Image pulls from registry.k8s.io sometimes fail with 403
I've just had a CI pipeline failing because ingress-nginx
failed to install, because it couldn't pull an ingress-nginx image from registry.k8s.io
. I confirmed this by manually trying to pull on the VPS:
# crictl pull registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20230407@sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b
E0613 11:11:22.248419 136261 remote_image.go:242] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b\": failed to resolve reference \"registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b\": pulling from host registry.k8s.io failed with status code [manifests sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b]: 403 Forbidden" image="registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20230407@sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b"
FATA[0000] pulling image: rpc error: code = Unknown desc = failed to pull and unpack image "registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b": failed to resolve reference "registry.k8s.io/ingress-nginx/kube-webhook-certgen@sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b": pulling from host registry.k8s.io failed with status code [manifests sha256:543c40fd093964bc9ab509d3e791f9989963021f1e9e4c9c7b6700b02bfb227b]: 403 Forbidden
Pulling the same image using docker pull
from my workstation works fine. This leads me to suspect that some of our IPs are banned (temporarily I hope) by registry.k8s.io
. Doing repeated cluster installations on the same instance would lead to quite a bit of traffic from a single IP, so it's not unthinkable that this is done automatically because our usage is seen as abuse. In fact this seems a well-known issue running kubernetes nodes on other cloud providers: https://github.com/kubernetes/registry.k8s.io/issues/211 .
We should really take another crack at setting up a docker image mirror. We tried using gitlab's built-in dependency proxy before, but ran into a gitlab bug, which is still open but incidentally has seen some recent activity: https://gitlab.com/gitlab-org/gitlab/-/issues/350485