Failed to pull image from private repository on kubernetes cluster

October 26, 2018 3.7k views
DigitalOcean Docker Kubernetes

Hello,

I’m using DO’s new kubernetes cluster and it seems there’s an issue pulling an image form a private repository. I am using GitLab on the cluster with a configured docker registry. The registry works and I can pull the image locally. However when deploying on kubernetes I get the following error:

“Failed to pull image {redacted}: rpc error: code = Unknown desc = Error response from daemon: Get {redacted}: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”

From related issues online it appears most people get this error with a proxy/dns issue so could this be an internal DO issue?

The request is not even logged on the registry server so it does not appear to be an authentication issue either. (Although I have a dockercfg secret setup for authentication)

Thanks,
Ryan

5 comments
  • I am also facing this issue. Is this something that you have figured out?

  • Haven’t investigated much more, just avoiding it for now. However their support directed me to a previous FAQ of someone managing to setup gitlab + private repository deploying before. Not much help but I will probably go over my deployment again.

  • I’m having the exact same problem. I can pull images from any private docker registry outside of my cluster (eg dockerhub.io, gitlab.com etc). I can also push and pull images in private docker registries that I have created inside the kubernetes cluster (have created both a nexus registry and one using the ‘stable/docker-registry’ helm chart) from my local dev machine. The registries I have created in the cluster have a public https endpoint via a DigitalOcean load balancer.

    I also discovered that if I SSH into the actual nodes themselves and use docker commands directly on underlying droplet I get exactly the same result (external registries work fine, internal fail). Any help appreciated. I don’t want to admit how much sweat I have wasted on this one.

  • I am also having the exact same problem.One of my node can pull image and 2 of them can’t. I built a private repo inside k8 cluster.

4 Answers

I had the same issue when running private registry on the DO cluster. I fixed it by scaling up registry to run on all nodes.
It seems to be a weird routing issue with the DO loadbalancer when addressing the outside IP from inside the cluster. DO k8s controller tries to be smart and routes node requests internally, instead of sending traffic to the loadbalancer IP. The issue though that if there is no registry service on that node, the packets go nowhere.
So, scale up or route through a non-DO LB.

  • Thanks for the answer. It’m facing exactly the same problem right now running k8s on Azure. It seems that they have exactly same issue. Your answer also explains why sometimes k8s successfully runs pod with image from in-cluster registry (when it starts on the same node with docker registry).

Any update on this ? Having the same issue here

Here’s what I did as a workaround:

  1. Set up your docker-registry ingress as usual (with tls etc.)
  2. Point your registry domain to your load balancer (as usual)
  3. Log in to your nodes via ssh
  4. Add the cluster IP of your ingress-controller to /etc/hosts
  <ingress controller IP> registry.example.com

Use the domain name when deploying images in the cluster (e.g registry.example.com/image)

If your /etc/hosts is managed you might need to update your /etc/cloud/templates/host.*.tmpl as well.

Have another answer? Share your knowledge.