Docker pull error in systemd script on CoreOS

December 2, 2014 4.7k views

I created a droplet with the following configuration:

512MB Ram 20GB SSD Disk Amsterdam 3 CoreOS CoreOS (beta) 494.0.0

CoreOS was subsequently updated to 494.1.0.

Since I don't need a cluster, I didn't provide a cloud-config.

Then I added the following systemd script in /etc/systemd/system/hello.service (copied from the official CoreOS tutorial):

[Unit]
Description=MyApp
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill busybox1
ExecStartPre=-/usr/bin/docker rm busybox1
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"

[Install]
WantedBy=multi-user.target

I ran 'sudo systemctl enable hello' and 'sudo systemctl start hello'. Everything worked perfectly.

Then I rebooted the machine, and this happened:

systemd[1]: Starting MyApp...
docker[478]: busybox1
docker[600]: busybox1
docker[610]: Pulling repository busybox
docker[610]: Get https://index.docker.io/v1/repositories/library/busybox/images: dial tcp: lookup index.docker.io: connection refused
systemd[1]: hello.service: control process exited, code=exited status=1
systemd[1]: Failed to start MyApp.
systemd[1]: Unit hello.service entered failed state.

I destroyed and recreated the droplet, with exactly the same result.

I tried the same approach and script locally with Vagrant as well as on Google Computing Engine, and it worked as expected both times.

Am I doing something wrong, or does the problem lie with Digital Ocean?

2 comments
  • I just tried this again, and the problem still exists. The error message, however, has now changed to:

    Get https://index.docker.io/v1/repositories/library/busybox/images: 
    dial tcp: lookup index.docker.io: no such host
    

    Until Digital Ocean fixes this, I can think of a few workarounds:

    • Instead of systemd, use a Docker Restart Policy to make sure that your service keeps running.
    • Use a different web hosting provider.
  • I'm having this same exact issue! I have a server fault question attempting to resolve this here: https://serverfault.com/questions/675591/coreos-cant-pull-docker-container-on-boot/675864#675864. It also appears to have nothing to do with waiting for the network interface to come up. Let me know if you find a proper resolution to this. You'd think more people would attempt to use CoreOS on digital ocean and pull on boot.

5 Answers

It seems that it is trying to run the pull command before networking is totally up. You can depend on the network-online.target service in order to make sure that networking is up before running the docker pull command. See "Running Services After the Network is up" from the systemd documentation for more info.

You might also want to move the pull command to ExecStart to make sure that it is only run after everything else has come up. The following works for me:

[Unit]
Description=MyApp
Wants=docker.service network-online.target
After=docker.service network-online.target
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill busybox1
ExecStartPre=-/usr/bin/docker rm busybox1
ExecStart=/usr/bin/docker pull busybox
ExecStartPost=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"

[Install]
WantedBy=multi-user.target
  • Adding the network-online.target did the trick — thanks for the hint!

    I guess your Wants should really be a Requires, since it makes no sense to start our service if either Docker or the network fails to start.

    There is, however, one huge problem with your script: Turning the ExecStart line into ExecStartPost results in a service that keeps starting forever. That is definitely not what we want.

    The improved service file — which finally also works on Digital Ocean — looks like this:

    [Unit]
    Description=MyApp
    Requires=docker.service network-online.target
    After=docker.service network-online.target
    
    [Service]
    TimeoutStartSec=0
    ExecStartPre=-/usr/bin/docker kill busybox1
    ExecStartPre=-/usr/bin/docker rm busybox1
    ExecStartPre=/usr/bin/docker pull busybox
    ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"
    
    [Install]
    WantedBy=multi-user.target
    
  • @patrickhoefler Right you are! Thanks for following up with your solution.

  • @patrickhoefler your snippet didn't work for me, am I missing some steps? I took the code snippet as you posted on March 16, 2015.

    I should mentioned I use CoreOS (647.0.0)

    core@x~ $ systemctl status test.service -l
    ● test.service - MyApp
       Loaded: loaded (/etc/systemd/system/test.service; enabled; vendor preset: disabled)
       Active: failed (Result: exit-code) since Sat 2015-05-23 13:52:15 UTC; 1min 24s ago
    
    May 23 13:52:15 x docker[593]: Error response from daemon: No such container: busybox1
    May 23 13:52:15 x  docker[593]: time="2015-05-23T13:52:15Z" level="fatal" msg="Error: failed to kill one or more containers"
    May 23 13:52:15 x docker[603]: Error response from daemon: No such container: busybox1
    May 23 13:52:15 x  docker[603]: time="2015-05-23T13:52:15Z" level="fatal" msg="Error: failed to remove one or more containers"
    May 23 13:52:15 x systemd[1]: test.service: control process exited, code=exited status=1
    May 23 13:52:15 x systemd[1]: Failed to start MyApp.
    May 23 13:52:15 x systemd[1]: Unit test.service entered failed state.
    May 23 13:52:15 x systemd[1]: test.service failed.
    May 23 13:52:15 x docker[626]: Pulling repository busybox
    May 23 13:52:15 x docker[626]: time="2015-05-23T13:52:15Z" level="fatal" msg="Get https://index.docker.io/v1/repositories/library/busybox/images: dial tcp: lookup index.docker.io: Temporary failure in name resolution"
    
    edited by asb

For the ones still hitting this issue, use systemd-networkd-wait-online.service instead of network-online.target. Here is my unit which works well:

      [Unit]
      After=early-docker.service systemd-networkd-wait-online.service
      Requires=early-docker.service systemd-networkd-wait-online.service
      Before=early-docker.target

Adding Restart and RestartSec fixed the issue:

# Restart after crash
Restart=always
# Give the service 10 seconds to recover after the previous restart
RestartSec=10s

Read more here

Have another answer? Share your knowledge.