Zero-downtime Unicorn deploy

December 18, 2014 2.4k views


I'm trying to use the Ruby on Rails with Ubuntu image. It has an init script to manage the Unicorn process, but it doesn't do zero-downtime deploy.

I'm not familiar enough with init scripts, so I can't do this myself.

Here is a different flavour of init script that knows how to do this : https://github.com/ValencePM/capistrano-unicorn-init

Here is the official documentation for this feature of Unicorn : http://unicorn.bogomips.org/SIGNALS.html (check the "Procedure to replace a running unicorn executable" paragraph)


2 Answers

To do a "zero-downtime deploy" or "graceful reload" by sending the USR2 signal, you can add a new case to the Unicorn init script in /etc/init.d/unicorn

        log_daemon_msg "Upgrading $DESC" $NAME || true
        if start-stop-daemon --stop --signal USR2 --quiet --oknodo --pidfile $PID; then
          sleep 3
          if start-stop-daemon --stop --signal QUIT --quiet --oknodo --pidfile $PID.oldbin; then
              log_end_msg 0 || true
              log_end_msg 1 || true
          log_end_msg 1 || true

This sends the signal and then reaps the old processes when you run service unicorn upgrade

Give it and try and let us know how it goes!

Thanks @asb, I'll try this.

I see that this script is autonomous ; it sends the USR2 signal, sleeps (3 seconds should be enough for most of web requests to finish), then sends a QUIT signal to the old master.

There another approach where Unicorn is responsible for killing an eventual old master when forking workers.

before_fork do |server, worker|
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
  # immediately start loading up a new version of itself (loaded with a new
  # version of our app). When this new Unicorn is completely loaded
  # it will begin spawning workers. The first worker spawned will check to
  # see if an .oldbin pidfile exists. If so, this means we've just booted up
  # a new Unicorn and need to tell the old one that it can now die. To do so
  # we send it a QUIT.
  # Using this method we get 0 downtime deploys.

  old_pid = "#{server.config[:pid]}.oldbin"
  if File.exists?(old_pid) && server.pid != old_pid
      Process.kill("QUIT", File.read(old_pid).to_i)
    rescue Errno::ENOENT, Errno::ESRCH
      # someone else did our job for us

This snippet is considered common use by a lot of Unicorn users. It is even in the examples shipped with Unicorn source code.

I guess it might be possible to have an init script that starts by sending the USR2 signal, sleeps (maybe a bit longer), then checks if the reloading has properly occurred.
Right now, I guess it fails (returning a 1 code) if the old PID is missing (removed by Unicorn itself within the sleep time.

What do you think?

Have another answer? Share your knowledge.