I have supervisord (latest version, 4.2.0) on my ubuntu 18.04 droplets in different regions.
Today I saw that exactly at the same time almost all of droplets has dropped in CPU usage from 20-30% to almost zero. It turned out that supervisord stopped working.
In supervisor logs I can see that someone send SIGTERM:
2020-05-16 12:33:59,831 WARN received SIGTERM indicating exit request
The only relevant answer I googled is https://stackoverflow.com/questions/28440543/supervisor-gets-a-sigterm-for-some-reason-quits-and-stops-all-its-processes
However, I’ve checked out that the date of unattended upgrade is different, though minutes are the same, which is suspicious:
Start-Date: 2020-05-15 06:33:13
Commandline: /usr/bin/unattended-upgrade
Upgrade: libjson-c3:amd64 (0.12.1-1.3, 0.12.1-1.3ubuntu0.1)
End-Date: 2020-05-15 06:33:13
Notice that both hours and days are different.
Is it possible to somehow figure out why supervisord stopped working, and how can I prevent this to happen in future? Since it’s crucial for me, I need it up and running for 100% of time :(
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in Q&A, subscribe to topics of interest, and get courses and tools that will help you grow as a developer and scale your project or business.
Hi @Akcium,
It seems Supervisord was killed by the server. It received a Sigkill signal which is basically when the server kills processes when they are out of memory. I’ll recommend checking if this is true by ‘grep’-ing in
/var/log/messages
foroom
orkill
. Here is an exampleMost probably in the logs, the time will match with what you saw in your supervisord log. Now, you know the server killed them however you’ll need to find out why. If you see the
oom
signal, it means the server was out of memory. You can confirm this by using thesar
command like so:It will show you for a certain period of time what was your memory usage.
That would be a good way to start troubleshooting.
Regards, KDSys