EpicCDN
By:
EpicCDN

PHP-FPM max children reached every exact 10 minutes.

May 4, 2015 3.4k views
Security PHP Nginx WordPress Caching CentOS

I run nginx + php-fpm on a client server, its been working fine for months, suddenly this last week starts trowing a warning alerting that php pool seems busy,

[04-May-2015 10:35:26] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 2 idle, and 87 total children
[04-May-2015 10:35:27] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 5 idle, and 95 total children
[04-May-2015 10:35:28] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 40 idle, and 111 total children
[04-May-2015 10:40:09] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 13 idle, and 138 total children
[04-May-2015 10:40:10] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 22 idle, and 146 total children
[04-May-2015 10:40:11] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 40 idle, and 162 total children

This happens in exact intervals of 10 minutes, out of nowhere. Analytics says current connections maintain the same, yet with some help of netstat I see a increase on connections at the exact moment this happens:

During the busy warning

Mon May 4 10:30:26 CST 2015
Nginx:
161
From Varnish:
124
Direct:
25
PHP-FPM:
126

and here is how it usually looks between spikes

Mon May 4 10:33:37 CST 2015
Nginx:
58
From Varnish:
21
Direct:
22
PHP-FPM:
91

Now, I checked the cronjobs and none of the jobs we have set runs on such interval, and I checked on the varnish servers and the increase in connections shows too, so I think is an external factor. but due the high traffic is been hard to me to trace the source and block it.

So any good ideas on how to trace the source where logs don't show anything solid to follow?

Be the first one to answer this question.