WordPress (nginx cache + Apache + fcgid) pegging all 8 CPUs

ptn777 asked:

I have a WordPress multi-user site that pegs all of my CPUs at more than 90% usage:

top - 12:02:58 up 55 days,  5:25, 10 users,  load average: 20.51, 15.66, 14.90
Tasks: 294 total,  24 running, 270 sleeping,   0 stopped,   0 zombie
Cpu0  : 87.5%us,  8.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  4.5%si,  0.0%st
Cpu1  : 97.9%us,  1.9%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu2  : 96.0%us,  3.5%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.5%si,  0.0%st
Cpu3  : 97.6%us,  2.1%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu4  : 97.1%us,  2.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu5  : 97.9%us,  1.9%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu6  : 97.9%us,  1.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.5%si,  0.0%st
Cpu7  : 96.0%us,  3.5%sy,  0.0%ni,  0.3%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:  14369424k total, 11903548k used,  2465876k free,   402360k buffers
Swap:  4063200k total,  3594784k used,   468416k free,  1484116k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                              
30658 apache    16   0  274m  97m 6304 R 62.1  0.7   0:12.49 php-cgi
30686 apache    16   0  213m  92m 6040 R 52.2  0.7   0:03.27 php-cgi
30685 apache    15   0  211m  87m 5764 S 50.3  0.6   0:04.50 php-cgi
28217 apache    16   0  529m 405m 6748 S 49.0  2.9   3:54.72 php-cgi
30468 apache    16   0  414m 291m 6452 R 48.5  2.1   0:49.78 php-cgi
29604 apache    15   0  258m 135m 6464 S 47.4  1.0   2:16.22 php-cgi
28308 apache    16   0  584m 408m 6724 R 43.9  2.9   3:43.07 php-cgi
28266 apache    16   0  550m 374m 6728 R 43.7  2.7   3:58.38 php-cgi
29573 apache    16   0  584m 407m 6592 R 36.8  2.9   1:59.88 php-cgi
30470 apache    16   0  219m  95m 6452 S 36.5  0.7   0:39.66 php-cgi
29138 apache    15   0  513m 334m 6528 S 33.6  2.4   2:03.14 php-cgi
30472 apache    17   0  441m 318m 6272 R 31.7  2.3   0:50.45 php-cgi
28283 apache    16   0  414m 291m 6580 R 29.3  2.1   3:53.06 php-cgi
29858 apache    16   0  251m 127m 6628 R 24.8  0.9   1:15.53 php-cgi
28253 apache    18   0  550m 374m 6580 R 24.5  2.7   4:08.05 php-cgi
30666 apache    15   0  217m  94m 5996 R 24.5  0.7   0:04.68 php-cgi
28208 apache    20   0  584m 407m 6436 R 24.2  2.9   4:36.36 php-cgi
29085 apache    25   0  358m 182m 6488 R 22.6  1.3   2:19.76 php-cgi
28258 apache    25   0  530m 407m 6512 R 22.4  2.9   3:58.70 php-cgi
29574 apache    16   0  530m 406m 6540 S 21.6  2.9   2:19.26 php-cgi
28947 apache    16   0  524m 401m 6476 R 14.1  2.9   2:32.33 php-cgi
28238 apache    15   0  488m 312m 6852 S 12.3  2.2   4:24.34 php-cgi
30464 apache    15   0  274m 151m 6176 R 11.2  1.1   0:19.67 php-cgi
28293 apache    16   0  269m 146m 6460 R  9.9  1.0   3:57.17 php-cgi
28205 apache    25   0  530m 407m 6496 R  9.6  2.9   4:05.49 php-cgi
30471 apache    19   0  263m 140m 6440 R  6.9  1.0   0:47.42 php-cgi

The output shows that the most CPU an individual process uses is ~60%, but there’s been times where I’ve had as many as 7 process using more than 90% cpu.

The site runs as follows:

  1. nginx works as a reverse proxy, serving every static file that it can and caching pages via the proxy_cache directive.

  2. It delegates to Apache when PHP scripts are required. These are run via mod_cgi using the ExecCGI option

  3. Both Apache and nginx do compression on every human-readable file

  4. To avoid hitting MySQL all the time, we save HTML fragments in memcached, which currently caches between 2 and 4MB, as reported by the stats command in a telnet connection

  5. There’s also some counters kept in a Redis database, mostly to count page views for every post.

  6. No WP Super Cache (nginx does the caching), no XCache.

I’m at a loss as to how to determine what exactly every php-cgi process is doing to require such a high CPU demand – the site has been heavily modified by several different software teams before we started giving it maintenance.

The PHP errors log shows mostly these errors:

  1. “Cannot redeclare class FacebookRestClientException”
  2. “Call to undefined function e_()”
  3. Invalid SQL syntax, mostly here: “WHERE post_id = xxxxx AND blog_id = “
  4. “Allowed memory size of 268,435,456 bytes exhausted”
  5. “Call to undefined method Services_JSON::encodeUnsafe()”

None of these actually perform any computation, so they can’t be the source of the cpu problem.

I tried tracing system calls and saw lstat, read, write and access, which would generate waiting and not cpu load were they the problem (correct?). Also, there were calls to both poll and select.

Could someone give me pointers as to what to check next?

My answer:


Your problem is here:

No WP Super Cache (nginx does the caching), no XCache.

Install APC Zend OPcache and W3 Total Cache and watch your CPU usage drop back down to almost nothing.

APC Zend OPcache alone should give you some breathing room.

Note that W3 Total Cache is not fully multi-site aware, and so it has to be configured on each site individually. It can be set up to use your existing memcached for caching.

You can also get rid of Apache. It’s doing absolutely nothing for you.

(Note: APC is deprecated and has proven to be unreliable in practice. I currently recommend using Zend OPcache instead.)


View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.