Load Average above 20 for single core vps on debian

ananthan asked:

OS: Debian 6.0 RAM:3072 M, CPU:single core.

top:

top - 08:56:43 up 21 days, 12:37,  1 user,  load average: 28.38, 22.48, 15.95
Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
Cpu(s):  6.3%us, 14.7%sy,  0.0%ni, 17.5%id, 57.0%wa,  0.1%hi,  4.4%si,  0.0%st
Mem:   3145728k total,    28144k used,  3117584k free,    10236k buffers
Swap:        0k total,        0k used,        0k free,        0k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1247 root      20   0 18932 1240 1000 R    0  0.0   0:00.06 top
    1 root      20   0  8356  724  676 S    0  0.0   7:41.97 init
 3277 root      20   0  208m  11m 5652 S    0  0.4   0:00.17 apache2
 3847 root      20   0 22420 1032  788 S    0  0.0   0:12.66 cron
 8809 www-data  20   0  208m 7400 1168 S    0  0.2   0:00.00 apache2
26429 root      20   0 70488 3368 2652 S    0  0.1   0:00.80 sshd
26539 root      20   0 19300 2124 1564 S    0  0.1   0:00.16 bash
29551 root      20   0 49168 1152  604 S    0  0.0   0:00.12 sshd

ps aux:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   8356   724 ?        Ss   Jul23   7:41 init [2]
root      3277  0.0  0.3 213808 11828 ?        Ss   08:17   0:00 /usr/sbin/apache2 -k start
root      3847  0.0  0.0  22420  1032 ?        Ss   Jul23   0:12 /usr/sbin/cron
root      5870  0.0  0.0  16332  1140 pts/10   R+   08:58   0:00 ps aux
www-data  8809  0.0  0.2 213944  7400 ?        S    08:32   0:00 /usr/sbin/apache2 -k start
root     26429  0.0  0.1  70488  3368 ?        Ss   08:13   0:00 sshd: root@pts/10
root     26539  0.0  0.0  19300  2124 pts/10   Ss   08:13   0:00 -bash
root     29551  0.0  0.0  49168  1152 ?        Ss   Jul23   0:00 /usr/sbin/sshd

How can I find out the process that is causing the problem? After some time load average is coming down, but can anyone help me in finding the cause of this behavior?

update: load-average 233

top - 10:29:01 up 21 days, 14:09,  2 users,  load average: 237.96, 183.80, 98.76


Tasks:  15 total,   1 running,  14 sleeping,   0 stopped,   0 zombie

Cpu(s):  9.3%us, 14.2%sy,  0.0%ni,  0.0%id, 72.0%wa,  0.0%hi,  4.6%si,  0.0%st

Mem:   3145728k total,    51408k used,  3094320k free,    10272k buffers

Swap:        0k total,        0k used,        0k free,        0k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

    1 root      20   0  8356  724  676 S    0  0.0   7:44.70 init

 2031 root      20   0 70592 3388 2652 S    0  0.1   0:00.14 sshd

 2664 root      20   0 19300 2120 1556 S    0  0.1   0:00.02 bash

 3277 root      20   0  210m  11m 5680 S    0  0.4   0:00.57 apache2

 3847 root      20   0 22420 1032  788 S    0  0.0   0:12.70 cron

 4041 www-data  20   0  211m 7792 1228 S    0  0.2   0:00.00 apache2

13767 root      20   0 32800 1112  812 S    0  0.0   0:00.01 cron

14742 smmsp     20   0 52508 3940 2632 D    0  0.1   0:00.00 sendmail

15769 root      20   0 69232 3092 2408 S    0  0.1   0:00.01 sshd

16154 www-data  20   0  211m 7716 1228 S    0  0.2   0:00.00 apache2

17260 sshd      20   0 50616 1372  728 S    0  0.0   0:00.00 sshd

18436 root      20   0 18932 1248 1004 R    0  0.0   0:00.02 top

26429 root      20   0 70488 3376 2652 S    0  0.1   0:01.11 sshd

26539 root      20   0 19300 2124 1564 S    0  0.1   0:00.29 bash

29551 root      20   0 49168 1152  604 S    0  0.0   0:00.14 sshd

My answer:


Your server is spending an inordinate amount of time in I/O wait.

57.0%wa

This means… disk.

A likely cause of this problem is the server your VPS runs on is having issues with the disk(s). Those issues include, but are not limited to: a failing disk; using non-enterprise-grade disks; your host trying to run a VPS business on creatively recycled hardware, etc.

It could also be that you are running a process that’s causing unusually high amounts of disk activity. Unfortunately that information isn’t reported in your top or ps listing, making me suspect you have a low-end OpenVZ based VPS. (Which, if true, puts you back at the previous paragraph.)

As for resolving the problem, the first thing to do is to rule out any of your processes as causing high amounts of disk activity. The iotop program, as mentioned by @Shi, is good for this. Though my bet is you’ll find nothing. Once done, you then contact the host to report some issue with the server’s disk that they will have to diagnose, since they’re the host and you can’t see that from within the container.

(And later, when you’re shopping for a new VPS provider, steer clear of any who use OpenVZ. It’s been my experience that the vast majority of them are run pretty badly.)


View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.