Page 1 of 1

Can I check why a server keeps slowing down?

Posted: Thu Jul 23, 2009 11:58 am
by coolemail
I have a server with Centos 5, Plesk 8.6, and ASL.

The server has been great until recently when I have noticed that access to the server slows right down - takes about 30-odd seconds to send an email from Horde, and navigating the Plesk CP takes ages - again about 30 seconds to navigate to different pages. Also, the Shell access takes that long to get the command prompt having logged in with the password.

If I reboot the server, everything gets really fast again as it should be, but within about 8 hrs, it has slowed right down again.

Can someone help me identify what it is that is causing it to be so slow? And is there something I can do to stop it? you can imagine that it is causing a lot of grief to people each day, and I do not think that rebooting it each time is the right thing to do.

Very grateful, as ever, for any help.

Re: Can I check why a server keeps slowing down?

Posted: Thu Jul 23, 2009 1:42 pm
by nobody
When the problem occurs run under shell top and see the cpu and ram usage between processes to identify what goes wrong ...

Re: Can I check why a server keeps slowing down?

Posted: Thu Jul 23, 2009 6:43 pm
by coolemail
nobody wrote:When the problem occurs run under shell top ...
Thank you nobody. Please excuse my ignorance, but can you tell me what shell command will achieve this? Sorry to be thick!

Re: Can I check why a server keeps slowing down?

Posted: Fri Jul 24, 2009 12:07 am
by Galactic Zero
The command is "top".
So, SSH to your box and then run that.

Re: Can I check why a server keeps slowing down?

Posted: Fri Jul 24, 2009 4:31 am
by coolemail
Thank you Galactic Zero.

Interestingly, when it does go slow, it stays slow until the next reboot. I have just run top and got the following.
[root@plesk2 ~]# top
top - 08:35:58 up 1 day, 2:29, 1 user, load average: 29.64, 29.24, 28.54
Tasks: 310 total, 31 running, 256 sleeping, 0 stopped, 23 zombie
Cpu(s): 82.1%us, 17.8%sy, 0.1%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2057308k total, 2040752k used, 16556k free, 46876k buffers
Swap: 2031608k total, 9852k used, 2021756k free, 463664k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23185 thewhite 20 0 128m 2728 1712 R 10 0.1 114:23.07 crond
15297 thewhite 20 0 128m 2728 1712 R 9 0.1 88:27.10 crond
24869 thewhite 20 0 128m 2728 1712 R 9 0.1 183:39.85 crond
30968 thewhite 20 0 128m 2728 1712 R 9 0.1 104:09.98 crond
9564 thewhite 20 0 128m 2728 1712 R 9 0.1 0:28.66 crond
27127 thewhite 20 0 128m 2728 1712 R 9 0.1 108:28.84 crond
28836 thewhite 20 0 128m 2728 1712 R 9 0.1 74:57.92 crond
2590 thewhite 20 0 128m 2728 1712 R 8 0.1 99:24.96 crond
7970 thewhite 20 0 128m 2728 1712 R 8 0.1 138:20.12 crond
10698 thewhite 20 0 128m 2728 1712 R 8 0.1 91:52.19 crond
11764 thewhite 20 0 128m 2728 1712 R 8 0.1 132:14.14 crond
15353 thewhite 20 0 128m 2728 1712 R 8 0.1 125:33.54 crond
17988 thewhite 20 0 128m 2728 1712 R 8 0.1 216:51.02 crond
18903 thewhite 20 0 128m 2728 1712 R 8 0.1 119:59.97 crond
19444 thewhite 20 0 128m 2728 1712 R 8 0.1 85:34.13 crond
25895 thewhite 20 0 128m 2728 1712 R 8 0.1 79:12.35 crond
28743 thewhite 20 0 128m 2728 1712 R 8 0.1 167:59.79 crond
15298 thewhite 20 0 154m 9308 5928 S 8 0.5 75:21.48 php
28837 thewhite 20 0 154m 9296 5928 S 8 0.5 64:58.96 php
23189 thewhite 20 0 154m 9300 5928 S 8 0.5 96:50.24 php
24870 thewhite 20 0 154m 9312 5928 S 8 0.5 152:01.51 php
30970 thewhite 20 0 154m 9304 5928 S 8 0.5 86:21.07 php
4095 thewhite 20 0 128m 2728 1712 R 7 0.1 146:58.25 crond
5466 thewhite 20 0 128m 2728 1712 R 7 0.1 2:58.57 crond
6418 thewhite 20 0 128m 2728 1712 R 7 0.1 95:45.80 crond
9565 thewhite 20 0 154m 9300 5928 S 7 0.5 0:24.12 php
15156 thewhite 20 0 128m 2728 1712 R 7 0.1 231:50.42 crond
17989 thewhite 20 0 154m 9300 5928 S 7 0.5 179:56.20 php
28748 thewhite 20 0 154m 9308 5928 R 7 0.5 147:37.55 php
2597 thewhite 20 0 154m 9308 5928 S 7 0.5 84:12.88 php
7979 thewhite 20 0 154m 9196 5928 S 7 0.4 115:38.25 php
11769 thewhite 20 0 154m 9300 5928 S 7 0.5 110:41.81 php
15363 thewhite 20 0 154m 9308 5928 S 7 0.5 105:34.15 php
18906 thewhite 20 0 154m 9312 5928 S 7 0.5 98:25.30 php
19447 thewhite 20 0 154m 9308 5928 S 7 0.5 72:13.78 php
21060 thewhite 20 0 128m 2728 1712 R 7 0.1 192:21.46 crond
23068 thewhite 20 0 128m 2728 1712 R 7 0.1 82:26.11 crond
25897 thewhite 20 0 154m 9304 5928 S 7 0.5 66:00.98 php
27128 thewhite 20 0 154m 9304 5928 S 7 0.5 91:02.06 php
32584 thewhite 20 0 128m 2728 1712 R 7 0.1 153:18.84 crond
10700 thewhite 20 0 154m 9184 5928 S 7 0.4 76:27.04 php
4096 thewhite 20 0 154m 9312 5928 S 6 0.5 124:51.54 php
21065 thewhite 20 0 154m 9308 5928 S 6 0.5 170:44.87 php
32593 thewhite 20 0 154m 9312 5928 S 6 0.5 133:57.83 php
5469 thewhite 20 0 154m 9308 5928 S 6 0.5 2:29.57 php
6424 thewhite 20 0 154m 9304 5928 S 6 0.5 81:14.23 php
15157 thewhite 20 0 154m 9300 5928 S 6 0.5 192:26.32 php
23073 thewhite 20 0 154m 9304 5928 S 6 0.5 68:14.97 php
9568 thewhite 20 0 3856 528 432 S 2 0.0 0:06.94 qmail-inject
9570 drweb 20 0 2980 1964 612 S 2 0.1 0:06.22 qmail-queue
[root@plesk2 ~]#
then I turned off that one domain, and the difference was amazing:
[root@plesk2 ~]# top
top - 09:28:21 up 1 day, 3:21, 1 user, load average: 2.26, 2.22, 4.86
Tasks: 186 total, 4 running, 181 sleeping, 0 stopped, 1 zombie
Cpu(s): 23.1%us, 3.6%sy, 0.0%ni, 73.2%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2057308k total, 1893452k used, 163856k free, 195108k buffers
Swap: 2031608k total, 6052k used, 2025556k free, 492728k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20969 thewhite 20 0 128m 2728 1712 R 56 0.1 15:09.35 crond
20970 thewhite 20 0 154m 9308 5928 R 48 0.5 12:51.52 php
20981 thewhite 20 0 3860 532 432 S 9 0.0 2:22.22 qmail-inject
26438 qmaild 20 0 33672 1512 1156 S 1 0.1 0:00.10 qmail-smtpd
30221 root 20 0 12764 1348 928 R 1 0.1 0:04.20 top
1 root 20 0 10364 752 628 S 0 0.0 0:01.12 init
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.20 migration/0
4 root 15 -5 0 0 0 S 0 0.0 0:00.34 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.12 watchdog/0
6 root RT -5 0 0 0 S 0 0.0 0:00.18 migration/1
7 root 15 -5 0 0 0 S 0 0.0 0:00.08 ksoftirqd/1
8 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/1
9 root RT -5 0 0 0 S 0 0.0 0:00.16 migration/2
10 root 15 -5 0 0 0 S 0 0.0 0:00.18 ksoftirqd/2
11 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/2
12 root RT -5 0 0 0 S 0 0.0 0:00.24 migration/3
13 root 15 -5 0 0 0 S 0 0.0 0:00.22 ksoftirqd/3
14 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/3
15 root 15 -5 0 0 0 S 0 0.0 0:08.78 events/0
16 root 15 -5 0 0 0 S 0 0.0 0:00.30 events/1
17 root 15 -5 0 0 0 S 0 0.0 0:00.36 events/2
18 root 15 -5 0 0 0 S 0 0.0 0:00.34 events/3
19 root 15 -5 0 0 0 S 0 0.0 0:00.00 khelper
88 root 15 -5 0 0 0 S 0 0.0 0:00.00 kintegrityd/0
89 root 15 -5 0 0 0 S 0 0.0 0:00.00 kintegrityd/1
90 root 15 -5 0 0 0 S 0 0.0 0:00.00 kintegrityd/2
91 root 15 -5 0 0 0 S 0 0.0 0:00.00 kintegrityd/3
92 root 15 -5 0 0 0 S 0 0.0 0:00.04 kblockd/0
93 root 15 -5 0 0 0 S 0 0.0 0:00.04 kblockd/1
94 root 15 -5 0 0 0 S 0 0.0 0:00.28 kblockd/2
95 root 15 -5 0 0 0 S 0 0.0 0:00.08 kblockd/3
97 root 15 -5 0 0 0 S 0 0.0 0:00.00 kacpid
98 root 15 -5 0 0 0 S 0 0.0 0:00.00 kacpi_notify
176 root 15 -5 0 0 0 S 0 0.0 0:00.00 cqueue
180 root 15 -5 0 0 0 S 0 0.0 0:00.00 ata/0
181 root 15 -5 0 0 0 S 0 0.0 0:00.00 ata/1
182 root 15 -5 0 0 0 S 0 0.0 0:00.00 ata/2
183 root 15 -5 0 0 0 S 0 0.0 0:00.00 ata/3
184 root 15 -5 0 0 0 S 0 0.0 0:00.00 ata_aux
186 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksuspend_usbd
191 root 15 -5 0 0 0 S 0 0.0 0:00.00 khubd
194 root 15 -5 0 0 0 S 0 0.0 0:00.00 kseriod
260 root 15 -5 0 0 0 S 0 0.0 0:21.60 kswapd0
307 root 15 -5 0 0 0 S 0 0.0 0:00.00 aio/0
308 root 15 -5 0 0 0 S 0 0.0 0:00.00 aio/1
309 root 15 -5 0 0 0 S 0 0.0 0:00.00 aio/2
310 root 15 -5 0 0 0 S 0 0.0 0:00.00 aio/3
482 root 15 -5 0 0 0 S 0 0.0 0:00.00 scsi_eh_0
484 root 15 -5 0 0 0 S 0 0.0 0:00.00 scsi_eh_1
[root@plesk2 ~]#
Are the crond commands mail failure emails, possibly? I found a "loop" of undeliverable mails in that domain which I resolved.

Is there anything else of concern that is highlighted by the above results?

THANK YOU both for helping me identify this.

Re: Can I check why a server keeps slowing down?

Posted: Fri Jul 24, 2009 8:34 am
by Highland
If you don't have your server set to deny bad addresses, your machine could be trying to bounce messages to spam addresses that don't exist.

Re: Can I check why a server keeps slowing down?

Posted: Fri Jul 24, 2009 9:43 am
by coolemail
For that domain, I have set it to REJECT mail to non-existent users. Is there another way I should be doing this?

What I found this morning was that they had mail forwarding set up from one account to another, and they spelt the mailname incorrectly, so it kept trying to say that it was undelivered, sent it to the first address, and so it went on! But that has now been corrected. But that domain is creeping up ever more, again, with the tasks. Is there anything I can do to stop it?
top - 14:42:21 up 1 day, 8:35, 1 user, load average: 15.63, 15.70, 15.55
Tasks: 226 total, 17 running, 197 sleeping, 0 stopped, 12 zombie
Cpu(s): 82.7%us, 17.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2057308k total, 1506724k used, 550584k free, 57592k buffers
Swap: 2031608k total, 5992k used, 2025616k free, 180804k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1589 thewhite 20 0 128m 2728 1712 R 17 0.1 70:00.52 crond
30455 thewhite 20 0 128m 2728 1712 R 17 0.1 7:17.08 crond
5210 thewhite 20 0 128m 2728 1712 R 17 0.1 55:27.54 crond
18488 thewhite 20 0 128m 2728 1712 R 17 0.1 28:08.48 crond
20969 thewhite 20 0 128m 2728 1712 R 17 0.1 118:01.18 crond
30146 thewhite 20 0 128m 2728 1712 R 17 0.1 88:33.29 crond
2017 thewhite 20 0 128m 2728 1712 R 16 0.1 1:58.93 crond
26690 thewhite 20 0 128m 2728 1712 R 16 0.1 105:57.86 crond
9824 thewhite 20 0 128m 2728 1712 R 16 0.1 47:09.94 crond
14262 thewhite 20 0 128m 2728 1712 R 16 0.1 33:43.51 crond
22284 thewhite 20 0 128m 2728 1712 R 16 0.1 20:03.00 crond
26580 thewhite 20 0 128m 2728 1712 R 16 0.1 13:00.91 crond
14267 thewhite 20 0 154m 9304 5928 S 14 0.5 28:51.68 php
1593 thewhite 20 0 154m 9308 5928 S 14 0.5 59:33.19 php
2019 thewhite 20 0 154m 9304 5928 S 14 0.5 1:40.31 php
18491 thewhite 20 0 154m 9304 5928 R 14 0.5 23:46.57 php
20970 thewhite 20 0 154m 9308 5928 S 14 0.5 100:03.83 php
22287 thewhite 20 0 154m 9308 5928 R 14 0.5 17:41.67 php
30464 thewhite 20 0 154m 9308 5928 S 14 0.5 6:05.07 php
5212 thewhite 20 0 154m 9300 5928 S 14 0.5 47:50.79 php
30152 thewhite 20 0 154m 9300 5928 S 14 0.5 74:13.29 php
9831 thewhite 20 0 154m 9300 5928 R 13 0.5 40:52.49 php
26582 thewhite 20 0 154m 9300 5928 S 13 0.5 11:11.12 php
26697 thewhite 20 0 154m 9308 5928 S 13 0.5 88:07.66 php
2254 popuser 20 0 174m 57m 2488 S 5 2.9 0:04.62 spamd
2030 thewhite 20 0 3892 532 432 S 3 0.0 0:22.80 qmail-inject
5222 thewhite 20 0 3848 528 432 S 3 0.0 8:39.63 qmail-inject
20981 thewhite 20 0 3860 532 432 S 3 0.0 18:03.68 qmail-inject
30161 thewhite 20 0 3852 528 432 S 3 0.0 13:37.62 qmail-inject
1605 thewhite 20 0 3860 532 432 S 2 0.0 10:50.94 qmail-inject
9835 thewhite 20 0 3852 528 432 S 2 0.0 7:18.64 qmail-inject
14278 thewhite 20 0 3900 532 432 R 2 0.0 5:14.37 qmail-inject
18497 thewhite 20 0 3848 532 432 S 2 0.0 4:22.47 qmail-inject
30471 thewhite 20 0 3868 528 432 S 2 0.0 1:11.39 qmail-inject
22290 thewhite 20 0 3892 532 432 S 2 0.0 3:09.83 qmail-inject
26592 thewhite 20 0 3840 532 432 S 2 0.0 2:03.99 qmail-inject
26702 thewhite 20 0 3840 532 432 S 2 0.0 16:11.56 qmail-inject
4420 root 20 0 103m 8764 1104 S 1 0.4 1:19.29 psmon
3750 mysql 20 0 316m 45m 6836 S 0 2.3 4:42.33 mysqld
1 root 20 0 10364 752 628 S 0 0.0 0:01.22 init
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.22 migration/0
4 root 15 -5 0 0 0 S 0 0.0 0:00.40 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.14 watchdog/0
6 root RT -5 0 0 0 S 0 0.0 0:00.20 migration/1
7 root 15 -5 0 0 0 S 0 0.0 0:00.12 ksoftirqd/1
8 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/1
9 root RT -5 0 0 0 S 0 0.0 0:00.20 migration/2
10 root 15 -5 0 0 0 S 0 0.0 0:00.22 ksoftirqd/2
11 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/2

Re: Can I check why a server keeps slowing down?

Posted: Sat Jul 25, 2009 10:26 am
by coolemail
Again this morning, every task in "top" was belonging to that same domain "thewhite". I got rid of them again by disabling and re-enabling the domain. Then the server again went at the right speed. Does anyone know how I can find out why these keep appearing, and what I can do to stop them all from re-appearing every few hours?

I'm really grateful for all the help so far.

Re: Can I check why a server keeps slowing down?

Posted: Sat Jul 25, 2009 12:07 pm
by scott
I'd take a look at their cron jobs, that looks like the source thats triggering your problem.

Re: Can I check why a server keeps slowing down?

Posted: Sun Jul 26, 2009 1:08 pm
by coolemail
Thank you Scott. They only had one Cron job. I disabled it and Hey Presto, it has stopped. It was just an RSS newsfeed, and was clearly fine for many months, but something must have changed. THANK YOU ALL for the help in resolving this.