Page 1 of 2

Server Hang - kswapd process the cause?

Posted: Tue Jun 01, 2010 6:16 am
by coolemail
Today our servers suddenly went really slow. For a while, we could not even get onto shell access. I think that ASL sorted it as we had 3 events in the Security Events:
plesk2 kernel: Out of memory: Killed process 29233 (httpd).
plesk2 kernel: Out of memory: Killed process 29232 (httpd).
plesk2 kernel: Out of memory: Killed process 29231 (httpd).

Is there any way we can find out what those were, and how we could have resolved it better? (We were about to have to get the server manually rebooted).

The uptime and top showed the following - we could only get the top running after I guess ASL had killed the processes above.
[root@plesk2 ~]# uptime
10:48:30 up 10 days, 22:10, 3 users, load average: 206.37, 232.76, 188.83
[root@plesk2 ~]# uptime
top - 11:13:09 up 10 days, 22:35, 3 users, load average: 1.06, 3.84, 44.35
Tasks: 222 total, 1 running, 221 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.9%us, 1.2%sy, 0.0%ni, 90.9%id, 2.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1784832k total, 1766232k used, 18600k free, 62668k buffers
Swap: 2031608k total, 858600k used, 1173008k free, 284104k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18510 apache 15 0 577m 208m 6872 S 7.6 11.9 0:13.17 httpd
18501 apache 15 0 575m 205m 6776 S 6.6 11.8 0:11.84 httpd
18498 apache 15 0 519m 203m 6356 S 6.0 11.7 0:10.44 httpd
18500 apache 15 0 580m 208m 6604 S 5.3 12.0 0:11.69 httpd
18494 apache 15 0 523m 206m 7052 S 1.3 11.9 0:10.50 httpd
17429 root 15 0 13004 1316 768 R 0.7 0.1 0:04.42 top
18521 apache 16 0 578m 206m 6804 S 0.3 11.9 0:09.26 httpd
1 root 15 0 10348 108 80 S 0.0 0.0 0:24.96 init
2 root RT -5 0 0 0 S 0.0 0.0 0:04.10 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.43 ksoftirqd/0
4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
5 root RT -5 0 0 0 S 0.0 0.0 0:03.57 migration/1
6 root 34 19 0 0 0 S 0.0 0.0 0:00.31 ksoftirqd/1
7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1
8 root RT -5 0 0 0 S 0.0 0.0 0:03.02 migration/2
9 root 34 19 0 0 0 S 0.0 0.0 0:00.63 ksoftirqd/2
10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2
11 root RT -5 0 0 0 S 0.0 0.0 0:03.00 migration/3
12 root 34 19 0 0 0 S 0.0 0.0 0:00.33 ksoftirqd/3
13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3
14 root 10 -5 0 0 0 S 0.0 0.0 0:00.27 events/0
15 root 10 -5 0 0 0 S 0.0 0.0 0:00.03 events/1
16 root 10 -5 0 0 0 S 0.0 0.0 0:00.01 events/2
17 root 10 -5 0 0 0 S 0.0 0.0 0:00.02 events/3
18 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khelper
19 root 10 -5 0 0 0 S 0.0 0.0 0:15.32 kthread
21 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 xenwatch
22 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 xenbus
27 root 10 -5 0 0 0 S 0.0 0.0 0:06.94 kblockd/0
28 root 10 -5 0 0 0 S 0.0 0.0 0:00.03 kblockd/1
29 root 10 -5 0 0 0 S 0.0 0.0 0:00.06 kblockd/2
30 root 10 -5 0 0 0 S 0.0 0.0 0:00.03 kblockd/3
31 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid
131 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/0
132 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/1
133 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/2
134 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/3
138 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khubd
140 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kseriod
221 root 10 -5 0 0 0 S 0.0 0.0 10:36.78 kswapd0
222 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0
223 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 aio/1
224 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 aio/2
225 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 aio/3
363 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 kpsmoused
409 root 12 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0
420 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 ata/0
421 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 ata/1
422 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 ata/2
Is there a way to get more information on a specific PID (I have used strace -p18498) that was killed before like the ones at the top that were killed by ASL? The output was massive and included LOADS of information from lots of domains. If someone can give me an indication of how to interpret that and take any necessary action, I'd be most grateful

We have one large website which is a large forum that gets a lot of attempted attacks and I wonder if this could be some of those brute force attempts??

many thanks, in advance, as ever.

EDIT: Not sure if it helps, but I ran the following to try and see who was connecting. Does that suggest anything?:
[root@plesk2 ~]# netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 127.0.0.1:2912 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:993 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:995 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:199 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:106 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:587 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:110 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:3310 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:143 0.0.0.0:* LISTEN
tcp 0 0 xxx.xxx.xxx.4:80 81.105.148.54:63205 SYN_RECV
tcp 0 0 xxx.xxx.xxx.4:80 220.255.3.27:6120 SYN_RECV
tcp 0 0 xxx.xxx.xxx.4:80 220.255.7.177:25516 SYN_RECV
tcp 0 0 xxx.xxx.xxx.4:80 81.105.148.54:63207 SYN_RECV
tcp 0 0 0.0.0.0:8880 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:465 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:10001 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:16851 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:11443 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:11444 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:21 0.0.0.0:* LISTEN
tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN
tcp 0 0 yyy.yyy.yyy.8:53 0.0.0.0:* LISTEN
tcp 0 0 xxx.xxx.xxx.6:53 0.0.0.0:* LISTEN
tcp 0 0 xxx.xxx.xxx.5:53 0.0.0.0:* LISTEN
tcp 0 0 10.10.10.10:53 0.0.0.0:* LISTEN
tcp 0 0 xxx.xxx.xxx.4:53 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:3000 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:953 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:8443 0.0.0.0:* LISTEN
tcp 0 0 xxx.xxx.xxx.4:25 65.55.116.27:48853 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:143 87.127.76.95:1497 ESTABLISHED
tcp 0 0 127.0.0.1:43080 127.0.0.1:3306 ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:43080 ESTABLISHED
tcp 0 0 xxx.xxx.xxx.4:110 88.111.1.160:51472 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 88.111.1.160:51473 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 78.32.40.209:2051 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:53105 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:53102 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 84.45.158.63:33212 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58911 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58904 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58898 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58894 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:587 86.177.200.101:3919 ESTABLISHED
tcp 0 0 xxx.xxx.xxx.4:110 86.158.101.86:54479 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58930 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 86.158.101.86:54483 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 86.158.101.86:54481 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58925 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 86.158.101.86:54485 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58921 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58917 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 MY_IP_ADDRESS:58914 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 212.39.166.122:49951 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 87.127.37.47:40822 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 90.194.75.207:63602 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.5:110 MY_IP_ADDRESS:58907 TIME_WAIT
tcp 0 0 xxx.xxx.xxx.4:110 82.132.248.46:59949 ESTABLISHED
tcp 0 0 :::80 :::* LISTEN
tcp 0 0 :::30000 :::* LISTEN
tcp 0 0 :::8181 :::* LISTEN
tcp 0 0 :::22 :::* LISTEN
tcp 0 0 :::443 :::* LISTEN
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:50622 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:21649 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:55443 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:52255 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:55962 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:9370 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:36228 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:49287 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:4481 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:24972 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:163.153.100.13:59116 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58878 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58879 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:16821 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58876 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58877 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:63387 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:28850 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54445 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54444 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54447 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54446 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:33938 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:53819 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54443 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54453 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54452 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54455 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54454 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:50117 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54449 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54448 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:38027 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:53792 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:43170 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54451 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:55688 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:50595 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:61570 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54457 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.32.170.129:54456 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:53847 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:59388 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:5591 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:55293 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:22749 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:62199 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:53850 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:53851 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:29401 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:52824 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:29146 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:61124 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:51181 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:207.46.204.241:47047 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:37839 TIME_WAIT
tcp 0 7106 ::ffff:xxx.xxx.xxx.4:22 ::ffff:MY_IP_ADDRESS:55740 ESTABLISHED
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:35824 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:52582 TIME_WAIT
tcp 1 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:47310 CLOSE_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:25317 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:39115 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:40162 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:41453 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:59844 ESTABLISHED
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:2281 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:54763 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:28191 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:56857 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:45873 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:18948 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:46383 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50796 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50797 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50794 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50795 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50792 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50793 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50790 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50791 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50788 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:57892 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50789 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50786 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50787 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50784 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:29962 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50785 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50782 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:22 ::ffff:MY_IP_ADDRESS:55678 ESTABLISHED
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:45108 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50783 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:22 ::ffff:MY_IP_ADDRESS:55679 ESTABLISHED
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50780 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50781 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50778 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:57904 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50779 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50776 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:61490 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50777 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50774 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50775 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50772 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50770 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:217.35.85.158:1994 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50771 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50768 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50769 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50766 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:62991 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50767 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50764 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:207.46.204.241:46895 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50765 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50762 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:3360 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50763 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:59144 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50760 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50761 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50758 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50759 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:47367 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:7981 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50756 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50757 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.45.52.129:53677 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50754 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50755 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50752 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50753 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50750 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50751 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49717 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:78.33.176.225:50749 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49969 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.49:20615 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:36982 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:82.94.176.145:53817 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:36465 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:10309 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:84.45.236.243:3181 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:30272 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:44352 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49699 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49966 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:14157 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:51277 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58884 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58882 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:44361 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49960 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58880 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.57:30614 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58881 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:67.195.115.50:37357 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:67.195.115.50:33518 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58940 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58941 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58938 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58939 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58936 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58937 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58934 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:2941 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:MY_IP_ADDRESS:58935 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:55381 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49689 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:15460 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:62.73.169.152:49927 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:212.183.140.33:12387 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:58695 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.249.65.136:34883 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45706 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46015 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:44990 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46014 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46007 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45482 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.56:33802 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46046 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46041 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46038 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.38:36890 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46017 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46016 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46018 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46077 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46079 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46073 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46075 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45050 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46074 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45805 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45804 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46313 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46312 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:17643 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45284 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46049 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46048 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:192.229.17.113:23435 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:46105 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45849 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45848 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45850 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45846 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.3.21:6121 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45576 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45579 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45578 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:85.92.222.254:46442 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45575 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45105 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45360 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.54:38120 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.36:64388 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:192.229.17.113:24019 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:220.255.0.54:34717 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:45632 TIME_WAIT
tcp 0 0 ::ffff:xxx.xxx.xxx.4:80 ::ffff:66.235.124.132:44927 TIME_WAIT
[root@plesk2 ~]#

Re: Security Events - Killed Process

Posted: Thu Jun 03, 2010 5:18 am
by coolemail
Can anybody help:

Further to the last post trying to work out what is going on, I had top command running when it happened this morning, and there are two processes which appear to have caused it to hang. Can anyone advise? The two processes using all the power are shown below.
top - 10:07:27 up 1 day, 3:18, 3 users, load average: 100.38, 60.23, 28.35
Tasks: 310 total, 106 running, 202 sleeping, 0 stopped, 2 zombie
Cpu(s): 0.2%us, 99.8%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1784832k total, 1778652k used, 6180k free, 692k buffers
Swap: 2031608k total, 2031608k used, 0k free, 11248k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3056 mysql 18 0 600m 24m 1948 S 80.8 1.4 38:38.56 mysqld
221 root 10 -5 0 0 0 D 64.1 0.0 1:11.38 kswapd0

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 1:42 pm
by coolemail
I've changed the Subject in the hope that someone can help as we may not have been clear enough up to now. Every few days the server CPU means that we are unable to get web access or even shell access.

Once when we had top running, it looked like it could be the kswapd process that was doing it - see last post. In most cases we need to get the server manually rebooted to get it working again.

Can anyone advise what might be triggering it, what we might be able to do to identify any cause, and what we could/should do to get it all back working as it should?

Many thanks in advance.

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 3:54 pm
by mikeshinn
kswapd is the kernels swapping thread that built into all Linux kernels - if thats using up a lot of CPU that means that system is really trying to swap. Which could mean two things:

1) You are out of memory and something is eating up all your memory, that can happen because of leaks too
2) You are having disk problems and the kernel cant use the disk based swap partition fast to free up memory, and its stuck
3) I/O on your drives is so high the system can't swap processes to the drive the swap partition(s) are located on because that/those drives are already overloaded doing other I/O

Take a look at your drives though. Failing drives will sometimes get really slow or non-responsive temporarily, driving up the load and causing back up.

Take a look at your iostat and vmstat values too to see whats going on with your system. Top isnt going to tell you the whole story.

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 4:20 pm
by scott
Not the cause, more like whats keeping the server from crashing constantly :P You're using 4G of ram, Id say run top and sort by memory usage (M). See what is using up all the ram on the box.

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 4:55 pm
by coolemail
Thanks Mike, I'll post an email to support@atomicorp.com. I hope you can find time to look at it.

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 5:05 pm
by coolemail
Scott, Thank you for that. Most of the time, top suggests no problems at all. Then all of a sudden it crashes and we cannot even get shell access. I've had to reboot twice this evening and currently getting
[psmon/plesk2.expat-email.co.uk] Spawned 'ossec-dbd' with '/sbin/service ossec-hids restart'

Command executed: /sbin/service ossec-hids restart
Exit value: 0
Signal number: 0
Dumped core?: 0

Shutting down ossec-hids: [60G[ [0;32m OK [0;39m]
Starting ossec-hids: 2010/06/08 21:21:56 ossec-maild: INFO: E-Mail notification disabled. Clean Exit.
[60G[ [0;32m OK [0;39m]

about every 3 mins :x

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 5:51 pm
by mikeshinn
What kernel are you running?

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 6:01 pm
by coolemail
I've posted CASE 3042 as I may need more help. But after the big yum update when all email stopped and we had to get Parallels to look at it, they said
The problem is that Dr.Web daemon is killed upon start without even loading libraries:

[root@plesk2 drweb]# pwd
/opt/drweb
[root@plesk2 drweb]# strace -Ff -v -s128 ./drwebd.real
execve("./drwebd.real", ["./drwebd.real"], ["MANPATH=//man:", "HOSTNAME=plesk2.expat-email.co.uk", "TERM=screen", "SHELL=/bin/bash", "HISTSIZE=1000", "KDE_NO_IPV6=1", "SSH_CLIENT=64.131.90.27 35962 22", "QTDIR=/usr/lib64/qt-3.3", "QTINC=/usr/lib64/qt-3.3/include", "SSH_TTY=/dev/pts/1", "USER=root", "LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=0", "KDEDIR=/usr", "MAIL=/var/spool/mail/root", "PATH=/usr/lib64/qt-3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin://sbin://bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin", "INPUTRC=/etc/inputrc", "PWD=/opt/drweb", "LANG=en_US.UTF-8", "KDE_IS_PRELINKED=1", "SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass", "SHLVL=1", "HOME=/root", "LOGNAME=root", "QTLIB=/usr/lib64/qt-3.3/lib", "CVS_RSH=ssh", "SSH_CONNECTION=64.131.90.27 35962 82.197.79.4 22", "LESSOPEN=|/usr/bin/lesspipe.sh %s", "G_BROKEN_FILENAMES=1", "_=/usr/bin/strace", "OLDPWD=/root/parallels/PSA_9.2.3/dist-rpm-CentOS-5-x86_64/opt/drweb"] <unfinished ...>
+++ killed by SIGKILL +++

Such problem may be caused by either GRSecurity (which appears to be compiled into the kernel, and though 'gradm' reports that it is disabled, messages are still reported from it).
I would suggest to reboot with kernel provided by CentOS instead of AtomicRocketTurtle and check whether the problem still persists. You may also try to boot with kernel that was installed on the sever previously.
For now, I have removed all drweb mail handlers to avoid loosing mail. They can be restored after Dr.Web is fully and properly installed with this command:
/usr/local/psa/admin/sbin/mchk --with-spam
and they then set it up with the CentOS kernel. I'd LOVE to go back to the ASL one IF we won't lose everything again, and that is why I think I need more detailed help.

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 6:33 pm
by coolemail
whatever we did this evening (only known thing was a reboot), ASL is also not working as it should. The security events are not loading at all. ALLOW_kmod_loading is set to "No" at present.

at the time of writing, top is showing nothing untoward:
[root@plesk2 ~]# top
top - 23:28:59 up 2:14, 1 user, load average: 9.74, 8.56, 5.35
Tasks: 193 total, 1 running, 191 sleeping, 0 stopped, 1 zombie
Cpu(s): 1.1%us, 0.1%sy, 0.0%ni, 73.8%id, 25.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1784832k total, 1268984k used, 515848k free, 2072k buffers
Swap: 2031608k total, 250612k used, 1780996k free, 70724k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1972 apache 16 0 503m 193m 4164 D 0.3 11.1 0:00.65 httpd
3079 mysql 15 0 323m 23m 4328 S 0.3 1.3 4:20.14 mysqld
1 root 15 0 10348 604 568 S 0.0 0.0 0:00.09 init
2 root RT -5 0 0 0 S 0.0 0.0 0:00.02 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
5 root RT -5 0 0 0 S 0.0 0.0 0:00.01 migration/1
6 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1
8 root RT -5 0 0 0 S 0.0 0.0 0:00.01 migration/2
9 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2
10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2
11 root RT -5 0 0 0 S 0.0 0.0 0:00.01 migration/3
12 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3
13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3
14 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/0
15 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/1
16 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/2
17 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/3
18 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khelper
19 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread
21 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 xenwatch
22 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 xenbus
27 root 10 -5 0 0 0 S 0.0 0.0 0:00.11 kblockd/0
28 root 10 -5 0 0 0 S 0.0 0.0 0:00.01 kblockd/1
29 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/2
30 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/3
31 root 17 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid
131 root 17 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/0
132 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/1
133 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/2
134 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/3
138 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khubd
140 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kseriod
221 root 10 -5 0 0 0 S 0.0 0.0 0:04.00 kswapd0
222 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0
223 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 aio/1
224 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 aio/2
225 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 aio/3
363 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 kpsmoused
370 drweb 15 0 74424 67m 316 S 0.0 3.9 0:00.00 drwebd.real

Re: Server Hang - kswapd process the cause?

Posted: Tue Jun 08, 2010 8:37 pm
by philb
Swap: 2031608k total, 250612k used, 1780996k free, 70724k cached

I'd be wondering about this. Something is pushing your server into swap. This is an indication of inadequate physical memory.

An aside, I prefer htop to top since you can sort it by tree and other useful ways. The tree view is very useful for figuring what owns what.

Also, try to take a snapshot with top when the CPU is actually doing something. Your load average: 9.74, 8.56, 5.35 says your server is pretty busy,
but your current snap, Cpu(s): 1.1%us, 0.1%sy, 0.0%ni, 73.8%id, 25.1%wa, 0.0%hi, 0.0%si, 0.0%st, says there isn't much going on.

Re: Server Hang - kswapd process the cause?

Posted: Wed Jun 09, 2010 1:56 am
by coolemail
when monitoring top, the server load is generally httpd commands. At the time of writing, the server load is low and all is OK, and we have:
[root@plesk2 ~]# ps axf | grep httpd
28921 pts/1 S+ 0:00 \_ grep httpd
2889 ? Ss 0:00 /var/asl/usr/sbin/asl-httpd
2910 ? S 0:09 \_ /var/asl/usr/sbin/asl-httpd
31448 ? S 0:06 \_ /var/asl/usr/sbin/asl-httpd
31452 ? S 0:05 \_ /var/asl/usr/sbin/asl-httpd
31453 ? S 0:06 \_ /var/asl/usr/sbin/asl-httpd
31834 ? S 0:06 \_ /var/asl/usr/sbin/asl-httpd
1819 ? Ss 0:01 /usr/sbin/httpd
29600 ? S 0:00 \_ /usr/sbin/httpd
29601 ? S 0:39 \_ /usr/sbin/httpd
29602 ? S 0:38 \_ /usr/sbin/httpd
29603 ? S 0:39 \_ /usr/sbin/httpd
29604 ? S 0:41 \_ /usr/sbin/httpd
29606 ? S 0:35 \_ /usr/sbin/httpd
29607 ? S 0:37 \_ /usr/sbin/httpd
29608 ? S 0:37 \_ /usr/sbin/httpd
29609 ? S 0:37 \_ /usr/sbin/httpd
29617 ? S 0:38 \_ /usr/sbin/httpd
29621 ? S 0:39 \_ /usr/sbin/httpd
29622 ? S 0:40 \_ /usr/sbin/httpd
29626 ? S 0:37 \_ /usr/sbin/httpd
29627 ? S 0:40 \_ /usr/sbin/httpd
29628 ? S 0:40 \_ /usr/sbin/httpd
29629 ? S 0:43 \_ /usr/sbin/httpd
20821 ? S 0:14 \_ /usr/sbin/httpd
20823 ? S 0:15 \_ /usr/sbin/httpd
20824 ? S 0:17 \_ /usr/sbin/httpd
[root@plesk2 ~]#
and once we had about 60 of them and because it had hung, we could not even kill them.

Re: Server Hang - kswapd process the cause?

Posted: Wed Jun 09, 2010 12:46 pm
by mikeshinn
whatever we did this evening (only known thing was a reboot), ASL is also not working as it should. The security events are not loading at all. ALLOW_kmod_loading is set to "No" at present.
If you are getting the ossec dbd restart message over and over again and you didnt change anything in ASL either your ASL database is corrupt (file system errors, mysql didnt shut down cleanly and the DB is crashed, or you have a drive failing), or your ASL configs are corrupt (which would be filesystem corruption or drive failure).

Check your mysql logs to see if you have any crashed databases.

Re: Server Hang - kswapd process the cause?

Posted: Wed Jun 09, 2010 1:57 pm
by coolemail
in /var/log/mysqld.log there are a number of things. A few entries like:
100517 15:24:09 mysqld started
100517 15:24:10 InnoDB: Started; log sequence number 1 2030236305
100517 15:24:11 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.90' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
100517 18:17:26 [Note] /usr/libexec/mysqld: Normal shutdown

100517 18:17:26 InnoDB: Starting shutdown...
100517 18:17:28 InnoDB: Shutdown completed; log sequence number 1 2030539880
100517 18:17:28 [Note] /usr/libexec/mysqld: Shutdown complete

100517 18:17:28 mysqld ended

100517 18:19:58 mysqld started
100517 18:19:59 InnoDB: Started; log sequence number 1 2030539880
100517 18:19:59 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.90' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
100518 08:44:54 mysqld started
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
100518 8:44:55 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
100518 8:44:56 InnoDB: Started; log sequence number 1 2044513043
100518 8:44:56 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.90' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
100518 11:46:23 [Note] /usr/libexec/mysqld: Normal shutdown

100518 11:46:24 InnoDB: Starting shutdown...
100518 11:46:27 InnoDB: Shutdown completed; log sequence number 1 2044721329
100518 11:46:27 [Note] /usr/libexec/mysqld: Shutdown complete

100518 11:46:27 mysqld ended
then about 30 entries of:
100608 19:32:27 [ERROR] /usr/libexec/mysqld: Incorrect key file for table './tortix/data.MYI'; try to repair it
then literally thousands (more than one per second) of entries of
100609 2:38:59 [ERROR] /usr/libexec/mysqld: Table './tortix/data' is marked as crashed and should be repaired
Does that give any clue? Can you tell me what to do to repair it?

Re: Server Hang - kswapd process the cause?

Posted: Wed Jun 09, 2010 2:10 pm
by mikeshinn
That means mysql either crashed and corrupted the database, or the file system corrupted it. This is a mysql error, and you should follow the mysql process for recovering a crashed database which is documented here:

http://dev.mysql.com/doc/refman/5.0/en/ ... epair.html

http://dev.mysql.com/doc/refman/5.1/en/ ... epair.html