New Kernel Less Stable than Last Kernel

General Discussion of atomic repo and development projects.

Ask for help here with anything else not covered by other forums.
User avatar
mikeshinn
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 4155
Joined: Thu Feb 07, 2008 7:49 pm
Location: Chantilly, VA

Re: New Kernel Less Stable than Last Kernel

Unread post by mikeshinn »

3644 apache 20 0 604m 241m 7164 S 2.4 6.1 0:49.65 httpd
That memory usage looks normal. 241M of resident is pretty tame on a box with 4GB of memory. Your box isnt even sweating. Keep in mind thats the total for all of apache, not per thread, so its really not much used at all. The other number is virtual, not actual used memory.
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

Great to know that the memory number for the httpd process isn't a per thread number. Would never have guessed that!

mod_rewrite is running on a bunch of sites to get pretty URL's. I'd have a hard time eliminating those. I did some apache performance tuning with prefork this week and have had some good results so far.
scott
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 8355
Joined: Wed Dec 31, 1969 8:00 pm
Location: earth
Contact:

Re: New Kernel Less Stable than Last Kernel

Unread post by scott »

try disabling it sometime on a quick restart and you'll see what I mean. Ive seen crazy complex htaccess and redirect files burn through more than a gig, and if written poorly cause a memory leak that will periodically use up all available memory
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

OK - I'll do that when nobody is looking :)

Just had another crash, here's the only thing left on the console:

Code: Select all

PAX: execution attempt in: <anonymous mapping>, 731629893000-731629897000 731629893000
PAX: terminating task: /usr/libexec/paxtest/anonmap(anonmap):4919, uid/euid: 0/0, PC: 0000731629893000, SP: 00007fff9edbfc18
PAX: bytes at PC: c3 00 00 00 00 00 00 00 
Help at all?
User avatar
mikeshinn
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 4155
Joined: Thu Feb 07, 2008 7:49 pm
Location: Chantilly, VA

Re: New Kernel Less Stable than Last Kernel

Unread post by mikeshinn »

here's the only thing left on the console:

Code: Select all

PAX: execution attempt in: <anonymous mapping>, 731629893000-731629897000 731629893000
PAX: terminating task: /usr/libexec/paxtest/anonmap(anonmap):4919, uid/euid: 0/0, PC: 0000731629893000, SP: 00007fff9edbfc18
PAX: bytes at PC: c3 00 00 00 00 00 00 00
Help at all?
That console message is harmless and unrelated:

https://www.atomicorp.com/wiki/index.ph ... st_mean.3F
Just had another crash,
So it sounds like you have console access, so can you tell me what you mean by crash? Did you get a kernel panic, was the system just unresponsive, etc.

If it was unresponsive, what was the load, I/O, CPU usage, memory, etc. on the system at the time?
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

So it sounds like you have console access, so can you tell me what you mean by crash? Did you get a kernel panic, was the system just unresponsive, etc.

If it was unresponsive, what was the load, I/O, CPU usage, memory, etc. on the system at the time?
Would love to answer these questions. Crash (to me) means it's not accessible. Only a hard reboot will cure it. Cannot access by console, shell, http, ftp, anything. Doesn't ping.

How would I answer your second question about load, I/O, CPU, memory, etc.?
User avatar
mikeshinn
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 4155
Joined: Thu Feb 07, 2008 7:49 pm
Location: Chantilly, VA

Re: New Kernel Less Stable than Last Kernel

Unread post by mikeshinn »

How would I answer your second question about load, I/O, CPU, memory, etc.?
Do you have sysstat or anything running on the system that records performance statistics? sysstat would be start (it only records every 5 minutes), but that may be enough to pin point what the cause of the non-responsiveness is.

Can you post your sar data from sysstat around the time this occurs?
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

OK - sysstat is now installed, configured, and monitoring. Will post sar data next crash.
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

OK, just rebooting after a crash, ran sar and got this:
Linux 2.6.32.43-6.art.x86_64 (hostname) 08/26/2011

12:00:01 AM CPU %user %nice %system %iowait %steal %idle
12:05:01 AM all 3.01 0.00 0.87 0.88 0.00 95.24
12:10:01 AM all 1.61 0.00 0.43 0.46 0.00 97.50
12:15:01 AM all 1.28 0.00 0.36 0.28 0.00 98.08
12:20:01 AM all 0.94 0.00 0.27 0.33 0.00 98.46
12:25:01 AM all 0.95 0.00 0.26 0.41 0.00 98.39
12:30:01 AM all 1.35 0.00 0.35 0.43 0.00 97.87
12:35:01 AM all 1.28 0.00 0.32 0.33 0.00 98.08
12:40:01 AM all 0.72 0.00 0.23 0.39 0.00 98.66
12:45:01 AM all 2.17 0.00 0.31 0.33 0.00 97.19
12:50:01 AM all 0.77 0.00 0.28 0.25 0.00 98.70
12:55:01 AM all 1.04 0.00 0.32 0.36 0.00 98.27
01:00:01 AM all 1.51 0.00 0.38 0.33 0.00 97.79
01:05:01 AM all 1.62 0.00 0.82 0.59 0.00 96.98
01:10:01 AM all 0.82 0.00 0.24 0.21 0.00 98.73
01:15:01 AM all 0.83 0.00 0.26 0.29 0.00 98.61
01:20:01 AM all 1.04 0.00 0.39 1.05 0.00 97.52
01:25:01 AM all 0.95 0.00 0.26 0.27 0.00 98.53
01:30:01 AM all 1.50 0.00 0.34 0.41 0.00 97.76
01:35:01 AM all 0.76 0.00 0.29 0.35 0.00 98.61
01:40:01 AM all 1.11 0.00 0.33 0.36 0.00 98.19
01:45:01 AM all 1.26 0.00 0.31 0.39 0.00 98.05
01:50:01 AM all 1.83 0.00 0.39 0.28 0.00 97.50
01:55:01 AM all 1.23 0.00 0.34 0.43 0.00 98.00
02:00:01 AM all 0.77 0.00 0.26 0.36 0.00 98.62
02:05:01 AM all 3.80 0.00 1.11 1.16 0.00 93.93
02:10:01 AM all 1.23 0.00 0.32 0.57 0.00 97.88
02:15:01 AM all 0.83 0.00 0.30 0.33 0.00 98.54
02:20:01 AM all 5.29 0.00 1.63 3.72 0.00 89.35
02:25:01 AM all 12.21 0.00 3.18 15.28 0.00 69.32
02:30:02 AM all 10.73 0.00 3.15 17.73 0.00 68.39
02:35:02 AM all 13.53 0.00 3.67 16.76 0.00 66.04
02:40:01 AM all 21.97 0.00 5.10 11.56 0.00 61.37
02:45:01 AM all 22.73 0.00 4.59 8.87 0.00 63.81
02:50:01 AM all 16.33 0.00 4.54 15.38 0.00 63.75
02:55:01 AM all 12.76 0.00 3.78 15.91 0.00 67.55
03:00:02 AM all 16.09 0.00 4.32 17.40 0.00 62.20
03:05:01 AM all 11.52 0.00 3.27 20.32 0.00 64.88
03:10:01 AM all 12.05 0.00 3.49 19.59 0.00 64.87
03:15:01 AM all 11.73 0.00 3.32 17.34 0.00 67.61
03:20:03 AM all 9.80 0.00 5.30 39.53 0.00 45.37
03:25:22 AM all 4.49 0.00 7.34 59.02 0.00 29.15
03:30:01 AM all 7.67 0.00 9.61 31.58 0.00 51.14
03:35:01 AM all 4.17 0.00 3.27 2.34 0.00 90.22
03:40:02 AM all 3.31 0.00 3.26 2.01 0.00 91.42
03:45:01 AM all 3.96 0.00 3.51 2.05 0.00 90.47

03:45:01 AM CPU %user %nice %system %iowait %steal %idle
03:50:01 AM all 4.23 0.00 3.47 2.49 0.00 89.81
03:55:01 AM all 3.15 0.00 2.64 3.18 0.00 91.03
04:00:01 AM all 1.80 0.00 0.44 0.93 0.00 96.82
04:05:01 AM all 15.30 0.00 2.70 8.86 0.00 73.14
04:10:01 AM all 16.73 0.00 1.69 6.35 0.00 75.24
04:15:01 AM all 20.21 0.00 1.67 5.39 0.00 72.72
04:20:01 AM all 21.80 0.00 1.74 4.52 0.00 71.95
04:25:01 AM all 21.30 0.00 1.54 2.90 0.00 74.25
Average: all 6.47 0.00 1.95 6.94 0.00 84.64

07:10:56 AM LINUX RESTART
Does that help?
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

/var/log/messages
Aug 26 04:11:01 server1 xinetd[3093]: EXIT: smtp status=0 pid=30277 duration=3(sec)
Aug 26 04:11:50 server1 xinetd[3093]: START: smtp pid=30498 from=178.150.67.195
Aug 26 04:11:51 server1 xinetd[30498]: warning: /etc/hosts.deny, line 47: can't verify hostname: getaddrinfo(195.67.150.178.triolan.net, AF_INET) failed
Aug 26 04:11:55 server1 xinetd[3093]: EXIT: smtp status=0 pid=30498 duration=5(sec)
Aug 26 04:12:43 server1 clamd[25910]: SelfCheck: Database status OK.
Aug 26 04:13:06 server1 xinetd[3093]: START: smtp pid=30831 from=59.95.62.80
Aug 26 04:13:10 server1 xinetd[3093]: EXIT: smtp status=0 pid=30831 duration=4(sec)
Aug 26 04:13:33 server1 xinetd[3093]: START: smtp pid=30963 from=124.125.89.213
Aug 26 04:13:36 server1 xinetd[3093]: EXIT: smtp status=0 pid=30963 duration=3(sec)
Aug 26 04:15:01 server1 psmon[31440]: Forking background daemon, process 31442.
Aug 26 04:15:01 server1 psmon[31442]: Forking second background daemon, process 31443.
Aug 26 04:16:26 server1 xinetd[3093]: START: ftp pid=32050 from=::ffff:74.208.3.12
Aug 26 04:16:26 server1 proftpd[32050]: 74.208.198.246 (::ffff:74.208.3.12[::ffff:74.208.3.12]) - FTP session opened.
Aug 26 04:16:26 server1 proftpd[32050]: 74.208.198.246 (::ffff:74.208.3.12[::ffff:74.208.3.12]) - FTP session closed.
Aug 26 04:16:26 server1 xinetd[3093]: EXIT: ftp status=0 pid=32050 duration=0(sec)
Aug 26 04:17:06 server1 xinetd[3093]: START: smtp pid=32257 from=85.92.29.0
Aug 26 04:17:10 server1 xinetd[3093]: START: smtp pid=32275 from=178.95.7.191
Aug 26 04:17:22 server1 xinetd[3093]: EXIT: smtp status=0 pid=32275 duration=12(sec)
Aug 26 04:18:15 server1 xinetd[3093]: START: smtp pid=32607 from=94.248.20.12
Aug 26 04:18:19 server1 xinetd[3093]: EXIT: smtp status=1 pid=32607 duration=4(sec)
Aug 26 04:18:45 server1 xinetd[3093]: EXIT: smtp status=1 pid=32257 duration=99(sec)
Aug 26 04:19:43 server1 kernel: grsec: process /usr/bin/sw-engine(sw-engine:724) attached to via ptrace by /usr/bin/sw-engine[sw-engine:727] uid/euid:0/0 gid/egid:0/0, parent /usr/bin/sw-engine[sw-engine:724] uid/euid:0/0 gid/egid:0/0
Aug 26 04:19:43 server1 kernel: grsec: process /usr/bin/sw-engine(sw-engine:724) attached to via ptrace by /usr/bin/sw-engine[sw-engine:728] uid/euid:0/0 gid/egid:0/0, parent /usr/bin/sw-engine[sw-engine:724] uid/euid:0/0 gid/egid:0/0
Aug 26 04:22:43 server1 clamd[25910]: SelfCheck: Database status OK.
Aug 26 04:24:41 server1 xinetd[3093]: START: smtp pid=2150 from=115.111.46.98
Aug 26 04:24:41 server1 xinetd[2150]: warning: /etc/hosts.deny, line 47: can't verify hostname: getaddrinfo(115.111.46.98.static-delhi.vsnl.net.in, AF_INET) failed
Aug 26 04:24:44 server1 xinetd[3093]: EXIT: smtp status=1 pid=2150 duration=3(sec)
Aug 26 04:25:44 server1 xinetd[3093]: START: smtp pid=2506 from=31.171.22.193
Aug 26 04:25:48 server1 xinetd[3093]: EXIT: smtp status=1 pid=2506 duration=4(sec)
Aug 26 07:11:00 server1 syslogd 1.4.1: restart.
User avatar
mikeshinn
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 4155
Joined: Thu Feb 07, 2008 7:49 pm
Location: Chantilly, VA

Re: New Kernel Less Stable than Last Kernel

Unread post by mikeshinn »

That all looks harmless. Whats the load like at the time? And memory utilization and swap?

Also, what did you do to setup marking? That would be ideal to get that working so we can get the time a little more pinned down.

Also, do you have any jobs that run at that time? And does this usually happen in the wee hours? Thats generally when Centos runs a lot of its housekeeping jobs, and at the very least the raid-sync job seems to not like some hardware, there could be something else too.
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

What do you mean about the markers?

I've disabled the raid-sync long ago, plesk is backing up to remote servers during this time. The stops have happened at all hours.
User avatar
mikeshinn
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 4155
Joined: Thu Feb 07, 2008 7:49 pm
Location: Chantilly, VA

Re: New Kernel Less Stable than Last Kernel

Unread post by mikeshinn »

syslog marking. -m when you start syslog.
premierhosting
Forum Regular
Forum Regular
Posts: 257
Joined: Wed Aug 04, 2010 2:52 pm

Re: New Kernel Less Stable than Last Kernel

Unread post by premierhosting »

It starts in init.d with -m

Code: Select all

[root@server1 ~]# ps fax | grep syslog
 2975 ?        Ss     0:01 syslogd -m 0
19175 pts/0    S+     0:00                      \_ grep syslog
Now what?
User avatar
mikeshinn
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 4155
Joined: Thu Feb 07, 2008 7:49 pm
Location: Chantilly, VA

Re: New Kernel Less Stable than Last Kernel

Unread post by mikeshinn »

Change the 0, to 1. The number means how many minutes should go by before it marks.
Post Reply