Page 2 of 11

Posted: Tue Sep 09, 2008 6:58 pm
by aus-city
Hi Scott,

This seems to help in the interim. I found that there was a rule update overnight and my server did not take it.

I did an asl -u and after it restarted mod-security good old rule processing failed.

So I just killed httpd and restarted it and asl -s -f and it came up.

Still no news about the fix for this?

Posted: Tue Sep 09, 2008 8:48 pm
by scott
Nope, I just know its not related to rules, because it will happen with no rules loaded at all.

Posted: Wed Sep 10, 2008 6:28 am
by faris
Is there a pattern to this at all? We are about to try upgrading from php 5.1.6 to 5.2.6, and also to upgrade suhosin, zend and so on.

I take it this is unlikely to be a factor?

The thing is, right now apache on our machine does not go down, things don't stop working. All that happens is that I see segfaults and rule processing errors in the logs. And I don't want to make it worse :-)

Faris.

Posted: Wed Sep 10, 2008 8:38 am
by scott
Same here, although once in a while it will manifest in some other ways like rule-processing failures. I can get it to happen if I throw a few ab's (apache benchmark) at a box and let it run for a while on nothing more than GET /.

Posted: Wed Sep 10, 2008 9:02 am
by faris
Hmm.. and you aren't using OpenVZ or Virtuozzo, so that rules out a problem with the kernel. Yet my problems only started when I moved from a 2.4 kernel to a 2.6 (Virtuozzo) kernel.

But is is somehow related to load, and thus possibly memory in some way.

Fascinating stuff but horrendous for the people who are badly affected.

Are we still waiting for a response from the person who said he knew what it was, or .. ?

Faris.

Posted: Wed Sep 10, 2008 12:50 pm
by scott
Yeah he never responded, I think we've all emailed him 7 or 8 times each.

Ive managed to reproduce this on a lot of different OS's, with a few quirks. I can say that I've seen a resource starved VPS apache segfault on a lot of other things too, like glibc and openssl.

Posted: Wed Sep 10, 2008 7:12 pm
by aus-city
Well I am happy its been reproduced so its being looked at. My server is a high load one and I gets me maybe once a day or two.

If I am around I catch it and kill httpd and restart it.

Problem is when its the middle of the night :( Hours and hours its off.

Scott - In the interim can a patch psmon or program be implemented that catches the rule processing fails (as you know these immediately turn up in the busiest domain on the server in the error_log for plesk), simply does an automatic killall httpd, restart httpd (as I have found stuck dead pids that unless you killall watchdog then complains about httpd is down), then asl -s -f to get it all back up?

I know the best solution is to fix it, but surely a interim monitor for it would save people a lot of hassles.

Again its a voluntary package you install if you have the problem, as I for one would rather have frequent restarts of httpd rather that sites you can't log into as the rule processing is down.

Posted: Thu Sep 11, 2008 8:38 am
by scott
No, psmon couldn't do that. It only has context on if the process is running or not.

Posted: Sat Sep 13, 2008 8:01 pm
by faris
Hey Scott -- I see you have a new version of mod_security in the atomic repo, but not in the ASL-2.0 repo.

Once it gets in to ASL-2.0 might it help stop these darned segfaults?

Faris.

Posted: Sun Sep 14, 2008 2:01 pm
by scott
No that doesnt effect it at all, it was just some minor cleanup we did while trying to isolate the problem.

Posted: Thu Sep 18, 2008 1:21 am
by biggles
I am also experiencing the Segmentation fault on both my servers. It appeared after updating to the latest version in ASL-testing. The servers are running on Virtuozzo.

Posted: Thu Sep 18, 2008 8:25 am
by scott
near as I can tell, this is something core with mod_security itself. I removed every rule, except for one "SecRuleEngine On" and I can still get it to happen. As a test I also tried it with builds from Fedora, and some other packagers. Again same effect.

The only other weird bits to it are that I can get it to consistently happen on x86_64 systems with 1G of ram. So maybe this is a ram issue. For example, its never happened on this server (32-bit 2G), which is identical in terms of software to our x86_64 server where it does.

Posted: Thu Sep 18, 2008 7:56 pm
by aus-city
Well at least its good as its a common issue to mod-security regardless of the operating system and seems RAM related.

Let us know if you need access to any of our servers that have this issue to help.

Posted: Mon Sep 22, 2008 3:15 am
by biggles
Any update? Anything we can do to help? The problems forces me to restart apache every other day.

Posted: Mon Sep 22, 2008 4:20 am
by aus-city
I just get phone calls when I can't baby sit the server and catch it.

If its a help watch the plesk error logs on your busiest domain you can catch the rule processing failed about 20 minutes before you see the level 12 segmentation fault.

My site gets busier and it gets worse and worse its almost a daily event :(