EL4 is also still on 2.6.9.
Heh, yep and soon to be DOA, or requiring an expensive Extended Life Cycle Support (ELS) Subscription (which is limited to specific packages and hardware):http://wiki.centos.org/FAQ/General#head ... dde5b75e6dhttps://access.redhat.com/support/polic ... es/errata/
And with a 7 year support cycle, thats a pretty old setup!
I'm pretty sure that Media Temple will just tell the client to get a new VPS if they want to run on a newer kernel. Can you think of any way to positively verify that the kernel is really the problem here? And how could it be that it worked before and doesn't now? The kernel didn't change.
If its a VPS, then is it safe to assume there are other VPS' on the same system? If so, then this being transient means its likely caused by something either changing in the host node, or maybe one of the other VPS' is challenging the system. Thats damn hard to show inside the VPS, youre playing inside a chroot and an abstraction. This is really something you have to look at from the host node.
The downside is that because 2.6.9 is positively ancient, and all the virtualization technology built around it is also really old, and many generations behind whats used now you're very unlikely to get Parallels or Redhat to put much effort into fixing it, but it can't hurt to ask. Just be prepared for them to suggest you upgrade with the short window of life left on EL4.
Oh course, it could just be bad tuning of the VPS or host node, lack of resources. This being transient that would be my guess, the VPS isnt getting what it needs to perform well. In which case DEFINITELY open a case with Parallels. This might be a simple problem of tuning.
Or, like I said, it could be an actual bug in the kernel (all the spinlock stuff changed since 2.6.9, and for good reason) and they may not be willing (or able) to fix it without you upgrading. The code you are using is so old its got dust on it!
Free advice: This VPS has been causing you grief for some time, and EL4 has all sorts of baggage on its own that was only fixed in EL5 and EL6, its really a losing battle trying to pin down the cause with the OS vendors getting ready to drop support soon, and then everyone else. You're going to have to upgrade soon if you want any support, so maybe this is the kicker to start planning. I assume theres some good reasons you can't upgrade, or you would have already. Wish I had an easy answer on this one. The clock is ticking though, so eventually that day is going to come when its just EOL, so no matter what the solution is to this problem, an upgrade is in this VPS' future.
Now if you must stick with this old code, open a case with Parallels and dont give them ammo to blow you off. Its their kernel, and if they still support it get them looking at this. If they don't support it, well put a fork in it, its done and move on to a supported platform.
Whatever you do, don't tell them "well if I disable this it goes away". Unfortunately, thats just asking for the old Drs joke response from the support person:
Patient "Doctor it hurts when I do this"
Doctor "Then don't do that. That'll $50"
Just tell them apache isnt restarting correctly, and leave it at that. This is definitely a problem being caused down at the kernel or VPS configuration level, I can see the strace you sent and its stuck waiting, for what who knows (youre in the VPS, so we cant see the full system). Disabling modsec is just telling apache to do less work, which is just hiding the disease by treating the symptom. It might be a configuration problem with the VPS, or it could be a kernel bug, could be something else. Hard to say for sure from inside the VPS, we're just staring into a mirror. Need to get behind it.
If they insist on pushing you off because you are using 2.2.17, roll back to the buggy 2.2.3 and play dumb and let us know what they say. My guess is its a configuration problem with the VPS. Its transient, so I think the host node is just not giving the VPS enough resources or something.