Advice/opinions requested

General Discussion of atomic repo and development projects.

Ask for help here with anything else not covered by other forums.
faris
Long Time Forum Regular
Long Time Forum Regular
Posts: 2321
Joined: Thu Dec 09, 2004 11:19 am

Advice/opinions requested

Unread post by faris »

I've just done a test install of Scott's grsec kernel rpms on my test server. All went well. I did find that I can't use X though - grsec stopping it starting. But that's another issue and easily sorted I'm sure.

What I need are some opinions on whether to install it on my production servers or not.

The problem is this: After updating from Plesk 7.1.4 to 7.5.2, my primary server started freezing. No errors. Nothing. Just freezing. It would happen once a day at least, and sometimes once every few mins.

After spending some sleepless nights assuming it was somthing to do with the new version of Plesk or 4PSA software or something along those lines, various very knowledgeable people suggested that due to the lack of errors in the logs apart from some unusual message from my hardware probes (errors occured well before the "freezes"), the issue must therefore be a hardware failure.

But I wasn't 100% convinced. Yes, everything pointed to a hardware failure, but there just didn't seem to be any actual hardware failing. What appeared to be happening was the hardware probes failing to find the hardware to probe, yet when put through tests etc everything checked out.

So I upgradated the kernel (I realised I was one release behind the times), and re-installed the Dell OpenManage hardware monitoring/diagnostic software.

And after doing this everything was perfect. Not one freeze since then.

Now I'm convinced I won't have any problems from the grsec kernel per se, but I am half afraid that I was just lucky with my last reboot - basically that on the final reboot after updating, the particular thing that was causing things to go wrong (causing the freezes) just didn't happen. e.g. some bit of memory not written to, some this or some that happening or not happening. I don't know.

Any opinions on this? Should I risk it or not? I know that at the end of the day I have to reboot sometime! And if the primary server fails at least I have a backup ready to take over should it be necessary. But ...

Faris.
scott
Atomicorp Staff - Site Admin
Atomicorp Staff - Site Admin
Posts: 8355
Joined: Wed Dec 31, 1969 8:00 pm
Location: earth
Contact:

Unread post by scott »

Do you have a remote serial console you can hook up to the system? This would catch any kernel errors that wouldnt normally be getting written to the console. Heres how you enable it:

in grub.conf:
#splashimage=(hd0,0)/grub/splash.xpm.gz
# this sends grub output to the serial console
serial --unit=0 --speed=9600
terminal --timeout=15 console serial

# This sends kernel output to the serial console
title Red Hat Linux (2.6.11art-2)
root (hd0,0)
kernel /vmlinuz-2.6.11art-2 ro root=/dev/sda3 noacpi console=tty0 console=ttyS0
initrd /initrd-2.6.11art-2.img

and in /etc/inittab:
# This sets up a tty on the serial console
0:12345:respawn:/sbin/agetty ttyS0 9600

ttyS0 corresponds to the 1st serial port on the system, ttyS1 would be the second com port, etc. If youre running this between 2 servers, you'll need to use a null modem cable, and set your terminal program to use 9600 baud.
faris
Long Time Forum Regular
Long Time Forum Regular
Posts: 2321
Joined: Thu Dec 09, 2004 11:19 am

Unread post by faris »

Ahhh! Thanks Scott. No, I don't, but I could arrange this, I think.

I'll need to get a null modem cable sent to the data centre and actually get someone to plug it into the right ports. I've been thinking of doing this in order to get console access to the server we have that doesn't have a DRAC card installed anyway.

I doubt I'll manage to get it done before the Easter break though, and that's when I really want to try the grsec kernel.

Oh for 48-hour days (either that or being able to function without sleep for a few weeks).

Faris.
Post Reply