Page 1 of 1

AOOI lilo/grub boot problems on 1&1 FC6

Posted: Mon Feb 04, 2008 4:03 pm
by ajshea
I'm trying to install CentOS using http://www.atomicorp.com/installers/aooi on an older (1/2005) 1&1 Root 1 server. I just reimaged to their FC6 minimal image from the original FC2(!) image then ran the aooi installer.

On reboot it would not boot into the new "atomic" install. Lilo does not appear to be installed -- lilo.conf is there, but I can't locate lilo itself.

Grub does appear to be installed, but not configured. I poked and 'grubbed' around for quite awhile before getting "grub.conf" (/boot/grub/menu.lst) configured for what I think is correct:

Code: Select all

title atomic 
	root (hd0,0)
	kernel /boot/vmlinuz.atomic root=/dev/hda1 console=ttyS0,57600n8 console=tty1 ks=hd:hda2/ks-1and1-i386.cfg
	initrd /boot/initrd.img
I also changed "default=0" to "default=3" at the top, but this still did not reboot correctly.

After trying a number of combinations of things, (very scientific I assure you), I think what worked was

Code: Select all

grub-install /dev/hda
but I still did not get a prompt from grub on reboot even though timeout=3 in "grub.conf".

This time I got

Code: Select all

Linux version 2.6.18-53.el5 (mockbuild@builder6.centos.org) (gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)) #1 SMP Mon Nov 12 02:22:48 EST 2007
instead of

Code: Select all

Linux version 2.6.20.20-071010a (root@buildd-i386) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Wed Oct 10 14:05:50 CEST 2007
It quickly ran through lots of messages, and now has stopped at

Code: Select all

ACPI: (supports S0 S1 S4 S5)
Freeing unused kernel memory: 220k freed
Write protecting the kernel read-only data: 388k
ààààààà
for the last ten minutes. I'm about ready to force a restart into the rescue kernel, because it certainly doesn't look right, and it hasn't gone anywhere at all. I can post the message from either boot if it would help anything.

In any case I would appreciate input on the grub menu.lst config.

- Alan

Progress and syntax

Posted: Tue Feb 05, 2008 1:02 am
by ajshea
I must have interrupted the kickstart the first time around... After unsuccessfully trying to get the system to boot from that I gave up and re-imaged. This time it only took 40 minutes, as compared to nearly 60 hours over the weekend! (Something got out of sync, but as usual helpful 1&1 support couldn't get to it until Monday morning... Uptime was 2 days 12 hours when I got access back.)

I had to go through the grub.conf and grub-install as previously posted, but that worked.

However, when it rebooted it started asking me all kinds of questions (language, keyboard, where is the image, etc). I couldn't find where to exit the installer, so I had to restart into the rescue system from the control panel to edit grub.conf.

According to http://www.centos.org/docs/5/html/5.1/I ... stall.html the syntax for a kickstart kernel option is

Code: Select all

ks=hd:<device>:/<file>
where the original aooi shell script has

Code: Select all

append="ks=hd:$KSDEV/ks-1and1-$ARCH.cfg console=ttyS0,57600n8"
(missing second ":"). Making that change in grub.conf and a restart and the installer is just going along like 60...

Now I know that the funny characters I saw, posted above, are supposed to be blocks, as in a progress bar, but the font I was using substituted an accented character instead.

How does one keep the serial console from closing while this process goes on? Just keep hitting <space> every two minutes as a keep-alive? Very tedious.

Posted: Tue Feb 05, 2008 5:11 am
by ajshea
Well the serial console closed on me after about 20 minutes, despite my best efforts to hit space every two minutes. When I reconnected right away I got no response. I let it go almost three hours "just in case" it was really doing something. Finally forced a reboot in normal system, but no response after ten minutes.

Rebooted in rescue system and checked parted to see what the partitions were -- they had been fixed, so something worked. And when I mounted hda1, hda2, hda3.

Code: Select all

mount /dev/hda1 /boot
mount /dev/hda2 /opt
mount /dev/hda3 /home
mount -o loop /opt/Centos-5.1.ServerCD-i386-1of6.iso /opt/1
there is about 343Mb of stuff on hda3 (which would normally mount as /). I'm not sure that the install was complete -- it seems mighty small.

There was nothing in /boot/grub, so I ran

Code: Select all

grub-install /dev/hda
, which responded "success - no errors". But since the rescue system is debian, I thought I should copy the CentOS grub info from /home/usr/share/grub/i386-redhat/* to /boot/grub.

I also created /boot/grub/menu.lst

Code: Select all

# grub menu.lst
serial --unit 0 --speed 57600
#terminal serial console --timeout=5

default=0
timeout=5

title atomic 
    root (hd0,0)
    kernel /boot/vmlinuz.atomic root=/dev/hda1 console=ttyS0,57600n8 console=tty1 ks=hd:hda2:/ks-1and1-i386.cfg
    initrd /boot/initrd.img
-- which is the same as the original kickstart grub.conf. Maybe I should remove the kickstart stuff? But in that case shouldn't there be a different vmlinuz and/or initrd.img (since the ones from the CentOS isolinux are for installing)?

EDIT: On reflection, the install couldn't have been complete as mkswap hadn't been run on hda2 and rc.local doesn't exist (two of the last things the kickstart file does).

Third time is the charm?

Posted: Tue Feb 05, 2008 7:15 am
by ajshea
Third reimage, this time the default FC6 with plesk 8.2. This time I have been able to see the boot messages as they scrolled by:

Code: Select all

 Restarting system.

LILO 22.6.1 boot: 
Loading lx......................................................
BIOS data check successful
Linux version 2.6.20.21-071108a (root@buildd-i386) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Thu Nov 8 12:56:42 CET 2007 
So the rascal IS using LILO -- but the lilo program is NOT installed on this system! Am I just missing something or is 1&1's image that bad? (Or maybe they didn't install it so that we wouldn't go screwing things up...) So that's why I can't get it to start the installer.

Strange that it says "Debian 4.1.1-21" when I asked for a Fedora install. Curiouser and curiouser.

I'll try getting grub to install instead.

Posted: Tue Feb 05, 2008 9:43 am
by scott
wow no lilo this time around? That place is a moving target. Also you might want to try the dev tree, aooi2 instead.

wget -q -O - http://www.atomicorp.com/installers/aooi2 |sh

Posted: Tue Feb 05, 2008 9:49 pm
by ajshea
Thanks, I'll try the AOOI2. I tried acpi=off as suggested to guga-nyc, but it still locks up at

Code: Select all

ACPI: (supports<6>Time: tsc clocksource has been installed.
 S0 S1 S4 S5)
Freeing unused kernel memory: 220k freed
Write protecting the kernel read-only data: 388k
I am getting a grub menu now, and it clearly is loading the atomic isolinux kernel

Code: Select all

Linux version 2.6.18-53.el5 (mockbuild@builder6.centos.org) (gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)) #1 SMP Mon Nov 12 02:22:48 EST 2007
This is very frustrating -- beginning with 1&1's broken images. I'll try aooi2 and report back. I would appreciate any other suggestions...

aooi2 feedback

Posted: Tue Feb 05, 2008 10:38 pm
by ajshea
AOOI2 (v0.11.1) is some pretty good scripting. Comments:

1. aooi2 is still missing the second : in the kickstart path for the atomic boot.

Line 452 is

Code: Select all

  append="ks=hd:$KSDEV/ks-1and1-$ARCH.cfg console=ttyS0,57600n8" 
but should be

Code: Select all

  append="ks=hd:$KSDEV:/ks-1and1-$ARCH.cfg console=ttyS0,57600n8" 
2. grub/lilo detection is unreliable -- I'm using grub, but since 1&1 set up the image as lilo (without installing lilo) aooi2 tries to add an atomic entry to lilo instead of grub. Here's the output of /tmp/disk.out

Code: Select all

LILO
ZRrI
D|f1
GRUB
Geom
Hard Disk
Read
 Error 
Perhaps moving the function addkernel to line 112 (grub detected in mbr string) instead of later would help. In a case like this, just do addkernel for both lilo and grub -- and it doesn't matter which one is actually working.

Is "root=/dev/hda1" something that should be on the kernel line in grub.conf? Its there in the default config.

Posted: Tue Feb 05, 2008 11:11 pm
by ajshea
No luck on aooi2 -- the atomic boot image just locks up at

Code: Select all

Freeing unused kernel memory: 220k freed
Write protecting the kernel read-only data: 388k
That is to say, nothing happens after that. How do I debug this or get it to keep going?

Posted: Wed Feb 06, 2008 10:28 am
by scott
All that stuff at the end is legacy from aooi1, theres actually an exit above that (around 230 or so). I don't use those ks-1and1.cfgs any more.

The fact that you've got both LILO and GRUB in the first block of the disk make it well neigh impossible to figure out what the boot loader is. Props for even getting grub working on a 1and1 box. That was originally how I did this years ago, but I eventually had to abandon that because finding a working grub on the box was so unreliable.

Once you're actually booting off the CentOS iso kernel, you're in the realm of centos support. Theres not much else I can do in AOOI to assist.

The only thing that comes to mind is perhaps the output isn't going to serial for the kernel piece. How long have you waited when you get to this point?

Posted: Wed Feb 06, 2008 11:46 am
by ajshea
Thanks for the feedback. Does it look like I have the grub config correct? I haven't used it before.
scott wrote: The only thing that comes to mind is perhaps the output isn't going to serial for the kernel piece. How long have you waited when you get to this point?
I let it sit for three hours once and nothing seemed to happen - evidenced by the fact that restarting it showed the partitions just as they were.

This did work twice, several days ago, but it wasn't getting the kickstart, and then it didn't finish somehow and I couldn't get the centos install to boot -- there was only ~ 340Mb installed, which seems low. I fought with it from the rescue console for two days and couldn't get it to boot, so reimaged. It hasn't worked since.

Posted: Wed Feb 06, 2008 3:06 pm
by scott
Wow, that makes me think youve got some sketchy hardware. One of the 1and1 guys gave me a server to play with where he cant get the network to work at all from the centos ISO kernel. I was about to follow up on this thread to mention that it might be the network, but the fact that you did get a network install going rules that out.

Anyway, on to fixes. WIth this new box I'm going to try and wrap up the local install code in aooi2, so its an amazingly long shot but it might be a solution for your problem.

Posted: Wed Feb 06, 2008 5:20 pm
by ajshea
thanks. The FC6 image provided by 1&1 does work, I just really don't like being on an unsupported OS, and there are other things that just aren't right with their images (no lilo, no lspci for example).

Posted: Wed Feb 06, 2008 7:10 pm
by scott
They didnt have fsck.xfs on there for a while either

Re-image

Posted: Mon Feb 11, 2008 8:45 am
by guga-nyc
Well,

For me was 18 Reimages to get-it right. So keep it trying it. Instead of acpi=off try the noapic. The acpi was able to boot and operate but I noticed some strange lookups while copying over from one HD to another.

The noapic removed that behavior.

About the 1n1 images themselves, last one I wasn't able to mount cd ... keep complaining to not understand iso9660.

PS: Does the aooi2 (network) work in the 1n1 enviroment ?

Posted: Mon Feb 11, 2008 1:16 pm
by scott
No it doesnt always work as a network install at 1and1. I couldn't tell you why either, Ive got their support looking into it as well.

I did add implement local installs through aooi2 last week, and tested it successfully on one of the 1and1 servers where network installs did not work.