Next Previous Contents

6. Troubleshooting

If you followed the instructions and your disk won't boot, the first step is to determine where the boot is failing. In general, the further along it gets the easier it is to diagnose (though not necessarily to fix!). These steps are arranged more-or-less chronologically.

6.1 Configuration problems

These should be explained in the doc subdirectory. I try to catch as many problems as I can in the configure script, but Yard relies on many external programs, which change occasionally and introduce bugs.

6.2 Problems building the root filesystem

6.3 Problems in kernel loading

In the normal boot process you will see a message sequence like:

        LILO loading floppylinux...
        Uncompressing...done
        Now booting the kernel

If the sequence halts or displays an error somewhere in this sequence, the problem is with LILO.

Not necessarily in LILO, but with LILO.
With some exotic floppy disk geometries (usually involving disk capacities greater than 1440K) LILO's map compaction won't work, causing the LILO boot sequence to halt. If LILO halts and you're using a non-1440K floppy, this is likely the problem.

The fix is to remove the COMPACT line from your bootdisk's lilo.conf, and run Yard again. This will simply turn off LILO's map compaction which should fix the problem (although kernel loading will slow down somewhat).

If that doesn't work, switch to a 1440K floppy and try again. If you really want to puzzle it out, go read the section ``Disk Geometry'' in LILO's README file -- and good luck.

If 1440K doesn't work, something is very broken. Make sure LILO has been installed correctly. If you don't normally use LILO to boot, re-install a recent version. As a last resort, remake your kernel with ``make mrproper''.

If you're using a double-disk rescue set, both floppies must be formatted identically. The boot loader becomes confused otherwise.

6.4 Problems finding the compressed root filesystem

If the loader tells you it can't find a compressed root image, make sure you gave Yard the correct floppy device (eg, /dev/fd0H1722 for 1722K). If you've constructed a single-disk rescue set and it prompts you to insert a root floppy disk, that's probably a Yard problem (the rdev in write_rescue_disk has failed for some reason and the failure wasn't caught by Yard). Go back and look over write_rescue_disk.log to make sure nothing failed.

6.5 Problems with init or login

  1. If the system repeatedly accepts a login name then offers the login prompt again, this is a sign that the NSS is not configured. On systems using the GNU Name Service Switch (NSS), you must include explicitly in Bootdisk_Contents a selected set of /lib/libnss_* shared libraries, as well as an /etc/nsswitch.conf file.
  2. If you get a message like:
    Id xxx respawning too fast: disabled for n minutes
    This comes from init, usually indicating that your *getty*
    The notation *getty* will be used to mean some getty-like program, eg getty, agetty, mgetty or getty_ps.
    or login is dying as soon as it starts up. Check the *getty* and login executables, and the libraries they depend upon. Make sure the invocations in /etc/inittab are correct.
  3. If halting occurs after the root filesystem is loaded, the problem is in Yard or in you. The first step is to check write_rescue_disk.log to make sure nothing failed. If you ignored warnings from make_root_fs or check_root_fs, look at them again. You should probably look over all three log files to make sure things are as they should be. make_root_fs.log has a listing of where each file was taken from -- make sure all your system files are coming from the right places on your hard disk.
  4. If you get strange messages from *getty*, it may mean the calling form in /etc/inittab is wrong. check_root_fs does not perform much inittab call checking because the options are so variable among the different *getty* programs
    Different versions of agetty are reported to have different incompatible calling forms.
    . If you're using a different call and/or program from what you use in your hard disk /etc/inittab, double check it.
  5. When you login, you may get errors about commands that aren't present. This usually happens when an rc file uses a builtin command that does not exist in the rescue disk's shell, but did exist in the shell used to run Yard.

6.6 And if none of that works...

Check the Yard webpage: http://www.croftj.net/~fawcett/yard/

On that page I'll put notes people have sent me about problems they've run into, before I've had a chance to fix and re-release Yard.

6.7 Suggestions

If you find a problem that that Yard didn't catch, but which it could have caught, please let me know and I'll try to add a check to check_root_fs.

If you think it's a Yard problem you can try diagnosing it yourself. In fact, you may have better luck since you're more familiar with your setup. Otherwise, package up the three .log files along with your Bootdisk_Contents, Config.pl and Bootdisk_Contents.ls and send them to me with a reasonable description of what went wrong. This command should work:

        tar cvzf yard_bug.tgz *.log *.ls \
            /etc/yard/Bootdisk_Contents /etc/yard/Config.pl

Be sure to let me know which Yard version you're using. You might check the yard webpage to make sure there isn't already a newer version of Yard that fixes your problem, or a note added there.


Next Previous Contents