Date: Tue, 6 Nov 2001 12:30:00 -0800
From: Mark Ferlatte
Subject: Linux Thingy -- Bootloaders

So, a long time ago, I used to write up and mail out stuff called
"Linux Thingys" to misc, which consisted of semi-useful information
mixed in with totally useless information about Unix and Linux.  Some
people found them to be nice and helpful, and other people undoubtedly
just deleted them.  Anyway, it seems like as good a time as any to
bring back the Linux Thingy.

However, previously the Linux Thingys were all about useful tricks and
tools from an end-user perspective.  The one after this one probably
will be, but I got all excited learning about various bootloaders this
week, and wrote this up anyway.  If you don't install your own
operating systems, this particular Thingy isn't going to be helpful,
but it might be interesting.

This particular Linux Thingy is going to be about a part of your
computer that you normally only care about when it breaks: the boot
loader.  The boot loader is responsible for loading the operating
system kernel after the system hardware is done initializing.  In the
interests of being specific, we're going to talk about what happens on
Intel architecture machines, that being what most of you have.  For
the record, just about every other architecture does a cleaner job of
booting than Intel (Sparc and PowerPC, especially, since they have
Open Firmware instead of a BIOS).

So, here's what happens:  You turn your computer on, and fairly
quickly (assuming the base hardware tests succeed), your system BIOS
is loaded into memory, and begins to execute 16-bit code.  Yep,
that's right: your shiny new Pentium IV running at 1 Ghz still boots
like a 286.  What's even worse is that your BIOS isn't even doing
anything useful for you if you're running a modern operating system;
it only exists for DOS compatibility, so if you've left Windows 95,
98, and Me behind you, your BIOS is just extra time in your boot
sequence.

Once the BIOS is done doing whatever initialization that it needs to
do, it starts the loading process from the system boot device, usually
your first hard disk.  The first thing that loads is a tiny 512 bytes
program called the Master Boot Record (MBR).  This program lives in
the first sector of your hard drive, and its only job is to determine
the bootable partition, and load and execute the first sector of that
partition (the stage 1 bootloader).  The stage 1 bootloader (which can
also only be 512 bytes), then loads and executes the stage 2 bootloader.

The stage 2 bootloader can be larger than 512 bytes (but is still
pretty small... various restrictions limit a stage 2 bootloader to
about 7k), and is responsible for loading the next stage in the
boot process.  Usually, the next stage is an OS kernel of some type,
but not always, as I'll explain below.  Finally, the OS kernel is
loaded, and begins to execute, and then you're in your operating
system's boot procedure.
Windows Me, 98, 95, and MS-DOS all use the same boot loader.  The DOS
stage1 knows that the next thing to load is in a certain place on the 
partition.  The "certain place" part is why you have to use the
SYS.EXE command to make an MS-DOS boot disk... SYS.EXE puts the
IO.SYS and MSDOS.SYS files in the correct physical location on the
disk so that the DOS loader can find them.  Once IO.SYS is loaded,
Windows goes on it's merry way.

Windows NT, 2000, and XP actually have a real bootloader.  The MBR
loads a stage1 bootloader from the bootable partition, which loads a
stage2 bootloader (called boot), which reads c:\boot.ini, and can
present choices to the user.  You can actually get this to boot
another OS on another partition, if you want.  Boot then loads the
Windows NT kernel, which is used to be the pretty blue text, but is
now hidden behind various splash screens.

The most common Linux bootloader is called LILO (LInux LOader), and
it's the bootloader everyone loves to hate.  While it has a
surprisingly large amount of functionality (you can make it use a
serial console, it can display VGA graphics for the boot menu, it can
boot kernels on very large disks), it's also a pain in the ass,
because you have to run /sbin/lilo to re-install the stage1 and 2
bootloaders every time you install a new kernel.  This is an
unnecessary evil, and makes new kernel installation more error prone.
Never the less, almost all Linux distributions use this, so we're
stuck with it for a while.  LILO can install it's stage1 in either the
MBR (overwriting the existing one), or in the boot partition.

The reason that you have to run /sbin/lilo after installing a new
kernel is that the LILO bootloader doesn't understand file systems; it
only knows about the lower level block structure of the disk.  The
/sbin/lilo program does understand file systems, and translates the
kernel's path (i.e., /boot/vmlinuz-2.2.19) into a logical block address
(i.e., 3,4,123) so that the LILO bootloader can find the kernel image to
load.  Effectively, this is a big hack.

There is, however, hope.  Newer distributions are starting to switch
to a program called GRUB (the GRand Unified Bootloader).  GRUB was
originally based upon the excellent FreeBSD bootloader, but that
proved to be too difficult to adapt to other systems, so it was
re-written from scratch.  GRUB can boot FreeBSD, Linux, and the HURD
directly, and handles OpenBSD, NetBSD, and other operating
systems via an indirection mechanism (basically, GRUB pretends that
it is the MBR, and boots the other systems stage1 loader).

In addition to an easy to use boot menu, GRUB also provides a
command line shell interface at boot, which allows you do to many useful
things, including booting any kernel on any readable file system
(GRUB can read ext2, JFS, XFS, ReiserFS, and FAT), so not only do you
no longer need to run something like /sbin/lilo, you don't even have
to remember to put your kernel somewhere in /boot for GRUB to be able
to find it!  GRUB even provides some emergency recovery mechanisms
(including the ability to cat a file on one of the file systems that it
can read).  For all of the sysadmins who just went "Oh, crap!  That's
awful security", grub also supports MD5 encrypted passwords to lock
out various operations.

GRUB can also act as a network boot agent, and can be burned into a PXE
boot EEPROM (so you can use it to netboot diskless workstations).

GRUB packs in all of this functionality by having a slightly different
boot process.  After the GRUB stage1 loads (which can be as a
replacement for the MBR, or chainloaded by the MBR), it loads something that
GRUB calls a stage1.5 loader.  The 1.5 loader understands file system
structure, and then proceeds to look for /boot/grub/stage2, which it
simply reads off of the file system and executes.  This allows the
stage2 to be significantly larger than normal (indeed, the GRUB stage2
is about 94k), and is what allows GRUB to have the large amount of
functionality that it contains.

FreeBSD's bootloader is roughly equivalent to GRUB (the major
exception being the file systems that it understands), and has pretty
much the same set of features.  It has a different boot shell syntax,
of course, but it is as well documented as GRUB is.

In addition to all of that boot loader wacky-ness, the Debian project
actually re-wrote the standard (DOS) MBR program from scratch.
Initially, this was simply so that the Debian distribution would be
running on completely Free Software, but the developer actually
managed to squeeze some functionality out of those 512 bytes of
assembly.  The default operation of this MBR lets you select via a key
press which of the partitions on the disk should be booted, regardless
of the status of the bootable partition flag.  However, you can also
use this MBR to work around Y2K bugs found in various system BIOSes
(like the 1994/1995 Award BIOSes).

This was undoubtedly way more than anyone ever really wanted to know
about bootloaders, but some of you might find this useful, or at least
educational.

Have fun!

M