Date: Tue, 6 Nov 2001 12:30:00 -0800 From: Mark Ferlatte Subject: Linux Thingy -- Bootloaders So, a long time ago, I used to write up and mail out stuff called "Linux Thingys" to misc, which consisted of semi-useful information mixed in with totally useless information about Unix and Linux. Some people found them to be nice and helpful, and other people undoubtedly just deleted them. Anyway, it seems like as good a time as any to bring back the Linux Thingy. However, previously the Linux Thingys were all about useful tricks and tools from an end-user perspective. The one after this one probably will be, but I got all excited learning about various bootloaders this week, and wrote this up anyway. If you don't install your own operating systems, this particular Thingy isn't going to be helpful, but it might be interesting. This particular Linux Thingy is going to be about a part of your computer that you normally only care about when it breaks: the boot loader. The boot loader is responsible for loading the operating system kernel after the system hardware is done initializing. In the interests of being specific, we're going to talk about what happens on Intel architecture machines, that being what most of you have. For the record, just about every other architecture does a cleaner job of booting than Intel (Sparc and PowerPC, especially, since they have Open Firmware instead of a BIOS). So, here's what happens: You turn your computer on, and fairly quickly (assuming the base hardware tests succeed), your system BIOS is loaded into memory, and begins to execute 16-bit code. Yep, that's right: your shiny new Pentium IV running at 1 Ghz still boots like a 286. What's even worse is that your BIOS isn't even doing anything useful for you if you're running a modern operating system; it only exists for DOS compatibility, so if you've left Windows 95, 98, and Me behind you, your BIOS is just extra time in your boot sequence. Once the BIOS is done doing whatever initialization that it needs to do, it starts the loading process from the system boot device, usually your first hard disk. The first thing that loads is a tiny 512 bytes program called the Master Boot Record (MBR). This program lives in the first sector of your hard drive, and its only job is to determine the bootable partition, and load and execute the first sector of that partition (the stage 1 bootloader). The stage 1 bootloader (which can also only be 512 bytes), then loads and executes the stage 2 bootloader. The stage 2 bootloader can be larger than 512 bytes (but is still pretty small... various restrictions limit a stage 2 bootloader to about 7k), and is responsible for loading the next stage in the boot process. Usually, the next stage is an OS kernel of some type, but not always, as I'll explain below. Finally, the OS kernel is loaded, and begins to execute, and then you're in your operating system's boot procedure. Windows Me, 98, 95, and MS-DOS all use the same boot loader. The DOS stage1 knows that the next thing to load is in a certain place on the partition. The "certain place" part is why you have to use the SYS.EXE command to make an MS-DOS boot disk... SYS.EXE puts the IO.SYS and MSDOS.SYS files in the correct physical location on the disk so that the DOS loader can find them. Once IO.SYS is loaded, Windows goes on it's merry way. Windows NT, 2000, and XP actually have a real bootloader. The MBR loads a stage1 bootloader from the bootable partition, which loads a stage2 bootloader (called boot), which reads c:\boot.ini, and can present choices to the user. You can actually get this to boot another OS on another partition, if you want. Boot then loads the Windows NT kernel, which is used to be the pretty blue text, but is now hidden behind various splash screens. The most common Linux bootloader is called LILO (LInux LOader), and it's the bootloader everyone loves to hate. While it has a surprisingly large amount of functionality (you can make it use a serial console, it can display VGA graphics for the boot menu, it can boot kernels on very large disks), it's also a pain in the ass, because you have to run /sbin/lilo to re-install the stage1 and 2 bootloaders every time you install a new kernel. This is an unnecessary evil, and makes new kernel installation more error prone. Never the less, almost all Linux distributions use this, so we're stuck with it for a while. LILO can install it's stage1 in either the MBR (overwriting the existing one), or in the boot partition. The reason that you have to run /sbin/lilo after installing a new kernel is that the LILO bootloader doesn't understand file systems; it only knows about the lower level block structure of the disk. The /sbin/lilo program does understand file systems, and translates the kernel's path (i.e., /boot/vmlinuz-2.2.19) into a logical block address (i.e., 3,4,123) so that the LILO bootloader can find the kernel image to load. Effectively, this is a big hack. There is, however, hope. Newer distributions are starting to switch to a program called GRUB (the GRand Unified Bootloader). GRUB was originally based upon the excellent FreeBSD bootloader, but that proved to be too difficult to adapt to other systems, so it was re-written from scratch. GRUB can boot FreeBSD, Linux, and the HURD directly, and handles OpenBSD, NetBSD, and other operating systems via an indirection mechanism (basically, GRUB pretends that it is the MBR, and boots the other systems stage1 loader). In addition to an easy to use boot menu, GRUB also provides a command line shell interface at boot, which allows you do to many useful things, including booting any kernel on any readable file system (GRUB can read ext2, JFS, XFS, ReiserFS, and FAT), so not only do you no longer need to run something like /sbin/lilo, you don't even have to remember to put your kernel somewhere in /boot for GRUB to be able to find it! GRUB even provides some emergency recovery mechanisms (including the ability to cat a file on one of the file systems that it can read). For all of the sysadmins who just went "Oh, crap! That's awful security", grub also supports MD5 encrypted passwords to lock out various operations. GRUB can also act as a network boot agent, and can be burned into a PXE boot EEPROM (so you can use it to netboot diskless workstations). GRUB packs in all of this functionality by having a slightly different boot process. After the GRUB stage1 loads (which can be as a replacement for the MBR, or chainloaded by the MBR), it loads something that GRUB calls a stage1.5 loader. The 1.5 loader understands file system structure, and then proceeds to look for /boot/grub/stage2, which it simply reads off of the file system and executes. This allows the stage2 to be significantly larger than normal (indeed, the GRUB stage2 is about 94k), and is what allows GRUB to have the large amount of functionality that it contains. FreeBSD's bootloader is roughly equivalent to GRUB (the major exception being the file systems that it understands), and has pretty much the same set of features. It has a different boot shell syntax, of course, but it is as well documented as GRUB is. In addition to all of that boot loader wacky-ness, the Debian project actually re-wrote the standard (DOS) MBR program from scratch. Initially, this was simply so that the Debian distribution would be running on completely Free Software, but the developer actually managed to squeeze some functionality out of those 512 bytes of assembly. The default operation of this MBR lets you select via a key press which of the partitions on the disk should be booted, regardless of the status of the bootable partition flag. However, you can also use this MBR to work around Y2K bugs found in various system BIOSes (like the 1994/1995 Award BIOSes). This was undoubtedly way more than anyone ever really wanted to know about bootloaders, but some of you might find this useful, or at least educational. Have fun! M