Understanding the SPL Setup

CSE 451 - Spring 2000
by
Anonymous Lab Staff Member

This note provides information describing the computing environment support CSE451. It contains much sage advice and many practical hints.

Networking

The networking environment has been explicitly designed to isolate machines in Rm. 231 (the SPL) from the Internet. (See figure below.) The machines within the SPL can talk to each other in the ways that you're accustomed to. However, the network connecting those machines is not itself directly connected to the Internet. Instead, it's connected to two gateway machines that serve as "filters"; none of the grubby traffic we're likely to generate while hacking the kernel on the SPL machines can reach the rest of the department (or world).

You can get to the world while logged into the SPL machines, though. The way to do this is the via the two gateway hosts, greer and baughm. It you wish to get at anything outside the SPL (while using a machine in the SPL), you must first login to greer or baughm and then login (or whatever) from them to the outside world.

Login access to the gateway machines is restricted to ssh. Similarly file transfer can only be accomplished by using scp.

File Storage

The intent is that all project code will reside on the gateway hosts. Your regular home directory will be available but you will not be able to store large amounts there due to the usual quota restrictions. Instead, each student has project space in /cse451/<login-name> that is accessible only from the gateway hosts. It is expected that all compiling and linking of projects will take place on those hosts.

Since the number of changes to kernel source code will be extremely small relative to the total space consumed by each kernel, the /cse451 directories are not backed up. Instead students are expected to copy only changed files to their regular home directory for backup purposes.

Disk space on the SPL machines is completely unregulated and is not backed up. The odds are that the file systems on those machines will be blown away by you, or someone you know, on a (very) regular basis. (In fact, unless no one ever has a bug in their code, nothing on those machines can be considered to be regulated in any way.)

Security

For security reasons, you will have a different password within the SPL machine cluster. An initial password will be assigned to you. You can change it to anything that you want from one of the SPL machines. DO NOT change it to be the same as your usual password on the instructional cluster. The software environment within the SPL is completely unsecure and you should not trust it any more than you absolutely must.

Configuring and Building a Linux Kernel

You will be given a base version of the kernel source code that is configured for use on the SPL machines. You will not need to reconfigure the kernel if you only modify existing source files. If you need to add a new file, do it in the following manner:

Determine which existing directory the file(s) will live in.
Edit the Makefile in that directory and add the name(s) of the .o file(s) that will be created to the definition of O_OBJS. If you are building a module, add the name(s) to M_OBJS.

To build the kernel you must be in the top-level source directory (the one containing a file named "REPORTING-BUGS"). From here the only four commands you should issue are:

"make dep" -- this will rebuild the file dependency information. You should only need to do this the first time you build a kernel or if you add new files.
"make bzImage" -- this command recompiles all necessary files and links the kernel. The compressed kernel executable file is named arch/i386/boot/bzImage.
"make modules" -- this command recompiles loadable modules used by the kernel. The modules themselves are nothing more than .o files which are inserted/deleted from the kernel by the insmod/rmmod commands discussed below.
"make clean" -- this command removes all binary files. The kernel will be recompiled from scratch.

Installing a New Linux Kernel

You need two files to install a new kernel: the System.map file from the top-level kernel source directory and the arch/i386/boot/bzImage file. Copy these files to the /boot directory of the target machine where you will run the kernel, giving them unique names on the target machine. E.g.

% scp System.map root@target:/boot/System.map-your_name
% scp arch/i386/boot/bzImage root@target:/boot/bzImage-your_name

On the target machine, edit the file /etc/lilo.conf adding the following lines:

image=/boot/bzImage-your_name
    label=linux-your_name
    append="console=ttyS0,9600 console=tty0"
    read-only
    root=/dev/sda9

There is a limit on the number of kernels that can be defined in the lilo.conf file; feel free to remove definitions left by people who have previously used the machine. DO NOT remove the lines beginning with

image=/boot/vmlinuz-2.2.14-5.0

This a known-good kernel and can be used to reboot the machine if something goes wrong with a kernel that you install.

Finally, run the command "/sbin/lilo" to install kernel information in the boot sector. DO NOT forget to run /sbin/lilo, otherwise the kernel you just built will not be bootable. (If it seems to be booting in this case, it's because you already had a kernel with that name installed. Turning that around, if you just made a change and rebooted and it seems that change didn't happen, the odds are you forgot to run /sbin/lilo.)

To boot your kernel, reboot the machine and hold down the ALT-key while the SCSI BIOS loads. You should receive a prompt of "LILO:". Enter the label you used for your kernel, e.g. "linux-your_name" as above.

If you are encapsulating your code as loadable modules, you must install and run the kernel that your modules were compiled with. However, you do not need to to copy over a new kernel or reboot every time the modules you are using change. You need only copy the module .o files to the target machine and insert them into the running kernel using "/sbin/insmod dot-o-file.o". You must unload any previous versions of your module using "/sbin/rmmod dot-o-file" before loading a newer version.

If you are using modules, you should always use insmod/rmmod. DO NOT attempt to have modules automatically loaded since that has the potential to affect the configuration of the machine for subsequent users.

Kernel Debugging

The standard kernel source is configured to support IKD, the interactive kernel debugger. See the documentation in the Documentation/kdb directory of the kernel sources.

"Remote" Debugging

It is possible to control boot-time kernel selection and run the kernel debugger from a "remote" location; you do not have to be physically present in Rm. 231. Doing this requires that the target machine be connected by a serial line to console server. Eight machines are configured in this manner: asimov, keegan, hawthorne, seuss, tolkien, hemingway, faulkner and wooster.

You can connect to the serial console on any of the above machines by using the rconsole program installed on the gateway machines, greer and baughm, by specifying the target machine name. E.g., "rconsole seuss". If you use the "-l file-name" option, a transcript of the terminal session will be written to file-name. Once connected, you must give a carriage-return (Enter key) to receive a login prompt.

If you reboot the machine while connected, the LILO: prompt will eventually appear and you can give the name of the kernel you wish to boot. Hitting <TAB> at the LILO: prompt will give you a list of the label-names of available kernels.

It is possible to use the kernel debugger from the serial console. See the kdb documention mentioned above for details.

To terminate your console session, use the two-character key combination of ctrl-shift-_ followed by q. Please do not hog the serial console lines.

When Disaster Strikes

It is entirely possible that an experimental kernel will destroy some or all of the file system on a Lab machine, making it unusable. If that happens you will be expected to re-install the system. This is a relatively simple procedure that requires you to insert a floppy disk and boot from it after hitting the reset button or power cycling the machine. (A couple of installation floppies are located in a plastic sleeve taped to the side of the machine labeled loom17. The floppies are labeled "RH6.2 netboot 20000407".)

You will presented with a Boot: prompt. Type "linux ks" followed by a carriage return. No further intervention should be necessary. The total time to re-install is less than 15 minutes. At the end of the installation, you will be presented with a screen that congratulates your efforts and invites to remove the floppy disk and reboot. So do that.

If the installation fails or the new system behaves in a peculiar manner, notify your instructor.

Don't screw it up for the next guy

SPL machines will be available (to CSE451 students) on a first-come, first-served basis; it is unlikely that you will always be able to use the same machine for your experiments. Therefore you should be extremely careful not to change machine configuration information and startup scripts or install modified versions of standard programs. Such actions can lead to mysterious behavior that subsequent users will have to deal with, and having wasted two hours on that, will then come looking to deal with you.