Project 1 - The Shell and System Calls

Out: Monday, January 14, 2002
Due: Wednesday, January 23, 2002

Assignment Goals

Background

As we've discussed in class, the OS command interpreter is the program that people interact with in order to launch and control programs. On UNIX systems, the command interpreter is usually called the shell: it is a user-level program that gives people a command-line interface to launching, suspending, and killing other programs. sh, ksh, csh, tcsh, bash, ... are all examples of UNIX shells. (It might be useful to look at the manual pages of these shells, for example, type: "man csh".)

Every shell is structured as the following loop:

  1. print out a prompt
  2. read a line of input from the user
  3. parse the line into the program name, and an array of parameters
  4. use the fork() system call to spawn a new child process
    • the child process then uses the exec() system call to launch the specified program
    • the parent process (the shell) uses the wait() system call to wait for the child to terminate
  5. once the child (i.e. the launched program) finishes, the shell repeats the loop by jumping to 1.
Although most of the commands people type on the prompt are the name of other UNIX programs (such as ls or more), shells recognize some special commands (called internal commands) which are not program names. For example, the exit command terminates the shell, and the cd command changes the current working directory. Shells directly make system calls to execute these commands, instead of forking a child process to handle them.

This assignment consists of two parts. In the first, you will design and implement an extremely simple shell that knows how to launch new programs, and also recognizes three internal commands (exit, cd, and execcounts), which we will describe below. The first two internal commands will work by calling existing system calls (exit and chdir); the third internal command will work by calling a new system call that you will design and implement. So, in the second part of this assignment, you will design and implement the execcounts system call. This will involve making changes to the Linux kernel source code. The semantics of the execcounts system call, and some hints on how to go about implementing it are also described below.

The Assignment

Part 1: Build a new shell
Write a shell program in C which has the following features:
Part 2: Add a new system call

There are four system calls in Linux related to creating new processes: fork, vfork, execve, and clone.  (The man pages will describe for you the differences among them.)  Instrument the kernel so that we can write a user-level program that will print counts of the number of times each of these four system calls has been invoked (by any process on the system);  that is, I want to write a garden-variety C program that prints out the total number of invocations of each of these four system calls (by any process on the system).

To do this requires three things:

  1. Modify the kernel to keep track of this information.

  2. Design and implement a new system call that will get this data back to the user application.

  3. Write the user application.

We'd also like to be able to reset these statistics periodically. So we need a way to clear the request information we've tracked so far. This requires either parameterizing the above system call to add a clear option, or adding another system call.

Warning 1: Remember that the Linux kernel should be allowed to access any memory location, while the calling application should be prevented from causing the kernel to unwittingly read/write addresses other than those in its own address space. Details about this are here.

Warning 2 (Hint 0): Remember that it's inconceivable that this problem (warning 1) has never before been confronted in the existing kernel.

Warning 3: Remember that the kernel must never, ever trust the application to know what it's talking about when it makes a request, particularly with respect to parameters passed in from the application to the kernel.

Warning 4: Remember that you must be sure not to create security holes in the kernel with your code.

Warning 5: Remember that the kernel should not leak memory.

SOME HINTS

You should be using the C language whenever you alter or add to the Linux kernel.

The part of the user-level application you didn't learn in CSE 142 is this:

#include <sys/syscall.h>
#define __NR_execcounts something
#include <unistd.h>
....

int ret = syscall(__NR_execcounts, ...);

(One last detail: If we were really implementing a new system call, we'd put the #define above in  <sys/syscall.h>.  But, we're better off not monkeying with that file, as it's shared among all of us.)

Recommended Procedure

I suggest you wade, rather than dive, into this.  In particular, here's a suggested set of incremental steps:

  1. Don't change any Linux code.  Figure out how to do a make of a new boot image, what file to move where so that you can boot the image you just created, how to tell the loader (LILO) that your image exists, and then how to boot your image.

  2. Now put a "printk()" somewhere in the code, and figure out how to find its output.  (Hints: /var/log and "man dmesg").

  3. Now implement a parameterless system call, whose body is just a printk() call.  Write a user-level routine that invokes it.  Check to make sure it was invoked.

  4. Now write the full implementation.

Part 3: Integrate the system call into the shell
Now that you have a working shell and an implementation of your new system call, it's time to integrate them; this should be very simple. Add a new internal command to your shell, called execcounts. The execcounts command should invoke the system call that you build in Part 2, and print out:
What to Turn In
You should print out and turn in the following:
  1. The C source code to your shell, and a Makefile that compiles the shell.

  2. The names of all of the Linux kernel source files that you modified in order to add your new system call, and a verbal description of what you did to them and why you needed to do it (i.e. why was it necessary to modify this particular file).

  3. The interface to the new system call (i.e., a miniature man page for it).

  4. The complete source code of the routine that implements the new system call in the kernel (i.e., just the new code you wrote, not the source code that was already in the kernel that got control to your new routine).

  5. A printout showing you using your new shell to invoke the /bin/date program, the /bin/cat program, and the exit and cd commands supported by your shell.  (/usr/bin/script might come in handy to generate this printout.  As always, do man script to find out how to use the command.)

  6. To attempt to achieve some sort of uniformity in results for the new system call, hand in the results obtained from the following (whose intent is to count the four system calls that occur due to a make bzImage of the kernel):
    • Get the kernel source source on the machine running your new kernel.
    • If necessary, do a make clean to remove any traces of previous compilations.
    • Invoke your shell
    • cd to the kernel source directory
    • Use your shell to reset the system call counts to zero
    • Do a make bzImage
    • Invoke execcounts and report these results

  7. A brief (less than a page) discussion of any important design decisions that you made while implementing your system call and/or shell.