|
|
|
|
Project 1 - The Shell and System Calls
Out: April 5th
Due: April 12th, 11:00am
Assignment Goals
-
To understand the relationship between OS command interpreters (shells), system
calls, and the kernel.
-
To become familiar with the tools and skills needed to understand, modify,
compile, install, and debug the Linux kernel.
-
To design and implement simple shell and system call.
Background
The OS command interpreter is the program that people interact with in
order to launch and control programs. On UNIX systems, the command interpreter
is usually called the shell: it is a user-level program that gives
people a command-line interface to launching, suspending, and killing other
programs. sh, ksh, csh, tcsh, bash, ... are all examples of UNIX
shells. (It might be useful to look at the manual pages of these shells, for
example, type: "man csh
".)
Every shell is structured as the following loop:
-
print out a prompt
-
read a line of input from the user
-
parse the line into the program name, and an array of parameters
-
use the fork() system call to spawn a new child process
-
the child process then uses the exec()
system call to launch the specified program
-
the parent process (the shell) uses the wait() system call to wait for
the child to terminate
-
once the child (i.e. the launched program) finishes, the shell repeats the loop
by jumping to 1.
Although most of the commands people type on the prompt are the name of other
UNIX programs (such as ls or more), shells recognize some
special commands (called internal commands) which are not program names. For
example, the exit command terminates the shell, and the cd
command changes the current working directory. Shells directly make system
calls to execute these commands, instead of forking a child process to handle
them.
This assignment consists of two parts. In the first, you will design and
implement an extremely simple shell that knows how to launch new programs, and
also recognizes three internal commands (exit, cd, and execcounts), which we
will describe below. The first two internal commands will work by calling
existing system calls (exit and chdir); the third internal
command will work by calling a new system call that you will design and
implement. So, in the second part of this assignment, you will design and
implement the execcounts system call. This will involve making changes
to the Linux kernel source code. The semantics of the execcounts system
call, and some hints on how to go about implementing it are also described
below.
The Assignment
Part 1: Build a new shell
|
Write a simple shell program in C which has the following features:
-
It should recognize two internal commands, exit and cd.
-
exit terminates the shell, i.e., the shell calls the exit()
system call or returns from main.
-
cd uses the chdir system call to change to the new directory.
-
If the command line is not an exit or cd, it should assume
that it is of the form
<executable name> <arg0> <arg1> ....
<argN>
Your shell should invoke the executable, passing it the command line.
Assume that the full path names, like /bin/ls, are given. Also, please use the
same prompt as in the following:
CSE451Shell% /bin/date
Sat Jan 6 16:03:51 PST 2002
CSE451Shell% /bin/cat /etc/HOSTNAME /etc/motd
spinlock.cs.washington.edu
Pine, MH, and emacs RMAIL are the supported mailers on the
instructional systems.
[and on and on...]
Note: The words underlined are typed in by the user.
Please take a look at the manual pages of execv, fork and wait.
To allow users to pass arguments to programs you will have to parse the input
line into words separated by whitespace (spaces and '\t' tab characters) and
place these words into an array of strings. You might try using strtok()
for this (man strtok for a very good example of how to solve exactly
this problem with strtok).
Then you'll need to pass the name of the command as well as the entire list of
tokenized strings to one of the other variants of exec, such as execvp().
These tokenized strings will then end up as the argv[] argument to the
main() function of the new program executed by the child process. Try man
execv or man execvp for more details.
Part 2: Add a new system call
|
There are four system calls in Linux related to creating new processes: fork,
vfork, execve, and clone . (The man pages will
describe for you the differences among them.) You need to edit the kernel
so that you can write a user-level program that will print the number of
times each of these four system calls has been invoked (by any process on the
system). More specifically, I want to write a garden-variety C program that
prints out the total number of invocations of each of these four system calls.
To do this requires three things:
-
Modify the kernel to keep
track of this information.
-
Design and implement a new system call, execcounts, that
will get this data back to the user application.
-
Write the user application.
We'd also like to be able to reset these statistics periodically,
so we need a way to clear the request information we've tracked so far.
This requires either parameterizing the above system call to add a clear
option, or adding another system call.
There are several different ways to approach this problem. It is
your job to analyze them from an engineering point-of-view, determine the
trade-offs, and defend the implementation you select.
Warning 1: Remember that the Linux kernel should be allowed
to access any memory location, while the calling application should be
prevented from causing the kernel to unwittingly read/write addresses other
than those in its own address space. Don't create security holes!
Details about this are here.
Warning 2: Remember that the kernel should not leak memory.
HINTS (Read Carefully)
You should be using the C language whenever you alter or add to the
Linux kernel.
You can't just make system calls directly from C. Instead, you need
to use the syscall function and pass it the number of your new system call.
This code fragment show you how to do that:
/*
* Set the features included by the linux libc to have the BSD extensions
*/
#ifndef _BSD_SOURCE
#define _BSD_SOURCE 1
#endif
#define _NR_execcounts someNumber
#include <unistd.h>
...
int ret =
syscall(__NR_execcounts, ...);
Recommended Procedure
I suggest you wade, rather than dive, into this. In
particular, here's a suggested set of incremental steps:
-
If you've never compiled the kernel, go through the
lab information page or see the
Linux Kernel HOWTO. It should not take longer than an hour and it
will ensure that you are up to speed with VMware.
-
Now implement a parameterless system call, whose body is just a
printk() call. Write a user-level routine that invokes it. Check to
make sure it was invoked.
-
Now write the full implementation.
Part 3: Integrate the system call
into the shell
|
Now that you have a working shell and an implementation of your new system
call, it's time to integrate them; this should be very simple. Add a new
internal command to your shell, called execcounts. The execcounts
command should invoke the system call that you build in Part 2, and print out:
-
the number of invocations of each of the four "process creation" Linux system
calls that have occurred since the last invocation of your "clear" command .
-
the fraction of all such calls that each of the four types represents.
Copy all of the following items to a directory named proj1<username>
and turn that directory in using the turnin program. In addition,
please bring a paper copy including items 3-7 to lecture on the due date.
-
The source code of your shell.
-
The source code of the files you changed in the kernel.
-
A description of the Linux source files you changed and why you had to change
them.
-
Documentation for your new system call (i.e., a miniature man page for the
system call itself, not for the shell command). A text file is fine.
-
A transcript showing you using your new shell to invoke the /bin/date program,
the /bin/cat program, and the exit and cd commands
supported by your shell. (/usr/bin/script might come in handy to
generate this printout. As always, do man script to find out how
to use the command.)
-
To attempt to achieve some sort of uniformity in results for the new system
call, hand in the results obtained from the following:
-
Invoke your shell
-
Use your shell to reset the system call counts to zero
-
Run the following command exactly as given:
/usr/bin/find /etc -type f -exec touch t ;
Note: to try that command in most "real" shells, use the following:
/usr/bin/find /etc -type f -exec touch ~/t \;
-
Invoke execcounts and report these results
-
A brief write-up with the answers to the following questions.
-
Describe how you found the information needed to complete this project. Did it
have the information you needed? Did you consult with any humans? If so, what
did you try first and who did you consult with?
-
Explain the calling sequence that makes your system call work. First, a user
program calls <.....>. Then, <.....> calls <.....>. ... and
so on. You can explain this using either text or a rough (less than 15 minutes)
diagram.
-
Why do you think the designers of Linux implemented system calls the way they
did? What were they trying to achieve? What were they trying to avoid?
-
Give (in 1-2 sentences) an alternative idea for implementing system calls.
State one way your idea would be better or worse than the way it is currently
done.
-
gotos are generally considered bad
programming style, but these are used frequently in the Linux kernel, why could
this be? This is a thinking question, so justification is more important
than your answer.
-
What is the difference between the "clone" and "fork" system calls?
-
How could you extend your shell to support multiple simultaneous processes
(foreground and background...)?
Do not underestimate the importance of the write-up. Your project grade depends
significantly on how well you understood what you were doing, and the write-up
is the best way for you to demonstrate that understanding.
The grade on the project will be calculated as follows:
-
Shell: 15 points
-
System call: 15 points
-
Write-up: 25 points
|