Overview
For this assignment, you will be implementing two (hopefully simple!) pieces of systems software:
- (A) a basic UNIX shell
- (B) a basic AMTED server
The requirements for your software are specified using the following conventions:
|
|
MUST |
The minimum requirements a solution should meet to receive a passing grade. |
SHOULD |
Describes a well-executed solution. The extent to which a solution follows the SHOULD
guidelines determines the points received beyond a passing grade. |
CAN |
Optional |
Common Specification For Both A and B
|
|
MUST |
- Be implemented in C or C++, but C might be easier in some ways.
- Compile and run on the UW CSE Linux distribution. If you like, you can use one of the UW CSE
home VMs to develop on; note that VMware is free for UW students.
- Document any borrowed code.
|
SHOULD |
- Compile without warnings.
- Avoid memory errors. Use a tool such as valgrind for detection and debugging. (Look for
"definitely lost" messages from valgrind.)
- Not leak memory, processes, descriptors, etc.
- Be well-structured code: clean module interfaces, a nice decomposition, good comments.
- Check for error conditions and take reasonable action (exiting the program is generally
reasonable).
|
CAN |
Use variants of the system calls we suggest, as long as MUST (2) is met. |
Submission Specification
|
|
MUST |
- Your submission must be under a single directory called
assignment1 . This directory
must contain two directories, parta and partb , and a
README.txt file.
-
The README.txt file must include the names, and UW email address for all team
members. README.txt should also document anything of interest about your solution
such as known issues or extra features available.
parta must contain the source code for Part A, and partb must contain
the source code for Part B.
- Both
parta and partb must have their own Makefile. Running
make in parta must produce an executable called 550shell .
Running make in partb must produce an executable called
550server .
- One submission per team.
- Submission format is a tarball or a zip file named
assignment1_submission_${UWNETID}.tar.gz .
For teams a single uwnetid in the tarball name is sufficient. You can create a submission tarball by
running the following command from the parent directory of assignment1 (replacing
${UWNETID} with your email account name):
tar -cvzf assignment1_submission_${UWNETID}.tar.gz assignment1
For example, since my email account is "kheimerl", I would run the command:
tar -cvzf assignment1_submission_kheimerl.tar.gz assignment1
- Use the course canvas submission to submit the tarball or zip files.
|
Part A: Piping Shell
Part A: Overview
Your job is to implement a very simple UNIX shell capable of executing commands through a pipe. Our goal for
this assignment is two-fold:
- to exercise your C/C++/Linux skills
- and to get you to think about a clean design that handles the concurrency aspect of the shell.
Because of this, you shouldn't worry about any of the more complex aspects of shells, like supporting command
line arguments, environment variables, or other shell features (
&
,
>
, etc.).
All you need to do is handle the execution of programs and being able to pipe stdout
from one
program to the stdin
of another. More specifically, if the user enters the line
ls | wc
, your shell program should fork
off the two programs, which together
calculate the number of files in the current directory. Similarly, if the user enters
ls | head | wc
, you'll need to arrange to fork three processes.
Part A: Specification
|
|
MUST |
- Read commands to execute from
stdin . A command is terminated by a newline character
('\n') and consists of one or more sequence of programs separated by the string
" | " .
- Continue reading and executing commands until
stdin returns EOF.
- Wait for the current command to terminate before starting the next command.
- The child programs of a command must execute in parallel.
- Programs can be named by either an absolute path or just by the program name (
exec*p
should handle this for you).
|
CAN |
- Print an easy-to-parse prompt to
stdout .
- Use reasonable, hard-coded maximum sizes to simplify your code.
|
Part A: Hints
To solve this programming exercise, you'll need to use some (but perhaps not all) of the following Linux
system calls: fork, exec*p, close, pipe, dup2
, and wait
. The semantics of these
system calls are very particular, and fork/wait
have different semantics than in the Mesa paper,
so read their man pages carefully. As a hint, you should use pipe
to create a pipe,
fork
to execute the programs, and dup2
to replace stdin or stdout, and wait to wait
for the processes to terminate, probably in that order.
You might also want to look at the readline()
system call (man readline
) available
on Linux, as well as strtok()
in libC. Together, those calls will simplify the job of parsing
user input in your shell.
Part B: AMTED Server
Part B: Overview
Your job is to implement in C/C++/Linux a simple asymmetric multi-threaded event driven file server (AMTED),
similar to the AMPED server architecture described in the Flash
paper. Your server will read an absolute file path from a client, return the file contents over TCP,
then close the connection.
In a nutshell, your implementation should have three pieces:
- The main thread of your server handles all network I/O. It is responsible for accepting incoming TCP
network connections, reading the name of a file (absolute file path) from a network connection, arranging to
read the content of the file into memory, writing the file content over the network connection, then closing
the network connection.
- A threadpool implementation; this module should permit customers to dispatch a thread from the threadpool
to a worker function. The threadpool is made up of a fixed number of long-lived threads. You should use the
pthreads interface on Linux for managing threads and synchronizing access to shared data.
- A worker function that receives the name of a file, reads the file content into the address space of the
process, and arranges to return (a pointer to) the file content to whoever dispatched the thread to the
worker function.
Part B: Specification
|
|
MUST |
- Your executable
550amtedserver , must take two command line arguments that specify,
respectively, the network address (IPv4) and port that the server opens a listening TCP socket on.
For example, the command line:
./550amtedserver 127.0.0.1 9000
should run your server so that it listens to address 127.0.0.1 and port
9000 .
- The
550amtedserver protocol is as follows. After connecting, the client will send an
absolute path to a file followed by a newline character ('\n') . Each character in the
path is a single byte. The server will send the file contents in response then close the connection.
- Serve multiple clients concurrently.
- The main thread must be event-driven and use non-blocking network I/O. Disk I/O must happen in a
worker thread and should be blocking.
- Avoid concurrency bugs. Use synchronization appropriately to protect data shared between threads.
If you decide to let multiple threads read and write the same location in memory without any
synchronization, then provide evidence with references in your
README.txt for why it is
safe and correct.
|
SHOULD |
- All threads (including the main thread), should wait in a blocking call when there is nothing for
that thread to do instead of spinning.
- A thread should not continue waiting if there is an event for that thread (i.e. there is work for
that thread to do).
- Should be robust to invalid client input (say, a path that doesn't exist) by closing the client
connection instead of crashing.
- Minimize reading and copying of buffers, i.e. client messages and file data.
|
CAN |
- Use reasonable, hard-coded maximum sizes to simplify your code. Here are some example maximums
that won't interfere with our automated tests:
- 16 concurrent clients
- 254 characters in a path
- 2MB file size
- Use an open-source networking library as long as your submission contains everything needed to
compile and execute on the UW CSE Linux distribution..
|
Part B: Hints
Read the man
pages of system calls carefully to make sure you are using the correct flags and
understand the semantics. If you are unfamiliar with C network programming, we suggest starting with Beej's Guide to Network Programming.
We suggest using the program nc
to test your server.
This assignment doesn't require a massive amount of code, but if you've never implemented an event-driven
server before, it can be a little mind-bending. You need to use asynchronous I/O to handle all network
interactions, and you'll need to keep some kind of finite state machine representation of each connection to
know whether you are waiting for network input, waiting for file input from the thread pool, waiting to be
able to write data back over the network, etc.
Use Google and stackoverflow.com to get hints on all the things you'll need to do. Some hints:
- We suggest using
epoll()
or poll()
to manage a set of file descriptors and
receive event notifications associated with them.
- You can use
sigaction()
to tell Linux not to send your process SIGPIPE
signals
whenever a client closes its end of a TCP connection.
- You can use
fcntl()
twice to set a socket to non-blocking mode, so you can do asynchronous
I/O. (The first call to get some flags; then you modify the flags, and call it again to set them in the OS.)
Note that on Linux a socket created by accept()
does not inherit it's parent socket's flags.
- You'll need to read up on how to set up a listening socket, accept an incoming connection,
read/write
over a socket, and close the connection. There are many, many web pages that
describe how to do this in C, and you might be able to find some open source libraries that simplify this
task. (You'll need to make sure they support non-blocking I/O, or you'll need to enhance them to use support
it.)
- A worker thread will need to figure out how to notify the main event thread that it has finished reading a
file. One way to solve this is by using one or more pipes to communicate between threads, and adding these
pipes to the set of polled file descriptors.