CSEP551 Winter 2012 -- Programming Assignment #1

CSEP551 -- Programming Assignment #1

Out: Monday, January 9th, 2012
Due: Monday, January 30th, 2012, by 5pm

[ overview | details | bonus | what to turn in ]

Overview

In this assignment, you will investigate the impact of system call overhead on application performance. Your ultimate goal is to produce a set of graphs that looks something like the following:

This graph shows some application-level benchmark performance as a function of system call latency. To generate this graph, you will modify the linux kernel so that you can add a specifiable amount of overhead to every system call. By varying this overhead, you can measure benchmark performance as a function of this overhead to generate the curve.

Assignment details

Here are the steps we recommend that you should follow:

Set up an environment in which you can install linux and compile linux kernels. I strongly recommend using VMware for this. (It is true that VMware will affect your benchmark performance, but let's ignore that for this assignment.) You'll need to:
- Find a computer on which you can install VMware. UW CSE students can get a one-year license of VMware for free. On Windows or Linux, you want VMware workstation; on a Mac, you want VMware fusion.
- Install VMware.
- Find and download a Linux virtual machine. I recommend that you use the virtual machine image that was prepared for the UW undergraduate operating systems course. You can download the image here. It's 4GB tarball, so be forewarned that it might take some time to download over broadband. Once you've downloaded it, extract the archive (12GB expanded).
  (If you'd prefer to use a different Linux distribution, feel free, but you'll be on your own to figure out how to compile, install, and boot into a new kernel using it.)
Practice compiling, installing, and booting into a new kernel. For the VM we recommended, you can find instructions on how to do this here.
(If you're using your own flavor of Linux, you'll need to figure out how to do this yourself. You may need to install kernel sources, configure the kernel, and maybe even install compilation tools, depending on what you're starting with.)
You're now ready to modify the linux kernel so that you can add overhead to every system call that occurs. To do this, you'll need to:
- Figure out how system calls work on Linux, and trace them through the linux kernel source code. There are plenty of Web pages that should help with this, but as a hint, Linux 2.6 uses the sysenter and sysexit instructions to do efficient user to kernel crossings. See, for example, here.
- Add your code to the system call path. You'll have to decide where to do this; there is a quick way that will require a small amount of assembly hacking in one location, and a more onerous way that will require C programming but patching a bunch of files and routines.
- Your code should introduce a variable amount of overhead, presumably by doing computation in a loop, and varying the loop count.
Make it possible for user-level programs to change the amount of system call overhead introduced by your code. I suggest that you add something to the /proc virtual file system to do this -- i.e., by writing a number to a file you create in proc, your code will vary the amount of loop overhead according to that number. There are plenty of web pages that will help you learn how to add something to the /proc filesystem; see, for example, here.
Calibrate your system call overhead by measuring the latency of a simple system call (e.g., close(100)) as a function of your variable overhead.
Measure the performance of the following three benchmarks as a function of system call overhead:
- compiling the linux kernel source tree
- the maximum throughput of apache serving a small, static file
- something of your choice
Be careful of caching effects when running your benchmarks, in particular, of the file system buffer cache!

Bonus

If you want some extra credit, find a way to add a similar delay loop (including another /proc interface) on the ethernet packet reception and transmission paths. Calibrate this delay loop. Devise an experiment to measure the impact of this delay on the throughput and latency of serving a small, static web page from Apache. Devise another experiment to measure the impact of this delay on the throughput and latency of serving a large, static web page from Apache. Explain your results.

What to turn in

You should submit your assignment by emailing a single .tar.gz/.zip archive to Steve and Owen containing the following elements:

a description of your linux environment, including whether you used VMware, which linux kernel version you modified, and so on.
any source code you generated, including files that you modified in the linux kernel tree. Please don't send us the full kernel source tree; just send us any files that you modified.
a writeup that includes:
- your name
- a brief description of how you added overhead (i.e., explain the design / implementation of your code)
- a brief description of the mechanism you introduced that allows user-level code to vary the system call overhead
- a graph showing the calibration of your introduced system call overhead
- the graphs of your benchmark results, and a description of your measurement method
- a short analysis of the graphs -- why are they shaped the way they are, why are they differently shaped for the different benchmarks, and what high-level lessons do you draw from this?