CSE 451 Assignment 1: Synchronization

Due Friday at 9:00 PM April 22, 2016

Objectives

Introduction

In this assignment you will implement synchronization primitives for OS/161 and learn how to write tests for them. Then you will use your synchronization primitives to build a thread-safe data structure that you will use later in the quarter. Once you have completed the written and programming exercises, you should have a fairly solid grasp of the pitfalls of concurrent programming and, more importantly, how to avoid those pitfalls in the code you will write later this quarter.

To complete this assignment you'll need to become familiar with the OS/161 thread code. The thread system provides interrupts, control functions, and semaphores. You will implement locks, condition variables, and thread join.

Begin Your Assignment

First, "tag" your Git repository. The purpose of tagging your repository is to make sure that you have something against which to compare your final tree.

Make sure that you do not have any uncommitted updates in your repo. Now, tag your repository as shown here:

  % git tag asst1-start
Note that git, by default, does not push tags to remote repositories. To explicitly push your tags, do this:
  % git push origin <tagname>
In this case, for example, at least one person in your group would type:
  % git push origin asst1-start

Configure OS/161 for ASST1

We have provided an example framework in which you can implement and test your solution for ASST1. This framework consists of driver code, found in kern/test and a menu item that you can use to execute the test from the OS/161 kernel boot menu. You have to reconfigure your kernel before you can use this framework. The procedure for configuring a kernel is the same as in ASST0, except you will use the ASST1 configuration file:

  % cd kern/conf
  % ./config ASST1

You should now see an ASST1 directory in the compile directory.

Building for ASST1

When you built OS/161 for ASST0, you ran bmake from compile/ASST0. In ASST1, you run bmake from (you guessed it) compile/ASST1.

  % cd compile/ASST1
  % bmake depend
  % bmake
  % bmake install

If you are told that the compile/ASST1 directory does not exist, make sure you ran config for ASST1.

Command Line Arguments to OS/161

Your solutions to ASST1 will be tested by running OS/161 with command line arguments that correspond to the menu options in the OS/161 boot menu. Please DO NOT change any existing menu option strings, since we will be using them to automate testing.

"Physical" Memory

In order to execute the tests in this assignment, you may need more than the 512 KB of memory configured into System/161 by default. We suggest that you allocate at least 2MB of RAM to System/161. This configuration option is passed to the mainboard device with the ramsize parameter in your sys161.conf file, which is found in the root folder. Make sure the mainboard device line looks like the following:

  31	mainboard  ramsize=2097152  cpus=2

Note: 2097152 bytes is 2MB. You can of course add even more memory if you feel it is necessary for your testing. Adding more CPUs will probably enhance your testing as well, but it is advisable to get your code to work on one processor before trying two or more.

Concurrent Programming with OS/161

If your code is properly synchronized, the timing of context switches and the order in which threads run should not change the behavior of your solution. Of course, your threads may print messages in different orders, but you should be able to easily verify that they follow all of the constraints applied to them and that they do not deadlock.

Built-in thread tests

When you booted OS/161 in ASST0, you may have seen the options to run the thread tests (type ? at the menu for a list of commands). The thread test code uses the semaphore synchronization primitive. You should trace the execution of one of these thread tests in GDB to see how the scheduler acts, how threads are created, and what exactly happens in a context switch. You should be able to step through a call to thread_switch() and see exactly where the current thread changes.

Thread test 1 ("tt1" at the prompt or tt1 on the kernel command line) prints the numbers 0 through 7 each time each thread loops. Thread test 2 ("tt2") prints only when each thread starts and exits. The latter is intended to show that the scheduler doesn't cause starvation—the threads should all start together, spin for awhile, and then end together.

Debugging concurrent programs

thread_yield() is automatically called for you at intervals that vary randomly. While this randomness is fairly close to reality, it complicates the process of debugging your concurrent programs.

The random number generator used to vary the time between these thread_yield() calls uses the same seed as the random device in System/161. This means that you can reproduce a specific execution sequence by using a fixed seed for the random number generator. You can pass an explicit seed into random device by editing the "random" line in your sys161.conf file. For example, to set the seed to 1, you would edit the line to look like:

  28 random seed=1

We recommend that while you are writing and debugging your solutions you pick a seed and use it consistently. Once you are confident that your threads do what they are supposed to do, set the random device to autoseed. This should allow you to test your solutions under varying conditions and may expose scenarios that you had not anticipated.

Code Reading (14 points)

To implement synchronization primitives, you will have to understand the operation of the threading system in OS/161. It may also help you to look at the provided implementation of semaphores. When you are writing solution code for the synchronization problems it will help if you also understand exactly what the OS/161 scheduler does when it dispatches among threads.

Margo Seltzer has put together a video that may help explain some aspects of the implementation of OS/161 thread synchronization, here this video. You'll need to set up a mix account, but you should have access through your UW CSE gmail/exchange account. The video goes over Interrupt Protection Level (IPL), semaphores, and wait channels in OS/161. She assumes you already understand condition variables (something we'll get to) but you can understand it without that background. Make sure to watch the video before you start to work on your implementation of locks and condition variables.

Please answer the following questions and submit them with your assignment. Place the answers in a file called asst1.pdf or asst1.txt in your submit directory.

Thread and Synchronization Questions (16 points)

Question 1: What happens to a thread when it sleeps?

Question 2: What function(s) handle(s) a context switch?

Question 3: What does it mean to turn interrupts off? How is this accomplished? Why is it important to turn off interrupts in the thread subsystem code?

Question 4: What happens when a thread wakes up another thread? How does a sleeping thread get to run again?

Question 5: What happens to a thread when it exits (i.e., calls thread_exit())?

Question 6: What role does the hardware timer play in scheduling? What hardware-independent OS routine is called on a timer interrupt?

Question 7: Describe how wchan_sleep() and wchan_wakeone() are used to implement semaphores.

Question 8: Why does the lock API in OS/161 provide lock_do_i_hold(), but not lock_get_holder()?

Write Readable Code (nothing to turn in)

In your programming assignments, you are expected to write well-documented, readable code. There are a variety of reasons to strive for clear and readable code. Since you will be working with a partner (or two), it will be important for you to be able to read your partner's code. Also, since you will be working on OS/161 for the entire quarter, you may need to read and understand code that you wrote several weeks earlier. Finally, clear, well-commented code makes your TAs happy!

There is no single right way to organize and document your code; it is not our intent to dictate a particular coding style for this class. The best way to learn about writing readable code is simply to read a lot of code. Read the OS/161 code, or the source of code of some other freely available operating system. Read your partner's code, too. When you read someone else's code, note what you like and what you don't like. Pay close attention to the lines of comments which most clearly and efficiently explain what is going on. When you write code yourself, keep these observations in mind.

Here are some general tips for writing better code:

Synchronization Primitives (40 points)

We know: you've been itching to get to the coding. Well, you've finally arrived! You can implement these synchronization primitives in terms of any other primitive(s), but you should put some thought into choosing the right implementation.

Locks (10 Points)

Implement locks for OS/161. The interface for the lock struct is defined in kern/include/synch.h. Stub code is provided in kern/thread/synch.c.

Condition Variables (10 Points)

Implement condition variables with Mesa semantics for OS/161. The interface for the cv struct is also defined in synch.h and stub code is provided in synch.c.

Thread Join (10 Points)

Implement the thread_join operation to allow a parent thread to wait for a child thread to complete; only the parent thread can join, and only if it notifies the thread system when the thread is created that it plans to join. The join should return the integer returned by the forked thread. Your solution must ensure that thread control blocks are garbage collected properly in both the join and no-join case.

General Guidelines

You may NOT use semaphores for your lock and condition variable implementation; while it is possible to do that, it is likely to lead you astray. Instead, use the various operations defined on wchan (after watching the video).

You may use semaphores, locks/condition variables, or wchan for your thread_join implementation.

Unit Testing (10 points)

It's important to get in the habit of writing your own tests. The tests we provide, while hopefully helpful, do not necessarily ensure robust coverage and correctness of your code. After all, you know your code better than we do! It is also important to be able to communicate your testing needs (and other coding needs) to others. In this section, you will exercise all of these skills.

In order to verify the correctness of your synchronization primitives, we ask that you write some unit tests for your lock and condition variable implementations. Unit tests are small, modular pieces of test code intended to verify a single, atomic unit of functionality (hence the name). You should decide for yourself what sorts of things you think qualify as "atomic units of functionality" in the context of your lock, condition variable, and join implementations, and make a case for their sufficiency in the comments of your unit tests.

Unit testing is a bit trickier in the presence of asynchronicity, so take care to make sure your tests are sound -- that is, your unit tests should succeed if and only if your primitives are correct. Remember that you have thread routines like thread_yield() at your disposal.

To provide an example, we have implemented a thorough unit test for semaphores, in kern/test/semunit.c, with twenty-some separate test routines. You do NOT need to be this thorough; instead, we are looking for roughly five tests each for locks and condition variables.

In addition, carefully document each test with the constraint that you are testing.

Thread-Safe List (10 points)

For this part of the assignment, your task is to build a thread-safe linked list module using locks and condition variables. A module is thread-safe if its public routines can be safely called concurrently by any number of threads.

As a starting point, we give you a thread-unsafe linked list implementation that can be used in the OS/161 kernel. The version we provide, however, is thread-unsafe -- because the data structure is not locked, it will break in random and unpredictable ways if called concurrently.

The linked list implementation is in kern/lib/list.c. Note that the directory also holds additional (thread-unsafe) data structures related to collections: a hash table, heap, queue, and bitmap. The header files are in kern/include, the implementations are in kern/lib, and the test code is in kern/test. All take an arbitrary object as the item to be stored in the collection. There is also a thread-unsafe bitmap. You may find these useful in later assignments.

Instead of changing the existing linked list implementation, create a new version that is thread-safe. These should use the sequential implementation, rather than replace the sequential version. All of the collection modules have a "put" and a "get" operation, to put an item into the collection, or to get an item out of the collection. Your thread-safe version should implement two types of get: one that returns NULL if the collection is empty, and one that waits until an item exists, and then returns it. (Likewise, a thread-safe bitmap would have a function that waits until a bit is free before returning from set.) You will need to write your own test code for these thread-safe implementations. Describe your test strategy in comments in your source file.

General Guidelines

Your thread-safe linked list MUST:

In other words, your thread-safe linked list must NOT:

Submitting Your Assignment

By the deadline, you should push the following to your group's repo and then upload a tar file with the entire os161 directory to catalyst:

You do not need to print out anything for this assignment.

Notes: Unit Testing

A few things to keep in mind for unit tests: