CSE 451 Assignment 1: Synchronization

Due Wednesday at 9:00 PM April 23, 2014

Objectives

Understand how OS/161 implements semaphores.
Develop and implement synchronization primitives.
Conceive of and implement unit tests for said primitives.
Be able to select the appropriate primitive(s) for a given specification.
Use synchronization primitives appropriately to solve synchronization problems.

Introduction

In this assignment you will implement synchronization primitives for OS/161 and learn how to write tests for them. Then you will use your synchronization primitives to solve a few synchronization problems. Once you have completed the written and programming exercises, you should have a fairly solid grasp of the pitfalls of concurrent programming and, more importantly, how to avoid those pitfalls in the code you will write later this quarter.

To complete this assignment you'll need to become familiar with the OS/161 thread code. The thread system provides interrupts, control functions, and semaphores. You will implement locks, condition variables, and thread join.

Write readable code!

In your programming assignments, you are expected to write well-documented, readable code. There are a variety of reasons to strive for clear and readable code. Since you will be working in pairs, it will be important for you to be able to read your partner's code. Also, since you will be working on OS/161 for the entire quarter, you may need to read and understand code that you wrote several weeks earlier. Finally, clear, well-commented code makes your TAs happy!

There is no single right way to organize and document your code; it is not our intent to dictate a particular coding style for this class. The best way to learn about writing readable code is simply to read a lot of code. Read the OS/161 code, or the source of code of some other freely available operating system. Read your partner's code, too. When you read someone else's code, note what you like and what you don't like. Pay close attention to the lines of comments which most clearly and efficiently explain what is going on. When you write code yourself, keep these observations in mind.

Here are some general tips for writing better code:

Group related items together, whether they are variable declarations, lines of code, or functions.
Use descriptive names for variables and procedures. Be consistent with this throughout the program.
Comments should describe the programmer's intent, not the actual mechanics of the code. A comment which says "Find a free disk block" is much more informative than one that says "Find first non-zero element of array."

Setting up git

You will need to pull the code modifications for ASST1. Assuming your git repository is set up according to the group repository instructions, issue the following commands to grab the new code:

  % git fetch distrib
  % git merge distrib/master

To confirm you have the changes, you can list your current set of commits with:

  % git log

If the first commit listed has the commit message "ASST1", then you're good to go.

Once you have the new changes, make sure at least one person in your group updates your group repository:

  % git push

Begin Your Assignment

First, "tag" your Git repository. The purpose of tagging your repository is to make sure that you have something against which to compare your final tree.

Make sure that you do not have any uncommitted updates in your repo. Now, tag your repository as shown here:

  % git tag asst1-start

Note that git, by default, does not push tags to remote repositories. To explicitly push your tags, do this:

  % git push origin <tagname>

In this case, for example, at least one person in your group would type:

  % git push origin asst1-start

Configure OS/161 for ASST1

We have provided an example framework in which you can implement and test your solution for ASST1. This framework consists of driver code, found in kern/test/netqueuetest.c, and a menu item that you can use to execute the test from the OS/161 kernel boot menu. You have to reconfigure your kernel before you can use this framework. The procedure for configuring a kernel is the same as in ASST0, except you will use the ASST1 configuration file:

  % cd kern/conf
  % ./config ASST1

You should now see an ASST1 directory in the compile directory.

Building for ASST1

When you built OS/161 for ASST0, you ran bmake from compile/DUMBVM. In ASST1, you run bmake from (you guessed it) compile/ASST1.

  % cd compile/ASST1
  % bmake depend
  % bmake
  % bmake install

If you are told that the compile/ASST1 directory does not exist, make sure you ran config for ASST1.

Command Line Arguments to OS/161

Your solutions to ASST1 will be tested by running OS/161 with command line arguments that correspond to the menu options in the OS/161 boot menu. Please DO NOT change any existing menu option strings, since we will be using them to automate testing.

"Physical" Memory

In order to execute the tests in this assignment, you may need more than the 512 KB of memory configured into System/161 by default. We suggest that you allocate at least 2MB of RAM to System/161. This configuration option is passed to the mainboard device with the ramsize parameter in your sys161.conf file. Make sure the mainboard device line looks like the following:

  31	mainboard  ramsize=2097152  cpus=2

Note: 2097152 bytes is 2MB. You can of course add even more memory if you feel it is necessary for your testing. Adding more CPUs will probably enhance your testing as well.

Concurrent Programming with OS/161

If your code is properly synchronized, the timing of context switches and the order in which threads run should not change the behavior of your solution. Of course, your threads may print messages in different orders, but you should be able to easily verify that they follow all of the constraints applied to them and that they do not deadlock.

Built-in thread tests

When you booted OS/161 in ASST0, you may have seen the options to run the thread tests (type ? at the menu for a list of commands). The thread test code uses the semaphore synchronization primitive. You should trace the execution of one of these thread tests in GDB to see how the scheduler acts, how threads are created, and what exactly happens in a context switch. You should be able to step through a call to thread_switch() and see exactly where the current thread changes.

Thread test 1 ("tt1" at the prompt or tt1 on the kernel command line) prints the numbers 0 through 7 each time each thread loops. Thread test 2 ("tt2") prints only when each thread starts and exits. The latter is intended to show that the scheduler doesn't cause starvation—the threads should all start together, spin for awhile, and then end together.

Debugging concurrent programs

thread_yield() is automatically called for you at intervals that vary randomly. While this randomness is fairly close to reality, it complicates the process of debugging your concurrent programs.

The random number generator used to vary the time between these thread_yield() calls uses the same seed as the random device in System/161. This means that you can reproduce a specific execution sequence by using a fixed seed for the random number generator. You can pass an explicit seed into random device by editing the "random" line in your sys161.conf file. For example, to set the seed to 1, you would edit the line to look like:

  28 random seed=1

We recommend that while you are writing and debugging your solutions you pick a seed and use it consistently. Once you are confident that your threads do what they are supposed to do, set the random device to autoseed. This should allow you to test your solutions under varying conditions and may expose scenarios that you had not anticipated.

(Note that for assignment 3, the disk device has a random delay that means that even if you use the same seed, you may not get reproducible results.)

Code Reading (10 points)

To implement synchronization primitives, you will have to understand the operation of the threading system in OS/161. It may also help you to look at the provided implementation of semaphores. When you are writing solution code for the synchronization problems it will help if you also understand exactly what the OS/161 scheduler does when it dispatches among threads.

Please answer the following questions and submit them with your assignment. Place the answers in a file called asst1.pdf or asst1.txt in your submit directory.

Thread Questions

What happens to a thread when it exits (i.e., calls thread_exit())? What about when it sleeps?
What function(s) handle(s) a context switch?
What does it mean for a thread to be in each of the possible thread states?
What does it mean to turn interrupts off? How is this accomplished? Why is it important to turn off interrupts in the thread subsystem code?
What happens when a thread wakes up another thread? How does a sleeping thread get to run again?

Scheduler Questions

What function(s) choose(s) the next thread to run?
How does it (do they) pick the next thread?
What role does the hardware timer play in scheduling? What hardware independent function is called on a timer interrupt?

Synchronization Questions

Describe how wchan_sleep() and wchan_wakeone() are used to implement semaphores.
Why does the lock API in OS/161 provide lock_do_i_hold(), but not lock_get_holder()?

Synchronization Primitives (60 points)

We know: you've been itching to get to the coding. Well, you've finally arrived! You can implement these synchronization primitives in terms of any other primitive(s), but you should put some thought into choosing the right implementation.

Locks (15 Points)

Implement locks for OS/161. The interface for the lock struct is defined in kern/include/synch.h. Stub code is provided in kern/thread/synch.c.

Condition Variables (15 Points)

Implement condition variables with Mesa semantics for OS/161. The interface for the cv struct is also defined in synch.h and stub code is provided in synch.c.

Thread Join (15 Points)

Implement the thread_join operation to allow a parent thread to wait for a child thread to complete; only the parent thread can join, and only if it notifies the thread system when the thread is created that it plans to join. The join should return the integer returned by the forked thread. Your solution must ensure that thread control blocks are garbage collected properly in both the join and no-join case.

General Guidelines

You may NOT use semaphores for your lock and condition variable implementation; while it is possible to do that, it is likely to lead you astray. Instead, use the various operations defined on wchan.

You may use semaphores or locks/condition variables for your thread_join implementation, but you are welcome to use wchan instead.

Unit Testing (15 points)

It's important to get in the habit of writing your own tests. The tests we provide, while ideally useful to you, do not necessarily ensure robust coverage and correctness of your code. After all, you know your code better than we do! It's also important to be able to communicate your testing needs (and other coding needs) to others. In this section, you will exercise all of these skills.

In order to verify the correctness of your synchronization primitives, we ask that you write some unit tests for your lock, CV, and join implementations. Unit tests are small, modular pieces of test code intended to verify a single, atomic unit of functionality (hence the name). You should decide for yourself what sorts of things you think qualify as "atomic units of functionality" in the context of your lock and CV implementations, and make a case for their sufficiency in the comments of your unit tests.

Unit testing is a bit trickier in the presence of asynchronicity, so take care to make sure your tests are reliable—that is, your unit tests should succeed if and only if your primitives are correct. Remember that you have thread routines like thread_yield() at your disposal.

Synchronization Problem (30 points)

Besides being able to develop your own implementations and tests, being able to implement according to another person's specification is also a necessary skill, and one which will be the cornerstone of ASST2. For the following synchronization problem, you should write code that demonstrates a solution to the problem and then figure out a way to test that your solution works. Describe your test strategy in comments in your solution source files.

General Guidelines

Your solutions to the synchronization problem below must NOT:

busy-wait or
use spinlocks directly.

Your solution to the synchronization problem MUST:

use locks and condition variables,
either follow the best practices outlined in the textbook, or explain why they are inappropriate to this problem.

Additionally, you should stay clear of biglock solutions, where you use a single global lock to take care of synchronization for all your threads and objects. Instead, take some time to design your solution so that it uses a finer-grained synchronization scheme, or primitives which are better suited to the particular situation. That said, the problem below may have a number of valid and effective solutions—choose whichever one you think is best, and make a case for why you believe it to be so. The problem is described below at a high level; additional details can be found in comments in the driver files.

For the problem, explain which synchronization primitive(s) you chose and why you chose them. State what the correctness criteria are to solve the problem, and then write clean, simple code to achieve those criteria.

Problem: Implement a multi-threaded bounded network queue (or interprocess pipe) that stores k items and provides fair allocation among connections that share the queue. The queue is initialized with a call to void pktinit(). Threads put items into the queue (for connection c) with void putpkt(c, pkt); the network pulls packets off the queue with pkt = getpkt(). By fair allocation, we mean that getpkt() should return packets in round robin order among connections. Our test harness creates 10 connections, but your implementation should not assume it knows how many connections there will be. In particular, your solution should work when the buffer size is smaller than the number of connections.

The test harness is located in kern/test/netqueuetest.c and can be run from the kernel using the command nqu. There are stub definitions for pktinit, putpkt, and getpkt; you are allowed to implement these functions directly in netqueuetest.c.

Submitting Your Assignment

By the deadline, you should push the following to your group's repo and then upload a tar file with the entire os161 directory to catalyst:

The file asst1.pdf or asst1.txt in the submit/ directory containing answers to the questions.
Completed implementations of locks, condition variables, and thread join.
Completed solutions to the synchronization problem.
A test suite demonstrating that your implemention works.
A git tag asst1-end on the final commit of your assignment.

You do not need to print out anything for this assignment.

Notes: Unit Testing

A few things to keep in mind for unit tests:

Tests should be runnable from the menu and should be located in a separate file from the existing tests called "asst1_tests.c".
Adding a new file for the tests requires modifying other files in the kernel including conf.kern, test.h, and menu.c.
Add a single command "a1a" to the menu, mapped to a function "asst1_tests" that runs all of the tests in your test suite (that is "asst1_tests" should call in turn each of the functions in your test suite).
Each test should be for an atomic unit of functionality.
You may wish to write your unit tests before implementing your code (see Test-Driven Development)
Please provide comments in your test suite (and your code in general) explaining what you are testing and how this particular test ensures that functionality.
Test names should be descriptive.
Tests should have error and success messages.
Do not share code between groups, however you can share what functionality you are testing for. We reserve the right to use other groups test suite on your code and vice versa.