CSE 451 Autumn 2002: Project 2 Part 1, User-level thread manager

Project 2 Part 1, User-level thread manager

Dates

Out: Sunday, January 26, 2003
Scheduler Design Doc Due: Thursday, January 30, 2003
Implementation Due: Friday, February 7, 2003

Tasks:

Write a Design Doc covering how the scheduler and the process control functions will work together.
Implement the functions related to thread creation, deletion, and scheduling

Updates/Announcements:

Tuesday, Feb 04:Starter code version 1.03 has been released. I actually finally got some time to test 1.02 and found a bunch of fairly obvious mistakes in the queue implementation. (Stupid Albert) This should be a working version. At least, my tests, I think, give good coverage ;) Please update your code.
Monday, Feb 03:Starter code version 1.02 has been released. This includes fixes for some bugs in the previous code (like forgetting to put in the stubs for sthread_join and cancel), and some correction on the sthread_queue (like making the iterator actually useful by providing a remove function). This code was kind of hammered out. Though I'm reasonably sure it is bug free, as always, check and report oddities.
Sunday, Jan 26: If you find errors or what look like errors, report it immediately. This is a relatively new project, so there may still be some bugs.

The Assignment

Implement Thread Scheduling

For part 1, we give you:

An implementation of a simple queue (sthread_queue.h).
A context-switch function that, given a new stack pointer, will switch contexts (sthread_ctx.h).
A new-stack function that will create and initialize a new stack such that it is ready to run.
Skeleton routines for a user-level thread system (sthread_user.c).
Code for atomic operations and an atomic datatype.

It is your job to complete the thread system and a test suite, implementing:

Data structures to represent threads (struct _sthread in sthread_user.c).
A routine to initialize your data structures (sthread_user_init()).
A preemptive thread scheduler (sthread_user_do_scheduler(), round-robin is fine).
A thread creation routine (sthread_user_create()).
A thread destruction routine (sthread_user_exit()).
A thread join routine (sthread_user_join()).
A thread cancel routine (sthread_user_cancel()).
A mechanism for a thread to voluntarily yield, allowing another thread to run (sthread_user_yield()).
A suite of tests that you believe can prove the correctness of your thread library. This may be 1 test, it may be 10. You will have to argue completeness

The routines in sthread_ctx.h do all of the stack manipulation, PC changing, register storing, and nasty stuff like that. Rather than focusing on that, this assignment focuses on managing the thread contexts. Viewed as a layered system, you need to implement the grey box below:

At the top layer, applications will use the sthread package (through the API defined in sthread.h). sthread.c will then call either the routines in sthread_pthread.c or your routines in sthread_user.c (it chooses based on the value passed to sthread_init()). Your sthread_user.c, in turn, builds on the sthread_ctx functions (as described in sthread_ctx.h).

Applications (the top-layer) may not use any routines except those listed in sthread.h. They must not know how the threads are implemented; they simply request threads be created (after initializing the library), and maybe request yeilds/exits. For example, they have no access to sthread_queue. Nor do they keep lists of running threads around; that is the job of the grey box.

Similarly, your grey box - sthread_user.c - should not need to know how sthread_ctx is implemented. Do not attempt to modify the sthread_ctx_t directly; use the routines declared in sthread_ctx.h.

Recommended Procedure

Find a partner and read over this assignment twice
Figure out what each function you need to implement should do. Look at some of the test programs to see usage examples. Also, read the pthread man pages for descriptions of its API as sthreads is modeled after pthreads (man -k pthread)
Examine the supporting routines we've provided (primarily in sthread_queue.h and sthread_ctx.h).
Design your threading algorithm: When are threads put on the ready queue? When are threads removed? Where is sthread_switch called? How do you invoke the scheduler? How are you going to keep shared structures with your scheduler consistent?
Figure out how to start a new thread and what to do about the initial thread (the one that calls sthread_user_init).
Talk to your fellow students and the TAs about your design.
Implement it.
Test it. The test programs provided are not adequate; for full confidence in your implementation, you'll need to create some of your own. Make sure you can justify why you think your test suite is thurough

Hints

sthread_create should not immediatly run the new thread.
Use the provided routines in sthread_ctx.h. You don't need to write any assembly or try to directly manipulate registers for this assignment, nor is an understanding of the exact layout of the stack required (in fact, the routines provided go to great lengths to avoid requiring such knowledge). However, understanding why the atomic primitives work and roughly how they work may be helpful.
The start routine passed to sthread_new_ctx does not take any arguments (unlike the routine that your sthread_user_create is passed). So you can't create a new stack directly with the user-supplied routine; you need to supply a routine that takes no arguments but somehow winds up invoking the user's routine with the user's argument.
The start routine passed to sthread_new_ctx will likely be one of 2 places where context switches will place you after scheduling is finished -- specifically, execution for a newly spawned thread will probably resume here. Make sure you do things like ensure the timer that invokes the scheduler is ticking down again. (Look at setitimer routine in schedule_handler)
If the routine passed to sthread_create() finishes (returns), you need to make sure the thread's resources are marked for clean up (i.e. sthread_exit() is called). The thread's resources only *have* to be cleaned up when it is joined with sthread_join()
If a thread exits, but is not joined, then, by specification, its context may not get cleaned up. If you write a user-level program and don't join your threads, you will leak resources. We won't be happy with a solution that does that.
You should not attempt to free the stack of a running thread (note: to free a stack, use shread_free_ctx()).
Use your thread struct to pass message to your scheduler. One major technique to cope with asynchrony is message passing
Unless you can prove it, don't assume that anything will eventually execute
Dealing with the initial thread is tricky. You need to make sure that an sthread_t struct is created for it at some point, so it can be scheduled like the other threads. However, it wasn't created by your sthread_user_create() function so it isn't going to be handled by sthread_join(). Remember that, while a thread is running, the state stored in the sthread_t is garbage (though it is probably important that you have a such a struct, so you've got some place to put the state when you want to stop the thread).
Be very careful using any local variables after calling sthread_switch() as their values are quite likely different from what they were before (you're on a different stack).
While globals should be avoided in general, there will be places in this assignment where they are necessary.
To debug with gdb(1) or any other debugger, run the configure script with --disable-shared (and then re-run make from the top level). This disables shared-libraries, which cause lots of trouble for the debugger. When you run the project, gdb will stop each time you create a new thread with a message like "Program received signal SIGUSR1, User defined signal 1." This is OK - just type continue.

Written Questions

Please answer the following questions after your implementation.

What was the hardest part of the assignment?
Would the project have been easier to implement if the scheduler were not preemptive? Explain.
Give at least one usage benefit of having a preemptive scheduler. Give a usage benefit of having a non-preemptive scheduler.
If you were given a mutex type where you could say "mutex_lock" and "mutex_unlock", would the code have been easier to develop? Would it have introduced any more complexities? Explain.
List any concerns about your implementation, and explain why/how you did address them. If there are concerns that you did not address, explain why you decided to ignore them.

Extra Credit

For each extra credit extension, you will be required to write up what you did as well as show a proof-of-concept for your implementation. The following is a list of suggestions for projects. You can always propose an extension for extra-credit consideration. However, these must be approved first, so make sure you contact us before spending too much time researching a topic.

Easier (doable)

Implement a differnt scheduling algorithm than round robin. (Perhaps support priorities?) Compare the differences between the two schedulers from an implementation viewpoint as well as from a user viewpoint
Propogate child exit and cancel calls to child threads. Make this a per-thread setting. You will need to add a new API function for this to set thread attributes.
Look at pthread_detach. This sets a thread so it deallocates itself without needing to be joined. Implement this. You will need to add a new API function to set the thread attributes
Look at pthread_cleanup_pop. Create a similar API for registering cleanup handlers that will be called when a thread finishes.

Harder (likely undoable in timeframe)

Implement multiple kernel threads servicing the user threads (many to many). Discuss this with the course staff to get started in the right direction.
Implement Scheduler Activations, or something aproximating this. :P (Don't try this without consulting staff first)

Turnin

Design Doc

Bring a hardcopy of your design document with you to quiz section on Thursday, January 30, 2003. Make sure the names of you and your partner are on the top of the paper. Your design document should cover the scheduler, sthread_user_create, sthread_user_exit, and sthread_user_yield. It does not have to cover sthread_user_join or sthread_user_cancel. The design doc should contain diagrams, pictures, notes, code fragments, and anything else you need to fully specify your design. Try to make it so that if I were a coder who just received your document, I could implement your design without having to think up any algorithmic or technical modifications. This means that you must explain what data structures you will use to communicate between different asynchornous parts of your program, how you will guarantee that the asychrony will not destroy the integrity of the data structures, as well as how the functions interact with each other (who calls what and why).

Implementation

In your top-level simplethreads directory, run make dist. This will produce a file named simplethreads-X.X.tar.gz that should contain all the code in your simplethreads directory.

If you have added any files, run tar tzf simplethreads-X.X.tar.gz and check to make sure your new files are listed.

Also submit a README with the answers to written questions on this assignment. Also include in the README, an list, as well as a description of any extra-credit you might have done.

Submit this file using the turnin(1L) program under project name project2-1 by 11:59pm on the day it is due.