Thread-safe Libraries
In the beginning (and, really, up till a few years ago), UNIX machines
only allowed a single thread per process. During this period, many
libraries were written (including the standard C library) and the
authors natrually assumed that processes had one thread each. So they
stored data in global variables and didn't bother with
synchronization.
Threads were introduced, and people started writing multi-threaded
programs. Quite often, several threads would be using library
functions at the same time - this caused problems; though the
programmer had thought they were being safe, the library was
corrupting things (remember that libraries are just pieces of code
that you don't have to write - otherwise, they're the same as if you
had written the code).
For example, lets imagine had been written scanf(3)
as
follows:
scanf() |
static char buf[256];
int scanf(char *format, ...) {
read(STDIN, buf, 255);
work on buf to pull out matches
return matches
}
|
What's wrong? Well, buf
is a static variable, and is thus
shared across the entire program. If two threads call scanf at the
same time, they'll wind up overwriting the buf with some incorrect
values.
Now, this example is fairly trivial to fix (we can just allocate a new
buffer on each call to scanf), but others were not. The simplest
solution was often just to add a lock around the entire function:
scanf() |
static char buf[256];
static lock_t scanf_lock;
int scanf(char *format, ...) {
acquire scanf_lock;
read(STDIN, buf, 255);
work on buf to pull out matches
release scanf_lock;
return matches
}
|
The addition of the lock makes this function thread-safe: it
will produce the correct results if it is accessed by multiple threads
simultaneously.
However, it isn't very efficient; there isn't really any reason a
thread should have to wait for all the other threads to be done before
it can use scanf (well, actually, there might be interesting problems
with read(2)
- but we'll ignore those). This means the
function is not MP-efficient.
pthreads vs. LinuxThreads
pthreads refer to "POSIX threads," meaning threads that behave as
specified in the POSIX standard 1003.1c. LinuxThreads is a library
that proivdes POSIX threads for Linux. It uses the
clone(2)
syscall to create new threads and also provides
the synchronization functions needed to implement the POSIX 1003.1c
specification. While I'll use functions from pthreads, the concepts
here are general across all systems (in fact, I copied my notes from a
another class which did not use pthreads and just substituted the
pthread functions :).
Synchronization Primitives
We'll discuss two concepts that are independent parts of synchronization:
- Mutual exclusion: Protection of some data from simultaneous
access by multiple threads.
- Inter-thread scheduling: Communication between threads
that some event happened; blocking for events in other threads.
General points about synchronization:
- Necessary when multiple thread have access to the same
data.
- Can't be used in interrupt handlers (except
sem_post(3)
),
because interrupt handlers must not block (same goes for UNIX
signal handlers)
- Don't forget to release the lock, re-enable interrupts,
etc. Tricky with multiple exits (e.g. exceptions).
- Hardware generally provides some help. Disable/enable interrupts, test
& set, etc.
- Synchronization bugs can be very difficult to find - so make
sure you understand your design, and try to keep it simple.
Locks/Mutexs
- Provides mutual exclusion.
- Simple to use:
pthread_mutex_lock(3)
and pthread_mutex_unlock(3)
.
- Make a critical section (note that a single critical section may
be discontiguous if a single lock is acquired/released in multiple functions).
- One of the few constructs which is actually "held" by an
identifiable thread.
- Usually 1 lock associated with each data object or collection
type (e.g. a lock per list).
- Granularity of locking and order of acquiring/releasing locks is
a issue in all systems because of deadlock.
- What happens if you call
pthread_mutex_lock
with
the same lock twice in a row?
Thread Exit/Join
- Provides inter-thread scheduling.
- Join waits for a particular thread to exit (either by
explicitly calling
pthread_exit(3)
or by returning
from the thread's initial function.
- Example: Thread
A
creates thread B
and sets B
off computing some value. A
then computes on it's own, but eventually needs the value
B
was computing. So it joins with B
,
to make sure B
was actually finished.
pthread_exit(3)
and pthread_join(3)
Semaphores
- Provides both mutual exclusion and inter-thread scheduling.
- Have memory of past (current value of counter).
- Other sync. constructs only know current state.
- In general, more complicated then worth. Why?
- Because they mix two functions (scheduling and mutual exclusion),
thus, you can run into deadlock/other problems easily.
- Monitors split these functions making them easier to debug.
sem_wait(3)
and sem_post(3)
.
Monitors/Condition Variables
- Idea in monitors is to separate
concerns: use locks for mutual exclusion and condition variables for
scheduling constraints. (CS162 Lecture Notes, UC
Berkeley)
- Condition variable: a queue of threads waiting for something inside a critical section
- condition variable(s) + lock = monitor.
- Lock protects the data
structure; typically 1 lock per object.
- Condition variables help the scheduler synchronize the actions
on the object efficiently. Possibly multiple CVs per object (all
using the same lock).
- Can only use the methods of the cv while holding the lock.
pthread_cond_wait(3)
: Atomically do [release lock and sleep].
pthread_cond_signal(3)
: Wake a single waiter, if any
currently waiting.
pthread_cond_broadcast(3)
: Wake all currently waiting waiters.
Condition Variables - Why?
So, why do we need these condition variable things anyway? We'll use a
common example, an unlimited size buffer, and attempt to solve it
without the condition variable.
We'll assume the following code is used to initialize the system appropriatly:
Initialization |
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t data_available = PTHREAD_COND_INITIALIZER;
buffer = empty buffer;
|
The correct version, with condition variables. We assume that two
threads are using the code, one calling AddToBuffer and another calling
RemoveFromBuffer.
AddToBuffer | RemoveFromBuffer |
pthread_mutex_lock(&lock);
put item in buffer;
pthread_cond_signal(&data_available);
pthread_mutex_unlock(&lock);
|
pthread_mutex_lock(&lock);
while (nothing in buffer) {
pthread_cond_wait(&data_available, &lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
|
First, lets try just removing condition variable:
AddToBuffer | RemoveFromBuffer |
pthread_mutex_lock(&lock);
put item in buffer;
pthread_mutex_unlock(&lock);
|
pthread_mutex_lock(&lock);
while (nothing in buffer) {
;
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
|
Obviously no good - the while loop holds the lock, meaning no way to
run AddToBuffer, so while will run forever....
OK, so release the lock in the while loop (but remember to reacquire
it before checking if anything is in the buffer):
AddToBuffer | RemoveFromBuffer |
pthread_mutex_lock(&lock);
put item in buffer;
pthread_mutex_unlock(&lock);
|
pthread_mutex_lock(&lock);
while (nothing in buffer) {
pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
|
This code will work (surprisingly), but very slowly - if one thread
enters RemoveFromBuffer (with an empty buffer) and then another thread
enters AddToBuffer, the only way to make progress is to context switch
out of RemoveFromBuffer in-between the two lock statements. That's no good.
Well, how about we stick a sched_yield(2)
in there to speed things
up a bit - that way, at least the scheduler will have a better chance
of context switching us there. (That the algorithm is correct, but
very inefficient is a good sign we have an inter-thread scheduling
problem, not a mutual exclusion problem).
AddToBuffer | RemoveFromBuffer |
pthread_mutex_lock(&lock);
put item in buffer;
pthread_mutex_unlock(&lock);
|
pthread_mutex_lock(&lock);
while (nothing in buffer) {
pthread_mutex_unlock(&lock);
sched_yield();
pthread_mutex_lock(&lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
|
That is looking a little better, but a yield isn't quite right - it is
quite likely we will have to go around the while loop many, many times
before some thread actually adds anything to the buffer. What we
really want is to sleep until there is data available, and have
AddToBuffer wake us up.
Lets try and do that by adding a queue to hold waiting threads
(call it waiters
).
AddToBuffer will take the first thread off of the queue and wake it up.
We can then change the sched_yield(2)
to a sleep(3)
,
since AddToBuffer should wake us up...
AddToBuffer | RemoveFromBuffer |
pthread_mutex_lock(&lock);
put item in buffer;
if (waiters not empty)
waiters.next().sendAlarm(); //
wakeup time
pthread_mutex_unlock(&lock);
|
pthread_mutex_lock(&lock);
while (nothing in buffer) {
1 pthread_mutex_unlock(&lock);
2 waiters.enqueue(this
thread);
3 sleep(1000000); // sleep a
long time
4 pthread_mutex_lock(&lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
|
OK, that looks good. Where is the problem? Consider a context
switch from thread A between lines 1 and 2 in
RemoveFromBuffer to thread B just entering AddToBuffer. Thread B
would successfully add the item to the buffer, and then check if any
waiters were available. Since thread A has not yet added itself to the
waiters queue, it wouldn't find any, and it would return. At some
point, thread A would run again, and would sleep - even though there
is actually data available in the buffer.
What we need is an atomic version of lines 1 through
4. Hmm, that happens to be exactly what a condition variable does.
Note that we were able to solve the locking/consistency part of the problem
without condition variables, but to get the scheduling part right, we needed
them.
You may also be wondering: When would we ever need more than 1
condition variable? Here is a quick example where we add another CV,
bufferEmpty
:
AddToBuffer | RemoveFromBuffer | HandleAnEmptyBuffer |
pthread_mutex_lock(&lock);
put item in buffer;
pthread_cond_signal(&data_available);
pthread_mutex_unlock(&lock);
|
pthread_mutex_lock(&lock);
while (nothing in buffer) {
pthread_cond_wait(&data_available, &lock);
}
remove item from buffer;
if (buffer empty)
bufferEmpty.wakeAll();
pthread_mutex_unlock(&lock);
return item;
|
pthread_mutex_lock(&lock);
while (something in buffer) {
pthread_cond_wait(&buffer_empty, &lock);
}
yell at roommate for hosing the network;
pthread_mutex_unlock(&lock);
|
Basically, anytime you have multiple different conditions, use another
condition variable - pretty simple, really.