|
|
|
Lecture 26 — programming with locks and critical sections
|
|
|
|
|
race condition
|
|
|
|
|
when the result of computation depends upon scheduling
|
|
|
|
|
i.e., the order in which the processor executes instructions
|
|
|
|
|
can only exist when there are multiple threads
|
|
|
|
|
usually causes a problem by exposing intermediate state
|
|
|
|
|
bad interleavings
|
|
|
|
|
one kind of race condition
|
|
|
|
|
consider a peek method for a stack data structure:
int Stack::peek() { int ans = pop(); push(ans); return ans; }
|
|
|
|
|
pop and push use appropriate locking to prevent bad interleavings
|
|
|
|
|
not great style, but correct under sequential programming
|
|
|
|
|
if we couldn’t access stack internals, we would have to do it this way
|
|
|
|
|
even though peek overall has no effect on the stack, its intermediate state is inconsistent
|
|
|
|
|
consider bad interleavings:
|
|
|
|
|
we need to lock around the entirety of peek
|
|
|
|
|
what if peek didn’t modify the stack?
|
|
|
|
|
boolean Stack::isEmpty() { // unsynchronized return index==-1; } void push(int val) { std::lock_guard<std::mutex> lg(some_mutex); array[++index] = val; }
int pop() { std::lock_guard<std::mutex> lg(some_mutex); return array[index--]; }
int peek() { // unsynchronized return array[index]; }
|
|
|
|
|
it looks like peek and isEmpty should be able to get away with not using the lock
|
|
|
|
|
WRONG: push and pop might take several steps, so peek and isEmpty could see an inconsistent intermediate state
|
|
|
|
|
this situation is the other kind of race condition: a data race
|
|
|
|
|
when, for some data, a read or write can occur at the same time as another write
|
|
|
|
|
more about why this is a problem next lecture
|
|
|
|
|
bad interleaving vs data race
|
|
|
|
|
data race: simultaneous read/write or write/write of the same memory location
|
|
|
|
|
(for mortals) always an error, due to compiler and hardware (next lecture)
|
|
|
|
|
original peek example has no data races
|
|
|
|
|
bad interleaving: despite lack of data races, exposing bad intermediate state
|
|
|
|
|
what intermediate state counts as “bad” depends on your implementation
|
|
|
|
|
original peek had several bad interleavings
|
|
|
|
|
getting it right
|
|
|
|
|
conventional wisdom for concurrent programming
|
|
|
|
|
For every memory location (e.g., object field) in your program, you must obey at least one of the following:
|
|
|
|
|
thread-local: do not use the location in more than 1 thread
|
|
|
|
|
immutable: do not write to the location
|
|
|
|
|
synchronized: use synchronization (like locks) to control access to location
|
|
|
|
|
thread-local
|
|
|
|
|
whenever possible, avoid sharing resources
|
|
|
|
|
each thread has its own copy of a resource
|
|
|
|
|
only correct is threads do not need to communicate through the resource
|
|
|
|
|
for example, random number generator
|
|
|
|
|
in typical concurrent programs, the vast majority of objects should be thread-local: shared-memory should be rare – minimize it
|
|
|
|
|
immutable
|
|
|
|
|
whenever possible, do not update objects
|
|
|
|
|
make new objects instead
|
|
|
|
|
if a location is only read, never written, then no synchronization is necessary
|
|
|
|
|
simultaneous reads are not data races, and not a problem
|
|
|
|
|
in practice, programmers over-use mutation — minimize it
|
|
|
|
|
synchronization
|
|
|
|
|
guideline #0: no data races
|
|
|
|
|
never allow two threads to read/write or write/write the same location at the same time
|
|
|
|
|
in Java or C, a program with a data race is almost always wrong
|
|
|
|
|
guideline #1: consistent locking
|
|
|
|
|
for each location needing synchronization, have a lock that is always held when reading or writing the location
|
|
|
|
|
this lock is said to guard the location
|
|
|
|
|
the same lock may (and often should) guard multiple locations
|
|
|
|
|
clearly document the guard for each location
|
|
|
|
|
consistent locking is conceptual — up to the programmer to follow it
|
|
|
|
|
lock granularity
|
|
|
|
|
coarse-grained: fewer locks (each lock guard more locations)
|
|
|
|
|
one lock for entire data structure
|
|
|
|
|
one lock for all bank accounts
|
|
|
|
|
fine-grained: more locks (each lock guards fewer locations)
|
|
|
|
|
one lock per data element
|
|
|
|
|
one lock per bank account
|
|
|
|
|
coarse vs fine is a continuum
|
|
|
|
|
exercise: trade-offs?
|
|
|
|
|
coarse-grained advantages
|
|
|
|
|
simpler to implement
|
|
|
|
|
faster/easier to implement operations that access multiple locations
|
|
|
|
|
especially operations that modify data structure shape
|
|
|
|
|
fine-grained advantages
|
|
|
|
|
more simultaneous access (i.e., better performance)
|
|
|
|
|
guideline #2: start with coarse-grained and move to fine-grained only if blocking on coarser locks becomes an issue
|
|
|
|
|
critical section granularity
|
|
|
|
|
how much work should be done while holding the lock?
|
|
|
|
|
too long = performance loss
|
|
|
|
|
too short = bugs
|
|
|
|
|
guideline #3: do not do expensive computations or I/O in critical sections, but also don’t introduce race conditions
|
|
|
|
|
atomicity
|
|
|
|
|
an operation is atomic if no other thread can see it partly executed
|
|
|
|
|
guideline #4: think of what operations need to be atomic
|
|
|
|
|
think about atomicity first and locks second
|
|
|
|
|
don’t roll your own
|
|
|
|
|
rare that you should write your own data structure
|
|
|
|
|
especially true for concurrent data structures
|
|
|
|
|
standard thread-safe libraries written by world experts
|
|
|
|
|
guideline #5: use built-in libraries whenever they meet your needs
|
|
|