Lecture 26 — programming with locks and critical sections

race condition

when the result of computation depends upon scheduling

i.e., the order in which the processor executes instructions

can only exist when there are multiple threads

usually causes a problem by exposing intermediate state

bad interleavings

one kind of race condition

consider a peek method for a stack data structure:

int Stack::peek() {
int ans = pop();
push(ans);
return ans;
}

pop and push use appropriate locking to prevent bad interleavings

not great style, but correct under sequential programming

if we couldn’t access stack internals, we would have to do it this way

even though peek overall has no effect on the stack, its intermediate state is inconsistent

consider bad interleavings:

http://courses.cs.washington.edu/courses/cse374/16wi/lectures/peek-interleavings.pdf

we need to lock around the entirety of peek

what if peek didn’t modify the stack?

boolean Stack::isEmpty() { // unsynchronized
return index==-1;
}

void push(int val) {
std::lock_guard<std::mutex> lg(some_mutex);
array[++index] = val;
}

int pop() {
std::lock_guard<std::mutex> lg(some_mutex);
return array[index--];
}

int peek() { // unsynchronized
return array[index];
}

it looks like peek and isEmpty should be able to get away with not using the lock

WRONG: push and pop might take several steps, so peek and isEmpty could see an inconsistent intermediate state

this situation is the other kind of race condition: a data race

when, for some data, a read or write can occur at the same time as another write

more about why this is a problem next lecture

bad interleaving vs data race

data race: simultaneous read/write or write/write of the same memory location

(for mortals) always an error, due to compiler and hardware (next lecture)

original peek example has no data races

bad interleaving: despite lack of data races, exposing bad intermediate state

what intermediate state counts as “bad” depends on your implementation

original peek had several bad interleavings

getting it right

conventional wisdom for concurrent programming

For every memory location (e.g., object field) in your program, you must obey at least one of the following:

thread-local: do not use the location in more than 1 thread

immutable: do not write to the location

synchronized: use synchronization (like locks) to control access to location

thread-local

whenever possible, avoid sharing resources

each thread has its own copy of a resource

only correct is threads do not need to communicate through the resource

for example, random number generator

in typical concurrent programs, the vast majority of objects should be thread-local: shared-memory should be rare – minimize it

immutable

whenever possible, do not update objects

make new objects instead

if a location is only read, never written, then no synchronization is necessary

simultaneous reads are not data races, and not a problem

in practice, programmers over-use mutation — minimize it

synchronization

guideline #0: no data races

never allow two threads to read/write or write/write the same location at the same time

in Java or C, a program with a data race is almost always wrong

guideline #1: consistent locking

for each location needing synchronization, have a lock that is always held when reading or writing the location

this lock is said to guard the location

the same lock may (and often should) guard multiple locations

clearly document the guard for each location

consistent locking is conceptual — up to the programmer to follow it

lock granularity

coarse-grained: fewer locks (each lock guard more locations)

one lock for entire data structure

one lock for all bank accounts

fine-grained: more locks (each lock guards fewer locations)

one lock per data element

one lock per bank account

coarse vs fine is a continuum

exercise: trade-offs?

coarse-grained advantages

simpler to implement

faster/easier to implement operations that access multiple locations

especially operations that modify data structure shape

fine-grained advantages

more simultaneous access (i.e., better performance)

guideline #2: start with coarse-grained and move to fine-grained only if blocking on coarser locks becomes an issue

critical section granularity

how much work should be done while holding the lock?

too long = performance loss

too short = bugs

guideline #3: do not do expensive computations or I/O in critical sections, but also don’t introduce race conditions

example: http://courses.cs.washington.edu/courses/cse374/16wi/lectures/critical-section-granularity.pdf

atomicity

an operation is atomic if no other thread can see it partly executed

guideline #4: think of what operations need to be atomic

think about atomicity first and locks second

don’t roll your own

rare that you should write your own data structure

especially true for concurrent data structures

standard thread-safe libraries written by world experts

guideline #5: use built-in libraries whenever they meet your needs