Lecture: Concurrency and scheduling
preparation
- read the xv6 book: §5, Locking and §6, Scheduling
- take a look at lab lock
administrivia
overview
- multicore CPUs
- how to use multicore CPUs correctly and efficiently
- often times things come down to this
- this class focuses on mechanisms: locks, threads, and scheduling
locks
- example: multi-threaded hash table (ph.c)
- why missing keys
- data races: concurrent
put()s
- all try to set
table[0]->next - one winner, others lose
- race detection using ThreadSanitizer
- how to fix the bug - use lock(s) to protect
put()
- coarse-grained: one lock for the entire table
- fine-grained: per-bucket locks, or even per-entry
- trade-off
- why
__sync_fetch_and_add and __sync_bool_compare_and_swap for done?
- would
done += 1; while (done < nthread); work?
gcc -O2: infinite loop - why?
- how about
get()?
- locks
- mutual exclusion: only one core can hold a given lock
- “serialize” critical section: hide intermediate state
- example: transfer money from account A to B
put(a + 100) and put(b - 100) must be both effective, or neither
- lock implementation
- strawman
- hw: draw cores, caches, bus, RAM
- try it on ph.c: what can go wrong?
struct lock { int locked; };
void acquire(struct lock *l)
{
for (;;) {
if (l->locked == 0) { // A: test
l->locked = 1; // B: set
return;
}
}
}
void release(struct lock *l)
{
l->locked = 0;
}
- atomic exchange: combine test and set into one atomic step
- gcc’s
__sync builtins
__sync_lock_test_and_set(ptr, value): atomically write value to ptr and return old value
__sync_lock_release(ptr): write 0 to ptr
- show assembly code of
acquire(): xchg instruction
- if
l->locked was 1, set it to 1 again & return 1
- if
l->locked was 0, at most one xchg would see & return 0
- alternatives
- RISC-V:
amoswap.w instructions
- C11 atomics, assembly
- the problem is pushed down to hardware
- guess how
xchg is implemented
- understand the performance overhead
- memory consistency models next week
void acquire(struct lock *l)
{
while (__sync_lock_test_and_set(&l->locked, 1) != 0)
;
}
void release(struct lock *l)
{
__sync_lock_release(&l->locked);
}
- spinlocks in xv6
- show
kernel/spinlock.h, kernel/spinlock.c
- what’s
pushoff()/popoff()? turn off/on interruptps in the kernel
- see Figure 2: U54 Interrupt Architecture Block Diagram of the SiFive U-54 manual
threads & scheduling
- goal: virtualizing time
- thread = stack (state) + virtual CPU registers
- each thread thinks it has a dedicated CPU
- kernel runs each in turn on a physical CPU
- analogy: virtual memory vs. physical memory
- scheduling in xv6
- 1 user thread and 1 kernel thread per process
- 1 scheduler thread per processor
- locks to protect shared data structures and resources
- cooperative scheduling for kernel threads
- threads give up control by yielding
- two switches: thread 1 → scheduler → thread 2
scheduler() in kernel/proc.c
swtch() in kernel/swtch.S - what does ret return to?
- lab uthread
- preemptive scheduling for user threads
- how to force a thread to give up control?
- per-processor timer interrupt (every 100 ms): user → kernel
usertrap()/kerneltrap() (kernel/trap.c) → yield() (kernel/proc.c) → sched() (kernel/proc.c)
- switch to a different thread, then kernel → user
- now you have a complete picture of how kernel works
- user space is running process p (say
sh)
- p traps into the kernel upon timer (preemptive)
- the kernel switches from p to the scheduler (cooperative)
- the scheduler switches to q (say
ls)
- q returns to user space to resume execution
- scheduling policy
labs