Lecture: Concurrency and scheduling
preparation
- read the xv6 book: §5, Locking and §6, Scheduling
- take a look at lab lock
administrivia
overview
- multicore CPUs
- how to use multicore CPUs correctly and efficiently
- often times things come down to this
- this class focuses on mechanisms: locks, threads, and scheduling
locks
- example: multi-threaded hash table (ph.c)
- why missing keys
- data races: concurrent
put()
s
- all try to set
table[0]->next
- one winner, others lose
- race detection using ThreadSanitizer
- how to fix the bug - use lock(s) to protect
put()
- coarse-grained: one lock for the entire table
- fine-grained: per-bucket locks, or even per-entry
- trade-off
- why
__sync_fetch_and_add
and __sync_bool_compare_and_swap
for done
?
- would
done += 1; while (done < nthread);
work?
gcc -O2
: infinite loop - why?
- how about
get()
?
- locks
- mutual exclusion: only one core can hold a given lock
- “serialize” critical section: hide intermediate state
- example: transfer money from account A to B
put(a + 100)
and put(b - 100)
must be both effective, or neither
- lock implementation
- strawman
- hw: draw cores, caches, bus, RAM
- try it on ph.c: what can go wrong?
- atomic exchange: combine test and set into one atomic step
- gcc’s
__sync
builtins
__sync_lock_test_and_set(ptr, value)
: atomically write value
to ptr
and return old value
__sync_lock_release(ptr)
: write 0 to ptr
- show assembly code of
acquire()
: xchg
instruction
- if
l->locked
was 1, set it to 1 again & return 1
- if
l->locked
was 0, at most one xchg
would see & return 0
- alternatives
- RISC-V:
amoswap.w
instructions
- C11 atomics, assembly
- the problem is pushed down to hardware
- guess how
xchg
is implemented
- understand the performance overhead
- memory consistency models next week
- spinlocks in xv6
- show
kernel/spinlock.h
, kernel/spinlock.c
- what’s
pushoff()
/popoff()
? turn off/on interruptps in the kernel
- see Figure 2: U54 Interrupt Architecture Block Diagram of the SiFive U-54 manual
threads & scheduling
- goal: virtualizing time
- thread = stack (state) + virtual CPU registers
- each thread thinks it has a dedicated CPU
- kernel runs each in turn on a physical CPU
- analogy: virtual memory vs. physical memory
- scheduling in xv6
- 1 user thread and 1 kernel thread per process
- 1 scheduler thread per processor
- locks to protect shared data structures and resources
- cooperative scheduling for kernel threads
- threads give up control by yielding
- two switches: thread 1 → scheduler → thread 2
scheduler()
in kernel/proc.c
swtch()
in kernel/swtch.S
- what does ret
return to?
- lab uthread
- preemptive scheduling for user threads
- how to force a thread to give up control?
- per-processor timer interrupt (every 100 ms): user → kernel
usertrap()/kerneltrap()
(kernel/trap.c
) → yield()
(kernel/proc.c
) → sched()
(kernel/proc.c
)
- switch to a different thread, then kernel → user
- now you have a complete picture of how kernel works
- user space is running process p (say
sh
)
- p traps into the kernel upon timer (preemptive)
- the kernel switches from p to the scheduler (cooperative)
- the scheduler switches to q (say
ls
)
- q returns to user space to resume execution
- scheduling policy
labs