Lecture: crash recovery

administrivia

problem: multi-step operations
- need to update multiple blocks
  - example: file create
    - allocate a new block
    - add dirent entry to parent directory
    - write file inode
  - which operations should happen first?
  - what happens if the operations don’t complete?
    - dirent entry points to uninitialized inode - reliablity & security
    - block allocated but not used - waste
    - compare to memory bugs: dangling pointers and leak
- how to tell if multi-block updates are incomplete?
- recovery: need to either continue with or undo incomplete updates
assumptions
- the system can crash at any arbitrary point
- fail-stop: the disk may miss writes
  - no arbitrary writes, no disk corruption
  - power failures, hardware failures, kernel panics
- single sector/block write is often atomic (by hw)
- ordering: can two writes be reordered? - more on this later
goal: crash safety
- the file system must be “usable” after reboot and recovery
- invariant: maintain consistent file system state
approaches
- write-ahead logging/journaling (xv6)
- other approaches
  - introduce redundancy: replications, checksums
  - best effort repair
  - sync metadata change + fsck (garbage collection)
  - copy on write
  - soft updates

basic idea: “transaction”
- log everything you intend to do before making any destructive changes
- mark “done” in the log
- do the changes
- delete the log
- if crash - try recovery upon reboot
  - “done” in log: redo writes from the log
  - no “done” in log: simply discard the log
- how about crash while replaying the log
- ensure disk won’t reorder these steps - need flush disk inbetween
- xv6: begin_op/log_write/end_op (log.c)
example: begin_op(); write(a1, v1); write(a2, v2); end_op();
- what if the system crashes before end_op()?
- what if the system crashes after end_op()?
- what if the system crashes during recovery?