Lecture: file system Q&A

preparation

do the Yggdrasil exercise

administrivia

lab X
- feel free to talk to us
- lab 6 takes time: allocate your time carefully
- demo
- no late days; see the grading and late day policy

recap

high-level structure: syscalls - FS - disk
- real-world I/O stack is complex - Linux
syscalls (POSIX)
- semantics are tricky
- example: update a new file
  - why not just directly writing to file - can leave an incomplete new file if crash in the middle
  - does this guarantee that users will see either the old or the new file?
  - how many inodes are being changed in this case?
- POSIX is underspecified
  - the behavior may vary across file systems
  - safe fix: insert fsync(fd) before close
  - maybe even fsync the directory
- if you are interested, learn more about crash consistency

int fd = open("file.tmp", ...);
write(fd, newdata, newdatasize);
close(fd);
rename("file.tmp", "file");

disk
- HDD (magnetism), SSD (NAND flash), 3D XPoint (unclear yet)
- Flash Translation Layer (FTL)
  - emulate the block device interface
  - logical (virtual) block to physical block: how does this compare to virtual memory
  - widely used in SSDs
file systems
- hierarchical on-disk datastructure
  - superblock
  - free bit maps
  - inodes (direct & indirect blocks)
  - files & directories
- crash safety
journaling
- ideas
  - step 1: write a “todo” list before destructive updates
  - step 2: replay the “todo list”
  - can you use an “undo” list instead?
- examples: ext3/ext4 (Linux), NTFS (Windows), HFS+ (macOS)
- example: LevelDB, key-value store - not a file system, but share many ideas
- downsides
  - write twice (once in the journal and once for the actual data)
  - performance (log)
  - how about running LevelDB on top of ext4?
copy-on-write (COW)
- don’t do destructive updates; reduce updates to one single write
- example: log-structured file systems
  - conceptually, the entire file system is a log
  - over-simplified example: see Figure 2
  - pros and cons?
  - see the original paper: The Design and Implementation of a Log-Structured File System, from SOSP 1991
  - also see this LWN article: Log-structured file systems: There’s one in every SSD
- examples: Btrfs (Linux), ReFS (Windows), APFS (macOS)
other approaches for crash safety
- best effort repair; sync metadata change + fsck (garbage collection)
- introduce redundancy: replications, checksums
- soft updates