File Systems Designs
Filesys Implementation
- filesys structures
- block bitmap
- track usage info of hundreds of millions of sector/block
- bitmap use 1 bit to encode each block's usage status, efficient and compact
- to reduce size of bitmap, can use 1 bit to track multiple blocks
- is this a persistent structure? where should we store the bitmap?
- blocks taken up by the bitmap are marked as used in the bitmap!
- metadata (inode) table
- reserved blocks on disk that stores metadata for file/directory
- metadata for the root directory typically has a known location in the table
- is this a persistent structure? where should we store the inode table?
- blocks taken up by the inode table are marked as used in the bitmap!
- superblock
- file system metadata, stored at known location on disk
- includes filesys size, block size, location of the bitmap and inode table
- can locate filesys structures just from the superblock
- data layout
- so far we just said metadata track where data is on disk, but how?
- depends on how the data is laid out on disk!
- basic techniques
- contiguous
- allocate consecutive blocks on disk
- metadata tracks the starting block number and the number of blocks
- how do you locate which block contains the ith byte of data?
- linked
- can allocate block anywhere on disk, each block stores data + a pointer to the next block
- metadata tracks the block number of the first data block
- how do you locate which block contains the ith byte of data?
- array
- store an array of block pointers, data blocks can be anywhere on disk
- limited by the size of array
- how do you locate which block contains the ith byte of data?
- indexed/indirection
- instead of storing an array inside the metadata itself, stores it inside a block
- metadata tracks the (index) block holding an array of blocks
- keeps the inode size small, more disk op is needed to find a data block
- how do you locate which block contains the ith byte of data?
- combined techniques
- extents
- one extent tracks a contiguous section of blocks
- track multiple extents via array and/or linked approach
- multilevel indexed pointers
- track an array of blocks, some point to the actual data, some to an indexed block, some to a doubly indexed block
Case Study: Fast File System (FFS)
- designed for good disk performance
- linux ext2 (1993-2001) and ext3 (2001-2006) uses this design
- data layout: multilevel index
- inode (metadata) stores 15 pointers to track data location
- first 12 are pointers to data blocks
- pointer 13 is a pointer to an indirect block
- pointer 14 is a pointer to a doubly indirect block
- pointer 15 is a pointer to a triply indirect block
- can locate data for small files quickly while still support large files
- any limitation for file size with this layout?
- what would the inode look like if the file is only 100 bytes?
- what would the inode look like if we do a write at 0, and a write at a large offset (sparse file)?
- normally append happens when we write past end of file
- POSIX also has a
lseek
system call that lets a process sets a file offset past its current size
- write at a large offset past end of file can also extend file
- what should happen to the gap? do we allocate blocks for it?
- design for disk: locality heuristics
- access nearby sectors are way faster than access random sectors on disk due to less arm movement
- if we place sectors that are likely referenced together close by, our requests can be served faster
- block group placement
- group nearby tracks on each platter into block groups
- each group has its own inode array, inode bitmap, and data bitmaps
- place related things within the same block group and unrelated things in different ones
- what's related? data and metadata of a file, files within the same directory
- what's unrelated? files in different directories, different directories (are these always unrelated?)
- exceptions for large file: what may happen if we use this approach on large file?