File Systems Basics
Overview
- storage stack
- file system
- abstraction on top of disk blocks
- organize persistent data into files and directories
- provide names and permissions to persistent data
- provide simple APIs to modify persistent data
- e.g. read, write, create, delete
- hide data placement (where data is stored on disk) details
- main abstractions
- file
- a container of persistent data
- consists of metadata and data
- metadata: size, owner, permission, type, where data locates on disk
- metadata for file is also known as inode, file header, file record
- directory
- a way to group and organize files
- if we implement directory as a file, what would its metadata & data track?
- metadata: track same set of information as file metadata
- data: track files within the directory
- any benefit to treating directory as a file?
- directory data format
- consists of an array of directory entries
- each directory entry contains 1. file name 2. file's metadata location
- each directory has two default directory entries:
.
(current dir) and ..
(parent dir)
- path
- namespace convention
/
separated path consists of all directories leading to the file
- all files can be found starting at root directory
/
- to read
/home/tom/foo.txt
- read in root's metadata, locate the data blocks for root
- read in root's data block, search for directory entry for "home"
- read in home's metadata, locate the data blocks for home
- read in home's data block, search for directory entry for "tom"
- read in tom's metadata, locate the data blocks for tom
- read in tom's data block, search for directory entry for "foo.txt"
- read in foo.txt's metadata, locate the data blocks for foo.txt
- read foo.txt's data block(s)!
- how do we know where to find root directory's metadata?
Filesys Implementation
- filesys structures
- block bitmap
- track usage info of hundreds of millions of sector/block
- bitmap use 1 bit to encode each block's usage status, efficient and compact
- to reduce size of bitmap, can use 1 bit to track multiple blocks
- is this a persistent structure? where should we store the bitmap?
- blocks taken up by the bitmap are marked as used in the bitmap!
- metadata (inode) table
- reserved blocks on disk that stores metadata for file/directory
- metadata for the root directory typically has a known location in the table
- is this a persistent structure? where should we store the inode table?
- blocks taken up by the inode table are marked as used in the bitmap!
- superblock
- file system metadata, stored at known location on disk
- includes filesys size, block size, location of the bitmap and inode table
- can locate filesys structures just from the superblock
- data layout
- so far we just said metadata track where data is on disk, but how?
- depends on how the data is laid out on disk!
- contiguous
- allocate consecutive blocks on disk
- metadata tracks the starting block number and the number of blocks
- how do you locate which block contains the ith byte of data?
- linked
- can allocate block anywhere on disk, each block stores data + a pointer to the next block
- metadata tracks the block number of the first data block
- how do you locate which block contains the ith byte of data?
- array
- store an array of block pointers, data blocks can be anywhere on disk
- limited by the size of array
- how do you locate which block contains the ith byte of data?
- indexed/indirection
- instead of storing an array inside the metadata itself, stores it inside a block
- metadata tracks the (index) block holding an array of blocks
- keeps the inode size small, more disk op is needed to find a data block
- how do you locate which block contains the ith byte of data?
- extents
- one extent tracks a contiguous section of blocks
- track multiple extents via array and/or linked approach
- multilevel indexed pointers
- track an array of blocks, some point to the actual data, some to an indexed block, some to a doubly indexed block