Outline for 2/25/98
- Last time: Hierarchical Directory Structures.
Symbolic links for convenience of short names.
- Administrative:
- Project 3 available now
- Questions on recent lectures?
- Objective: More on directory implementation,
How are files used? Implications on design and implementation.
UNIX Inodes
Pathname Resolution
What to do about long paths?
- Make long lookups cheaper - cluster inodes and data on disk to make each component resolution step somewhat cheaper
- Immediate files - meta-data and first block of data co-located
- Collapse prefixes of paths - hash table
- "Cache it" - in this case, directory info
File System Data Structures
Finally Arrive at File
- What do users seem to want from the file abstraction?
- What do these usage patterns mean for file structure and implementation decisions?
- What operations should be optimized 1st?
- How should files be structured?
- Is there temporal locality in file usage?
- How long do files really live?
Generalizations from UNIX Workloads
- Standard Disclaimers that you can't generalize…but anyway…
- Most files are small (fit into one disk block) although most bytes are transferred from longer files.
- Most opens are for read mode, most bytes transferred are by read operations
- Accesses tend to be sequential and 100%
More on Access Patterns
- There is significant reuse (re-opens) - most opens go to files repeatedly opened & quickly. Directory nodes and executables also exhibit good temporal locality.
- Looks good for caching!
- Use of temp files is significant part of file system activity in UNIX - very limited reuse, short lifetimes (less than a minute).
File Structure Alternatives
- Contiguous
- 1 block pointer, causes fragmentation, growth is a problem.
- Linked
- each block points to next block, directory points to first, OK for sequential access
- Indexed
- index structure required, better for random access into file.
File Buffer Cache
- Avoid the disk for as many file operations as possible.
- Cache acts as a filter for the requests seen by the disk - reads served best.
- Delayed writeback will avoid going to disk at all for temp files.