Outline for 2/27/98
- Last time: File Workloads - How people use files.
- Administrative:
- Read Distributed System chapters
- Questions? Problems?
- Objective:
- File caches (avoiding the disk)
- Dealing with disks.
What to do about Disks?
- Disk scheduling
- Idea is to reorder outstanding requests to minimize seeks.
- Layout on disk
- Placement to minimize disk overhead
- Build a better disk (or substitute)
Disk Scheduling
- Assuming there are sufficient outstanding requests in request queue
- Focus is on seek time - minimizing physical movement of head.
- Simple model of seek performance
Seek Time = startup time (e.g. 3.0 ms) +
N (number of cylinders ) *
per-cylinder move (e.g. .04 ms/cyl)
Policies
- Generally use FCFS as baseline for comparison
- Shortest Seek First (SSTF) -closest
- Elevator (SCAN) - sweep in one direction, turn around when no requests beyond
- handle case of constant arrivals at same position
- C-SCAN - sweep in only one direction, return to 0
- less variation in response
Layout on Disk
- Can address both seek and rotational latency
- Cluster related things together
(e.g. an inode and its data, inodes in same directory (ls command), data blocks of multi-block file, files in same directory)
- Sub-block allocation to reduce fragmentation for small files
- Log-Structure File Systems
Log-Structured File Systems
- Assumption: Cache is effectively filtering out reads so we should optimize for writes
- Basic Idea: manage disk as an append-only log (subsequent writes involve minimal head movement)
- Data and meta-data (mixed) accumulated in large segments and written contiguously
- Reads work as in UNIX - once inode is found, data blocks located via index.
- Cleaning an issue - to produce contiguous free space, correcting fragmentation developing over time.
Build a Better Disk?
- "Better" has typically meant density to disk manufacturers - bigger disks are better.
- I/O Bottleneck - a speed disparity caused by processors getting faster more quickly
- One idea is to use parallelism of multiple disks
- Striping data across disks
- Reliability issues - introduce redundancy
RAID
Redundant Array of Inexpensive Disks
RAID Level 5