Persistent Storage
Storage Devices
- persistent/nonvolatile, retains data after power down
- Hard Drive (HDD) / Spinning Disk
- large capacity at low cost, block level access, not byte addressable
- physical motion needed to read and write, milliseconds access latency
- Solid State Drive (SSD)
- large capacity at intermediate cost (~3x HDD), block level access, not byte addressable
- no physical moving parts, microsecond access latency
Hard Drive
- what does a spinning disk look like?
- head: moves above the platter (3nm), reads data from and writes data to disk sectors
- sector: unit of reads and writes on disk, 512 bytes, contains error correcting code
- track: length varies across disk, outer tracks have more sectors
- separated by unused guard regions to reduce likelihood of neighboring corruption
- only outer half of radius is used, most sectors are in the outer half
- what happens on a disk read?
- a read request is sent to disk controller
- find the right platter and surface, arm moves to the right track
- head reads while the disk spins (desired sector will spin under the head)
- transfer data read back to the host
- access latency
- total time = seek time + rotation time + transfer time
- seek time: time to move disk arm over the desired track (1-20ms)
- rotation time: time for the desired sector to rotate under the disk head (based on RPM, 4-15ms)
- e.g. 7200 RPM = 120 RPS = 0.12 rotation per ms = 8.3 ms per rotation
- reasonable to assume it takes half a rotation to get to the desired sector, so 8.3/2 = 4ms
- transfer time: time to transfer data onto/off the disk (based on disk bandwidth, often < 4us per sector)
- e.g. 120 MiB/s bandwidth = 125829120 B/s
- sector = 512 bytes, 512 / 125829120 * 1000 = 0.004 ms per sector
- sequential vs random access
- how many seeks would we need to do if we read 10 consecutive sectors on the same track?
- how many seeks would we need to do if we read 10 random sectors on different tracks?
- IOPS (I/O operation per second)
- # of I/O requests / total latency
- access patterns of the I/O request matters
- IOPS of 10 random reads (one sector per read)?
- IOPS of 10 sequential reads (one sector per read)?
Solid State Drive
- no moving parts, NAND-based flash
- components
- page: unit of read and write, 2-4 KB, not the same as VM pages!
- block: unit of erasure, 1-8 MB, span hundreds of pages
- can only write to an empty page, if no empty page, must erase an entire block to create empty pages first
- operations
- read (a page): can read any page, fast sequential and random access, tens of microseconds
- program (a page): program a page in an erased block by setting certain bits to 0 to write data, tens of microseconds
- erase (a block): erase a block by setting all bits in the block to 1, slow (a few ms)
- what happens to existing data in the block?
- once a block is erased, it's ready to be programmed
- reliability
- a block becomes unusable after a certain number(10-100K) of program/erase operations
- repeated writes to the same page is bad for endurance
- wear leveling: try to spread writes across the blocks as evenly as possible
- SSD needs a way to flexibly remap logical blocks to actual block & page
- Flash Translation Layer (FTL)
- allows client (OS) to perform read & write operation on logical block/page
- translates read/write requests for logical block/page to read, erase, program operations on actual block/page
- allows SSD to do internal management (wear leveling, garbage collection, see OSTEP: SSD)
- latency
- sequential access still faster than random access, but much closer than hard drive
- total time = access latency + transfer time + (erasure time if applicable)
- e.g. transfer rate of 500MiB/s = 524288000 B/s
- transfer 4KiB of data = 4096 bytes, 4096 / 524288000 * 1000000 = 7.8us
- total time for reading 4KiB of data = 10us (access latency) + 7.8 (transfer time)
- IOPS
- # of I/O operations / total time