## Lecture 24 (Wed 11/26/2008)

- HW #4 (optional) Due Fri Dec 5 during class
- Lab #4 Hardware Due Fri Dec 5 at 5pm
- Today: I/O!

<section-header><section-header><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item><list-item>

1



|   | I/O is slow!                                                                                                                     |
|---|----------------------------------------------------------------------------------------------------------------------------------|
|   | How fast can a typical I/O device supply data to a computer?                                                                     |
|   | <ul> <li>A fast typist can enter 9-10 characters a second on a keyboard.</li> </ul>                                              |
|   | <ul> <li>Common local-area network (LAN) speeds go up to 100 Mbit/s, which<br/>is about 12.5MB/s.</li> </ul>                     |
|   | <ul> <li>Today's hard disks provide a lot of storage and transfer speeds around<br/>40-60MB per second.</li> </ul>               |
| • | Unfortunately, this is excruciatingly slow compared to modern processors and memory systems:                                     |
|   | - Modern CPUs can execute more than a billion instructions per second.                                                           |
|   | <ul> <li>Modern memory systems can provide 2-4 GB/s bandwidth.</li> </ul>                                                        |
| • | I/O performance has not increased as quickly as CPU performance, partially due to neglect and partially to physical limitations. |
|   | <ul> <li>This is changing, with faster networks, better I/O buses, RAID drive<br/>arrays, and other new technologies.</li> </ul> |
|   |                                                                                                                                  |















|   | Estimating disk times                                                                                                     |    |
|---|---------------------------------------------------------------------------------------------------------------------------|----|
| • | The overall response time is the sum of the seek                                                                          |    |
|   | time, rotational delay, transfer time, and overhead.                                                                      |    |
| • | Assume a disk has the following specifications.                                                                           |    |
|   | <ul> <li>An average seek time of 3ms</li> </ul>                                                                           |    |
|   | <ul> <li>A 6000 RPM rotational speed</li> </ul>                                                                           |    |
|   | <ul> <li>A 10MB/s average transfer rate</li> </ul>                                                                        |    |
|   | <ul> <li>2ms of overheads</li> </ul>                                                                                      |    |
| • | How long does it take to read a random 1,024 byte sector?                                                                 |    |
|   | <ul> <li>The average rotational delay is:</li> </ul>                                                                      |    |
|   | <ul> <li>The transfer time will be about:</li> </ul>                                                                      |    |
|   | — The response time is then:                                                                                              |    |
| • | How long would it take to read a whole track (512 sectors) selected at random, if the sectors could be read in any order? |    |
|   |                                                                                                                           |    |
|   |                                                                                                                           | 12 |























|         | Example Bus Problems, cont.                                                                                                                                                                                                                                                                                                 |
|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| For thi | Me the following system:<br>A CPU and memory share a 32-bit bus running at 100MHz.<br>The memory needs 50ns to access a 64-bit value from one address.<br>Is system, a single read can be performed in eight cycles or 80ns for an<br>ective bandwidth of (12.5 x 10 <sup>6</sup> reads/second) x (8 bytes/read) =<br>MB/s. |
| ,       | e memory was widened, such that 128-bit values could be read in s, what is the new effective bandwidth?                                                                                                                                                                                                                     |
| ,       | t is the bus utilization (fraction of cycles the bus is used) to achieve above bandwidth?                                                                                                                                                                                                                                   |
| ,       | ilization were 100% (achievable by adding additional memories), what ective bandwidth would be achieved?                                                                                                                                                                                                                    |

| For thi | ne the following system:<br>A CPU and memory share a 32-bit bus running at 100MHz.<br>The memory needs 50ns to access a 64-bit value from one address.<br>system, a single read can be performed in eight cycles or 80ns for an<br>ctive bandwidth of (12.5 x 10 <sup>6</sup> reads/second) x (8 bytes/read) = 100MB/s. |      |
|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
|         | memory was widened, such that 128-bit values could be read in 50ns, when e new effective bandwidth?                                                                                                                                                                                                                     | nat  |
|         | read can now be done in $(1 + 5 + 4) = 10$ cycles, or 100ns. This yields an effective width of $(10 \times 10^6 \text{ reads/second}) \times (16 \text{ bytes/read}) = 160MB/s.$                                                                                                                                        | ł.   |
|         | is the bus utilization (fraction of cycles the bus is used) to achieve the ab<br>width?                                                                                                                                                                                                                                 | )OV6 |
|         | ) cycle access, sending the address takes 1 cycle, transferring the data takes 4 cycl $D$ = 50%.                                                                                                                                                                                                                        | .e = |
|         | lization were 100% (achievable by adding additional memories), what ctive bandwidth would be achieved?                                                                                                                                                                                                                  |      |
| Since w | have 1 address transfer for every 4 data transfers the effective bandwidth would b f the total bandwidth: $(32b \times 100Mhz) \times 80\% = (400MB/s) \times .8 = 320MB/s$ .                                                                                                                                           | Эе   |