CSE 451 Lecture Notes 5/16/03
Note Takers: William Harvey, Melissa Garcia

MIDTERM RESULTS:
The midterm results are in, and you can view them at http://www.cs.washington.edu/education/courses/cse451/CurrentQtr/exams/MidtermStatistics.pdf

MAIN POINT OF LAST LECTURE:
Laziness shall be rewarded often, but sometimes you get punished.

MAIN POINT OF THIS LECTURE:
Paging is expensive, but we can reduce the pain by taking advantage of locality.

Consider the formula for effective access time for a demand-paged memory:
(effective access time) = (1 - p) * (memory access time) + p * (page fault time),
where 'p' is the probability of a page fault (between 0 and 1). The "punishment" occurs during a page fault, which will take anywhere from 10 to 100 milliseconds. To get a good feel about how much punishment this actually is, consider the following example:

A 1GHz processor executes roughly 1,000,000,000 (one billion) instructions per second. One page fault on a machine using this processor will waste 100,000,000 instructions. If a program takes 1 second to complete in theory (without any page faulting), and in reality causes 100 page faults, then it will take 11 seconds to run because of the page faults. (Following the example of a program taking 1 sec to execute with 100 page faults: 1 sec + 100 * 100 millisec (to service each pf) = 1sec + 10sec = 11sec.) The moral of the story is that you want to be careful about choosing a page placement algorithm, because page faults are a huge performance hit.

REPLACEMENT ALGORITHMS:
Q: Which property are we going to exploit when designing our placement algorithm?
A: Locality! (Temporal Locality: if we've used it recently, then we'll likely use it again. Spatial Locality: if we needed to reference it, then one somewhere close by will likely be referenced soon).

Q: On the same token, which program properties will we reward?
A: Locality, and "information" (have the program tell the OS "I'm done with this phase, and I'm going to begin a different phase. Take the pages I was using and do whatever you want with them.")

Q: Doesn't informed paging expose hardware details to the programmer?
A: No. It hints to the OS about what to do, rather than forcing it do something right this instant. The OS can deal with the problem whenever it wants to.

Program Execution time as a function of program (physical) memory size.

This graph compares the Random and Optimal replacement algorithms. We will always compare algorithms with relation to Optimal because it is the best one could possibly (impossibly, rather) do. Note that the graph flattens out at both ends of both lines. This shows that there is a limit to how good you can get and how bad you can get. When the graph levels out (approaching a limit of when the execution time is at it's lowest and/or when the the size is at its lowest), no replacement algorithm will help you, so you have to find one that does not hurt you.

For the most part, replacement algorithms are designed to take advantage of locality. Here are a few of the page replacement schemes:

OPTIMAL: "Replace the page that WILL NOT BE USED for the longest period of time."
The optimal page replacement algorithm has the lowest page fault rate of all algorithms. Unfortunately, this algorithm cannot be implemented, because it is impossible to predict the future. We can use it, though, to compare to other page replacement
algorithms.

RANDOM: "Pick a page at random to replace."
Random works well when there is tons of free memory and when there is no free memory. However, in between those two scenarios, random leaves a lot to be desired. We will consider the random page replacement algorithm as another benchmarking tool. Actually, we don't want our clever replacement algorithms to be slower than random.

FIFO: "Replace the oldest page."
FIFO stands for First In, First Out. As pages get paged in, maintain a queue that contains these pages. The page that is at the front of the queue will be the page that has been there the longest, it is the "oldest" page among them all. This algorithm is based on two thoughts:
1. If a page is old, then it's probably not being used.
2. "Not used" eventually means old.
Problem: Many "old" pages may get used all the time, but this algorithm would replace those pages anyways. Also, the fault rate might increase when the Algorithm is given more physical memory (Belady's anomaly).

LRU: "Replace the least recently used page."
LRU stands for Least Recently Used. Since we can't predict the future, we will use the past to make an educated guess about what is going to happen.
Q: How do we determine which page was least recently used?
A: Record/update a time stamp (within the PTE) every time a page in accessed.
Problem: Checking the time stamp for every PTE takes lots of time and is way too expensive.

LRU is a good example of PRECOMPUTATION. Although precomputation is great for algorithms, it's bad for paging itself. As we learned previously, "being lazy pays off" and computing/keeping information it doesn't really need is aggressive (not lazy).
One exception the rule, "never precompute," is the MATTRESS RULE metaphor.
1. If I need $3 for the paper boy, I can look under the matress for $3 that I previously put under there
2. When the paper boy leaves, I assume that no one else will be stopping by looking for money.

NEXT LECTURE: More about the LRU page replacement algorithm, and how to use a "free pool" to satisfy a page fault immediately by allowing the consumer of free pages to run asynchronously with the producer of free pages. The pool acts as a buffer.