Homework #3, Virtual Memory

This problem investigates the interaction between memory allocators and virtual memory. We will use the BufferPool as our working example (talk about code reuse!).

Let's assume that multiple applications are allocating buffers using a buffer pool. Each process/application has its own buffer pool. Our goal is to design the buffer pool so that we make the most efficient use of memory resources. And, we want to avoid catastrophically bad performance (i.e., thrashing).

Let us also assume that the application decides to allocate an enormous number of buffers up front. Remember: this should be OK, because we're only allocating virtual memory at this stage. So, we can afford to be pretty generous (virtual memory is kinda like Monopoly money...). During actual use, the expected working set of the application is much smaller than the buffer pool size. Although the working set is fairly small, the application frequently performs allocations and deallocations, so the set of buffers in use is changing all the time.

[An "application" that behaves this way is a network router, which is constantly allocating and deallocating buffers for holding network packet data. However, the number of deallocations is on par with the number of allocations, so the working set size stays bounded.]

For these problems, you may assume that the buffer size is 512 bytes, the page size is 4k, there are 4 million buffer entries (BUFFER_POOL_SIZE = 2^22), and the working set size of the application is a half million (2^19) buffers. Finally, you can assume that the buffers are allocated in-order from a large, sequential range of virtual memory space. The number of applications (each using a buffer pool) is variable (as described below).

Part A: The BufferPool uses a linked list data structure to maintain the set of free buffers. Do you think it would be better to implement this data structure like a stack (LIFO) or a queue (FIFO)? Why?

Part B: Using the strategy you selected in part A, how many applications can the system support before thrashing? Assume that there is 2 Gigabytes of physical memory dedicated to buffers (for all applications).

Part C: Using the strategy you did not select in part A, what is the maximum number of applications the system can support before thrashing?

Part D: Can you think of a better data structure than a linked list for storing these buffers? (Hint, assume we are in a language like C, which exposes the virtual memory addresses of pointers). (Even bigger hint: this problem came up in a research project with your esteemed teacher: See section 4.3.1 of this paper ). Why does this approach provide a benefit?

(Note: the acronym "VM" in that paper means virtual machine, not virtual memory. When acronyms start colliding, it's probably time to change fields...)