Project 3 - Virtual Heap Allocator

Version 1.0

Due dates

Out:Saturday, November 9, 2002
Due:Wednesday, November 20, 2002 Friday, November 22, 2002 11:59 PM

Tasks:

Updates:

Assignment Goals

Background

Sometimes, when using a computer that is shared with other users, you might want to limit the size of your process. If your process allocates too much memory, then it might adversely affect other users of the system. Linux and other Unices provide mechanisms for limiting the resources that a process can use. You can set limits on the size of the process' stack, the size of its heap, the number of file handles it can have open, and several other things. To see your current limits, type: ulimit -a from a bash prompt. You can set limits for yourself, as a regular user, and the system administrator can set hard limits that you must stay under. Take a look at the man page for ulimit for more on this.

The trouble is that if your process reaches one if its limits, it might not be able to continue running. For example, if you limit your heap size to 16K, and your program tries to allocate 64K with malloc, the call to malloc will just fail and return a NULL pointer. To see this, type the following (minus the comments) from a bash prompt.


ulimit -S -d 1           # Limit the heap size to 1K (soft limit)
man ulimit               # Try to start a program.  This will fail.
ulimit -S -d umlimited   # Restore your limit to what it was before

This problem means that you cannot limit your process' heap size and still allow your program to continue if it needs more heap space.

In this assigment, you will solve this problem by writing your own heap allocator. The heap allocator you probably use most often is malloc which is part of the standard C library. On Linux, malloc comes by way of glibc, the GNU implementation of the standard C library. Note that malloc, itself, is not part of the kernel. Your heap allocator will be quite a bit different from malloc. In particular, it will allow a process to specify a limit to its total heap usage without causing the application to terminate if it exceends the limit. It will do this by utilizing some of the techniques used by the virtual memory system to make the process think it has an infinitely large heap.

For background on malloc, I strongly suggest reading this paper by Poul-Henning Kamp about the FreeBSD implementation of malloc. It's only 6 pages, so it should be a quick and fruitful read. For background on the binary buddy strategy for sub-page allocation, take a look at this 1193-word article on allocation.

lmalloc

Your heap allocator, called lmalloc, will consist of the following interface.

void lmalloc_init(int size_of_heap, int page_size, enum page_replacement_strategy strategy, char* pagefilename);
Initialize the virtual heap so lmalloc can work. size_of_heap is given in bytes. Signal that the system should use a particular page replacement strategy. (Tiny hint: lmalloc_init(..) will call the standard malloc(..).)
lhandle lmalloc(int bytes_to_allocate);
lmalloc allocates a block of memory for the application on the virtual heap. If there is not enough space on the virtual heap to fulfill the request, it will use page replacement to pick a block from the virtual heap and page (write) it out to disk to make room for the allocation request. Also, if the request is at least half a page (as specified in the lmalloc_init(..) call), it should just allocate a whole page. If the request is less than half a page, it should use sub-page allocation to fulfill the request.
void lfree( lhandle victim );
Deallocate a block of memory that was previously allocated by lmalloc. (Tiny hint: This should not call free.)
void* lmem( lhandle target );
lmem will take a memory handle and return a pointer to real memory. lmem should definitely be written as a macro, not a real function. It is written here this way for clarity. lmem will first check to see if the requested block is in a page that is currently in memory. If not, it will go to the page file on disk and retrieve it. Then, it will return the address in memory where the block is located.
void* lmem_d( lhandle target );
lmem_ddoes the same thing as lmem, except that it notifies the lmalloc system that the value has been changed. Thus, your system will need to set a dirty bit somewhere.
typedef int lhandle;
Actually, you can typedef it to anything you want, as long as it is lightweight. Specifically, you may not store the application's data in a struct that is part of lhandle.
enum page_replacement_strategy;
Just an enumeration that lets the application select between the four available page replacement strategies.
lmalloc_end()
Do any deallocation or clean-up tasks that need to be done.

Restrictions on the application developer

This system is not interreplaceable with the standard malloc. In other words, you could not just replace malloc calls with lmalloc calls in a working program and expect it to work. It should allow the application developer to do anything (s)he could do with malloc but it requires a few special considerations. Theoretically, you could use lmalloc and standard malloc in the same program, but that would defeat the purpose of lmalloc.

Tests

To verify your implementation, you will write a total of five programs that use your library for heap allocation:

Additional Requirements

Partners

For this project, you will need a partner. Consider working with somebody you don't know very well. This often leads to better productivity. It always leads to making connections with people with similar professional interests who you might not otherwise get to know.

Collaboration can be a great thing in a project because it lets you tackle a bigger problem, learn more, and learn how to express your technical ideas. Being able to clearly express technical ideas is a super-important for all scientists and engineers. Collaboration also solves the usual problem of wanting to discuss your solution with somebody without spoiling somebody else's fun. However, problems can arise. Maybe one person is contributing more than the other. Maybe one person is over-committed and unavailable. Maybe one person over-writes the other's files. Maybe the duo cannot decide on an acceptable solution. Usually these problems are few and far between, but if you have any such trouble, please get in touch with one of the TAs as soon as possible. All we can really do is try to facilitate communication, but usually that's all it takes.

As far as grading is concerned, both members of the duo are playing the same song, so generally we will award one score per performance.

Write-up

Just merge all of these into one write-up file and turn it in electronically with your code. The following formats are okay: PDF, HTML, text file, Word document, StarOffice/OpenOffice document. The whole thing will probably be between one and two pages.

  1. Make a brief (e.g. maybe 2-6 sentences each) man page for each function in your interface, specifically lmalloc_init, lmalloc, lfree, lmem, lmem_d, and lmalloc_end. It should explain to an application developer how to use your library. It should also explain to the application developer any constraints he/she should be aware of, such as a minimum/maximum allocation size, minimum/maximum page size, etc.
  2. Justify any constraints you place on the application developer, as though you were talking to a classmate familiar with the project. One or two sentences per constraint is fine.
  3. How do you decide how much space in memory to set aside for metadata? How did you arrive at this scheme?
  4. Give the complexity of the following events in Big-O notation in terms of "page_size" and "vheap_size". Assume the standard malloc runs in constant time. No explanation is needed.
  5. Suppose you were able to modify the Linux kernel to support your library, as needed. This would allow you to get into the address translation mechanism and put in your own layer of translation, as you see fit. How could the system be improved?
  6. What do you like best about your design? How would you improve it, if you had another week to work on it?
  7. Is there anything unusual to know about your system when testing it?
  8. Is there anything you were unable to finish? If so, explain how you would have finished it, including any details of your design that you have worked out.

Turnin

Turnin is enabled. To turnin your files, assuming they are in a directory called proj3 and you are in section AA, enter...

turnin -ccse451=AA -v -pproject3 proj3

You should have..

  1. Source code for your library (e.g. lmalloc.c).
  2. Source code for all five test programs.
  3. A Makefile to build executables for the test programs and generate your object file, named lmalloc.o.
  4. One file for your write-up.

Please, turn in only one submission per group. If there is any confusion, please send mail to Alex.

Last updated 11/19/02 by Alex Quinn (aquinn@cs)