CSE 374, Lecture 18: HW6

NOTE: this is an abbreviated version of the much longer homework 6 writeup.

Memory management and system calls

We use "malloc" and "free" in C in order to allocate space on the HEAP for data - strings, arrays, structs, etc. We understand that malloc somehow makes a "reservation" - it marks the block of memory as "taken" - and free somehow "cancels the reservation" - marking the block of memory as available again for another malloc reservation. But how do malloc and free actually work?

At this point, remember that your C program is running in a "virtual address space": the operating system is managing the program's address space to make sure that it won't collide with any other program's address space. Since the program doesn't ACTUALLY own any real address space, in order for the C program to get heap space to satisfy your malloc call, malloc has to "ask" the underlying operating system for some memory to use in its virtual address space. This is called a SYSTEM CALL.

A system call is a function that programs can use to request things from the operating system. Some examples of system calls:

Why don't we just use the raw system call instead of malloc and free? The interface of the raw system call isn't very user-friendly. So malloc and free form a wrapper around the underlying system call implementation such that it is easy to use.

You will be doing the same thing in HW6, except that you will use malloc as a stand-in for the underlying system call.

HW6

The core part of HW6 is to implement the equivalents of malloc and free:

        void* getmem(uintptr_t size)  aka malloc
        void freemem(void* p)         aka free

How do we do this?

How do we keep track of all of the available chunks vs reserved chunks? We'll use something called a "free list", which is a linked list of nodes that store information about available chunks. This free list data structure will be shared by both getmem and freemem. Each block on the free list starts with an uintptr_t integer that gives its size followed by a pointer to the next block on the free list. To help keep data in dynamically allocated blocks properly aligned, we require that all of the blocks be a multiple of 16 bytes in size, and that their addresses also be a multiple of 16 (this is the same way that the built-in malloc works).

When a block is requested from getmem, it should scan the free list looking for a block of storage that is at least as large as the amount requested, delete that block from the free list, and return a pointer to it to the caller. When freemem is called, it should return the given block to the free list, combining it with any adjacent free blocks if possible to create a single, larger block instead of several smaller ones.

The actual implementation needs to be a bit more clever than this. In particular, if getmem finds a block on the free list that is substantially larger than the storage requested, it should divide that block and return a pointer to a portion that is large enough to satisfy the request, leaving the remainder on the free list. But if the block is only a little bit larger than the requested size, then it doesn't make sense to split it and leave a tiny chunk on the free list that is unlikely to be useful in satisfying future requests.

If no block on the free list is large enough to satisfy a getmem request, getmem needs to acquire a good-sized block of storage from the underlying system, add it to the free list, then split it up, yielding a block that will satisfy the request, and leaving the remainder on the free list. Since requests to the underlying system are (normally) relatively expensive, they should yield a reasonably large chunk of storage, say at least 4K or 8K or more, that is likely to be useful in satisfying several future getmem requests.

A request for a large block will happen the very first time getmem is called. When a program that uses getmem and freemem begins execution, the free list should be initially empty. The first time getmem is called, it should discover that the (empty) free list does not contain a block large enough for the request, so it will have to call the underlying system to acquire some storage to work with.

When freemem is called, it is passed a pointer to a block of storage and it needs to add this storage to the free list, combining it with any immediately adjacent blocks that are already on the list. What freemem isn't told is how big the block is. In order for this to work, freemem somehow has to be able to find the size of the block. The usual way this is done is to have getmem actually allocate a block of memory that is a bit larger than the user's request, store the free list node or just the size of the block at the very beginning of that block, and return to the caller a pointer to the storage that the caller can use, but which actually points a few bytes beyond the real start of the block. Then when freemem is called, it can take the pointer it is given, subtract the appropriate number of bytes to get the real start address of the block, and find the size of the block there.

How is freemem going to find nearby blocks and decide whether it can combine a newly freed block with one(s) adjacent to it? One way is getmem and freemem can keep the blocks on the free list sorted in order of ascending memory address. The block addresses plus the sizes stored in the blocks can be used to determine where a new block should be placed in the free list and whether it is, in fact, adjacent to another one.

Where does the free list head pointer live? The variable that stores the beginning of the free list will need to be accessible in both getmem and freemem implementation .c files. Just like functions, in order to share them between files, we need to put them in a shared header file. However, variables are different - if we create a variable like this:

    int x;

then we have both DECLARED and also DEFINED the variable! We haven't initialized it (ie x = 0), but we have allocated space for it. This is bad in header files! We just want to DECLARE it and not define it. We can do this with the keyword "extern", which tells the compiler that it should declare the variable but not define it yet:

    extern int x;

Then in a .c file, you can actually define it (only in one file!).

uintptr_t

We refer to the type "uintptr_t" type in the HW6 specification - this is an integer type that is convertible to memory addresses. You can cast addresses to uintptr_t before doing integer arithmetic to make sure that the proper integer arithmetic happens.