Due: Monday, November 18, 2024, at 11:59pm
Goals
Synopsis
Set-Up
Provided Files
Your Task
Background Information
Assessment
Turn-in instructions
In this assignment you will implement and test a memory management package that
produces the same functionality as the standard library malloc
and
free
functions. In this process you will gain further practice
with programming in C, manipulating pointers, and using a linked list
data structure. You will exercise your skills using a benchmarking
program (bench
), using gdb, and using make
.
In this assignment you will develop and benchmark a memory management
package. Specifically, you will implement the back end for a getmem
function that serves the job of malloc
, and implement a
freemem
function that serves the job of free
.
Starter code will be provided for you to use, and is available through the
CSE374-materials git repository. This includes a version of bench.c
that may be used as a bench tester for your solution.
You will be completing the functions declared in memory.c. This includes your
own implementation of a freemem
function, and the helper functions
required to manage the memory allocated by getmem
and de-allocated by
freemem
Before you get started, ensure that your set-up is up-to-date and appropriate.
cancun
, including gcc
git status
to determine if there are outstanding changes, and git add
and git commit
if you are unsure.)git pull upstream main
to pull the newest commit from the upstream repository.
This will give you access to the hw7
folder containing the materials for this
assignment. (upstream
specifies that you want to pull from the upstream
repository, and main
specifies the branch. You may see a text editor open to allow
you to edit the merge message. This will likely be Vim, so you can edit it and then save,
or just accept the current text and exit using :q!
.)You will do this assignment in your new hw7
folder.
You should get into the habit of committing and pushing code frequently.
Your hw7 folder contains the following files:
mem.h
: The header describing your memory programs interface.mem_internal.h
: The header describing private memory functions.memory.c
: You will modify this program to implement your memory library.bench.c
: This is a bench tester for your implemented memory system.Makefile
: A Makefile to compile your code and run a bench test.clint.py
: The linter you will use to check your coding style.The memory management package includes a header file (mem.h
)
and C implementation file (memory.c
) that specify and
implement the following functions.
Function | Description |
---|---|
void* getmem(uintptr_t size) |
Return a pointer to a new block of storage with at least |
void freemem(void* p) |
Return the block of storage
at location |
In order to get getmem
and freemem
to work, you will
want to implement a number of helper functions. These functions have been declared
in a mem_internal.h
file. It is your job to implement these helper
functions. You will be able to find them easily by searching for
COMPLETEME
in the module file. Each of these functions is prototyped,
and documented with its specification within the header file.
You may create additional helper functions if you desire. Follow the standards to prototype them and document them. These functions would not be exposed to the user, so would be placed in the internal header file.
As with any program you write, your code should be readable and understandable to anyone who knows C. In particular, for full credit your code must observe the following requirements.
As with the previous assignment, you should use the cpplint.py style checker to review your code. While this checker may flag a few things that you wish to leave as-is, most of the things it catches, including whitespace errors in the code, should be fixed. We will run this style checker on your code to check for any issues that should have been fixed. Please use the class discussion board if you have questions about any of clint's warnings and whether they can be ignored.
In addition to the required work above, a number of utility functions have been declared and implemented for you. You may modify them if you wish. These include:
Function | Description |
---|---|
void get_mem_stats( |
Places values at the specified locations corresponding to
|
void print_heap(FILE * f) |
Print a formatted listing on file
f showing the blocks on the free list. Each line of
output should describe one free block and begin with two
hexadecimal numbers (0xdddddddd , where d
is a hexadecimal digit) giving the address and length of that
block.
|
void check_heap() |
Check for possible problems with the
free list data structure. When called this function should
use assert s to verify that the free list has the
following properties:
assert
should fail and cause the program to terminate at that point. Calls
to check_heap should be included in other functions to
attempt to catch errors as soon as possible. In particular,
include calls to check_heap at the beginning
and end of functions getmem
and freemem . Include additional calls to check_heap
wherever else it makes sense.
|
You have been provided with a C module called bench.c,
.
This modules provides a bench testing module which will exercise your
memory implementation. You should take advantage of the input arguments
to control your test runs.
You can use your bench program to test the memory system as you go.
Once you think you have all the memory fucntions in working order, you might want to
rerun bench
after recompiling the code with -DNDEBUG
to turn off the assert
tests in check_heap
to see how
much faster the code runs without them. However, leave the check_heap
tests on while developing and debugging your code since this will be a big help
in catching errors.
You have been provided with a Makefile. You should take a look at this, and notice that there are a few different ways to compile the resulting code. In addition, a testing target has been created for you. You are welcome to modify this file as you see fit.
The above sections describe what you need to do. This section gives some ideas about how to do it. We discuss this further in class, and you should take advantage of the Ed discussion page to trade questions, ideas, and suggestions.
The basic idea behind the memory manager is fairly simple. At the
core, the getmem
and freemem
functions
share a single data structure, the free list, which is just
a linked-list of free memory blocks that are available to satisfy
memory allocation requests. Each block on the free list starts with
an uintptr_t
integer that gives its size followed by a
pointer to the next block on the free list. As each of these values
requires 8 bytes, we require that all blocks be a multiple of 16 bytes
in size, with all addresses multiples of 16 bytes. This helps ensure that
dynamically allocated blocks are properly aligned.
When a block is requested from getmem
, it should scan
the free list looking for a block of storage that is at least as
large as the amount requested, delete that block from the free
list, and return a pointer to it to the
caller. When freemem
is called, it should return the
given block to the free list, combining it with any adjacent free
blocks if possible to create a single, larger block instead of
several smaller ones.
The actual implementation needs to be a bit more clever than this.
In particular, if getmem
finds a block on the free list
that is substantially larger than the storage requested, it should
divide that block and return a pointer to a portion that is large
enough to satisfy the request, leaving the remainder on the free
list. But if the block is only a little bit larger than the
requested size, then it doesn't make sense to split it and leave a
tiny chunk on the free list that is unlikely to be useful in
satisfying future requests. You can experiment with this threshold
and see what number is large enough to prevent excessive
fragmentation, without wasting too much space that could have been
used to satisfy small requests.
What if no block on the free list is large enough to satisfy
a getmem
request? In that case, getmem
needs to acquire a good-sized block of storage from the underlying
system, add it to the free list, then split it up, yielding a block
that will satisfy the request, and leaving the remainder on the free
list. Since requests to the underlying system are (normally)
relatively expensive, they should yield a reasonably large chunk of
storage, say at least 4K or 8K or more, that is likely to be useful
in satisfying several future getmem
requests. Normally
the same amount is acquired each time it is necessary to go to the
underlying system for more memory. But watch out for really
big getmem
requests. If getmem
is asked
for, say, a 200K block, and no block currently on the free list is that large,
it needs to get at least that much in a single request
since the underlying system cannot be relied on to
return adjacent blocks of storage on successive calls.
So what is "the underlying system"? For our purposes, we'll
use the standard malloc
function! Your memory manager
should acquire large blocks of storage from malloc
when
it needs to add blocks to its free list. malloc
normally guarantees that the storage it returns is aligned on
16-byte or larger boundaries on modern systems, so we won't
worry about whether the block we get from
malloc
is properly aligned.
Notice that a request for a large block will happen the very first
time getmem
is called(!). When a program that
uses getmem
and freemem
begins execution,
the free list should be initially empty. The first
time getmem
is called, it should discover that the
(empty) free list does not contain a block large enough for the
request, so it will have to call the underlying system to acquire
some storage to work with. If implemented cleanly, this will not be a
"special case" in the code -- it's just the normal
action taken by getmem
when it needs to get new
blocks for the free list!
What about freemem
? When it is called, it is passed a pointer
to a block of storage and it needs to add this storage to the free list, combining
it with any immediately adjacent blocks that are already on the list. What
freemem
isn't told is how big the block
is(!). In order for this to work, freemem
somehow has
to be able to find the size of the block. The usual way this is
done is to have getmem
actually allocate a block of
memory that is a bit larger than the user's request, store the
block size at the beginning of the block, and return to the caller
a pointer to the storage that the caller can use, but which actually points
a few bytes beyond the real
start of the block. Then when freemem
is called, it
can take the pointer it is given, subtract the appropriate number
of bytes to get the real start address of the block, and find the
size of the block there. Our list nodes are set up to hold this size.
How is freemem
going to find nearby blocks and decide
whether it can combine a newly freed block with one(s) adjacent to
it? There are various ways to do this (as usual), but a good basic
strategy is for getmem
and freemem
to keep
the blocks on the free list sorted in order of ascending memory
address. The block addresses plus the sizes stored in the blocks
can be used to determine where a new block should be placed in the
free list and whether it is, in fact, adjacent to another one.
It could happen that a request to freemem
would result
in one of the underlying blocks obtained from the system (i.e., from
malloc
) becoming
totally free, making it possible to return that block to the
system. But this is difficult to detect and not worth the trouble in
normal use, so you shouldn't deal with this possibility in your
code.
For more information, in addition to Google and Wikipedia, an authoritative discussion is in Sec. 2.5, Dynamic Storage Allocation, in The Art of Computer Programming, Vol. I: Fundamental Algorithms, by Donald Knuth.
Here are a few ideas that you might find useful. Feel free to use or ignore them as you wish, although you do need to use the 64-bit pointer types correctly.
Your code should work on, and we will evaluate it on, the CSE
Linux systems (cancun
and the CSE virtual
machine). These are 64-bit machines, which means pointers and
addresses are 64-bit (8-byte) quantities. Your code will probably work
on other 64-bit machines, and, if you're careful, might work
on 32-bit machines if it is recompiled, although we won't test
that.
One thing that is needed in several places is to treat pointer
values as unsigned integers so we can do arithmetic to compute memory
block addresses and sizes. We need to be able to cast 64-bit values
between integer and pointer types without losing any
information. Fortunately the library <inttypes.h>
contains a number of types and macros that make the job easier (and
fairly portable!). The main type we want to use
is uintptr_t
, which is a type that is guaranteed to be
the right size to hold a pointer value so that we can treat it as an
unsigned integer. A pointer value (void*
or any other
pointer type) can be cast to uintptr_t
to create an
integer value for arithmetic, and uintptr_t
values can be
cast to pointers when they hold integers that we want to treat as
addresses. (There is also an intptr_t
type that is a
signed integer type of the right size to hold a pointer, but for our
project it would be best to stick with unsigned values.)
You can print pointers and uintptr_t
values
with printf
. Use format %p
to print a
pointer value, e.g., printf("%p\n",
ptr);
. For uintptr_t
values, since these are
stored as long, unsigned integers on our 64-bit systems, they can be
printed as decimal numbers using the %lu
format
specifier: printf("%lu\n",uintvalue);
. It turns
out that <inttypes.h>
defines string macros that
make it possible to print values without knowing the actual size of
the underlying type. The magic incantation to print
an uintptr_t
value ui
is printf("%" PRIuPTR "\n",
ui);
. There are other formatting macros to do things like print
signed integer pointer values as decimal numbers
(PRIdPTR
) or in hex (PRIxPTR
). See a good C
reference for details.
As with all projects, you should start (very) small and incrementally build up the final project. Here are some ideas:
getmem
first by itself. Just
have freemem
return without doing
anything. Get freemem
working later. For this
homework, getmem
has been completed, but it is calling
helper functions. It is your job to implement these helper functions.
If you do this well, implementing freemem
afterwards
will require on a few lines of code.getmem
/freemem
requests
when you are first testing the memory manager routines.print_heap
function can be very helpful during
debugging. Also, gdb
can be useful for exploring the
free list (expecially gdb's
x
command) and for examining the
operation of your code.gdb
to check that it really works as you
expect.Makefile
to compile and run small test programs,
or run the benchmark program with various argument values. If you
find yourself typing the same command more than a few times to run
a test, add it to your Makefile
as the command for a
target with a suitable name
(e.g., testonlygetmem
, testbigmemory
,
testmanysmallcalls
, etc.).get_mem_stats
function may be useful during
debugging to see the effect on the free list of various patterns
of getmem
and freemem
requests. Don't
feel constrained to use it only to produce the required benchmark
program reports.check_heap()
and other assert
s in
your program. These can be particularly useful while you are
testing and debugging, especially to check that pointers are
not NULL
when they shouldn't be and that the heap
data structures have not been corrupted. In particular, include
calls to check_heap()
at the beginning and end
of getmem
and freemem
to verify that
those functions don't introduce any obvious, checkable errors in
the free list. Leave the assert
s
and check_heap()
calls in your code even after
things seem to be working. You can always
put -DNDEBUG
in a gcc
command in
some Makefile
target to disable asserts if you want
to run your code without them.valgrind
is unlikely to be particularly helpful
for this assignment. We are manipulating pointers in non-standard ways
and valgrind
will probably report many spurious problems
that are not really errors given what the code needs to do.Your code will be primarily evaluated by an autograder. You are welcome to submit again in order to achieve a better score, but, you will be well served by using your bench tester to evaluate your code on your own first. We are expecting the specified helper functions to be implemented to spec, and details about these functions will be examined along with the overall memory allocation performance. There will be additional manual grading to look at code quality.
You will submit this homework to the Gradescope HW7: C Memory Implementation, via Gitlab.
You will first update your Gitlab repository. Use git add
and git commit
to ensure that your updated t9_tests.c
, and any required additional
&ldquot;.txt&rdquot; files, are committed. This will be, at minimum, your memory.c
.
These should be located in the hw7
folder, located at the top
level of your repository. User git push
to bring the origin/remote repository up-to-date.
Once you locate the Gitlab assignment you will tap the "GitLab" button on the bottom:
Once you submit your code the autograder may take some time to run. You may resubmit for a higher grade, but your should always do as much testing as possible on your own platform before resubmitting.