PRACTICE: Homework 6A: Bench Tester

Due: Friday, May 19, 2023, at 11:59pm

Before you start:
Submit Canvas comment specifying whether you wish to work with a partner

Do a git pull on cse374-materials to get the updated starter code (in HW6)

Assignment Goal

Homework 6 is a project that continues our exploration of procedural programming, memory management, and software tools. In particular in this assignment you will: implement and test a memory management package that has the same functionality as the standard library malloc and free functions. In Part A (this section) you will focus on developing your software testing skills by creating a bench tester. In Part B you will use this bench tester to evaluate your memory management package.

Synopsis

In this assignment you will develop benchmarking system for a memory management package. Starter code will be provided for you to use, and is available through the CSE374-materials git repository. This includes a pre-compiled object file of the memory system for use in Part A, and a pre-compiled object file of the bench tester for Part B.

This portion of the project (Part A) is worth a total of 20 points. You will be evaluated on the correctness of your bench tester, your use of the Makefile to compile and run your tester, and the contents of a Readme file. Notice that Part A is treated as a 'Practice' homework, and is therefor eligible to be dropped at the end of the quarter.

Team Work

You may work in a team for the duration of HW6 Parts A and B. If you wish to work with a team mate you must submit a request via Canvas by Wednesday, May 10th. We will create a shared gitlab repository for your team to use. Please notice that each participating team member will recieve the same grade at the completion of this project. In order to be considered a participating team member you must have commits on the gitlab repository. If you are part of a team for HW6 Part A, you will expected to continue working with that partner for HW6 Part B.

GitLab

You should use your git repository on the CSE GitLab server for this assignment; you cannot use another repository elsewhere. (And, as is true of all assignments, your solution code should not be publicly available on any repository where it could be accessed by other students in the class this quarter or in the future.)

Requirements

This project consists of two main technical pieces: a memory management package, and a program to exercise it and report statistics. While starter code is provided to give you a jump on development, you are ultimately responsible for both portions of the code and for making them work together.

For this portion of the project (Part A), you are provided with some starter code, as well as an object file memory.o containing a functional memory management system. You will modify the Makefile, a Readme, and the bench.c code. For the second portion of the project (Part B), you will modify the code to produce your own version of the memory management system, which can then be tested using your bench testing code.

Memory Management

The memory management portion of the includes a header file mem.h and definition of the following four functions. In Part A you will be provided with an object file (memory.o) that contains a compiled implementation of these functions.

Function	Description
`void* getmem(uintptr_t size)`	Return a pointer to a new block of storage with at least `size` bytes of memory. The value `size` must be greater than 0. If `size` is not positive, or if for some reason `getmem` cannot satisfy the request, it returns `NULL`. Type `uintptr_t` is an unsigned integer type that can hold a pointer value (i.e., can be converted to or from a pointer of type `void *`, or other pointer type, exactly). It is defined in header `<inttypes.h>` (and also `<stdint.h>`). See discussion below.
`void freemem(void* p)`	Return the block of storage at location `p` to the pool of available free storage. The pointer value `p` must be one that was obtained as the result of a call to `getmem`. If `p` is `NULL`, then the call to `freemem` has no effect and returns immediately. If `p` has some value other than one currently allocated by `getmem` the operation of `freemem` is undefined.
`void get_mem_stats( uintptr_t* total_size, uintptr_t* total_free, uintptr_t* n_free_blocks)`	Store statistics about the current state of the memory manager in the three integer variables whose addresses are given as arguments. The information stored should be as follows: `total_size`: total amount of storage in bytes acquired by the memory manager so far to use in satisfying allocation requests. (In other words, the total amount requested from the underlying system via malloc.) `total_free`: the total amount of storage in bytes that is currently stored on the free list, including any space occupied by header information or links in the free blocks. `n_free_blocks`: the total number of individual blocks currently stored on the free list. See the discussion in Part B outlining the implementation of the memory manager for more details about these quantities.
`void print_heap(FILE * f)`	Print a formatted listing on file `f` showing the blocks on the free list. This is used in Part B.

Test and Benchmark

In this portion of the project you will implement a program named bench, whose source code is stored in a file bench.c. When this program is run, it should execute a large number of calls to functions getmem and freemem to allocate and free blocks of random sizes and in random order. This program should allow the user to specify parameters that control the test. The command-line parameters, and their default values are given below. Trailing parameters can be omitted, in which case default values should be used. Square brackets [] mean optional, as is the usual convention for Linux command descriptions.

Synopsis: bench [ntrials] [pctget] [pctlarge] [small_limit] [large_limit] [random_seed]

Parameters:

ntrials: total number of getmem plus freemem calls to randomly perform during this test. Default 10000.
pctget: percent of the total getmem/freemem calls that should be getmem. Default 50.
pctlarge: percent of the getmem calls that should request "large" blocks with a size greater than small_limit. Default 10.
small_limit: largest size in bytes of a "small" block. Default 200.
large_limit: largest size in bytes of a "large" block. Default 20000.
random_seed: initial seed value for the random number generator. Default: some more-or-less random number such as the the system time-of-day clock (or bytes read from /dev/urandom if you're feeling adventurous).

(The parameter list is, admittedly, complex, but the intent is that this program will be executed by various commands in your Makefile(s), so you will not have to repeatedly type long command lines to run it.)

When bench is executed, it should perform ntrials memory operations. On each operation, it should randomly decide either to allocate a block using getmem or free a previously acquired block using freemem. It should make this choice by picking a random number with a pctget chance of picking getmem instead of freemem. If the choice is to free a block and all previously allocated blocks have already been freed, then there is nothing to do, but this choice should be counted against the ntrials total and execution should continue.

If the choice is to allocate a block, then, if the pointer returned by getmem is not NULL, the bench program should store the value 0xFE in each of the first 16 bytes of the allocated block starting at the pointer address returned by getmem. If the requested block size is smaller than 16 bytes, all of the requested bytes should be initialized to 0xFE.

If the choice is to free a block, one of the previously allocated blocks should be picked randomly to be freed. The bench program must pick this block and update any associated data structures used to keep track of allocated blocks in amortized constant (O(1)) time so that the implementation of the bench program does not have unpredictable effects on the processor time needed for the test.

The next three parameters are used to control the size of the blocks that are allocated. In typical use, memory managers receive many more requests for small blocks of storage than large ones, and the order of requests is often unpredictable. To model this behavior, each time a new block is allocated, it should be a large block with probability pctlarge; otherwise it should be a small block (use a random number generator to make this decision with the specified probability). If the decision is to allocate a small block, request a block whose size is a random number between 1 and small_limit. If the decision is to allocate a large block, request a block whose size is is a random number between small_limit and large_limit.

While the test is running, the benchmark program should print the following statistics to stdout:

Total CPU time used by the benchmark test so far in seconds (show enough fractional digits to provide useful information if possible, although the granularity of the system clock may be too large for this to be meaningful for short tests).
Total amount of storage acquired from the underlying system by the memory manager during the test so far (e.g., the total_size quantity from get_mem_stats, above).
Total number of blocks on the free storage list at this point in the test.
Average number of bytes in the free storage blocks at this point in the test.

The program should print this 10 times during execution, evenly spaced during the test. In other words, the first report should appear after 10% of the total getmem/freemem calls have executed, then after 20%, 30%, etc., and finally after the entire test has run. You may format this information however you wish, but please keep it brief and understandable - one line for each set of output numbers should be enough.

Additional Requirements

Besides the software specifications above, you must meet the following requirements for this assignment.

You should use either your individual or your team's repository in Gitlab. Be sure to commit and push early and often.
You should create a Makefile with at least the following targets:
- bench (this should be the default target). Generate the bench executable program.
- test. Run the bench test program with default parameters. This should recompile the program first if needed to bring it up to date.
- clean. Remove only the bench.o files, bench executable, emacs backup files (*~). You will not want to remove the memory.o file.
You may add any additional targets that you wish for your convenience. There are some ideas for useful targets in the Implementation Suggestions section, below.
You should create a Readme file to accompany your work This file should give a brief summary of:
- Your name(s), and repository url (if used)
- A summary of the results you observed on several runs of your bench program. This does not need to be exhaustive (and should not be exhausting), but it should give the reader an idea of how the provided object code worked, how fast it was, and how efficient it was in its use of memory. This will serve as a record for evaluating your own system in Part B.
- A summary of any resources you consulted during this development. Your code, of course, must be your own, but feel free to research ideas along the way.
Finally, your code should be of the usual high quality, with clean layout, good comments, and so forth. Use cpplint to check for possible style issues that may need correcting. (Note: cpplint may flag rand and ask you to use the multi-threaded version. For this reason we have included a version of cpplint in the hw6 folder that removes this flag.)

Repository Notes

You have a repository on gitlab you may use for this assignment on the CSE gitlab (https://gitlab.cs.washington.edu). If you are working with in a team a new repository will be created for which you will share access. You should use your experience with gitlab to access this resource.

Memory Management

Details about developing a memory management system that complies with the functions listed above are included in Part B of this project. Feel free to read ahead by referring to that assignment.

Implementation Suggestions

Here are a few ideas that you might find useful. Feel free to use or ignore them as you wish, although you do need to use the 64-bit pointer types correctly.

64-bit Pointers and ints

Your final memory management code should work on, and we will evaluate it on, the CSE Linux system (Seaside. These are 64-bit machines, which means pointers and addresses are 64-bit (8-byte) quantities. Your code will probably work on other 64-bit machines, and, if you're careful, might work on 32-bit machines if it is recompiled, although we won't test that.

One thing that is needed in several places is to treat pointer values as unsigned integers so we can do arithmetic to compute memory block addresses and sizes. We need to be able to cast 64-bit values between integer and pointer types without losing any information. Fortunately the library <inttypes.h> contains a number of types and macros that make the job easier (and fairly portable!). The main type we want to use is uintptr_t, which is a type that is guaranteed to be the right size to hold a pointer value so that we can treat it as an unsigned integer. A pointer value (void* or any other pointer type) can be cast to uintptr_t to create an integer value for arithmetic, and uintptr_t values can be cast to pointers when they hold integers that we want to treat as addresses. (There is also an intptr_t type that is a signed integer type of the right size to hold a pointer, but for our project it would be best to stick with unsigned values.)

You can print pointers and uintptr_t values with printf. Use format %p to print a pointer value, e.g., printf("%p\n", ptr);. For uintptr_t values, since these are stored as long, unsigned integers on our 64-bit systems, they can be printed as decimal numbers using the %lu format specifier: printf("%lu\n",uintvalue);. It turns out that <inttypes.h> defines string macros that make it possible to print values without knowing the actual size of the underlying type. The magic incantation to print an uintptr_t value ui is printf("%" PRIuPTR "\n", ui);. There are other formatting macros to do things like print signed integer pointer values as decimal numbers (PRIdPTR) or in hex (PRIxPTR). See a good C reference for details.

The Benchmark Program

The command line can contain several integer parameters. These need to be converted from character strings ("500") to binary int values. There are various library functions that are useful: look at atoi and related ones. Take advantage of the Linux getopt library function if it helps.

The benchmark program relies heavily on random numbers. The standard library function rand can be used to generate sequences of pseudo-random numbers. Given a particular starting number (the seed), rand (or any pseudo-random number generator) will always generate the same sequence of numbers on successive calls. This can be very helpful during testing (i.e., things are basically random, but the sequence is reproducible). If you want to generate a different sequence of numbers each time the program is executed, you can set the seed to some quantity that is different on each run -- the system time-of-day clock is a frequent choice -- and a different value for each execution should be the default if no seed is given on the benchmark program command line. Alternatively, modern Linux systems provide a special file /dev/urandom that returns random bytes whenever it is read, and you can read bytes from here to get a random starting value.

One of the benchmark quantities that should be printed is the processor time used. The clock library function can be used to measure this. Store the time right before starting the tests, then subtract this beginning time from the current clock time whenever you need to get the elapsed time. Unfortunately, on many Linux systems clock is updated infrequently. If your test is fast enough that clock has the same value before and after the test, don't worry about it. Alternatively you can explore whether there are better timing functions available. If you use one of these please be sure it is available on the CSE Linux machines so the program will work when we run it. (This has been a problem in the past when people developed the code using other systems only to have their entire project fail to compile because they were using a timing function or header that was not portable and not found on the CSE machines.)

Finally, the benchmark program needs to keep track of all of the pointers returned by getmem but not yet freed, and randomly pick one of these to free when the "coin toss" says to free some storage. The obvious way to handle this is to allocate a "big enough" array using malloc (not using getmem! Why?) and store the pointers there. When a pointer is picked randomly to be freed, you can move another pointer from the end of the list to the spot occupied by the freed pointer and reduce the size of the list by 1. That way, picking the pointer and updating the list can be done in O(1) (constant) time, so the order in which the pointers are picked won't affect the time needed by the benchmark program itself to run the tests.

Developing and Testing

Don't be shy about adding lots of targets to your Makefile to compile and run small test programs, or run the benchmark program with various argument values. If you find yourself typing the same command more than a few times to run a test, add it to your Makefile as the command for a target with a suitable name (e.g., testnoargs, testallargs, testsetrandomseed, etc.).

cpplint.py is provided. You should use this version early to check your code for styles issues.

Submission / Turning-In

For this assignment, you will turn in the code via Gradescope as usual. You should be able to submit it directly from Gitlab. Please make sure that all the code associated with this assignment is in a folder called HW6A. If you are working as a team we will double check your gitlab commits to ensure that each team member is active.

You should submit a folder called HW6A containing a Readme, Makefile, and bench.c. For testing we will use your Makefile to compile and run the code, using our own memory.o file.

PRACTICE: Homework 6A: Bench Tester

Assignment Goal

Jump To:

Synopsis

Team Work

GitLab

Requirements

Memory Management

Test and Benchmark

Additional Requirements

Repository Notes

Memory Management

Implementation Suggestions

64-bit Pointers and ints

The Benchmark Program

Developing and Testing

Submission / Turning-In