Out: Friday, June 28, 2019
Due: Thursday, July 11, 2019 by 11:59 pm
You will finish our implementation of two C data structures: a doubly-linked list (Part A) and a chained hash table (Part B).
Please read through this entire document before beginning the assignment, and please start early! This assignment involves messy pointer manipulation and malloc/free puzzles, and these can cause arbitrarily awful bugs that take time and patience to find and fix.
All CSE 333 homeworks are only supported on Attu and the CSE Home VM. We do not support building and running this assignment in any other work environments.
If you've programmed in Java, you're used to having a fairly rich library of elemental data structures upon which you can build, such as vectors and hash tables. In C, you don't have that luxury: the C standard library provides you with very little. In this assignment, you will add missing pieces of code in our implementation of a generic doubly-linked list.
At a high-level, a doubly-linked list looks like this:
Each node in a doubly-linked list has three fields: a payload, a
pointer to the previous element in the list (or NULL
if there is no
previous element), and a pointer to the next element in the list.
If the list is empty, there are no nodes. If the list has a single
element, both of its next and previous pointers are NULL
.
So, what makes implementing this in C tricky? Quite a few things:
Given all of these complications, our actual linked list data structure ends up looking like this:
Specifically, we define the following types and structures:
You should follow these steps to do this assignment:
git pull
. (See
the CSE 333 Git Tutorial for some tips if
the pull
command fails because you have unstaged
changes or other problems.) After the pull
command
finishes you should see the following directories and files in your
repository:bash$ ls clint.py gtest hw0 hw1
Makefile
: a makefile you can use to compile the
assignment using the Linux command make all on Attu and the VM.
LinkedList.h
: a header file that defines and documents the API to
the linked list. A customer of the linked list includes this
header file and uses the functions defined within in. Read
through this header file very carefully to understand how the
linked list is expected to behave.
LinkedList_priv.h
, LinkedList.c
: LinkedList_priv.h is a
private header file included by LinkedList.c; it defines the
structures we diagrammed above. LinkedList.c contains the
partially completed implementation of our doubly-linked list.
Your task will be to finish the implementation. Take a minute
and read through both files; note that there are a bunch of
places in LinkedList.c that say "STEP X:" these labels identify
the missing pieces of the implementation that you will finish.
example_program_ll.c
: this is a simple example of how a
customer might use the linked list; in it, you can see the
customer allocating a linked list, adding elements to it,
creating an iterator, using the iterator to navigate a bit, and
then cleaning up.
test_linkedlist.cc
: this file contains unit tests that we wrote
to verify that the linked list implementation works correctly.
The unit tests are written to use the Google Test unit
testing framework, which has similarities to Java's JUnit testing
framework. As well, this test driver will assist the TA in
grading your assignment: as you add more pieces to the
implementation, the test driver will make it further through the
unit tests, and it will print out a cumulative score along the
way. You don't need to understand what's in the test driver for
this assignment, though if you peek inside it, you might get
hints for what kinds of things you should be doing in your implementation!
solution_binaries
: in this directory, you'll find some Linux
executables, including example_program_ll
and test_suite
.
These binaries were compiled with a complete, working version of
LinkedList.c; you can run them to explore what should be displayed
when your assignment is working!
LinkedList.c
.
Go through LinkedList.c, find each comment that says "STEP X", and replace
that comment with working code. The initial steps are meant to
be relatively straightforward, and some of the later steps are
trickier. You will probably find it helpful to read through
the code from top to bottom to figure out what's going on.
You will also probably find it helpful to recompile frequently
to see what compilation errors you've introduced and need to fix.
When compilation works again, try running the test driver to
see if you're closer to being finished.
static
) helper
functions in your implementation, and you should do that when it
improves modularity.
valgrind --leak-check=full ./solution_binaries/example_program_llNote that Valgrind prints out that no memory leaks were found. Similarly, try running the test driver under Valgrind:
valgrind --leak-check=full ./solution_binaries/test_suiteand note that Valgrind again indicates that no memory leaks were found.
A chained hash table is a data structure that consists of an array of buckets, with each bucket containing a linked list of elements. When a user inserts a key/value pair into the hash table, the hash table uses a hash function to map the key into one of the buckets, and then adds the key/value pair onto the linked list. (As an important corner case, if the key of the inserted key/value pair already exists in the hash table, the hash table replaces the existing key/value pair with the new one, and returns the old key/value pair to the customer.)
So, over time, as more and more elements are added to the hash table, the linked lists hanging off of each bucket will start to grow. As long as the number of elements is a small multiple of the number of buckets, lookup time is small: you hash the key to find the bucket, then iterate through the chain (linked list) hanging off the bucket until you find the key. As the number of elements gets longer, lookup gets less efficient, so our hash table includes logic to resize itself to maintain short chains.
As with the linked list in Part A, we've given you a partial implementation of a hash table. Our hash table implementation looks approximately like this:
Specifically, we defined the following types and structures:
You should follow these steps to do this assignment:
HashTable.c
contains the
implementation you need to finish, HashTable.h
contains the public
interface to the hash table and documents all of the functions
and structures that customers see, and HashTable_priv.h
contains
some private, internal structures that HashTable.c
uses.
HashTable.h
first to get a sense of what the hash
table interface semantics are. Then, take a look at
example_program_ht.c
; this is a program that uses the hash table
interface to insert/lookup/remove elements from a hash table,
and uses the iterator interface to iterate through the elements
of the hash table.
test_hashtable.cc
contains our Google Test unit tests for the
hash table. As before, run this (on its own, and using valgrind)
to see how close you are to finishing your hash table implementation.
HashTable.c
, find all of the missing pieces
(identified by STEP X comments, as before), and implement them.
solution_binaries
, we've provided you with linux
executables (example_program_ht
and the same test_suite
as before)
that were compiled with our complete, working version of
HashTable.c
You can run them to explore what should be displayed
when your part B implementation is working
and look at the source code for examples of how to use the data structures.
You'll note that we provided a second Makefile called Makefile.coverage
.
You can use it to invoke the "gcov" code coverage generation tool. Figure
how to (a) use it to generate code coverage statistics for
LinkedList.c
and HashTable.c
, (b) note that the code coverage for HashTable
is worse than that for the LinkedList, and (c) write additional HashTable
unit tests to improve HashTable's code coverage.
The bonus task is simple, but we're (deliberately) providing next to no detailed instructions on how to do it – figuring out how is part of the bonus task!
Please make sure your additional unit tests don't change the scoring mechanism that we use, obviously. (We'll be checking that.) Place your additional unit tests in a separate file from the original test suite. That will make it easier for us to find and evaluate your tests.
When you are ready to turn in your assignment, you should follow
exactly the same procedures you used in hw0, except this time tag
the repository with hw1-final
(instead
of hw0-final
). Remember to clean up and commit and
push all necessary files to your repository before you add and
push the tag.
After you have created and pushed the tag, be
absolutely sure to test everything ON ATTU OR THE VM by creating a new
clone of the repository in a separate, empty directory, checkout
the hw1-final
tag, and verify that everything works
as expected. Refer to the hw0 submission instructions for details
and follow those steps carefully.
If you fail to check your work and your project doesn't build properly when the same steps are done by the course staff to grade it, you may lose a huge amount of the possible credit for the assignment even if almost absolutely everything is actually correct.
We will be basing your grade on several elements:
test_linkedlist.cc
and test_hashtable.cc
. If your code fails a
test, we won't attempt to understand why: we're planning on just
including the number of points that the test drivers print out.