CSE 333 15au Homework #1

Out: Monday, October 12, 2015
Due: Monday, October 26, 2015 by 11:59 pm.

[ summary | part a | part b | optional bonus | how to submit | grading ]

Summary
For homework #1, you will finish our implementation of two C data structures: in part A, a doubly-linked list, and in part B, a chained hash table.

Please read through this entire document before beginning the assignment, and please start early! This assignment involves messy pointer manipulation and malloc/free puzzles, and these can cause arbitrarily awful bugs that take time and patience to find and fix.

Part A -- doubly linked list
Context.

If you've programmed in Java, you're used to having a fairly rich library of elemental data structures upon which you can build, such as vectors and hash tables. In C, you don't have that luxury: the C standard library provides you with very little. In this assignment, you will add missing pieces of code in our implementation of a generic doubly-linked list.

At a high-level, a doubly-linked list is incredibly simple; it looks like this:


Each node in a doubly-linked list has three fields; a payload, a pointer to the previous element in the list (or NULL if there is no previous element), and a pointer to the next element in the list. If the list is empty, there are no nodes. If the list has a single element, both of its next and previous pointers are NULL.

So, what makes implementing this in C tricky? Quite a few things:

Given all of these complications, our actual linked list data structure ends up looking like this:


Specifically, we define the following types and structures:

What to do.

You should follow these steps to do this assignment:

  1. Make sure you are comfortable with C pointers, structures, malloc, and free. We will cover them in detail in lecture, but you might need to brush up and practice a bit on your own; you should have no problem Googling for practice programming exercises on the Web for each of these topics.

  2. Get the source files for hw1. Navigate to a directory that contains a checked-out copy of your cse333 Git repository and run the command git pull. (See the CSE 331 Git Tutorial for some tips if the pull command fails because you have unstaged changes or other problems.) After the pull command finishes you should see the following directories and files in your repository:
    bash$ ls
    clint.py  gtest  hw0  hw1 LICENSE.TXT
    
    (You can delete the hw0 directory if you want, but everything else needs to remain in place for this and later parts of the project over the rest of the quarter.)

  3. Look inside the hw1 directory. You'll see a number of files and subdirectories, including these that are relevant to Part A:

  4. Run "make" to verify that you can build your own versions of example_program_ll and test_suite. Make should print out a few things, and you should end up with new binaries inside the hw1 directory.

  5. Since you haven't yet finished the implementation of LinkedList.c, the binaries you just compiled won't work correctly yet. Try running them, and note that example_program_ll halts with an assertion error or a segfault and test_suite prints out some information indicating failed tests, and may crash before terminating.

  6. This is the hard step: finish the implementation of LinkedList.c. Go through LinkedList.c, find each comment that says "STEP X", and replace that comment with working code. The initial steps are meant to be relatively straightforward, and some of the later steps are trickier. You will probably find it helpful to read through the code from top to bottom to figure out what's going on. You will also probably find it helpful to recompile frequently to see what compilation errors you've introduced and need to fix. When compilation works again, try running the test driver to see if you're closer to being finished.

  7. We'll also be testing whether your program has any memory leaks. We'll be using Valgrind to do this. To try out Valgrind for yourself, do this:

Part B -- a chained hash table
Context.

A chained hash table is also a fairly simple data structure. It consists of an array of buckets, and each bucket contains a linked list of elements. When a user inserts a key/value pair into the hash table, the hash table uses a hash function to map the key into one of the buckets, and then adds the key/value pair onto the linked list. (As an important corner case, if the key of the inserted key/value pair already exists in the hash table, the hash table replaces the existing key/value pair with the new one, and returns the old key/value pair to the customer.)

So, over time, as more and more elements are added to the hash table, the linked lists hanging off of each bucket will start to grow. As long as the number of elements is a small multiple of the number of buckets, lookup time is small: you hash the key to find the bucket, then iterate through the chain (linked list) hanging off the bucket until you find the key. As the number of elements gets longer, lookup gets less efficient, so our hash table includes logic to resize itself to maintain short chains.

As with the linked list in Part A, we've given you a partial implementation of a hash table. Our hash table implementation looks approximately like this:


Specifically, we defined the following types and structures:

What to do.

You should follow these steps to do this assignment:

  1. The code you fetched in Part A also contains the files you'll need to complete your hash table implementation and test it. Similar to the linked list, the hash table implementation is split across a few files: HashTable.c contains the implementation you need to finish, HashTable.h contains the public interface to the hash table and documents all of the functions and structures that customers see, and HashTable_priv.h contains some private, internal structures that HashTable.c uses.

  2. Read through HashTable.h first to get a sense of what the hash table interface semantics are. Then, take a look at example_program_ht.c; this is a program that uses the hash table interface to insert/lookup/remove elements from a hash table, and uses the iterator interface to iterate through the elements of the hash table.

  3. test_hashtable.cc contains our Google Test unit tests for the hash table. As before, run this (on its own, and using valgrind) to see how close you are to finishing your hash table implementation.

  4. Look through HashTable.c, find all of the missing pieces (identified by STEP X comments, as before), and implement them.

  5. As before, in solution_binaries, we've provided you with linux executables (example_program_ht and the same test_suite as before) that were compiled with our complete, working version of HashTable.c You can run them to explore what should be displayed when your part B implementation is working and look at the source code for examples of how to use the data structures.

Bonus
You'll note that we provided a second Makefile called "Makefile.coverage". You can use it to invoke the "gcov" code coverage generation tool. Figure how how to (a) use it to generate code coverage statistics for LinkedList.c and HashTable.c, (b) note that the code coverage for HashTable is worse than that for the LinkedList, and (c) write additional HashTable unit tests to improve HashTable's code coverage.

The bonus task is simple, but we're (deliberately) providing next to no detailed instructions on how to do it -- figuring out how is part of the bonus task!

Please make sure your additional unit tests don't change the scoring mechanism that we use, obviously. (We'll be checking that.) Place your additional unit tests in a separate file from the original test suite. That will make it easier for us to find and evaluate your tests.

What to turn in

When you are ready to turn in your assignment, you should follow exactly the same procedures you used in hw0, except this time tag the repository with hw1-final (instead of hw0-final). Remember to clean up and commit all necessary files to your repository before you add the tag. After you have created and pushed the tag, be absolutely sure to test everything by creating a new clone of the repository in a separate, empty directory, checkout the hw1-final tag, and verify that everything works as expected. Refer to the hw0 turnin instructions for details, and follow those steps carefully.

Grading

We will be basing your grade on several elements: