CSE 490c - Homework 4b

Due: Friday, May 14, 2:30 pm, over the web (see turn-in details below).

In this assignment, you will begin working on the group projects, as introduced in HW 4a. Before getting into the C implementation, you will set up your CVS repository and write acceptance tests.

Overview

Your database program will accept a series of input commands on STDIN and produce output on STDOUT. It will have the ability to store "rows" of data in memory, which will simply be lists of key/value pairs, and it will be able to lookup and delete particular rows by matching against some subset of the keys and values. (There is no restriction that all rows must have the same keys.) Your database will also support commands to save and load its data to and from disk. The query language you will support and sample inputs and outputs are provided below.

Query Language:
insert key="value" ...
The insert command takes a list of key/value pairs and inserts a row into the database. Items in the list are separated by spaces, and you should put quotes around your values (in case they have spaces or other symbols). Key names should only contain letters and numbers. You must ensure that there is at least one key/value pair for the insert command, and print an error message if not.
lookup key="value" ...
The lookup command takes a list of key/value pairs as input and prints a list of all rows that contain all of the specified key/value pairs. (Each row may have additional keys and values, and the rows that are returned may differ in structure.) If no rows are found, a message indicating this is printed. If no key/value pairs are given, the entire database is printed.
delete key="value" ...
The delete command behaves similarly to lookup, except that it deletes the rows in addition to printing the output list. If no key/value pairs are given, the entire database is deleted.
save filename
The save command writes the current contents of the database to the given filename on disk, overwriting it if it exists.
load filename
The load command reads in the contents of the given saved database and adds them to the current database in memory. If the file exists, you may assume that it was produced by a save command, and thus has valid contents.
# comment text
Any line beginning with a hash character (#) should be considered a comment and should thus be ignored by your program. No output should be produced for comments.

Note that you should print an error message if you encounter syntax errors in the input or if the expectations stated above are not met (ie. insert must have at least one key/value pair and the file being loaded must exist).

Sample input session:

insert name="John Doe" accountType="Checking" balance="100"
insert name="Jane Doe" accountType="Checking" balance="200"
insert name="John Doe" accountType="Savings" balance="50"
insert vaultBalance="10000"
lookup accountType="Checking"
# Delete Jane's account
delete name="Jane Doe"
delete name="Charlie" branch="WA"
save myDatabase.db
Possible corresponding output:
Record inserted.

Record inserted.

Record inserted.

Record inserted.

Lookup results:
name: John Doe, accountType: Checking, balance: 100
name: Jane Doe, accountType: Checking, balance: 200

Deleted rows:
name: Jane Doe, accountType: Checking, balance: 200

No rows deleted.

Database saved in myDatabase.db.

Note that there is a blank line between each command's output. You may choose to do this or use some other delimiter (eg. ---------------), but you need to have some clear marker to distinguish the output of each command.


Tasks

  1. Create a CVS repository. We will send each group an email with your group name (eg. c490c-a), and we have created group-writable project directories for each group in /projects/instr/04sp/cse490c/ where you should store your CVS repository. Inside your group's directory, you should create a directory for each member of your group, plus your CVS repository, as described below. You can then each check out a copy of the repository to work on, with the ability to merge your changes through CVS. Note: you may also check out a copy of your repository in your home directory or somewhere else convenient, if you prefer to do your work there.

    1. One person in your group should create a directory called cvsroot in your group's project directory. Start a repository in this directory using the syntax in the lecture slides, where your cvsDir argument will be:

      /projects/instr/04sp/cse490c/<group_name>/cvsroot
      
      Then create a project in your repository with a name of your choice (probably something simple like "db", but you can be creative). Note that it's important to run the import command from an empty directory, which you can remove afterwards.

    2. After the above steps are complete, each person in your group should do the following steps. Create a directory with the same name as your username in your group's project directory. For example, if I were in group c490c-z, I would create a directory named:

      /projects/instr/04sp/cse490c/c490c-z/creis
      

    3. Inside this directory, checkout the project from your CVS repository, using the syntax from the lecture slides. This will create an essentially empty directory with the same name as your project. (The only thing inside it should be a bookkeeping directory named CVS; you shouldn't touch anything inside here.)

    4. To get a feel for how CVS works, create a text file in your project directory, add it to your repository, commit it, and see if other people in your group can get a copy of it by updating. Feel free to remove this file from the repository as well, to get practice before working on important files.

    5. For the remainder of the project, all of your files and directories (with the exception of temporary or scratch files) should be added to your CVS repository, so that all of your group members can contribute to them.

  2. Write tests. As discussed in class, writing some of the tests before starting implementation can be a useful technique that helps you think about how your program needs to behave in common and uncommon cases. By writing unit tests first, you can easily run them after writing each part of the code to see if you met your own expectations. (It's still important to add more tests as you write the code, though, to cover things you didn't think about beforehand.)

    You can use this strategy of unit testing for units of any size: individual functions, modules, or whole programs. For this part of the assignment, you will write sample inputs and the expected outputs that your final program should produce. (This level of granularity is called acceptance testing.)

    1. Create a tests directory in your project directory where you can store your test files. Be sure that exactly one person adds each file and directory you create to your CVS repository; your other group members may need to use "cvs update -d" to get new directories that have been added to the repository.

    2. You should create several input files that each contain a sample session (a collection of commands to send to the database on STDIN). Each file should be named input.NN, where NN is a unique number. Your sample input sessions should exercise the common and uncommon uses of your programs, and they should include both well-formed and badly-formed inputs. (Specifically, you should include test inputs that do not meet the expectations stated in the query language description, so be sure that your error handling works.)

      It may help to have both small and reasonably sized sessions, so that at least some tests will pass early on. Also, you may not want to have different sample input files depend on data saved in other input files, since one test's failure might then affect another test.

    3. For each input.NN file that you create, create a corresponding output.NN file with the expected output for each sample input session. You should think carefully about the exact syntax you want your program to use for output; remember to keep it simple.

    4. Create a README file in your tests directory. In this file, briefly describe which aspects of your program you chose to test and why, and give brief arguments for how your sample input files test those aspects. You may also choose to put some of this information in the test files themselves as comments, in which case this file can simply be an overview of what was tested and what was not.

    5. Write a script that takes two arguments (the name of the "sample inputs" file and the name of the "expected outputs" file) and runs your program on the input file. The script should produce an actual.NN file (overwriting it if it exists) with the actual output of your program, and compare the results to the expected output (possibly using diff and/or cmp). It should print a clear message to the user whether the test passes or fails, and what the difference is if the test fails.

      Note: Obviously, your program does not exist yet, so the script should always print a failure! You may want to define the name of your program in a variable to make it easy to change once you write your program. Feel free to write a "dummy" script file to run that produces either no output or hardcoded output, just to see if your test script works on simple cases.

    6. It would be tedious to run the test script for each pair of input/output files every time you make a change to your program. Instead, write another script (or modify your first one) to loop over ever pair of input/output files in the directory. You should have a rule in your Makefile for this directory that runs this script over all your tests.


What to Turn In

One person from each group should turn in a compressed hw4b.tar.gz file containing the contents of your project directory. Use E-submit, not email, to submit the file.