CSE 331 Homework 3: Designing a Graph ADT

Contents:

Introduction
Problem 1: Written Exercises
Building a Graph
Test script file format
- Sample input and output
Hints
- Writing Specifications
- Designing Tests
What to Turn In
Q & A

Introduction

In hw3, you will design a graph ADT and write tests for it. (In hw4, you will implement the graph ADT.) You get to specify the ADT (choose its operations), and write tests to write to demonstrate that an implementation meets the specification.

Although this is an individual assignment, in the design stage you will learn more and produce a better design by bouncing your ideas off others and incorporating their own ideas. To encourage discussion, we are relaxing the standard collaboration policy somewhat for Problem 2 (the interface design stage). You must first attempt to come up with an interface design on your own, but then you are strongly encouraged to discuss your design with your classmates who have also attempted a design (as well as the course staff during office hours) as much as you like, without restrictions on what materials you can bring to or from meetings. On the other parts of this assignment, the standard collaboration policy still applies.

This assignment is the first of a multi-part project. Think about building a campus map to get an idea of one application for your graph ADT. By the end of the quarter, you will build a full application for getting directions to travel between two buildings on the UW campus.

Problem 1: Written Exercises

This part is designed to test your understanding of some of the ADT concepts from lecture. To get started, update your working copy of your repository using git pull to get the files for this assignment. Place your answers in the file hw3/answers.txt.

For each of the classes below, write the abstraction function and representation invariant, referring to the source code provided in the hw3/problem1 package of your project.
1. IntQueue1
2. IntQueue2
Below are several snapshots of an IntQueue2 object's internal state at different points in a program. Which snapshots are equivalent to each other at the abstract level? In other words, partition the snapshots into groups based on which are identical to each other from the client's perspective.
Below are signatures for various methods and constructors. For each, state and justify in 1-2 sentences (per problem) whether the method or constructor could possibly expose the representation, given the information available. Explain any assumptions you made.
1. public int solveEquations(int x, int y, int z)
2. public String[] decode(boolean slowly)
3. private Date myBirthday()
4. public String toString()
5. public Iterator<Integer> elements()
6. public Deck(List<Card> cards)

Building a Graph

Definitions and Terminology

In the rest of this assignment you will design a directed labeled multi-graph and write tests for it, but you will not implement it.

A graph is a collection of nodes (also called vertices) and edges. Each edge connects two nodes. In a directed graph, edges are one-way: an edge e = ⟨A,B⟩ indicates B that is directly reachable from A. To indicate that B is directly reachable from A and A from B, a directed graph would have edges ⟨A,B⟩ and ⟨B,A⟩.

The children of node B are the nodes to which there is an edge from B. In Fig. 1, the children of B are A and C. Similarly, the parents of B are the nodes from which there is an edge to B. In Fig. 1, B has only one parent, A.

A path is a sequence of edges ⟨node1,node2⟩, ⟨node2,node3⟩, ⟨node3,node4⟩, .... In other words, a path is an ordered list of edges, where an edge to some node is immediately followed by an edge from that node. In Fig. 1, one possible path is ⟨B,A⟩,⟨A,B⟩,⟨B,C⟩. This path represents traveling from B to A to B to C. A path may traverse a particular edge any number of times.

Figure 1: A simple directed graph with four nodes.

Figure 2: A directed multigraph.

Figure 2: A directed labeled multigraph.

In a multigraph, there can be any number of edges (zero, one, or more) between a pair of nodes. Fig. 2 shows a multigraph with 2 edges from A to C.

In a labeled graph (Fig. 3), every edge has a label containing information of some sort. Labels are not unique: multiple edges may have the same label. A multigraph can contain two "identical" edges with the same starting point, ending point, and label. For this project, you you may decide whether or not to allow identical edges in your hw4 Graph specification — that is, you decide whether to specify a multigraph or an ordinary graph. (Whatever you decide, make sure it is clearly documented so clients understand how they can use your ADT.)

If you want to learn more, read Wikipedia's definition of a graph. Graphs are also a major topic in CSE332. If you still have questions, ask the course staff.

Many interesting problems can be represented with graphs. For example:

A graph can represent airline flights between cities, where each city is a node and an edge ⟨A,B⟩ indicates that there is a flight from A to B. The edge label might represent the cost in money (airfare), time (length of flight), or distance.
To find walking routes across the UW campus, you can build a graph where nodes represent buildings and other locations and edges represent walking paths connecting two locations. The label/cost of an edge is the physical length of that path.
The Web can be modeled as a graph with node for every webpage and an edge ⟨A,B⟩ if page A links to page B. The label could indicate the anchor text for a link on page A, or the number of links from page A to page B.
Facebook is essentially a giant graph with nodes for users and edges between friends. (You can see a visualization of the Facebook graph.)

Problem 2: Write a Specification for Graph

To start, you will specify the API for a Java class or classes representing a directed labeled multigraph. The API, or public interface, is the collection of public classes and methods that clients of your graph ADT can use. We recommend that you work in iterative layers of detail. Start rough — preferably with pencil and paper — by identifying what operations the ADT needs to support, how it might be logically broken into classes and interfaces, and how these classes and interfaces rely on each other. Then, jot down a list of methods for each class that provide these operations. Think through some possible client applications, particularly the application in UW campus map, to get an idea for what operations are needed. Perhaps write some pseudocode (or even real code) for the application and make sure your graph interface meets its needs. (Note: don't worry about implementing the search algorithm. We prefer that you focus on the lower-level operations needed to build the graph and to be able to perform some search.)

Keep your initial design rough — don't write formal class and method specifications with all the proper clauses right away. Your design will likely go through multiple iterations, and it's easier to throw away parts if you have invested less effort in them.

Once you are satisfied with your high-level design, write a Javadoc specification for each class and method. Follow the format we have been using in CSE 331, remembering to use both standard Javadoc tags (param, returns, throws) and ones introduced for this course (requires, effects, modifies). A good approach is to create skeleton implementations of your classes containing only method “stubs”, then write your specifications in the right place in the source file. A stub is a not-yet-implemented method whose body simply throws an exception, as you saw in the Homework 2 starter code. Stub methods give you the flexibility to run client code and tests before all your methods are implemented correctly.

As you write the JavaDoc specifications and create the stub methods, take advantage of the JavaDoc tool to generate formatted JavaDoc web pages. Reviewing specifications in this format can provide useful feedback on the quality and completeness of the work you've done. There are detailed JavaDoc instructions in the handout on running Java programs.

You may assume nodes are uniquely identified by their data contents: that is, no two nodes store entirely equal data.

For this assignment, you may restrict your graph to store the data in nodes and edge labels as Strings. We strongly recommend you DO NOT use Java generics for this assignment. In a future assignment, you will use generics to make your ADT work with other data types — text, integers, doubles, etc.

Implementation details, such as fields, representation invariants, and abstraction functions, are not part of the public specification. Therefore, they should not be part of what you submit for this assignment.

Design problems such as this one are open-ended by nature: we are not looking for one “right” design. There are principles of good design you have learned in lecture, however. You will also find suggestions of what to consider and how to approach the problem in the hints section. Also, designs naturally evolve as requirements change, so try to make your code easy to expand without making it overly general right away.

In hw3/answers.txt, write a brief description (one to two sentences for each operation) of why you included the operations you did and why you feel they are a sufficient interface to a graph. If your design includes multiple classes or interfaces, explain why you included each one; if not, explain whether you considered additional classes and why you decided not to include them. So your total work in this problem is this brief write-up plus the Javadoc class and method specifications as they appear in the source code; you do not need to submit your Javadoc in a separate file.

Problem 3: Write Tests for Graph

Write a high-quality test suite for your Graph specifications. It's important to write your tests before your code.

Make sure you are familiar with the difference between implementation and specification tests, described in the testing handout. Specifically, specification tests must satisfy the staff-provided format and layout; in other words, they must be valid tests for any solution to this assignment. By contrast, implementation tests are unit tests for methods from the unique specification you designed in Problem 2.

Specification Tests: Because we didn't specify any of the class or method names you need to write for this assignment, specification tests cannot test your interface directly. Instead, you will construct specification test cases in the format specified in the Test Script File Format section. Each test case should consist of a “test” file containing a series of commands, and an “expected” file containing the output that the commands should produce. The file names should have the same base name but end with .test and .expected, respectively. For example, you may have a command file named testAdd.test with expected output in testAdd.expected. These files must be in the test directory alongside ScriptFileTests.java. They should have descriptive file names and comments that explain what case is being tested, and just like methods in unit tests, each test file should test only one distinct condition.
The staff supplied HW3TestDriver will read from these files using the format described and "run the tests". For now, you don't have to implement the driver.

When you run the provided JUnit Test ScriptFileTests (which is also run by the JUnit Test Suite SpecificationTests), it will find all of the test files in that directory and run them, saving their output as "testAdd.actual" (for example) in the build/ directory of your project. It then compares the actual file to the expected file and fails if they are different.
Implementation Tests: Write unit tests for the methods of your classes in one or more JUnit test classes, just like you saw in Homework 2. Create one test class per public ADT class, and be sure to write tests for every public method. Add your test classes to test/ImplementationTests.java, mimicking the structure of the provided SpecificationTests.java from Homework 1 and 2.
Documentation: In hw3/answers.txt, write a few paragraphs documenting and explaining your testing strategy. Mention how your specification tests differ from your implementation tests (if they do) and why.

Note that both your specification tests and your implementation tests will fail, because you have not yet implemented your graph. They should succeed after you implement your graph in HW4.

It is important to check every aspect of the output files to make sure that your tests are running correctly, including whitespace. We use an automated script to compare the expected and actual output of your files and report if they don't match.

What to turn in: you are required to commit any new JUnit test classes you add, your updated version of ImplementationTests.java listing these new classes, and the .test and.expected files you write. You should not modify SpecificationTests.java.

Test script file format

Because you and your classmates will have different specifications for the class(es) in this assignment, it is important that there is a standardized interface to use and test your code. To that end, we specify a text-based scripting format used to write instructions that will be executed by your graph.

The testing script is a simple text file with one command listed per line. Each line consists of words separated by white space. The first word on each line is a command name. The remaining words are arguments to that command. To simplify parsing the file, graph names and node and edge data may contain only alphanumeric ASCII characters (in other words, numbers and English letters (upper or lower case)).

There are example programs in the test/java/hw3 directory.

The following is a description of the valid commands. Each command has an associated output which should be printed to standard output (typically, the console) when the command is executed. Lines that have a hash (#) as their first character are considered comment lines and are only echoed to the output when running the test script. Lines that are blank should cause a blank line to be printed to the output. These commands were chosen for ease of testing and are not meant to suggest what methods you should include in your graph specifications or how you should implement them. For example, it is unlikely to make sense for your graph ADT to store a name for the graph.

CreateGraph graphName

Creates a new graph named graphName. The graph is initially empty (has no nodes and no edges). The command's output is:

created graph graphName

If the graph already exists, the output of this command is not defined.

Note that graph names are used purely in the test script; it is unlikely to make sense for your graph ADT to store a name.

AddNode graphName nodeData

Adds a node represented by the string nodeData to the graph named graphName. The command's output is:

added node nodeData to graphName

If a node with this data is already in the graph, the output of this command is not defined.

AddEdge graphName parentNode childNode edgeLabel

Creates an edge from parentNode to childNode with label edgeLabel in the graph named graphName. The command's output is:

added edge edgeLabel from parentNode to childNode in graphName

If either node does not exist in the graph, the output of this command is not defined. If an identical edge (same parent, child, and label) already exists, the output of this command is not defined either, as it is left to your discretion whether to allow identical edges in your implementation.

ListNodes graphName

This command has no effect on the graph. Its output starts with:

graphName contains:

and is followed, on the same line, by a space-separated list of the node data contained in each node of the graph. The nodes should appear in alphabetical order. There is a single space between the colon and the first node name, but no space if there are no nodes.

ListChildren graphName parentNode

This command has no effect on the graph. Its output starts with:

the children of parentNode in graphName are:

and is followed, on the same line, by a space-separated list of entries of the form node(edgeLabel), where node is a node in graphName to which there is an edge from parentNode and edgeLabel is the label on that edge. If there are multiple edges between two nodes, there should be a separate node(edgeLabel) entry for each edge. The nodes should appear in alphabetical order by node name and secondarily by edge label, e.g. firstNode(someEdge) secondNode(edgeA) secondNode(edgeB) secondEdge(edgeC) thirdNode(anotherEdge). There is a single space between the colon and the first node name, but no space if there are no children.

Sample input and output

We have provided example input and output files in your test/java/hw3 directory. You may assume the input files are well-formed.

Hints

Writing Specifications

To give you some sense of the kinds of issues you should be considering in your design, here are some questions worth considering. These don't in general have simple answers. You'll need to exercise judgment, and think carefully about how different decisions may interfere with each other.

Will the graph be mutable or immutable?
Will the graph be implemented as a single class, or will there be a Java interface for the Graph specification and a separate class for the implementation?
Will edges be objects in their own right? Will they be visible to a client of the abstract type?
Will nodes be objects in their own right? Will they be visible to a client of the abstract type?
When will the user specify the nodes and/or edges in the graph? (In the constructor? With an insertion method? Both? Can the user add multiple nodes and/or edges at once?)
What kind of iterators will the type provide?
Will the type provide any views, like the set view returned by the entrySet method of java.util.Map?
Will the type implement any standard Java collection interfaces?
Will the type use any standard Java collections in its implementation?

In choosing what operations/methods to include, strive to include enough that the ADT will be convenient and useful for a client, but avoid the temptation to write an “everything but the kitchen sink” API. Generally speaking, it is better to design a minimal than a maximal API. In the real world, you can always add methods later. However, you can never remove them from a published API, and such methods may over-constrain the implementation in the future.

Make good use of the course staff. If you have concrete questions, then take your specification to office hours for feedback on your design and style. Doing so could save you a lot of time.

Designing Tests

It can be difficult to come up with a good test suite. You would like to test a variety of “interesting” graphs, but what are interesting graphs? One possible approach is a “0, 1, 2” case analysis: test scripts with 0, 1, and 2 graphs are interesting; graphs with 0, 1, and 2 nodes and 0, 1, and 2 edges are interesting. For each method, 0, 1, and 2 parameters and 0, 1, and 2 results are interesting; for example: AddEdge on nodes that currently have 0, 1, and 2 children; ListChildren on nodes with 0, 1, and 2 children; etc. This approach, while certainly not required, can give a good way to structure your tests to cover many important cases without too much redundancy.

What to Turn In

You should add, commit, and push the following files to your CSE 331 GitLab repository:

main/hw3/answers.txt
main/hw3/*.java [Java classes for your graph design]
test/hw3/*.test
test/hw3/*.expected
test/hw3/*Test.java [Other JUnit test classes you create]

Additionally, be sure to commit and push any updates to hw3/test/ImplementationTests.

When you have committed and pushed all of your changes and are done with the assignment, you should create a git tag in your repository named hw3-final and push that tag to your repository. Once you have committed and pushed that tag, your assignment has been submitted and the staff will grade the files in your repository that are labeled with that tag.

Refer to assignment submission handout for complete turin instructions.

Q & A

Q: Does the graph ADT have to support an edge connecting a node to itself?

A: Yes. It's common for graphs to have these kind of nodes, so your ADT should allow them.

Q: Does the graph ADT have to support multiple edges between the same nodes with the same text?

A: No, you can choose how to handle this case.

Homework 3: Designing an ADT

Due: Thursday, October 18 at 11:59 PM