CSE logo University of Washington Computer Science & Engineering
 CSE 331 Homework 4 - Winter 2012
  CSE Home   About Us    Search    Contact Info 

   

Useful Links

Javadoc specification for the calculator package.

Java 6 API

Java coding standards

Remote access: accessing attu and your personal files from home.

Homework 4: ADTs and class design

Due: Thursday, February 9, 2012 at 11pm

In this assignment you will practice designing and implementing an ADT. For simplicity, the assignment is broken into several parts: writing the public specification, implementing it and testing it, and writing a client application. In reality, these parts will likely overlap as later parts suggest improvements to your earlier design. You will also learn to use version control software, an important tool for managing large projects.

Although this is an individual assignment, you will learn more and ultimately produce a better design by bouncing ideas off others, getting feedback and sharing ideas. You should begin by sketching out a design alone, but you are strongly encouraged to then discuss your design from Part 2 with your classmates as well as your TAs during office hours. The standard collaboration policy still applies to your implementation in Part 3, however: you should not show other students your code. Please include a list of everyone with whom you discussed your design in your answers.txt/answers.pdf file.

This project was adapted from an assignment in an MIT course taught by Profs. Daniel Jackson and Srinivas Devadas.


Errata

Feb. 5: AllTests.java was omitted from your repository. Do an update (Right-click project > Team > Update to HEAD) and it should show up. Note that if your graph package is empty, that package may no longer show up until you add some classes to it, but it's still there.

Feb. 5: I should add that if you had unit tests in graph.tests already, there's a possibility that my update could clobber your unit tests, especially if you hadn't committed your own tests yet. I tried to check for this carefully, but I can't guarantee that it won't happen. If Subversion gives you merge conflicts, you don't need to use my test suite - tell SVN to resolve the conflict using your copy and then write your own test suite from the template in HW2. If you do have to handle a merge conflict and you've written your own unit tests, you should make a backup of those files before resolving the conflict just to be safe.

Feb. 6: The released version of MarvelParser is missing an else block at line 63 that should read:

  else {
    bookChars.add(character);
  }


Setup

In this project you will learn to use version control software, in particular a flavor called Subversion (SVN). We will discuss what version control is, common terminology, and how to use SVN/Subclipse during section. You can also refer to this SVN cheat sheet.

  1. Install Subclipse, an Eclipse plugin that integrates SVN support into the IDE. Follow Step 1 under these instructions from spring quarter.
  2. Check out your repository ("repo"), where the central copy of your code is stored. Normally you need to create a repository first, but we've already created your repository and added some starter code for you.
    • In Eclipse, choose File >> New >> Project... >> SVN >> Checkout Projects from SVN. Hit Next.
    • Select "Create a new repository location" and enter svn+ssh://YOUR_CSE_NETID@attu.cs.washington.edu/projects/instr/12wi/cse331/YOUR_DIRECTORY, replacing YOUR_CSE_NETID as appropriate and YOUR_DIRECTORY with your unique one- or two-letter directory name, which we have published in Gradebook under "Project directory." Hit Next.
    • In the Enter SSH Credentials panel, enter your Linux/attu password. Hit OK.
    • In the Select Folder panel, select the GraphApplications project. Select Finish.

The GraphApplications project should now appear in Package Explorer with the rest of your projects.


Part 1: Written Exercises

This part is designed to test your understanding of some of the ADT concepts from lecture. Place your answers in a file named answers.txt or answers.pdf at the top level of your GraphApplications project, i.e. at the same level as the src/ directory. (As usual, standard file formats besides txt and pdf are accepted but strongly discouraged.)

  1. For each of the classes below, write the abstraction function and representation invariant, referring to the source code provided in the hw4exercises package of your project.
    1. Queue1
    2. Queue2
  2. Below are several snapshots of Queue2 object's internal state at different points in a program. Which are equivalent at the abstract level? (In other words, which ones look identical to the client?)
  3. Below are signatures for various methods and constructors. From the information available, which of these methods could possibly expose the representation? For each example, justify your answer and explain any assumptions you made in 1-2 sentences.
    1. public int solveEquations(int x, int y, int z)
    2. public String[] decode(boolean slowly)
    3. private Date myBirthday()
    4. public String toString()
    5. public Iterator elements()
    6. public Deck(List cards)

Parts 2-4: Writing a graph ADT

In the rest of this assignment, you will design, implement, and use a graph ADT - specifically, a directed labeled multi-graph. Given an abstract, high-level description of the ADT, you will create and test a working Java implementation. When a client application has a need for this ADT, it can construct an instance of your representation. You will decide what classes and public methods/constructors to expose to the client (the public interface) and what instance fields, etc. your classes will need to accomplish what it promises (the internal representation).

Here is more information on a directed labeled multi-graph:

  • A graph is a collection of nodes (also called vertices) and edges. Each edge connects two nodes.
  • In a directed graph, or digraph, edges are one-way: an edge e = <A,B> means there is a path from node A to node B but doesn't say whether there is a path from B to A. To indicate that there is a path from A to B and from B to A, a digraph would have edges <A,B> and <B,A>.
  • Any two nodes A and B are neighbors only if there is an edge e = <A,B>.
  • A path is a sequence of edges <A,B>, <B,C>, <C,D>, .... In other words, a path is an ordered list of edges, where an edge entering into a node is immediately followed by an edge exiting that node.
  • In a multigraph, there can be any number of edges (zero, one, or more) between a pair of nodes.
  • In a labeled graph, every edge has a label. Distinct edges may have the same label.
Some simple graphs

Many diverse and interesting problems can be represented with graphs. Often the labels on edges represent the "cost" of traversing a given edge. For example:

  • A graph can represent airline flights between cities, where each city is a node and an edge <A,B> indicates that there is a flight from A to B. The edge label might represent the cost in money (airfare), time (length of flight), or distance.
  • Facebook is essentially a giant graph with nodes for users and edges between friends. (You can see a visualization of the graph here.)
  • The Web can be modeled as a graph with node for every webpage and an edge <A,B> if page A links to page B.
  • To find walking routes across the UW campus, you can build a graph where nodes represent buildings and other locations and edges represent walking paths connecting two locations. The cost of an edge is the physical length of that path. To find the optimal walking route between two buildings, you can use an algorithm known as Dijkstra's algorithm to find the path of lowest cost between two nodes.

The two most common ways to represent a graph are with adjacency lists or an adjacency matrix. For this assignment, we recommend adjacency lists, in which each node is associated with a list containing all its outgoing edges.


Part 2: Design

Approach

You will start by designing the public interface for a Java class or classes that represent a directed labeled multigraph as described above. Your design should be general enough to serve as the main data structure for a variety of applications requiring graphs, including the one you will implement in Part 4 and the examples given earlier. In addition to edge labels, each of your nodes will need to store some sort of data/value. For example, in the airline flight graph described earlier, a node might store the name of a city or airport.

Design problems such as this one are open-ended by nature. A good design provides the functionality the client needs in an interface that is easy to use and supports clean, readable, and maintainable code, both in the ADT itself and in the client code. There is no one "right" design, but there are general principles of good design to which you will be introduced in lecture.

We recommend that you work in iterative layers of detail. Start rough - preferably with pencil and paper. Brainstorm what operations the ADT needs to support, how it might be logically broken into modules (classes/interfaces), and how these modules connect to each other. Then jot down a list of methods for each class that provide these operations. "Test" your design by thinking through some possible client applications, particularly the application you will implement in Part 4. Write some pseudocode (or even real code) for the application and make sure your graph interface meets its needs. Keep your design rough at first: avoid writing extensive descriptions of your methods or formal Java method signatures, let alone compilable Java code. Your design will likely go through multiple iterations, and it's easier to throw away parts before you've invested too much effort in them.

Deliverables

Once you are satisfied with your high-level design, add detail and structure by writing Javadoc pages. Return to the computer and write a skeleton for each class or interface, placing them in the graphs package. Write a brief description above each class in a Javadoc comment and add stubs for all constructors and methods, leaving the bodies empty (or with just a dummy return statement) but writing complete signatures and public comments.

Your generated Javadoc pages should go in the doc/graph directory in the top level of your project (i.e. doc/ should be at the same level as src/). In the Generate Javadoc window, verify that the path under "Use standard doclet" ends with GraphApplications/doc (or GraphApplications\doc), correcting it if not.

We recommend you generate the Javadoc pages before going on to Part 3. Seeing the "client contract" alone without the source code may help you mentally separate the external specification and internal implementation. You are welcome to make adjustments to your ADT in Parts 3 and 4. We recommend regenerating and reviewing the Javadoc pages anytime you make significant changes and generating a final copy just before turnin.

In addition to Javadoc, include a short writeup describing your final design and design process. This should include:

  • An overview of your design
  • A description of alternative designs you considered and why you rejected them
  • Some of the questions and tradeoffs you considered, and why you made the decisions you did
  • Any assumptions you made when designing your ADT

Because your design is likely to change in Parts 3 and 4, we recommend that you be sure to check this summary and revise it just before turnin. Keep notes as you go to remind yourself of the process. Your writeup should be brief, just one or a few short paragraphs. We are grading for content, not length, and one concise paragraph will be favored over several paragraphs that say the same thing in twice as many words. Place this paragraph in your answers file from Part 1. You should not write about implementation details at this point.

Hints & Guidelines

As mentioned earlier, you should think through some client applications to determine what functionality your class needs. Operations a client will surely want include the ability to:

  • Add nodes to a graph
  • Get a list of all the nodes in a graph or iterate over the nodes
  • Get a node's neighbors and the labels on edges from that node to its neighbors
  • Find/follow path through the graph

Should the client be able to remove nodes from a graph? get a list of all edges? Should path-finding operations be included as methods of the graph, or should they be implemented in client code on top of the graph? These, along with the questions below, are examples of what you should be considering as you think through your design.

Generally the questions raised here don't have clear, simple answers. In making a decision, you'll need to weigh tradeoffs, consider the unique requirements of your design, and think carefully about how decisions will interfere with each other.

  • Should the graph be implemented as a single class, or will there be a separate Java interface for the Graph specification and a class for the implementation?
  • How are nodes stored in the graph? Will there be a Node class? a Node interface that clients implement? or can any object be a node? (e.g. if you wanted a City object associated with each node, the Node class could store a reference to the data object, City could implement Node, or the graph could store City objects directly.)
  • Should there be an Edge class? Will it be visible to the client or should it be a private inner class? (Read about inner classes here.)
  • If there are Node and Edge classes, do the clients construct these objects themselves or just pass the data to be stored in the node/edge into a method in the graph?
  • Will edge labels be strings or generic objects? If objects, will it be OK to use a node or an edge as a label? or even a graph?
  • Where the user specify the nodes and/or edges in the graph? when invoking the constructor? by calling an insertion method later? both? Can the user add multiple nodes or edges in a single call to the insertion method?
  • Does the user add a node's outgoing edges when adding the node to the graph? or does the user add the node first and edges later?
  • If there are separate Node objects, will it be possible to find the successor of a node from the node alone, or will the graph be needed too? Can a node belong to multiple graphs?
  • Will the type implement any standard Java collection interfaces?
  • Will the type use any standard Java collections in its implementation?

Part 3: Implementation & testing

Once you are satisfied with your design, implement it! Include an abstraction function, representation invariant, and checkRep() function in each class, calling checkRep() where you deem appropriate (particularly in mutator methods - generally, anywhere the RI might have been broken).

Your graph implementation should have reasonable performance for graphs with thousands (but not millions) of nodes and edges, such as the application in Part 4. You should not assume that the graph will be sparse (that is, containing very few edges compared to nodes) or dense (that is, with most node pairs connected). For instance fields, try to choose data structures best suited for the given purpose.

Also write JUnit tests for your implementation and put these in the graph.tests package. (Eclipse may show this as a tests subdirectory within graph.) Modify the provided AllTests suite so it runs all your tests, which should be organized into separate test classes for each public class. Your tests should be comprehensive enough that you and your graders are reasonably convinced of your implementation's correctness. This means you should test all public methods under a range of inputs. Follow the guidelines from lecture and section slides for writing good tests, including the IntArrayTest example.

We recommended that you try a test-driven development approach, writing the tests before the implementation. You might write the entire test suite first or you might develop incrementally, writing the tests for a method just before implementing that method.

Lastly, submit a brief paragraph describing your implementation experience. What worked as you expected? What changes did you make along the way? How did your implementation experience affect the ultimate design, if it did? What were the biggest surprises? What alternative implementations did you consider, and why did you choose the one you did?

If you tried a test-driven development approach, feel free to mention your impressions of this approach.


Part 4: Using your ADT

In this part, you will use your graph ADT to model the Marvel Comics universe! In this application, nodes represent characters, and edges represent appearances of two characters in the same comic book. For example, if Zeus and Hercules both appear in book "T 221," then there will be an edge from Zeus to Hercules with label T 221, and a second edge from Hercules to Zeus also with label T 221.

You will write a class MarvelPaths for finding paths between characters. Your class should have a main method that accepts the names of two characters as command-line arguments. There should be no other public methods, but private methods are strongly encouraged to keep main() smaller. It searches for a path through the graph connecting the two characters and prints the path in the form

    Char1	Book1	Char2
    Char2	Book2	Char3
    ...
    CharN-1	BookN-1	CharN
where Char1 is the first command-line argument, CharN is the second argument, Char_k is the name of some character, and Book_k is the title of some book for all k. Each character and book is separated by one tab. This path indicates that Char1 appeared in Book1 with Char2, Char2 appeared in Book2 with Char3, and so on, ending with CharN-1 appearing in BookN-1 with CharN.

Not all characters have a path between them. If the user gives two valid character names with no path, print No path found. If either of the character names are unknown, i.e. they were not in the original dataset, print Unknown character(s). If the user gives fewer than two arguments, print Incorrect number of arguments.

Command-line arguments are simply parameters specified when the program is launched, and they are accessible via the args[] parameter in main(). If you are running your program through Eclipse, you can specify these arguments by going to Run >> Run Configurations. In the window that appears, check that your MarvelPaths class is listed under "Main class," select the "Arguments" tab, enter your arguments separated by a space under "Program arguments," and hit "Run." Because character names can contain spaces, you should surround each argument by quotes; otherwise, the parts of the name separated by whitespace will be treated as separate arguments. Also be aware that some names have an extra space at the end.

For example, to find a path from FLYING FOX to FLYING TIGER, you would enter as the program arguments:

    "FLYING FOX" "FLYING TIGER"
and your program should print a list such as:
    FLYING FOX	SSU 1	BOVA
    BOVA	M/TIO 74	HUMAN TORCH/JOHNNY S
    HUMAN TORCH/JOHNNY S	ASPOT 29/2	FLYING TIGER

Your program should print the shortest path between two characters, but if there are several shortest paths it can print any one. We recommend that you use a breadth-first search (BFS) for pathfinding. A BFS from node u to node v, visits all of n's neighbors first, then all of n's neighbors' neighbors', then all of n's neighbors' neighbors', and so on until v is found or all nodes with a path from u have been visited. Below is a pseudocode algorithm to find the shortest path from u to v in a graph G:

    u = starting node
    v = destination node
    M = map from visited nodes to paths: initially empty.
        paths should be stored as lists of nodes or edges 
        (or perhaps just the node data or edge labels).
    Q = queue of nodes to visit: initially empty
    
    Add u to Q
    Add u->[] to M (u mapped to an empty list)
    while Q is not empty:
        dequeue next node n
        if n == v, lookup n in M and return the associated path
        for each edge e=<n,m>:
            if m is not in M, i.e. m has not been visited:
                add m to Q
                let p be the path n maps to in M
                let p' be the path formed by appending e to p
                add m->p' into M
            
    if loop terminated, no path exists from u to v 
    (return some value that indicates this)

In the top level of your project is a file marvel.tsv containing all the data you need. Each line is of the form

    [character] [book]
where [character] is the name of a character, [book] is the title of a comic book he/she/it appeared in, and the two fields are separated by a tab. The file is sorted by character name, so all entries for a given character are consecutive.

We provide you with a class MarvelParser to help load the data from marvel.tsv. This class has one static method, parseData(), which reads marvel.tsv to construct a list of all characters and a map from each book to a list of all characters in that book. Feel free to use and modify this class however you wish. You may wish to update parseData()'s method signature and/or body, especially at the "YOUR CODE GOES HERE" comment. You are also welcome to rewrite the provided code or even write your own parser class.


Turnin

Rather than submitting files to the dropbox, you will commit them to your repository, which we will check out to grade. Be aware that new files are not added to the repository until you tell Subversion to do so. If a file isn't added, it will appear with a question mark icon in Package Explorer, and we won't see it when we check out your repo. To add the file, right-click on the file or directory and choose Team >> Add to Version Control. The added files will now appear with a plus sign icon until your next commit.

If you are unsure whether you have added and committed everything, you can to check out a second copy of your repo in a different workspace or a project with a different name and verify that it contains everything you expect.

You should add the following files to your repo:

  • Your Javadoc HTML pages. For simplicity, add the entire doc/ directory to version control. Remember to regenerate the Javadoc pages if you changed your classes' public specification in parts 3 or 4.
  • answers.txt or answers.pdf, containing your answers to Part 1, brief paragraphs from Parts 2 and 3, and list of students with whom you discussed your design. This should go in top level of project, at the same level as doc/ and src/.
  • All source code from Parts 3 and 4, including your JUnit tests from Part 3. Make sure you have written an abstraction function, representation invariant, and checkRep() for each class that is part of your ADT. (Neither JUnit tests nor the AF/RI/checkRep are necessary in Part 4.) Also make sure you have updated the test suite to run all your tests.

LATE DAYS: As usual, you may use up to two late days on this assignment. We will check out your repository as of the deadline for grading, which means we will assume you have have NOT used late days unless we hear otherwise. Please email the coures staff if you are using a late day, and we will check out your repository 24 or 48 hours after the deadline..


CSE logo Computer Science & Engineering
Box 352350, University of Washington
Seattle, WA  98195-2350
(206) 543-1695
[comments to Hal Perkins]
Privacy policy and terms of use