Handout P4

Quick links:

Contents:

We recommend that you read the entire problem set before you begin work.

Purpose

This assignment gives you additional practice in selecting designs, writing specifications, and testing and implementing those specifications. You will also gain an increased appreciation for the utility of well-chosen interfaces, when you see that your graph and path-finding code can be reused with nodes and paths of a different variety. You will be able to judge the extensibility of the test file driver of Problem Set 3 by adapting it for a related purpose. You will practice writing abstraction functions and representation invariants, and checking the invariants via the checkRep() method.

Getting Started

Update the cse331 directory in your SVN repository. This will create a cse331/src/ps4 directory, and will also update various other files that you need. Then, work through the problems below. Use the provided specifications to help you fill in the missing methods.

You are free to work from home if that makes your life easier, but keep in mind that your final code must run correctly on attu. Once everything is in order, commit your changes. You should also run ant validate (on attu) as a double-check after committing.

These skeleton classes are provided to you to help get you started. Take a look at the Javadoc specifications for more information.

By the end of the problem set, you should have the following files committed to SVN in your ps4 directory:

Introduction

In this problem set, you will design and implement abstract data types for representing streets (StreetSegment and StreetNumberSet), you will read commands from a file to join these street segments into a graph, and you will create a new variety of Path that enables your previous PathFinder implementation to be used, without change, to find paths along streets represented as a sub-class of GeoSegment. When you finish this problem set, you will have a complete, albeit rudimentary, version of Husky Maps that can print directions between specified street segments. In Problem Set 6, you will construct the street graph from the US Census Bureau database, look up which street segment contains a given address, refine the directions-finding algorithm, and provide a user interface.

The StreetSegment class, which you will define in this problem set, builds on GeoSegment by adding some new functionality. Among other things, each StreetSegment has associated with it certain StreetNumberSet objects that indicate which addresses (building numbers) appear on the Segment. StreetNumberSet can be queried to see where a particular building number appears (for instance, how far from one end of the street that building stands).

This problem set requires you to do a fair amount of design. Writing code to a careful specification is fairly easy; writing code to an incomplete specification is harder; and writing a specification is hardest of all. However, it can also be much more interesting. When you select a design, you should first sketch out the design space. Brainstorm all the different ways you might imagine satisfying the specification. (Some of these might seem silly, but don't reject them out of hand.) Almost every set of requirements has multiple designs with different properties, and the ones we give you are no exception. When you write up your design, talk about different designs, tradeoffs they make, in what circumstances each is good, and why you chose the one you did. There is no one correct answer, no one "perfect" design that the CSE 331 staff is waiting for you to discover. Rather, we want you to think about the problem, consider the alternatives, and make a cogent case for the one that you choose.

A substantial part of your grade will be based on your writeup. Your code makes up a minority of the points for this submission. Documentation, analysis, testing, and so forth are also important; in fact, in the real world, code alone without those other deliverables is generally useless, as it can't be reused, safely modified, or even understood.

GeoSegment Meets the Real World

In Problem Set 2, you implemented GeoSegment, a class that represents a generic connection between two points on the surface of the earth. This problem set will ask you to extend GeoSegment to design and implement a StreetSegment representing a street in the real world.

Consider the following segment of a fictitious street:

Broadway Map

Note that one side of the street is in Somerville, and the other is in Medford. The two sides of the street have disjoint numbers, as well as different ZIP codes. The street also has a classification: it is a local road. For street segments, we will only have four possible classifications:

Street Segments

The StreetSegment class should be able to accommodate street segments of this form. Like a GeoSegment, the segment has a start point, an end point, and a name. The class should also record the ZIP code on each side of the street, the set of street numbers present on each side of the street, and the street classification. You will use the staff-provided class StreetClassification to keep track of the classification of a StreetSegment. Read the Javadoc specifications to understand how to use it.

Street Numbers

In order to determine which segment of a street contains a particular street number, Husky Maps must maintain the collection of street numbers that appear on each side of each segment. You will implement a data structure called StreetNumberSet that will store the street numbers for one side of a street segment.

A typical street has even street numbers on one side and odd street numbers on the other side, and each side contains all of the (even or odd) street numbers in a range. However, this is not the case for all streets: one side of a street may contain both odd and even numbers and/or be missing some numbers in a range. For example, one side of a street might contain 2, 4, 6, 8, 10, 12, 14, 17, 18, 20, and 22. Your design and implementation will need to be able to accommodate this most general case.

Problems

Problem 1: StreetNumberSet (25 points)

StreetNumberSet represents an arbitrary set of integers. We do not provide you with a complete specification for StreetNumberSet. However, we require you to provide at least one specific public constructor and two methods, contains and orderStatistic, along with any others of your devising. You also need to override equals (and hashCode, of course). The contains and orderStatistic methods will be needed in Problem Set 6. Although they are not used in this problem set, you should still fully specify, document, implement, and test them (not only because we will be grading them, but also to save yourself debugging time on Problem Set 6).

There are many good and correct implementations of StreetNumberSet. Program correctness is the most important goal, but we suggest that you choose a design that uses less memory relative to other options. We do not expect you to make heroic optimizations or use complex compression algorithms; a compact underlying data structure will achieve this goal. (By contrast, representing StreetNumberSet in terms of a Set, List, array or other enumeration of Integers will not, even though the implementation would be very simple; a design using a set (or list, array, etc.) of integers is not a permissible answer to this problem.) The database you will use in Problem Set 6 will involve hundreds of thousands of street segments — and a million or more unique street numbers — so an implementation that is memory-intensive will degrade the performance of your program. Correctness is more important than memory usage; however, both are required to receive a significant amount of credit.

One useful observation about street number sets is that they often contain long sequences of same-parity numbers. For example, the sequence {2, 4, 6, 8, 10, 12, 14, 17, 18, 20, and 22} contains even numbers from 2-14 and 18-22, and the odd number 17. More common cases will be a single range, such as even numbers from 22-140, or the empty set. For the purpose of storage space optimizations, you may assume that an input string containing multiple contiguous ranges is uncommon (e.g. "1-3,5-7,9-11"), and would instead be given as a single range "1-11". The StreetNumberSet constructor requires that when multiple ranges are provided (separated by commas), those ranges must be disjoint; we will obey that pre-condition.

Please put any written answers to this problem in a file named problem1.txt.

  1. Write a complete specification for StreetNumberSet. Remember to include both method specifications and specification fields (specfields). Your specification should include the public constructor and the contains and orderStatistic methods exactly as given in the partial specification. Write your complete specification both as Javadoc comments in your code and in the problem1.txt file.
  2. Write test cases for your specification. You should use JUnit as your test framework. Be sure to justify, in a brief written explanation, your testing strategy and why you believe that your test cases are adequate but not excessive. (As usual, you should use test/SpecificationTests.java as a wrapper for the test suites of the partial specifications that we provide for you, because those can be fairly run against other students' implementations. You should use test/ImplementationTests.java as a wrapper for the test suites of all other methods beyond the required ones, because those will differ from student to student and thus cannot be run against others' implementations.)

    It is very important for the purposes of this problem set - and it is often good programming practice in many real-world projects - that you write your (black-box) test suite BEFORE your implementation. This will help you to better understand and refine your specification before you spend lots of time coding.

  3. Decide upon a representation for your implementation. Write down the representation invariant and corresponding abstraction function as comments in your source code. (See the Writing Abstraction Functions & Rep Invariants handout for information about how to write good AFs and RIs.)

    In problem1.txt, discuss your particular choice of representation and compare it against at least one other representation that you considered. Write briefly about the relative merits of each choice and why you chose the one that you did.

  4. Implement your specification and test your implementation against your specification test suite. Include a checkRep() method that confirms that the representation invariant is satisfied, and call this method at appropriate points in your code.

    We highly encourage you to write tests for your particular implementation and place them in test/ImplementationTests.java. Make sure you understand how these internal tests are different than your specification tests.

    After testing, you may disable any checkRep() calls if they create a severe and measurable performance slowdown. (You can do so by disabling asserts if you use assert checkRep() or by commenting out the checkRep() calls or by surrounding them with if statements guarded by a final boolean variable. If that variable is set to false, the compiler will optimize away all of the if statements it guards, thus incurring no runtime overhead, as long as it is final.) Do not simply delete your checkRep() calls from your source code altogether; for grading purposes, we still need to be able to see where you would have placed those calls. (If your ADT does not call checkRep() for certain methods, then the test driver should call checkRep() while testing those methods, because during testing, correctness is generally more important than speed.)

Problem 2: StreetSegment (20 points)

We do not provide you with a complete specification for StreetSegment. However, we require you to provide a specific public constructor, and a fractionDist method. We also recommend that you implement a reverse() method that overrides GeoSegment.reverse(), as you will need that method in Problem Set 6. (Since this is only a recommendation, any tests for it belong in ImplementationTests, not in SpecificationTests. It is interesting to note that different choices of representation make the implementation of reverse() either trivial or complicated. What are the other tradeoffs of the designs?

As you did for StreetNumberSet in Problem 1, complete the specification, testing, documentation, and implementation of StreetSegment, following the exact same steps as described by Problem 1. Please put any written answers to this problem in a file named problem2.txt (which should be analogous to problem1.txt, except that the answers should be for StreetSegment).

Problem 3: RoutePath, A*, and Finding Directions (28 points)

In this problem, you will design and implement the remaining parts of your system that are necessary for finding directions between geographic locations. In order to do so, you will adapt the path-finding algorithm you wrote for Problem Set 3 in two ways: The first and most important extension is that its nodes will be GeoSegments rather than WeightedNodes. This should require no changes to your graph implementation if you made it sufficiently generic. The second extension is that your path-finding algorithm will implement the A* heuristic. Because our test cases will work with a large number of nodes, you may want to revise your Graph implementation from Problem Set 3 if it cannot lookup the children of a node, add a node, or add an edge in expected constant time. The HashMap class in the JDK is very useful for doing this.

RoutePath: A Path of GeoSegments directed towards goals

In order to make your path-finding implementation work over GeoSegments, you will need to create a new class RoutePath that implements the Path interface and makes use of Route in an appropriate manner to be able to find the shortest-length routes between destinations. We have purposely left this class under-specified in order to give you practice in designing an ADT.

As you did for StreetNumberSet in Problem 1, create the specification, documentation, and implementation of RoutePath, following the exact same steps as described by Problem 1 (except for the testing step, which you can ignore because you will be performing integration testing instead of unit testing for this problem). You do not need to write JUnit specification tests for RoutePath; you should be able to test it using the test file format in the integration tests. (However, you are welcome to write your own private tests for RoutePath in ImplementationTests to assist you in debugging.) Please put any written answers to this problem in a file named problem3.txt (which should be analogous to problem1.txt, except that the answers should be for RoutePath).

Hint: In the full Husky Maps system of Problem Set 6, you will be using your PathFinder (from Problem Set 3) to find and return RoutePaths. You will need a way to take a RoutePath and produce directions from it. Be sure to consider this usage while designing your RoutePath.

A* heuristic

The A* heuristic reduces the search space of a path-finding algorithm; it uses lower-bound estimates to avoid consideration of unpromising paths. Here is a brief explanation:

The search algorithm of Problem Set 3 works not only if paths are sorted (in the priority queue) by their lengths so far, but also if paths are sorted by any value that is at least as large as the length of the path so far and no greater than the true total length of the path once it is extended to the goal. In other words, it works to use not the cost of the path so far, but an underestimate of the total cost of any goal-reaching path with the given path as a prefix. One simple way to do this is to add, to the length so far, the straight-line distance to the goal. (When there are multiple goals, choose the minimum straight-line distance to any of the goals.) Since any real path to the goal is at least as long as the straight-line distance, this value falls between the distance so far and the actual distance.

For more details about A* search, see Wikipedia or an AI textbook such as Artificial Intelligence, by Patrick Winston.

You can implement this heuristic very simply: instead of using the route length as the cost of the path, your RoutePath defines its cost to be the sum of the route length so far plus the distance remaining to the nearest goal. Note that this heuristic requires no changes to PathFinder; it is just an efficient definition of the cost function.

Integration Testing

You will perform integration testing on your implementation using files that have a format similar in spirit to those of Problem Set 3. The format in defined in the test file format section below.

You should create a class named PS4TestDriver that runs tests in the same way that PS3TestDriver did in Problem Set 3. A skeleton class already exists in test/, but you can replace it. You have two choices as to your approach:

  1. As one option, you may copy your ps3/PS3TestDriver.java source file to ps4/PS4TestDriver.java, change it to package ps4, and revise it to match this problem set's file format specification. This will cause duplicated code between the two problem sets, but avoids some of the tricky issues with subclassing. This is probably the easier option.
  2. Alternatively, you may choose to write a new PS4TestDriver class that extends the PS3TestDriver and reuses some of its code. Depending on the extensibility of your PS3TestDriver implementation, this approach may or may not be easier. (You may go back and edit your PS3TestDriver to make it more extensible, if you wish, as long as the PS3TestDriver still meets the requirements for ps3.)

    In order to make PS3TestDriver more extensible, you will need to make some of its methods protected instead of private. Also, static methods may not have the inheritance properties you want. You may find it easiest to just copy the main method from PS3TestDriver, and change the references from PS3 to PS4.

Once your test driver is ready, you should create one or more sets of test cases, in files named ps4*.test, and provide their expected output in ps4*.expected. (Your test output must pass JUnit tests with no differences to the .expected files. If you decide to use the UNIX diff tool for debugging, consider using the -b option to ignore whitespace differences, and don't worry about reports of strange "Revision" differences.) Recall that ScriptFileTests is called from SpecificationTests.

Again, we plan to run each student's SpecificationTests against each other student's assignment; how you fare will determine a small part of your grade. (It would not make sense for us to do this cross-checking with ImplementationTests. If you don't know why, please review the difference between the two varieties of tests until you understand this point.)

In problem3.txt, include an explanation of your integration testing strategy and an argument for why you feel that your tests are sufficient, but not excessive (notice that this takes the place of the unit testing step on RoutePath described in Problem 1).

See the sample testing file section for an example of a test file.

Collaboration (1 point)

Please answer the following questions in a file named collaboration.txt in your answers/ directory.

The standard collaboration policy applies to this problem set.

State whether or not you collaborated with other students. If you did collaborate with other students, put their names and a brief description of how you collaborated.

Reflection (1 point)

Please answer the following questions in a file named reflection.txt in your answers/ directory.

  1. How many hours did you spend on each problem of this problem set? (Only include time when you are completely focused on CSE 331. For example, if you worked an hour in your noisy dorm lounge, or you were periodically reading email or IMs, don't count that hour.)
  2. In retrospect, what could you have done better to reduce the time you spent solving this problem set?
  3. What could the CSE 331 staff have done better to improve your learning experience in this problem set?
  4. What do you know now that you wish you had known before beginning the problem set?

Returnin (25 points)

After you receive your corrected problem set back from your TA, you will have until February 16, 2011 before you have to resubmit it electronically. The procedure for this is detailed in the General Information document.

You will only receive credit if you pass all the tests, and you have a fixed number of trials without penalty. See the returnin331 documentation for details.

Test file format

The format of the test file is similar to the format described in Problem Set 3, though not identical.

Each input file has one command per line, and each command consists of whitespace-separated words, the first of which is the command name and the remainder of which are arguments. Lines starting with # are considered comment lines and should be echoed to the output when running the test script. Lines that are blank should cause a blank line to be printed to the output.

The test driver manages a collection of named Graphs; graph nodes are GeoSegments. Each GeoSegment is given a unique name to permit them to be connected to one another. Names of nodes and graphs contain only alphanumeric characters.

The following is a list of the valid commands, and the meaning of their arguments. The commands are similar to that used in Problem Set 3, with the following exceptions: CreateNode is removed and replaced with CreateGeoNode; FindPath is removed and replaced with FindGeoPath; ListNodes and ListChildren are removed. (Therefore, CreateGraph, AddNode, and AddEdge are the same.) Each command has an associated output that should be printed to standard out (the console) when it is executed.

CreateGraph graphName
Creates a new graph that shall be named graphName. The graph is initially empty (has no nodes and no edges). The command's output is
created graph graphName
CreateGeoNode nodeName lat1 long1 lat2 long2 segName
Creates a new GeoSegment with the specified arguments (the latitudes and longitudes are in millionths of a degree) and assigns it the name nodeName. The node can be referred to by nodeName for the rest of the file. The command's output is
created node nodeName: lat1 long1 lat2 long2 segName
AddNode graphName nodeName
Adds a node identified by the string nodeName to the graph named graphName. The command's output is
added node nodeName to graphName
AddEdge graphName parentNode childNode
Creates an edge in graph graphName from parentNode to childNode. The command's output is
added edge from parentNode to childNode in graphName
FindGeoPath graphName from1 [from2 [from3 ... ] ] -> to1 [ to2 [ to3 ... ] ]
Finds the shortest path amongst all of the possible paths from any of the from nodes to any of the to nodes. The command's output starts with:
shortest path in graphName:

on a line by itself. (Don't forget the colon.) Then the path should be printed according to the specification of DrivingRouteFormatter, using an initial heading of zero (0.0). The path should be printed starting on a new line. If no path exists, the standard prefix should be followed by no path found; for example:

shortest path in sampleGraph: no path found

The behavior of the testing driver on malformed input files is not defined; you may assume the input files are well-formed.

Sample testing file

A sample testing file is provided in test/campus.test. It refers to the map below that roughly describes some streets around the UW campus.

UW campus map

The expected output is provided in test/campus.expected.

Hints

See the problem set guidelines and advice.

We recommend that you design all the classes first, rather than going class-by-class doing design, specification, testing, documentation, and implementation; similarly, you may find it helpful to write all the specifications before doing any testing or implementation, because you may find while writing the specifications that you need to make changes to your design.

You will need to implement the equals(Object) method for a number of classes. When you override Object.equals(Object), you usually need to override Object.hashCode() as well (See Item 9 on page 45 of Effective Java).

Because StreetNumberSet models a set, it should not contain duplicate street numbers. The StreetNumberSet constructor contains a @requires clause to that effect.

You should not need to modify your PathFinder or Graph implementations. In particular, they should not mention WeightedNode, StreetSegment, or the like. If you made an implementation error by including such references, remove them, but ensure that your PathFinder and Graph classes continue to work for Problem Set 3 as well as for Problem Set 4; both problem sets should use the same version of those classes.

Tracking down bugs in the formatting for the script-test output can be tricky, as the results need to match character for character. Eclipse has a GUI that can help you find these subtle differences in your .actual and .expected files. To use it, select the two files you want to compare in the package explorer, right click, and choose "Compare With ... Each Other" from the context menu.