Project 2 - The search for a-MAZE-ing donuts!

Phase B

Important Deadlines

Phase B Due

Wed, Nov 7

Electronic turnin (code + README.txt) by 11:59 pm

 

Thurs, Nov 8

Paper turnin due by beginning of Quiz Section

 

Outline


First, right after you read the rest of the assignment, but not later, Open your README file and answer question #3 below.


OK, now you're ready to get to some coding....

VII. More Provided Code

For the second phase of this project, we are providing code to read the input maze given as a text file, parse it, and build a Maze class object from it, which your algorithm will operate on. We are also providing the code for a simple maze runner, RandomMazeRunner, that picks the direction to explore in a random fashion. Your own MazeRunner classes should inherit from the abstract MazeRunner class. You may modify the provided code if you wish, but very little (if any) should be necessary, and you should justify any modifications in your README.

The provided code and sample data files are located in this archive, which you should download onto your own computer and expand in your working directory.

Further documentation on the Maze code architecture and how to get started can be found here. If you find the large codebase a bit overwhelming, this is good place to start.

VIII. Maze Input and Output

INPUT: The SquareCellMazeFactory class reads a maze from a text file specified on the command prompt. The file format is a picture of the maze. There is also some header information at the front that gives the dimensions of the maze (width, then height). In the maze, '-' and '|' are used for walls, ' ' for rooms and unblocked passages, '+' for the corners of the rooms, '*' for the entrance, and 'O' for the donut. Once you find the donut, you get to chow down and then exit out of the maze. For example, the following is a legal maze : (Note: entrance and donut can be in any cells in the maze, not just the corners)

7 5
+-+-+-+-+-+-+-+
|*|   | | |   |
+ +-+ + + +-+ +
|   |   |     |
+ + + +-+ +-+ +
| |   |     | |
+ +-+-+ +-+-+ +
|       |     |
+-+ + +-+-+ +-+
|   |        O|
+-+-+-+-+-+-+-+

There are some sample input files in the starter files mentioned above that you can use to test your program. You are highly encouraged to write (and share with your classmates!) other test mazes to run your MazeRunner on. Send them to us or to the 326 discussion list or post them on the message board.

OUTPUT: After solving a maze, your MazeRunner should output a solution path calculated by the search. The path should be given on a single line and should consist of the printed locations (using the built-in print method) from the start to the exit, separated by single spaces. On the next line should be the length of the solution path, and on the following line should be the number of nodes visited and expanded in the search. The solution should be preceeded by a line giving the name of the search algorithm used. For example, the output for the above maze might be:

Random
(0,0) (1,0) (2,0) (3,0) (3,1) (3,2) (4,2) (4,3) (4,4) (4,5) (4,6) 
10
592

The word "Random" should be replaced with "DFS", "BFS", or "BestFS" as appropriate. You do not need to distinguish in your output what particular heap structure you used. The 3rd line is a number representing the length of the path, and the 4th line is a number representing the number of nodes visited (that is, the number of nodes that you have checked to see if they are the donut, including the donut itself).

The solution displayed above corresponds to the solution path shown below with dots:

   0 1 2 3 4 5 6
  +-+-+-+-+-+-+-+
0 |*|   | | |   |
  +.+-+ + + +-+ +
1 |.  |   |     |
  +.+ + +-+ +-+ +
2 |.|   |     | |
  +.+-+-+ +-+-+ +
3 |.....  |     |
  +-+ +.+-+-+ +-+
4 |   |........O|
  +-+-+-+-+-+-+-+

Note that your MazeRunner does not need to output a text-graphical solution such as the one displayed above.

Note also that a search very likely involves visiting many nodes that are not on the final solution path. You will have to explore ways of extracting the path from the collection of nodes you visited. RandomMazeRunner gives an example of how this might be done. You will need to modify this technique to work for your algorithms.

IX. Getting Started

After combining your code from Phase A and the new provided code for Phase B, compile everything and try to run the provided random mazeRunner from the launcher class as follows:

       % java MazeRunnerLauncher -r maze0.txt
The "-r" option invokes the random maze runner. Take a look at RandomMazeRunner.java to get a general idea of how things work. After understanding this file, make a copy (say BFSMazeRunner.java) and try implementing a better algorithm.

Graph Traversals for Maze Search

A search is an algorithm that traverses a graph in search of one or more goal nodes. As we will discover in a few weeks, a maze is a special instance of the mathematical object known as a "graph". In the meantime, however, we will use "maze" and "graph" interchangeably. We are going to look at three graph traversal strategies for quickly finding donuts in a maze. The basic idea behind all three is the same: they all visit the node that is “next” in a data structure they maintain (a stack, queue or priority queue), mark it, and then add its unvisited neighbors to the data structure. The traversals differ in which nodes they explore first and which they leave for later.

The basic algorithm used by all three searches is as follows. The type Set is implemented by one of the three ADTs, and affects the order in which nodes are explored.

Search( Maze m )
    Mark m.StartNode "Visited"
    Set.Insert( m.StartNode )
    While Set.NotEmpty 
        c <- Set.Remove
        If c is the goal
            Exit
        Else
            Foreach neighbor n of c
                If n "Unvisited"
                    Mark n "Visited"                    
                    Set.Insert( n )
            Mark c "Examined"                
End procedure
 

This algorithm uses three states for each cell.

I. Depth First Search (DFS)

The defining characteristic of this search is that, whenever DFS visits a maze cell c, it next searches the sub-maze whose origin is c before searching any other part of the maze. This is accomplished by using a Stack to store the nodes. Set.Insert corresponds to a Push, and Set.Remove to a Pop.

The end result is that DFS will follow some path through the maze as far as it will go, until a dead end or previously visited location is found. When this occurs, the search backtracks to try another path, until it finds an exit.

See this page for more information on DFS and BFS, and this page for a graphical comparison of BFS and DFS.

II. Breadth First Search (BFS)

The defining characteristic of this search is that, whenever BFS examines a maze cell c, it adds the neighbors of c to a set of cells which it will to examine later. In contrast to DFS, these cells are removed from this set in the order in which they were visited; that is, BFS maintains a queue of cells which have been visited but not yet examined (an examination of a cell c consists of visiting all of its neighbors). BFS is implemented using a Queue for the set. Set.Insert corresponds to an enQueue, and Set.Remove to a deQueue.

The end result is that BFS will visit all the cells in order of their distance from the entrance. First, it visits all locations one step away, then it visits all locations that are two steps away, and so on, until an exit is found. Because of this, BFS has the nice property that it will naturally discover the shortest route through the maze.

See this page for more information on DFS and BFS, and this page for a graphical comparison of BFS and DFS.

III. Best First search

The defining characteristic of this search is that, unlike DFS or BFS (which blindly examines/expands a cell without knowing anything about it or its properties), best first search uses an evaluation function (sometimes called a "heuristic") to determine which object is the most promising, and then examines this object. This "best first" behavior is implemented with a PriorityQueue for the set. Set.Insert corresponds to an Insert, and Set.Remove to a DeleteMin.

For our maze runners, the objects which we will store in the PriorityQueue are maze cells, and our heuristic will be the cell's "Manhattan distance" from the exit. The Manhattan distance is a fast-to-compute and surprisingly accurate measurement of how likely a MazeCell will be on the path to the exit. Geometrically, the Manhattan distance is distance between two points if you were only allowed to walk on paths that were at 90-degree angles from each other (similar to walking the streets of Manhattan). Mathematically, the Manhattan distance is:

| cellx - exitx | + | celly - exity |

(Sometimes, the Manhattan distance is scaled up by a constant factor, to ensure unique values for each cell.)

The end result is that best first search will visit what it thinks are the most promising cells first, which gives best first some of the nice properties of both BFS and DFS. However, this leaves best first search vulnerable to bad heuristics, or certain types of mazes which exploit weaknesses of certain heuristics.

X. Detailed Requirements

In Phase B you are required to:

Electronic Turnin

Include the following files plus any others you find necessary:

README Questions

Your README should contain the following information:

  1. The names of your team members and your team's name
  2. A breakdown of the work - a brief who-did-what description.
  3. (Answer this question before you begin) How long do you think this project will take you to finish?
  4. How much time did you actually spend on this project?
  5. Acknowledgement of any assistance you received from anyone but your partner, the course staff, or the Weiss book.
  6. A brief description of how your project goes "above and beyond" the basic requirements, if it does.

Answer the following questions:

  1. Why is it more important for DFS than BFS to check whether a cell has been visited yet?
  2. If you gave your MazeRunners a maze that was too large to fit in main memory, which would likely run faster, BFS or DFS? (Any reasonable answer -- as long as it is justified -- will be accepted.)
  3. In what ways are DFS and Best-first search similar? How are BFS and Best-first similar?
  4. Why is it important that a heuristic be easy to compute? Why is it important that a heuristic yield unique (or almost unique) values?
  5. Why is BFS guaranteed to find the shortest path? Are the other two algorithms guaranteed to find the shortest path? If yes say why, if not, give a counter example.
  6. What are the tradeoffs between a pointer vs. array-based heap? In particular, what are the tradeoffs between a binary heap, a d-heap, and a pointer-based heap? Some possible topics to cover include runtime, coding complexity, and memory usage. Did you observe any differences (for example in runtimes or memory use) between using these three for the larger maze inputs?  (Just note if you noticed anything, it is fine if you did not.  You are welcome to explore this question in more detail (timings, measurements, calculations) for extra credit.)
  7. In general (not necessarily in the context of this project), why could it be better to use a sorted array implementation of a priority queue versus a binary heap? Why could it be worse?
  8. What did you enjoy about this assignment? What did you hate? Could we have done anything better?

XI. Grading Breakdown

The amount of code required for this project is actually very small. A significant part of your grade will be based on your program design and your README.

Look at the course grading policy for a more detailed explanation on grading guidelines.

XII. Going Above and Beyond

The following suggestions are meant for you to try if you finish the requirements early. Be sure to clearly describe what you did and how to run it in your README. Recall that any extra-credit points you earn for these are kept separate from your assignment score and will be used to adjust your grade at the end of the quarter, as detailed in the course grading policy.

XIII. Phase B Turnin

For Phase B, turn in all your code for this assignment (from both Phase A and Phase B).

Only one person from each team should submit. The same person who submitted Phase A should submit Phase B. Use the assignment dropbox. You should submit electronically:

  1. All of the Java code (use *.java) that you need to compile your program, except for unmodified files that we have provided. We already have those are on the server.
  2. The README.txt file

For the hardcopy, please place the README.txt file in front, and include any code that you have written or modified. You do not have to hand in printouts of files that we provided that you did not modify (eg. MazeRunner.java). Write both partners’ names and UW email ids on the front of the README file.