CSE 326 Project 3, Winter 2008

Code due 11:00 PM Tuesday, March 11

Writeup due in class Wednesday, March 12

I. Introduction

It is an incredibly important day for you at work at Minotaur, Inc. You've managed to arrive before any of your fellow co-workers, meaning you have first dibs on some freshly-baked donuts IF you can get there in time. Unfortunately, your pointy-haired boss had the cubicles rearranged during the night, turning the whole office space into a complex and mystifying maze. Since you doubt you can find the true path in time, you decide to let your computer solve the maze for you.

Graph search algorithms easily lend themselves to solving mazes. Each square of a maze can be viewed as a node in a graph. Two nodes are connected by an edge if the squares they represent are adjacent in the maze and a wall is not between them. You will implement several maze solvers that use different graph search techniques.

II. The Assignment

Part A: You will be required to implement a MazeRunner class, which will run through a maze using different search techniques. The MazeRunner class should be run from the command line in the following form:

java MazeRunner alg -input infile [-final picfile]

where alg is either "dfs", "bfs", "best", or "astar", infile is an input file containing maze information, and picfile is a file to output a picture the final state of the maze to. Note that the "-final picfile" parameter is optional. The MazeRunner class will read the maze file contained in infile, run through it using the search algorithm specified by alg, and, if the final option is given, print out a picture of the resulting maze to the file picfile.

The program should print out, in order:

the name of the input file,
the type of search algorithm used,
the path found as a list of coordinates in the form (x,y), beginning with the entrance node and ending with the exit node. Do not print from the exit to the entrance.
the length of the path length found,
the number of nodes expanded in the search for a solution, and finally
a blank line.

Do not print out anything else. You will lose points if you do. Any debugging output should be removed before you turn in. Your command line should work exactly as described above, even if you have extra options used with extra-credit.

A sample output:

mazefile.txt

A* Search

(0,0) (0,1) (0,2) (0,3) (1,3) (2,3) (2,4) (3,4) (4,4) (4,5) (4,6)

To perform the maze runs, you will need to implement the following search algorithms to run through the maze:

Depth-First Search (DFS). Note that the maze might contain cycles, so you have to avoid going into an infinite loop!
Breadth-First Search.
Best-First Search.
Best-first search means you always extend the path that has the smallest estimated distance to the goal. To determine the estimated distance to the goal, we want you to use what is known as the Manhattan distance, which is calculated as:

key(node) = |node_x - exit_x| + |node_y - exit_y|

The Manhattan distance is the true distance only if there are no walls in the way between the current position and the goal. It is always a lower-bound on the true distance to the goal.

The algorithm for Best-First Search is: insert the entrance location into a priority queue, where its key (priority value) is its Manhattan distance to the goal. Then, until the queue is empty or the exit is found, delete the location with the minimum key from the priority queue, find its neighbors, mark each neighbor with its Manhattan distance to the exit, and insert each neighbor location into the queue as long as it has not already been visited.

A* Search.
A* Search combines the best features of Breadth-first search (always finds optimal paths) and Best-First search (visits fewer nodes). A* is like Best-First search, except that the priority for a node is the sum of the actual distance from the start to the node and the estimated distance to the goal:

key(node) = distance_from_start_to_node + |node_x - exit_x| + |node_y - exit_y|

You will need to keep track of the true distance from the start in each node. Then, when find the neighbors of a node, their distance is the distance of the current node plus one.

There are also some other minor but important modifications you need to make to the Best-First algorithm:

After you first see a node, you might see it again but along a shorter path. Therefore, instead of just checking whether a node has been visited or not before adding it to the queue, you have to check whether it has been visited and whether the distance from the start to that node previously calculated is the same or shorter than the distance along the current path. If both conditions hold, then you don't add it to the queue. But, if you have found a shorter path to the node, go ahead and insert it into the queue again. It is okay to have several copies of the same node, but with different priorities, in the queue.
Do not stop the search until the goal is first removed from the queue, not when it is first inserted in the queue. This is because the first time you insert the goal you may not have found the shortest path to it.

We have supplied the declarations for rooms, nodes, and mazes, as well as the input routine for mazes. You may use any priority queue implementation you like, such as one of Java's built in classes or the code from Weiss (which is available on his home page.) Be sure to say what you use for the queue. Everything else will be coded by you. You may do the project in C if you wish, however you will have to implement all the code we gave you in C yourself. Your implementation needs to take the exact same command-line arguments as described above.

Part B: Using the test mazes in the Examples.zip file:

Run each of the four search algorithms on each maze. Record the path length found and the number of nodes each search algorithm had to expand in a separate file named results. (This file can be automatically generated from the output of the program.)
Include in the turnin a set of the final solution picture files: for each of the input files from part B, include a file named

RUNNER-FILENAME

where RUNNER is the name of the algorithm used and FILENAME the name of the maze file. For example,

java MazeRunner astar -input bigmaze1.txt -final astar-bigmaze1.txt >> results

creates astar-bigmaze1.txt. Note that it is easy to run all these experiments by writing a simple script. For example, assuming the input files are in the current directory,

  echo > results

  for f in maze*.txt; do

    for a in dfs bfs best astar; do

       java MazeRunner $a -input $f -final $a-$f >> results

    done

  done

Create a project writeup file that includes the following:

The name(s) of the students on your team.
An introduction that briefly describes this project and your design decisions.
Answers to questions:

How did the algorithms rank in terms of quality (length) of solutions on the mazes with and without cycles?
How did the algorithms rank in terms of number of nodes visited on the mazes with and without cycles?
What was the average savings (in percent) in terms of number of nodes visited by A* compared to breadth first?
What was the average improvement (in percent) in terms of the length of solutions found by A* as compared to best first?

A conclusion that summarizes what you discovered in doing this project.

III. Provided Code:

In the file MazeJavaCode.zip are:

EasyReader.java

The EasyReader class is used by the Maze class, and is designed to provide simple access to the I/O console. (Information about it may be found here.)

Maze.java

The Maze class itself. It contains inner classes RoomNode (representing a room in the maze) and RoomList (representing a list of rooms). The Maze constructor takes the name of a maze file as a parameter, and parses it to create the maze. It also contains a blank printMaze() routine (guess what that does?).

VisitedState.java

A short class which serves as a container for some constants relating to the state of a particular room in the maze.

The file Examples.zip contains the test mazes. Each file begins with the dimensions of the maze. In the maze, '-' and '|' are used for walls, ' ' for rooms and unblocked passages, '+' for the corners of the rooms, '*' for the entrance, and 'X' for the exit. For example:

7 5

      +-+-+-+-+-+-+-+

      |*|   | | |   |

      + +-+ + + +-+ +

      |   |   |     |

      + + + +-+ +-+ +

      | |   |     | |

      + +-+-+ +-+-+ +

      |       |     |

      +-+ + +-+-+ +-+

      |   |        X|

      +-+-+-+-+-+-+-+

Want to see how the mazes were generated? Generator.zip contains the program used to create the mazes.

IV. Odds and Ends

Note that a search might involve visiting many nodes not on the solution path. You will have to find a means of extracting the path from the collection of nodes you visited.
The count of the number of nodes expanded is incremented when a node is removed from the queue.

V. Extra credit

Visualization: Create a visualization routine that runs as the maze is being solved. Pop up a window that shows the graph, with different colors used to indicate the state of each room. Drawing routines will need to be invoked when the maze is create and from setState. Include instructions in your README file on how to run the visualization as a Java program and/or as an applet.
Add a maze generator to your program or applet, either by translating our generator from c to java or by starting from scratch.

· Come up with an A* heuristic that expands fewer nodes than the one specified above, and still guarantees optimal solutions. You can demonstrate your heuristic on graphs you create yourself that are designed to show that is it better.

Recall that any heuristic which is a lower bound on the true distance will still guarantee optimal solutions. So to improve the one specified above, you need to come up with a heuristic which is always smaller than the true distance, but is "smarter" than the one above. To show that your heuristic is smarter, you need to come up with input where your heuristic does better than the fist one.

Here is one approach that might work. If the current and goal nodes are on the same row or column, then the Manhattan distance is the exact distance to the solution only if there are no walls between the current position and goal. If there is a wall, then you can add 2 to the estimated distance to the goal, and have a better lower bound on the actual distance. Can you generalize this idea, by having the heuristic "look ahead" in order to improve its estimate?

Add an option to allow the mazerunner to knock down (additional) walls at a given cost. The command line option
-knock INTEGER
specifies the cost of knocking down a wall. For example,
-knock 3
means that you can walk through walls, but the distance through a wall is considered to be 4 (=3+1) rather than 1.
Come up with entirely new feature to add to mazes and to the solver. For example, "portals" -- if two squares on the input are labeled with the same letter, you are allowed to move between them in a single step. Go all out: We're letting you express your creative side here. How about mazes with multiple floors? Or rendering the maze in 3-D? If you do anything above the basic design specifications, document what you did and we will be curious to see your artistry. Of course, your code should still meet the basic requirements.