Kruskal's Algorithm

Navigating the Codebase (Part 2)¶

More on Graphs¶

Here are more data structures and algorithms that you will need for the next part of the assignment.

graphs.Graph: a basic directed graph, with generic type parameters for vertex and edge types.
- graphs.KruskalGraph: extends Graph to be undirected, and adds a few more methods required by Kruskal’s algorithm.
  - graphs.AdjacencyListUndirectedGraph: an adjacency list implementation of KruskalGraph. MazeGraph extends this class.
- There are additional Graph implementations in the test module.
graphs.BaseEdge: the base edge class; always directed and weighted.
- graphs.Edge: the primary implementation of BaseEdge.
- graphs.EdgeWithData: a BaseEdge that additionally contains some extra, generic data; used by MazeGraph.
graphs.minspantrees: contains algorithms for finding minimum spanning trees

Disjoint Sets¶

Lastly, the disjointsets package contains the disjoint sets data structures required by Kruskal’s algorithm. A default, inefficient implementation is included in the skeleton. Later, you will implement an optimized version.

Generating Mazes¶

Background

Understand the interfaces involved.

We saw earlier that the “remove random walls” algorithms usually ended up generating pretty poor mazes—they either removed too many walls and created trivial mazes, or removed too few and created impossible ones.

What we really want is an algorithm that:

generates a random-looking maze
makes sure the maze is actually solvable
removes as few walls as possible

It turns out that we can use MST algorithms such as Prim’s and Kruskal’s to do exactly that! We’ll start this portion of the assignment by implementing Kruskal’s algorithm, and afterwards you’ll use it to generate better mazes.

Minimum-Spanning-Tree Finder¶

Background

Much like ShortestPathFinder, MinimumSpanningTreeFinder describes an object that simply computes minimum spanning trees.

The interface also includes the same gross generic definitions as ShortestPathFinder, but once again, you should be able to safely ignore them—the important takeaway is that G is a Graph, V can be any object, and E is a BaseEdge.

Signature	Description
`MinimumSpanningTree<V, E>` `findMinimumSpanningTree(G graph)`	Finds and returns a minimum spanning tree for the given graph.

MinimumSpanningTree is another container for edges, but unlike ShortestPath, the edges are unordered (since the edges of an MST don’t have any particular ordering like the edges of a path do).

Kruskal Graph Interface¶

Background

Kruskal’s algorithm requires some extra functionality from its graphs beyond the basic Graph interface, so we introduced another graph interface - the KruskalGraph interface:

Signature	Description
`Collection<V> allVertices()`	Returns an unmodifiable collection of all vertices in the graph.
`Collection<E> allEdges()`	Returns an unmodifiable collection of all edges in the graph.

Disjoint Sets¶

Background

Kruskal’s algorithm also uses the disjoint sets ADT:

Signature	Description
`void makeSet(T item)`	Creates a new set containing just the given item and with a new integer id.
`int findSet(T item)`	Returns the integer id of the set containing the given item.
`boolean union(T item1, T item2)`	If the given items are in different sets, merges those sets and returns `true`. Otherwise does nothing and returns `false`.

The skeleton includes a naive implementation, QuickFindDisjointSets, which you can use to start. You’ll write a faster implementation later.

Kruskal MST Finder¶

Task

Complete KruskalMinimumSpanningTreeFinder, using Kruskal’s algorithm to implement the MinimumSpanningTreeFinder interface.

Implementation notes:

The generic type bounds on this class require G to be a subtype of KruskalGraph.
The skeleton code includes a snippet of code that sorts the edges of the given graph based on their weights, so you don’t need to worry about figuring out how to do that.
Unlike the pseudocode from disjoint set lecture, the findShortestPath must be able to detect when no MST exists and return the corresponding MinimumSpanningTree result.

Graphs used in tests:

Tree graph:

Tree graph

Graph with cycle:

Graph with cycle

Disconnected graph:

Disconnected graph

Graph with self-loop edge:

Graph with self-loop edge

Carving Mazes using Kruskal’s¶

Task

Implement KruskalMazeCarver using KruskalMinimumSpanningTreeFinder.

MazeCarver is used to determine which walls to remove from an initial maze, where there exists a wall between all adjacent rooms. After the removal, a real maze can be formed. The MazeCarver requires subclasses to implement a single method:

Signature	Description
`Set<Wall>` `chooseWallsToRemove(Set<Wall> walls)`	Given a set of walls separating rooms in a maze base, returns a set of every wall that should be removed to form a maze.

Recall our criteria from above:

generates a random-looking maze
makes sure the maze is actually solvable
removes as few walls as possible

Here’s the trick: we take the maze and treat each room as a vertex and each wall as an edge, much like we would when solving the maze (the only difference being that edges now represent walls instead of pathways). Then, we can assign each wall a random weight, and run any MST-finding algorithm.

For example, here’s a diagram of an MST that might be output for a grid-shaped maze:

Example MST overlaid over a grid base

By removing any wall that was a part of that MST, we end up satisfying all three criteria! By randomizing the wall weights, we remove random walls which satisfy criterion 1. An MST, by definition, will include a path from every vertex (every room) to every other one, satisfying criterion 2. And finally, because the MST will not have cycles, we avoid removing unnecessary edges and end up with a maze where there really is only one solution, satisfying criterion 3.

Implementation notes:

If you aren’t sure where to start your implementation, take a look at MazeGraph, Wall, and related classes. To reiterate, Rooms will be the vertices in the graph, and edges will represent Walls.
Use the Random object in the provided rand field to generate the random edge weights.
Use the MinimumSpanningTreeFinder object in the provided minimumSpanningTreeFinder field to find the minimum spanning tree.

After you finish, you can try using your code to generate some mazes by running the program and using the “Run (randomized) Kruskal” option. You should notice that although the mazes generated look much better than before, they take a bit longer to generate—we’ll address this by creating a faster disjoint sets implementation.

A Better Disjoint Sets Implementation¶

Task

Implement UnionBySizeCompressingDisjointSets, and use it to speed up KruskalMinimumSpanningTreeFinder.

You may want to review the lecture slides and section materials before implementing your disjoint sets since your implementation will need to use the union-by-size, path compression, and array representation optimizations that they cover. Also make sure to store your array representation in the provided pointers field—the grader tests will inspect it directly.

After modifying your KruskalMinimumSpanningTreeFinder to use this class, you should notice that maze generation using KruskalMazeCarver becomes significantly faster—almost indistinguishable from the time required by the RandomMazeCarver.

Submission¶

Task

Commit and push your changes to GitLab before submitting to Gradescope.