CSE 331 Problem Set 5: Abstract Reasoning and Graph Building

Handout P5

Contents:

Purpose
Getting Started
Graph ADT
- Problem 1: Abstraction Functions and Representation Invariants in Graph (11 points)
Graph Subtypes
- Problem 2: Subtypes of Graph (12 points)
Induction
- Problem 3: Proving correctness (31 points)
Graph Building
Errata
Q & A

Purpose

Abstract reasoning is important throughout all stages of developing a software system. This problem set starts off by developing your ability to reason abstractly about the properties of the code that you write.

Then you will be challenged to think ahead about Husky Maps's graph-building algorithm.

Getting Started

There is no code provided in this problem set, but you will want to update the cse331 directory in your SVN repository to create the cse331/src/ps5 directory.

By the end of the problem set, you should have the following files committed to SVN in your ps5 directory:

answers/<Your Graph Implementation From ps3>.java (see problem 1)
answers/problem1.txt
answers/problem2.txt
answers/problem3.txt
answers/problem4.txt
answers/collaboration.txt
answers/reflection.txt

Graph ADT

Problem 1: Abstraction Functions and Representation Invariants in Graph (11 points)

In Problem Set 3, you specified and implemented a generic graph abstraction. Answer the following questions in a file called problem1.txt. Place it in your answers/ folder and add it to SVN:

Restate your abstraction function and representation invariant for your graph ADT from Problem Set 3. By "restate," we mean to just copy and paste what you have used for PS3. Ensure that your overview and specfields correspond to a graph abstraction and not to your particular implementation. Include a copy of your Graph implementation in your answers/ folder and add it to SVN so your TA can reference it easily.
Argue that your implementation of Graph preserves its representation invariant. Please be as concise as possible. A good argument will explain why the representation invariant is not broken by each public method in the class.
Is it possible for two concrete states of your implementation of Graph to represent the same abstract state? If so, give an example; if not, briefly explain why this is impossible.

Graph Subtypes

Now we add a class Tree that extends Graph. A tree is a connected graph in which every node A has only one edge coming in from another node B. A is the child of B and B is the parent of A. All nodes in a tree, except the root, have exactly one parent. A tree must not contain cycles, meaning that if B is the parent of A, B cannot be a child or a descendant of A.

When adding a node to a tree, you must specify the parent of the node as well. We do not allow new edges to be added other than those added when adding new nodes.

A simple implementation of Tree could inherit its constructors and observers from the Graph class, and overwrite the addNode method such that it adds the new node and an edge to its parent at the same time.

class Graph<N> {
  // Adds a node to the graph
  void addNode(N node);

  // Adds an edge from parent to child
  void addEdge(N parent, N child);

  // Other additional methods, including a constructor that
  // takes no parameters and returns an empty graph.
}

class Tree<N> extends Graph<N> {

  /* @requires root != null
   * @effects Constructs a tree with a single node, root.
   */
   public Tree(N root) {
     super();
     super.addNode(root);
   }

  /* @requires parent != null && child != null && parent is in graph
   * @effects Add the node child to graph, and add an edge from parent to child
   */
  void addNode(N parent, N child) {
    if (parent not in graph) {
      // Error handling code
    } else {
      super.addNode(child);
      super.addEdge(parent, child);
    }
  }

  // Unsupported in Tree
  void addNode(N node) throws UnsupportedOperationException {
    // Throws exception regardless of input
  }

  // Unsupported in Tree
  void addEdge(N parent, N child) throws UnsupportedOperationException {
    // Throws exception regardless of input
  }

  // Other additional methods

}

Problem 2: Subtypes of Graph (12 points)

Answer the following questions in problem2.txt. Place the file in your answers/ folder and add it to SVN.

Is Tree a true subtype of Graph? Why?
Is it a good idea for Tree to subclass Graph? Why or why not?

Induction

This problem lets you practice using induction over operators to prove correctness. We will work with a representation of a tree data structure. Recall that a tree is a connected graph that contains no cycles. Our tree's nodes are identified by integer values. A node's "descendants" include any node that can be reached by traversing edges to successive children.

Remember that the concept of correctness is undefined without a specification and representation invariant. The specification is:

class Tree {
  public Tree()
    effects: creates an empty Tree that contains zero nodes and edges.

  public void addNode(int n)
    modifies: this
    effects:  this' = this + {TreeNode(n)} where this' is the state of the object
              after applying the method

  public void deleteNode(int n)
    modifies: this
    effects:  this' = this - {TreeNode(n)}

  public boolean containsNode(int n)
    returns:  this.contains(TreeNode(n))
}

class TreeNode {
  public TreeNode(int v)
    effects: create a node with the given value

  public void addChild(TreeNode n)
    modifies: this
    effects:  this'.children = this.children + n

  public void removeChild(TreeNode n)
    modifies: this
    requires: this.children.contains(n)
    effects:  this'.children = this.children - n

  public int getValue()
    returns: this.value

  public TreeNode getInsertableNode()
    returns: any descendent node with fewer than 2 children

  public TreeNode findParentOf(int n)
    returns: the parent of any node with the value n

  public TreeNode findNode(int n)
    returns: any node, from among all descendants, with the value n

  public List getChildren()
    returns: this.children
}

Here is pseudo-code for an implementation along with a representation invariant with two clauses. Note that this specification is different from the one given in Problem 2.

class Tree {
  private TreeNode rootNode;

  // Representation invariant:
  //  - there are no cycles in the tree,
  //  - there is exactly one path from the root to any node in the tree.

  public Tree() {
    // set rootNode to null
  }

  public void addNode(int n) {
    // if the rootNode is null,
    //   set it to TreeNode(n)
    // otherwise,
    //   create a new node with value n.
    //   find a descendant with fewer than 2 children.
    //   and add a new node as its child.
    //   (note: one or more other TreeNode objects with the same
    //          value might already be in the tree.)
  }

  public void deleteNode(int n) {
    // if the root node isn't null:
    //   if the root node has value n
    //     make one of the root's children the new root
    //     and make all the other children be children
    //     of the new root
    //     delete the node
    //   else
    //     find the node and its parent.
    //     make every child of the node be a child of the node's parent.
    //     delete the node
  }

  public boolean containsNode(int n) {
    // if the root node isn't null:
    //   if the node is the root,
    //     return true
    //   else
    //     if a descendant node has value n:
    //       return true
    //     else
    //       return false
  }
}


class TreeNode {
  List<TreeNode> children;
  int    value;

  public TreeNode(int v) {
    value = v;
    children = new ArrayList<TreeNode>();
  }

  public void addChild(TreeNode n) {
    // insert n into the children List
  }

  public void removeChild(TreeNode n) {
    // remove n from the children List
  }

  public int getValue() {
    return value;
  }

  public TreeNode getInsertableNode() {
    // do a depth-first search of the
    // children and find a node with fewer than 2 children
  }

  public TreeNode findParentOf(int n) {
    // do a depth-first search of the children,
    // stopping at a node that has a child with value n
  }

  public TreeNode findNode(int n) {
    // if value == n
    //   return this
    // else
    //   do a depth-first search of the of children,
    //   stopping at the node that has value n
  }

  public List<TreeNode> getChildren() {
    return Collections.unmodifiableList(children);
  }
}

Using this specification and implementation outline, let's reason about the representation invariants:

All induction proofs start with a base case. Here the base case is proving that the tree starts with no cycles. Our constructor creates a tree with no nodes, and hence no cycles, so this is clearly true.
Now, let's use induction over operators to prove that none of the other methods break the invariant. To do so, assume the operator is called on a Tree that has no cycles. We must prove that each operator does not induce a cycle. If [the proof that operator X does not induce a cycle] depends on [the proof that operator Y does not induce a cycle], that's okay, as long as you eventually prove that operator Y does not induce a cycle. This could happen, for example, if one of your operators calls one of your other operators. Further, we will only give proofs for methods in Tree, not TreeNode. The method addNode(int n) has two cases:
- First, when we are inserting the first node we clearly cannot have a cycle.
- Second, consider when at least one node exists. We first find a descendant node that has less than 2 children, which does not change the tree, and hence doesn't induce a cycle. Next, add the new node as a child of the found node. Since the added node is a new node with no descendants, adding it cannot induce a cycle.

Problem 3: Proving correctness (31 points)

Answer the following questions in problem3.txt. Place it in your answers/ folder and add it to SVN. For your proofs you may assume that no two nodes in the tree have the same value.

Follow the examples above and prove that deleteNode(int n) does not induce any cycles in the tree. Your proof should not be long. You should break the proof into cases that mirror the cases in the pseudo-code.
Give an inductive proof for the second invariant: there is a path from the root to any node in the tree. Break your argument into these parts:
1. What is the base case? (Do not write more than one sentence.)
2. Show the inductive step for addNode(int n). (Hint: you can assume that you have a path from the root to the node where you're inserting the new node.)
3. Show the inductive step for deleteNode(int n).
4. Show the inductive step for containsNode(int n).
Describe why it is not possible to prove that no node ever has more than 2 children. Please be brief.

Graph Building

In Problem Set 6, you will build a graph-based representation of the streets in the Husky Maps database.

Consider the following pseudo-code to build such a graph from a collection of StreetSegments:

for each segment s in the collection of segments
  s_r <- reverse of s
  add s to the set of nodes
  add s_r to the set of nodes

for each segment s in the set of nodes
  for each segment s' in the set of nodes
    if s.p2 = s'.p1
      add an edge from s to s'

Note: We are ignoring the issue of one-way streets. We are assuming that all street segments in the database allow bidirectional travel and insert both a segment and its reverse in the set of nodes.

Problem 4: Finding an efficient graph building algorithm (22 points)

Unfortunately, this algorithm requires quadratic time in the number of segments. For Husky Maps, this is unacceptable due to the large number of segments in the database.

Give a more efficient algorithm to build the graph. Your algorithm should run in time O(n*outDegree) where n is the number of segments and outDegree is the maximum number of segments a single segment connects to. Describe your algorithm in problem4.txt and argue that it meets the specified time bound. Place the file in your answers/ folder and add it to SVN.

Feel free to use pseudo-code and to introduce auxiliary data structures or helper procedures as necessary. However, please try to keep your answer concise.

Collaboration (1 point)

Please answer the following questions in a file named collaboration.txt in your answers/ directory.

The standard collaboration policy applies to this problem set.

State whether or not you collaborated with other students. If you did collaborate with other students, put their names and a brief description of how you collaborated.

Reflection (1 point)

Please answer the following questions in a file named reflection.txt in your answers/ directory.

How many hours did you spend on each problem of this problem set?
In retrospect, what could you have done better to reduce the time you spent solving this problem set?
What could the CSE 331 staff have done better to improve your learning experience in this problem set?

Errata

None yet!

Q & A

This section will list clarifications and answers to common questions about problem sets. We'll try to keep it as up-to-date as possible, so this should be the first place to look (after carefully rereading the problem set handout and the specifications) when you have a problem.

None yet!

Problem Set 5: Abstract Reasoning and Graph Building

Due: May 4, 2010 at 8pm

Handout P5