Home Trees
Modifying trees
You should use the x = change(x)
pattern when modifying trees
and rearranging nodes.
In particular, you should never try and get the "parent" of a tree node. Tree nodes should contain references only to their children, never to their parent.
Suppose we wanted to write a tree manipulation method that replaces every node in a tree containing an odd number with a node containing "-1".
Naively, we might try modifying the data field of the tree, like so:
public void replaceOdds() { this.replaceOddsHelper(this.root); } private void replaceOddsHelper(TreeNode current) { if (current != null) { if (current.data % 2 == 1) { current.data = -1; } this.replaceOddsHelper(current.left); this.replaceOddsHelper(current.right); } }
However, in this class, we are not allowed to modify the data field(s) of
a node once they've been set, since it defeats the point of the exercise. In
this class, the data field will also always be marked final
, which
would prevent this code from even compiling in the first place.
We might then try and do something like this:
public void replaceOdds() { this.replaceOddsHelper(this.root); } private void replaceOddsHelper(TreeNode current) { if (current != null) { this.replaceOddsHelper(current.left); this.replaceOddsHelper(current.right); if (current.data % 2 == 1) { // Create new node TreeNode newCurrent = new TreeNode(-1); newCurrent.left = current.left; newCurrent.right = current.right; // Try and modify the parent to point at this new one TreeNode parent = /* Get parent node somehow */ if (current == parent.left) { parent.left = newCurrent; } else { parent.right = newCurrent; } } } }
However, this is very inelegant and hacky. For one, we don't have any obviously clean way of finding the parent. For another, having to determine if the current node was the left or right child of the parent is somewhat hacky – it seems like a lot of needless effort.
What we should do instead is use the x = change(x)
pattern. The basic idea
is that instead of having our recursive method be in charge of setting the parent, it should
instead only return a changed node and let the parent take charge of setting the
parent:
public void replaceOdds() { this.root = this.replaceOddsHelper(this.root); } private TreeNode replaceOddsHelper(TreeNode current) { if (current != null) { current.left = this.replaceOddsHelper(current.left); current.right = this.replaceOddsHelper(current.right); if (current.data % 2 == 1) { // Create new node TreeNode newCurrent = new TreeNode(-1); newCurrent.left = current.left; newCurrent.right = current.right; // We will eventually return the new current current = newCurrent; } } return current; }
Now, our caller (which could either be the recursive method or the original public pair) takes charge of setting and changing references, simplifying the code.
As a brief note, we can actually write this example a bit more concisely, using the three-argument constructor for TreeNodes:
public void replaceOdds() { this.root = this.replaceOddsHelper(this.root); } private TreeNode replaceOddsHelper(TreeNode current) { if (current != null) { current.left = this.replaceOddsHelper(current.left); current.right = this.replaceOddsHelper(current.right); if (current.data % 2 == 1) { current = new TreeNode(-1, current.left, current.right); } } return current; }
Looking ahead in trees
When writing a recursive tree method, you should almost never "look ahead" – the current recursive call should only ever need to look at the data field(s) of the current node. It is very rarely the case that it's necessary to look at the data field(s) of the left and right subtrees.
The problem with "looking ahead" is that you often end up duplicating a fair amount of your logic. Let's suppose that we want to write a method that replaces every tree node containing the number "10" with the number "20". Naively, we might try writing something like this:
public void replaceTenWithTwenty() { this.replace(this.root); if (this.root != null && this.root.value == 10) { this.root = new TreeNode(20, this.root.left, this.root.right); } } private void replace(TreeNode curr) { if (curr != null) { this.replace(curr.left); this.replace(curr.right); if (curr.left != null && curr.left.value == 10) { curr.left = new TreeNode(20, curr.left.left, curr.left.right); } if (curr.right != null && curr.right.value == 10) { curr.right = new TreeNode(20, curr.right.left, curr.right.right); } } }
This code "looks ahead" – within the recursive method, it looks ahead to the contents of the left and right subtrees to manipulate them.
The main reason why this is bad is due to redundancy – we've needed to repeat the "node is not null" check in four different places, and needed to repeat the "replace if 10" code in three different places.
We can entirely eliminate this redundancy by using the x = change(x)
pattern:
public void replaceTenWithTwenty() { this.root = this.replace(this.root); } private TreeNode replace(TreeNode curr) { if (curr != null) { curr.left = this.replace(curr.left); curr.right = this.replace(curr.right); if (curr.value == 10) { curr = new TreeNode(20, curr.left, curr.right); } } return curr; }
This code is much cleaner. The redundancy is entirely eliminated.
You will occasionally run into cases where you need to look ahead, usually when working with more complex trees outside of CSE 143. These instances are very rare within CSE 143 – it is nearly always possible to avoid looking ahead in trees within this class.
Designing custom tree nodes
When designing custom treenode-like objects, you should take care to keep your nodes
simple, and handle only node-specific logic. Any fields that contain "data" should be marked
as final
.
As a rough rule of thumb, if your custom node ends up looking up information or "data" stored in other nodes, it's doing too much.
Suppose we were trying to write a custom tree node that represents an arithmetic operation – we want to represent mathematical expressions like "(a + b) * c" as a tree, where "a", "b", and "c" are variables.
Here is one possible example of how we might design this kind of node. We are assuming we
are making this class a private static
inner class, to mirror your homework
assignments.
// A math expression private static class MathNode { public String value; public MathNode parent; public MathNode left; public MathNode right; public MathNode(String value, MathNode parent, MathNode left, MathNode right) { this.value = value; this.parent = parent; this.left = left; this.right = right; } // Returns a String representation of this expression public String toString() { if (this.left == null && this.right == null) { return this.value; } else { String leftExpr = this.left.toString(); String rightExpr = this.right.toString(); return "(" + leftExpr + " " + this.value + " " + rightExpr + ")"; } } // Replaces all instances of a variable in this expression with a new one public void replaceVariables(String oldVarName, String newVarName) { if (this.left == null && this.right == null) { if (this.value.equals(oldVarname)) { boolean iAmLeft = this.parent.left == this; if (iAmLeft) { this.parent.left = new MathNode(newVarName, null, null); } else { this.parent.right = new MathNode(newVarName, null, null); } } } else { this.left.replaceVariables(String oldVarName, String newVarName); this.right.replaceVariables(String oldVarName, String newVarName); } } }
This class has a number of flaws:
- Our
value
field stores data, but is not marked final. - Storing the
parent
makes our class too complex. We never need to store the parent (we can always use thex = change(x)
pattern instead), so why bother? - Notice that if we ever want to create a variable using our class,
we need to do
new MathNode(varName, null, null);
. This can potentially create a lot of redundancy, so it'd be cleaner to create a second constructor instead. (That said, you should never create constructors or methods that you never actually plan on using). - The
toString()
andreplaceVariables
methods are too complex. Both methods are looking at more then just this node. - Our comments are very minimal, and are sometimes non-existant.
- The
replaceVariables
method is disgusting – it would be much cleaner if we used thex = change(x)
pattern instead. - We repeat the
this.left == null && this.right == null
check in multiple places. We can refactor this redundancy.
Here is an example of how we could repair the above flaws:
// Represents a part of a mathematical expression. A MathNode can // either be a variable, or a binary operation on two math expressions. private static class MathNode { public final String value; public MathNode left; public MathNode right; // Initializes this MathNode to represent a binary operation. // // We assume no parameter is null, and that the 'operator' param is // a valid operator (such as +, -, *, /, %, etc). // // For example, running: // // new MathNode("+", new MathNode("a"), new MathNode("b")); // // ...corresponds to "(a + b)". public MathNode(String operator, MathNode left, MathNode right) { this.value = operator; this.left = left; this.right = right; } // Initializes this MathNode to represent a variable. // // We assume the 'variable' param is not null, and corresponds to a valid // variable. public MathNode(String variable) { this(variable, null, null); } // Returns 'true' if this node is a variable, and 'false' otherwise. public boolean isVariable() { return this.left == null && this.right == null; } }
This class fits all of the criteria listed previously. It's minimal, the
field to store the operator type or the variable name is marked as final, and
we only ever look at data stored within this particular node. Creating the
isVariable
method within the node is fine, since that method ends
up looking at only information stored within the current node.
In contrast, we would then pull the toString
and
replaceVariables
code out of this class, modify them to use
public/private pairs and the x = change(x)
pattern, and make them
a part of the containing class instead:
public class MathExpression { private MathNode root; /* [Code omitted] */ // Returns a String representation of this math expression. // Each operator is surrounded by parenthesis to make order of operations // explicit. For example: // // (a + (b * c)) // // If this math expression is just a variable, the output contains no // parenthesis. public String toString() { return this.toStringHelper(this.root); } // Returns the string representation of the current node. // We assume the 'curr' parameter is not null. // // Each operator is surrounded by parenthesis to make order of operations // explicit. For example: // // (a + (b * c)) // // If this math expression is just a variable, the output contains no // parenthesis. private String toStringHelper(MathNode curr) { if (curr.isVariable()) { return curr.value; } else { String leftExpr = this.toStringHelper(curr.left); String rightExpr = this.toStringHelper(curr.right); return "(" + leftExpr + " " + curr.value + " " + rightExpr + ")"; } } // Replaces all instances of a variable in this expression with the new one. // For example, if the expression was originally "((a - b) + (b * c))", and // we want to replace all instances of "b" with "z", the new expression // would be "((a - z) + (z * c))". // // We assume neither parameter is null, and newVarName is a valid name. // This method changes this MathExpression in place. public void replaceVariables(String oldVarName, String newVarName) { this.root = this.replace(this.root, oldVarName, newVarName); } // Returns a MathNode where all instances of the old variable have been // replaced with the new one. // // We assume the current node is never null, and both variable names // are valid and non-null. private MathNode replace(MathNode curr, String oldVarName, String newVarName) { if (curr.isVariable()) { if (curr.value.equals(oldVarname)) { curr = new MathNode(newVarName); } } else { curr.left = this.replace(curr.left, oldVarName, newVarName); curr.right = this.replace(curr.right, oldVarName, newVarName); } return curr; } /* [Code omitted] */ private static class MathNode { /* ... */ } }