Project 1, part 2: Abstract syntax trees

Due

In this half of the assignment, you will help implement a part of a symbolic algebra calculator. A symbolic algebra calculator is a kind of calculator that does not immediately evaluate the expressions you type in but instead lets you manipulate them symbolically.

You will implement the logic for manipulating and evaluating mathematical expressions.

Overview

To get us started, let's look at what your finished program will look like.

Screenshot of calculator

The user can interactively enter commands in the bottom pane. These commands can do different things, ranging from just evaluating expressions to graphing actual output. The lines that start with >>> are lines where the user typed in input; the following lines are the output of that operation.

Our code will manage most of the logic in this pane for you. Your job is to simply focus on evaluating expressions.

Example 1: Basic usage and variables

Let's take a closer look at an example run:

>>> 3 + 2 * 7
17

>>> x + y
x + y

>>> x := 1
1

>>> x + y
1 + y

>>> y := 2
2

>>> x + y
3

The users type input into the lines that start with >>>. The line that follows immediately after is the output.

This example demonstrates the two core features we can perform using this calculator: evaluate math expressions (line 1) and symbolically work with variables (lines 7 - 17).

When we first typed in x + y, we were unable to evaluate or simplify the expressions because the variables x and y are currently unknown. Once both variables are defined, we can evaluate this function.

To assign variables, we do varName := expression. This differs from how we assign variables in Java: we use := instead of =, and we don't need to assign a type to this variable. (All variables in our language are just mathematical expressions, so there's no need).

Example 2: Variable redefinition

Here is another example run:

>>> y := x + 3
x + 3

>>> y
x + 3

>>> x := 4
4

>>> y
7

>>> x := 8
8

>>> y
11

We define y to be equal to the expression x + 3. What exactly that expression evaluates to depends on the exact definition of x.

Example 3: Symbolic evaluation

Here is another example:

>>> y := x + 2 + 3 * sin(x) + (20 + 30)/70
x + 2 + 3 * sin(x) + 50 / 70

>>> y
x + 2 + 3 * sin(x) + 50 / 70

>>> x := 4
4

>>> y
6 + 3 * sin(4) + 50 / 70

>>> toDouble(y)
4.44387822836193

Notice that when we define y and display it, we don't evaluate the expression fully. By default, our calculator will perform simplification only if it can get an exact result. For example, we know evaluating sin and division generally does not produce exact results, so do not attempt to simplify those particular expressions.

In this case, we could have simplified 50 / 70 further. However, for the sake of simplicity, we will follow a simpler rule: we do not attempt to simplify division.

To get back an exact result, we need to use the toDouble(y) expression.

Example 4: AST transformation

Finally, the payoff: by deferring evaluating the expression, we can manipulate expressions in interesting ways we wouldn't be able to do otherwise. For example:

>>> y := x^2 + 7*x + 10
x ^ 2 + 7 * x + 10

>>> plot(y, x, 0, 10, 0.5)  // Plot y by varying x from 0 to 10 in steps of 0.5
// Output plot omitted

>>> derive(y, x)            // Derive y with respect to x (extra credit)
2 * x + 7

These functions all work by taking the underlying expressions and manipulating them in some way.

Notes about implementation

Representing expressions as data structures

So, how do we go about actually representing expressions like x + 2 * sin(y) in our code?

It turns out the most convenient representation is to store this as a tree:

AST representation

This lets us easily preserve things like nesting and order-of-operations. We can also evaluate or manipulate this tree using the same kind of tree-manipulation algorithms you studied in CSE 143!

We call this sort of data structure an abstract syntax tree (AST). Each individual node in this tree is an AST node. An abstract syntax tree is similar to the binary trees that you've studied in CSE 143, with a few key differences:

  1. We store a list of children, rather then points to the left and right children.
  2. We store extra metadata on whether a node is meant to represent a number, variable, or an operation. An "operation" node is a node that represents any kind of function call (like sin(...)) or any node that contains children (like a + b).

You should read the source code of AstNode.java for more information on what methods are available for you to use.

Example: Creating and using AST nodes

You are not required to know how to build an AST expression to complete this assignment. However, if you're curious, here's how we would build the AST representing the expression x + 2 * sin(y):

// Create 'sin' node
IList<AstNode> sinParams = new DoubleLinkedList<>();
sinParams.add(new AstNode("y"));
AstNode sinNode = new AstNode("sin", sinParams);

// Create the 'multiply' node
IList<AstNode> multiplyParams = new DoubleLinkedList<>();
multiplyParams.add(new AstNode(2.0));
multiplyParams.add(sinNode);
AstNode multiplyNode = new AstNode("*", multiplyParams);

// Create the final 'add' node:
IList<AstNode> addParams = new DoubleLinkedList<>();
addParams.add(new AstNode("x"));
addParams.add(addNode);
AstNode finalNode = new AstNode("+", addParams);

Background: converting a string into an AST

You may be wondering how exactly we go about converting arbitrary strings into an AST.

We've provided you the code to do this, and understanding how this process works is beyond the scope of this class, but if you're curious, this process is known as "lexing" and "parsing". A "lexer" takes a string and outputs a list of each distinct "token" in the stream. A parser takes this list of tokens and figures out how to convert it into a tree form.

We use a library called ANTLR which will automatically generate a lexer and a parser for us based on a grammar that describes what our language looks like. See the files in src/main/antlr if you want to know what our grammar looks like.

If you'd like to learn more about this whole process, feel free to ask the course instructor.

Types of AstNodes

So, we can represent expressions as a tree of AstNodes, but what exactly does an AstNode look like?

There are three types of AstNodes:

  1. Numeric nodes
  2. Variable nodes
  3. Operation nodes

Numeric nodes and variable nodes ought to be self-explanatory: they represent numbers and variables respectively. These types of nodes are guaranteed to always be leaf nodes.

Operation nodes are a little more abstract: they represent any kind of node that might have one or more children. For example, something like 3 + x would be represented as an operation node (+) with two children: a numeric node (3) and a variable node (x).

The AstNode class contains methods you can call to determine what type of node it is and to extract relevant pieces of information.

Storing variables

We can represent expressions using abstract syntax trees, but how do we store variables?

We use a dictionary! Whenever the user defines a variable, we store that variable name as a string and the corresponding expression in your ArrayDictionary. If a variable does not exist in that dictionary, we say it's an unknown variable.

We will provide you with code that handles variable assignment: you only need to use the pre-populated dictionary you were given.

Part 2a: Implement toDouble(...)

Task: Implement ExpressionManipulator.toDouble(...).

(Note: you can find ExpressionManipulator in the calculator.ast package.)

First, you will implement toDouble(...), a function that attempts to fully evaluate the given AST into just a single double.

For more details about the expected behavior of this function, see the comments in the method you need to implement.

You may assume that the AST you are given will contain only the following types of nodes:

Part 2b: Implement simplification

Task: Implement ExpressionManipulator.simplify(...).

This method is responsible for simplifying an AST node. For more implementation details, see the comments in the simplify(...) method. You may make the same assumptions about the AST as you did in part 2a. Note: the calculator will call your simplify(...) function many times while evaluating a single expression, so make sure it's relatively efficient.

Part 2c: Implement plot(...)

Task: Implement ExpressionManipulator.plot(...).

As before, see the source code for additional implementation nodes. You may make the same assumptions about the AST as you did in 3a and 3b.

Note: you are not responsible for drawing the actual chart yourself. Instead, you should call the drawScatterPlot(...) method (see ImageDrawer).

Background: drawing custom visualizations and charts

If you want to try implementing extra visualizations or charts (perhaps for extra credit? See below), here are some additional pieces of information that may help.

The calculator itself is built on top of Java Swing, which is the same library that the DrawingPanel class from CSE 142 uses. This means that fundamentally, the way we draw things is the same: we get and use a Graphics object. See ImageDrawer.getGraphics() for more details.

We use a library called "JFreeChart" to draw our different charts. You can find documentation on how to use this library online, and an example in the ImageDrawer class.

Part 2d: Complete writeup

Task: submit answers to the following questions.

Each group member should answer and submit these questions independently at this canvas page.

You must submit your answers in either .txt or .pdf form.

  1. Questions about the project:
    1. What is your name and your partner's name?
    2. How was the project? Did you enjoy it?
    3. Do you have any suggestions for improving the project?
  2. Questions about your partner (skip if you worked solo):
    1. Which parts of the project did you work on, and which parts did your partner program? Did you try pair programming?
    2. How was your partnership? Do you feel you and your partners contributed equally to this project?

Part 2e: Extra credit

You can earn extra credit by extending your calculator in different ways. See the extra credit page for more details.

If you decide to complete any of the extra credit, you should create a folder located at src/writeup and add a file named extra-credit.txt inside that describes what you implemented.

Deliverables

The following deliverables are due on .

A single person from your partnership should submit the entire contents of your src folder using the submit tool.

Make sure that the src folder contains the following:

Both partners should turn in a .txt or .pdf file containing their answers to part 2d at this canvas link, if you haven't already. This is due at the same time as the rest of your project.