CSE 373, Summer 2018: Project 1: Part 2

Summary

Due Friday July 13 at 11:59pm

In this half of the assignment, you will help implement a part of a symbolic algebra calculator. A symbolic algebra calculator is a kind of calculator that does not immediately evaluate the expressions you type in but instead lets you manipulate them symbolically.

You will implement the logic for manipulating and evaluating mathematical expressions.

You will be modifying the following file:

  • main.java.calculator.ast.ExpressionManipulators.java

Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):

  • test.java.calculator.TestCalculator.java
  • main.java.datastructures.interfaces.IList.java
  • main.java.datastructures.interfaces.IDictionary.java
  • main.java.experiments.*
  • main.java.calculator.ast.AstNode.java
  • main.java.calculator.gui.ImageDrawer.java
  • main.java.calculator.interpreter.Environment.java

Overview

To get us started, let's look at what your finished program will look like.

Screenshot of calculator

The user can interactively enter commands in the bottom pane. These commands can do different things, ranging from just evaluating expressions to graphing actual output. The lines that start with >>> are lines where the user typed in input; the following lines are the output of that operation.

Our code will manage most of the logic in this pane for you. Your job is to simply focus on evaluating expressions.

Example 1: Basic usage and variables

Let's take a closer look at an example run:

>>> 3 + 2 * 7
17

>>> x + y
x + y

>>> x := 1
1

>>> x + y
1 + y

>>> y := 2
2

>>> x + y
3

The users type input into the lines that start with >>>. The line that follows immediately after is the output.

This example demonstrates the two core features we can perform using this calculator: evaluate math expressions (line 1) and symbolically work with variables (lines 7 - 17).

When we first typed in x + y, we were unable to evaluate or simplify the expressions because the variables x and y are currently unknown. Once both variables are defined, we can evaluate this function.

To assign variables, we do varName := expression. This differs from how we assign variables in Java: we use := instead of =, and we don't need to assign a type to this variable. (All variables in our language are just mathematical expressions, so there's no need).

Example 2: Variable redefinition

Here is another example run:

>>> y := x + 3
x + 3

>>> y
x + 3

>>> x := 4
4

>>> y
7

>>> x := 8
8

>>> y
11

We define y to be equal to the expression x + 3. What exactly that expression evaluates to depends on the exact definition of x.

Example 3: Symbolic evaluation

Here is another example:

>>> y := x + 2 + 3 * sin(x) + (20 + 30)/70
x + 2 + 3 * sin(x) + 50 / 70

>>> y
x + 2 + 3 * sin(x) + 50 / 70

>>> x := 4
4

>>> y
6 + 3 * sin(4) + 50 / 70

>>> toDouble(y)
4.44387822836193

Notice that when we define y and display it, we don't evaluate the expression fully. By default, our calculator will perform simplification only if it can get an exact result. For example, we know evaluating sin and division generally does not produce exact results, so do not attempt to simplify those particular expressions.

In this case, we could have simplified 50 / 70 further. However, for the sake of simplicity, we will follow a simpler rule: we do not attempt to simplify division.

To get back an exact result, we need to use the toDouble(y) expression.

More specifically, we attempt to simplify expressions involving addition, subtraction, and multiplication of two numbers. If you are interested in implemented more complex kinds simplification operations, you can do so for extra credit.

Example 4: AST transformation

Finally, the payoff: by deferring evaluating the expression, we can manipulate expressions in interesting ways we wouldn't be able to do otherwise. For example:

>>> y := x^2 + 7*x + 10
x ^ 2 + 7 * x + 10

>>> plot(y, x, 0, 10, 0.5)  // Plot y by varying x from 0 to 10 in steps of 0.5
// Output plot omitted

>>> derive(y, x)            // Derive y with respect to x (extra credit)
2 * x + 7

These functions all work by taking the underlying expressions and manipulating them in some way.

Notes about implementation

Representing expressions as data structures

So, how do we go about actually representing expressions like x + 2 * sin(y) in our code?

It turns out the most convenient representation is to store this as a tree:

AST representation

This lets us easily preserve things like nesting and order-of-operations. We can also evaluate or manipulate this tree using the same kind of tree-manipulation algorithms you studied in CSE 143!

We call this sort of data structure an abstract syntax tree (AST). Each individual node in this tree is an AST node. An abstract syntax tree is similar to the binary trees that you've studied in CSE 143, with a few key differences:

  1. We store a list of children, rather then points to the left and right children.
  2. We store extra metadata on whether a node is meant to represent a number, variable, or an operation. An "operation" node is a node that represents any kind of function call (like sin(...)) or any node that contains children (like a + b).

You should read the source code of AstNode.java for more information on what methods are available for you to use.

Example: Creating and using AST nodes

You are not required to know how to build an AST expression to complete this assignment. However, if you're curious, here's how we would build the AST representing the expression x + 2 * sin(y):

// Create 'sin' node
IList<AstNode> sinParams = new DoubleLinkedList<>();
sinParams.add(new AstNode("y"));
AstNode sinNode = new AstNode("sin", sinParams);

// Create the 'multiply' node
IList<AstNode> multiplyParams = new DoubleLinkedList<>();
multiplyParams.add(new AstNode(2.0));
multiplyParams.add(sinNode);
AstNode multiplyNode = new AstNode("*", multiplyParams);

// Create the final 'add' node:
IList<AstNode> addParams = new DoubleLinkedList<>();
addParams.add(new AstNode("x"));
addParams.add(multiplyNode);
AstNode finalNode = new AstNode("+", addParams);

Background: converting a string into an AST

You may be wondering how exactly we go about converting arbitrary strings into an AST.

We've provided you the code to do this, and understanding how this process works is beyond the scope of this class, but if you're curious, this process is known as "lexing" and "parsing". A "lexer" takes a string and outputs a list of each distinct "token" in the stream. A parser takes this list of tokens and figures out how to convert it into a tree form.

We use a library called ANTLR which will automatically generate a lexer and a parser for us based on a grammar that describes what our language looks like. See the files in src/main/antlr if you want to know what our grammar looks like.

If you'd like to learn more about this whole process, feel free to ask the course instructor.

Types of AstNodes

So, we can represent expressions as a tree of AstNodes, but what exactly does an AstNode look like?

There are three types of AstNodes:

  1. Numeric nodes
  2. Variable nodes
  3. Operation nodes

Numeric nodes and variable nodes ought to be self-explanatory: they represent numbers and variables respectively. These types of nodes are guaranteed to always be leaf nodes.

Operation nodes are a little more abstract: they represent any kind of node that might have one or more children. For example, something like 3 + x would be represented as an operation node (+) with two children: a numeric node (3) and a variable node (x).

Another way of looking at operation nodes would be to consider them exactly equivalent to function calls. So, the expression sin(3) would be represented as an operation node – as a function call – with one argument.

(This means doing (3 + 2) is really equivalent to calling a function named +, passing in two arguments! This trick of treating everything as function calls helps us simplify our overall language.)

The AstNode class contains methods you can call to determine what type of node it is and to extract relevant pieces of information.

Storing variables

We can represent expressions using abstract syntax trees, but how do we store variables?

We use a dictionary! Whenever the user defines a variable, we store that variable name as a string and the corresponding expression in your ArrayDictionary. If a variable does not exist in that dictionary, we say it's an unknown variable.

We will provide you with code that handles variable assignment: you only need to use the pre-populated dictionary you were given.

Part 2a: Implement toDouble(...)

Task: Implement ExpressionManipulators.handleToDouble(...).

(Note: you can find ExpressionManipulators in the calculator.ast package.)

First, you will implement handleToDouble(...), a function that attempts to fully evaluate the given AST into just a single double.

This method will receive an operation node named toDouble. This node will contain a single child, which is the expression you are to convert into a double. (So, this method might receive the AST node corresponding to toDouble(3.0 + 4.0). You would then return the AST node corresponding to 7.0).

For more details about the expected behavior of this function, see the comments in the method you need to implement.

Furthermore, you may assume that the expression AST you must convert to a double will consist only of the following types of nodes:

  • Numeric nodes
  • Variable nodes
  • The following operation nodes:
    • Basic math operators: +, -, *, /, and ^ (addition, subtraction, multiplication, division, and exponentiation)
    • The negate operator. The "negate" operator is used to represent expressions like -(x + 3) (here, we negate x + 3). We use the string "negate" in these cases instead of "-" to avoid clashing with the subtraction operation.
    • Two basic trig functions: sin and cos.

For example, the input to represent toDouble(3 + 4) would be

AST representation of toDouble(3 + 4)

Which would return a new node containing the value 7.0. The representation of toDouble(-(a + b)) would be:

AST representation of toDouble(-(a + b))

Which would return a node with the numeric value -(a + b), if a and b are defined.

Part 2b: Implement simplification

Task: Implement ExpressionManipulators.handleSimplify(...).

This method is responsible for simplifying an AST node. For more implementation details, see the comments in the handleSimplify(...) method as well as the unit tests you have been provided.

You may make the same assumptions about the AST as you did in part 2a – the only difference is that instead of being given an operation node named "toDouble" that contains a single child, you will be given an operation named "simplify" that contains a single child.

Notes:

  • The calculator will call your handleSimplify(...) function many times while evaluating a single expression, so make sure it's relatively efficient.
  • You are required to simplify only expressions of the form NUM + NUM, NUM * NUM, and NUM - NUM. You may implement additional simplifications for extra credit – the tests you were provided will accept a range of different answers in case you decide to do this.

For example, to evaluate simplify((5 + 6) + (7 - a)) with no variables defined, we would pass in simplify in the following diagram to handleSimplify(...):

AST representation of simplify((5 + 6) + (7 - a))

Which will simplify the expression to a new tree containing the following:

AST representation of simplify(11 + (7 - a))

Note that the method was not required to simplify the expression further down to 18 - a.

Part 2c: Implement plot(...)

Task: Implement ExpressionManipulators.plot(...).

As before, see the source code for additional implementation nodes. Similar to before, you will receive an AST operation node named "plot". However, this time, the node will have five children – see the comments associated with plot(...) for more details. The representation of the node produced by plot(3 * x, x, 2, 5, 0.5 + a) is shown below (this would necessitate that the variable a has been previously defined):

AST representation of plot(3 * x, x, 2, 5, 0.5 + a)

You may make the same assumptions about each child as you did in 3a and 3b.

Note: you are not responsible for drawing the actual chart yourself. Instead, you should call the drawScatterPlot(...) method (see ImageDrawer).

Background: drawing custom visualizations and charts

If you want to try implementing extra visualizations or charts (perhaps for extra credit? See below), here are some additional pieces of information that may help.

The calculator itself is built on top of Java Swing, which is the same library that the DrawingPanel class from CSE 142 uses. This means that fundamentally, the way we draw things is the same: we get and use a Graphics object. See ImageDrawer.getGraphics() for more details.

We use a library called "JFreeChart" to draw our different charts. You can find documentation on how to use this library online, and an example in the ImageDrawer class.

Part 2d: Complete group writeup

Task: complete a writeup containing answers to the following questions.

Note: your completed writeup MUST be in PDF format. You will not submit this writeup via Canvas – instead, add it to your src/writeup folder and push to Gitlab.

Your writeup will be graded partially on whether or not you produced a reasonable answer, and partially based on the clarity of your explanations and analysis.

  1. While testing your DoubleLinkedList and ArrayDictionary classes, you should have discovered that your data structure needs to handle both null and non-null entries or keys. Explain how you modified your code so it could handle these two different types of inputs.

    Justify the design decisions you made. (Your answer should be at most 1 to 2 paragraphs long).

  2. In this question, you will run a variety of experiments and report back on the resulting behavior. Since this is the first assignment, we have written the assignments for you so all you need to do is run them. You will be asked to write your own experiments in future projects, so be sure to take some time to read the provided code and understand what's going on.

    You can find the experiments in the analysis.experiments package. For each of the four experiments, answer the following:

    1. Briefly, in one or two sentences, summarize what this experiment is testing. (Note: the experimental code deliberately has minimal comments. Treat this as an exercise in reverse-engineering unknown code.)

    2. Predict what you think the outcome of running the experiment will be.

      Your answer should be around one or two paragraphs. There is no right or wrong answer here, as long as you thoughtfully explain the reasoning behind your hypothesis.

    3. Run the experiment code. Each Experiment*.java file should generate a CSV file in the experimentdata folder (it is possible that the 4th experiment will produce a warning; this is can safely be ignored, as long as the CSV file in generated as described). Import the CSV file into Microsoft Excel, Google Sheets, or any other analysis of your choice and generate a plot of the data. Include this plot as a image inside your writeup.

      If the CSV file contains multiple result columns, plot them all in the same chart.

      You will be graded based on whether you produce a reasonable plot. In particular, be sure to give your plot a title, label your axes (with units), include a legend if appropriate, and relabel the columns to something more descriptive then "TestResult".

    4. How do your results compare with your hypothesis? Why do you think the results are what they were?

What is a CSV file?

A CSV file is a kind of text file formatted to contain tabular data. Each row of data goes on a separate line, and each cell in a row is separated by commas. (CSV stands for "comma separated value").

Since CSV files are just regular text files, you can open and view them using any text editor such as Eclipse. This makes CSV files a fairly popular way of exchanging data – they're easy to generate programatically, yet are understood by a wide variety of tools (such as Excel or Google Sheets).

Part 2e: Complete individual writeup

Task: submit answers to the following questions.

Each group member should answer and submit these questions independently at this canvas page.

You must submit your answers in either .txt or .pdf form.

  1. Questions about the project:
    1. What is your name and your partner's name?
    2. How was the project? Did you enjoy it?
    3. Do you have any suggestions for improving the project?
  2. Questions about your partner (skip if you worked solo):
    1. Which parts of the project did you work on, and which parts did your partner program? Did you try pair programming?
    2. How was your partnership? Do you feel you and your partners contributed equally to this project?

Part 2f: Extra credit

You can earn extra credit by extending your calculator in different ways. See the extra credit page for more details.

If you decide to complete any of the extra credit, describe what you implemented at the end of your group writeup.

Deliverables

The following deliverables are due on Friday July 13 at 11:59pm.

Code submission will be the same as homework 1, by pushing and taging your code. Make sure your group writeup is included! Only one partner needs to tag the code, since the repository is the same for both of you. Before turning in your code, run checkstyle and correct any errors it finds. You should be committing your code often to the repository. When you and your partner are done with your code, commit and push it, then tag that commit as SUBMIT. To tag a commit in Eclipse, right-click your project and select "Team > Advanced > Tag", then write SUBMIT in the "Tag Name" field. The "Tag Message" can be whatever you want - it is like a commit message. Then click Create Tag and Start Push. It is very important that you do this instead of just "Create Tag" so that it gets pushed back to Gitlab instead of staying on your machine. If you forget to push the tag, you can do so via "Team > Remotes > Push Tags..."

Both partners should turn in a .txt or .pdf file containing their answers to part 2e on Canvas if you haven't already. This is due at the same time as the rest of your project. Note that if you wish to earn back credit from part one as noted in the syllabus, those updates must be submitted within the final commit described above - we will re-evaluate part 1 based off this submission.