CSE 501: Implementation of Programming Languages

Course Project Information

Your project will consist of extending the Whirlwind compiler developed here at UW with some interesting analyses and transformations. The Whirlwind compiler is written in the Cecil language, and the compiler for Cecil is called Vortex.

Milestone 1 (due start of class on Monday Jan 13th)

Task 1: Form groups

Form a group of 2 or 3 people, and send Craig and Sorin an email with the names of the people in your group.

Task 2: Experiment with Cecil (to be done individually)

For small experimentation with simple Cecil code, the Cecil evaluator is a fine option. First, you need to log on to one of the solaris8 machines (monet, matisse, vangogh, klee, rousseau). Now, you can run the Cecil evaluator by executing /projects/cecil9/vortex-play/run-cecil. At the Cecil> prompt, enter expressions you'd like to evaluate. You can invoke any methods and instantiate any objects from the Cecil standard library. You also can define your own classes, methods, fields, and variables. Note, however, that the Cecil evaluator performs no static type checking; type declarations are allowed, but ignored. Also, the Cecil evaluator's performance is low. A little additional information is available in /projects/cecil9/vortex-play/vortex/notes/QUICK-START-CECIL;$VORTEX_HOME in that file refers to /projects/cecil9/vortex-play/vortex.

You should also look at the following manuals to get yourself familiar with Cecil:

The Cecil language reference manual: HTML,postscript,gzipped postscript
The Cecil standard library reference manual: HTML,postscript,gzipped postscript
A quickie Cecil tutorial is available here.

Task 3: Get a copy of the Vortex tree, run Vortex, and compile a program in Vortex

The interpreter is useful for experimenting with small Cecil programs, but for larger programs (like Whirlwind), or for performing static typechecking, the Vortex compiler is the best tool. Vortex is an optimizing compiler for Cecil, as well as a number of other object-oriented languages like Java, C++, Modula-3, and Smalltalk. (The Vortex compiler includes the functionality of the Cecil evaluator, too, so if you start up Vortex, you can do all the things at the Vortex> prompt that you could do at the Cecil> prompt.)

First, each group needs to install a copy of the Vortex tree in a cvs repository, and then each group member needs to checkout a copy of the Vortex tree from that repository. Instructions on how to do this are here.

Now that each student has a copy of Vortex, you can start experimenting. To compile a Cecil program using Vortex, follow these steps (look at the How to Use the Vortex Compiler manual for more information, or look at $VORTEX_HOME/notes/QUICK-START-VORTEX):

Create one (or more) files with Cecil code in some directory. Let's say the directory is MYDIR and the main file is called mytest.cecil .

There is a lot of Cecil code to look at for examples. Some of it is under $VORTEX_HOME/Cecil/src . You can find all of it under /projects/cecil/vortex-root/vortex/Cecil/src (a subset of these are also available under $VORTEX_HOME/Cecil/src). You may want to start by browsing benchmarks/tiny, contrib, and stdlib (sources for the standard library except the evaluator). The Vortex compiler source code is in the compiler subdirectory. The Cecil mode for emacs is here.

In a shell window, cd to MYDIR .
Start Vortex by running run-vortex.
To typecheck your program, type typecheck mytest (or tc mytest). After doing program editing, you can do typecheck again to typecheck only the files you changed, or fulltypecheck to retypecheck everything. Although Vortex does not require you to typecheck the code before compiling, it's always a good idea to typecheck first, since it will help you catch errors earlier.
At the Vortex> prompt, type make. This will compile your program. You do not need to provide the name of the program (mytest) anymore, because Vortex remembers it from the typecheck command. After this, to recompile the program if there are any errors or when you make modifications, just type make.
Run your program by typing run at the Vortex> prompt. Alternatively, you can run the program by going to another shell, and running the executable MYDIR/gen/mytest. Hope it works!

A good way to debug your program is, well, to use the debugger. Put breakpoint(); as the last line in your program (or wherever you want to suspend execution). When this expression is executed, the program will give you the debugging prompt ( debug> ), and you can use the full capabilities of the Cecil debugger/evaluator to interact with your program. In particular, you can type in Cecil expressions and see their values. Moreover, you can type in (or paste) the methods you want to add (or replace) to your program, and they will be available to you without recompiling the program. Similarly, you can add variable declarations. Press Ctrl-D to exit the debugger (or type help for more options). The print_line method, which takes a string argument and prints it, followed by a newline, is another handy, low-tech debugging tool.

The way the evaluator prints out values is by sending them the print_string message. By convention, this message returns a string that's a printable representation of its argument, but does not print anything itself. The debugger prints that string out. So if you define new kinds of objects in your program, you may also want to define the print_string methods so they print out nicely. Here's an example:

method print_string(p@:pair[`A,`B]):string {
 "(" || p.first.print_string || "," || p.second.print_string || ")" }

Remember that the debug> prompt is provided by your running program (and this is the place to add new methods and evaluate expressions), while the Vortex> prompt is printed by the compiler, and you should only issue compiler commands (like make) at that prompt.

You can type help at the Vortex prompt to get a list of commands that you can run.
You can quit Vortex by typing quit or Ctrl-D at the Vortex prompt.
Vortex allows you to save a checkpoint (the state of Vortex) to a file. Checkpointing is very important! Once Vortex has compiled a program, it has done a lot of work that you don't want to repeat (for example parsing the files and compiling them). If you simply quit Vortex, and then run it again, Vortex will start everything from scratch. So before you quit Vortex, you should checkpoint the state of the compiler, so that you can restore it later. The command save will save the state of the compiler to the file mytest.db, assuming that the program being compiled is mytest. When you restart vortex later, the command load mytest.db will restore the state of the compiler. Because you are sharing the solaris8 machine with others (including graduate students who are doing their research), you should save and quit vortex when you are done using it, and then restore the state later when need to run Vortex again. However, you can keep Vortex running while you debug and edit your program; you don't need to exit Vortex after each make. Vortex has incremental compilation: if you change a file in your source program, and type make or tc, it will reparse/retypecheck/recompile only those parts that have changed.
There are several variations on the basic make command. The makeo1 and makeo2 commands set of the optimization level to 1 and 2 respectively. Prepending the letter p to any make command (for example pmakeo2) will perform the C compilation phase in parallel on several machines. Type help make for more details.

You should look at the following manual for more details about how Vortex works.

"How to Use the Vortex Compiler": HTML,postscript,gzipped postscript

Task 4: Compile Whirlwind

You are now ready to compile Whirlwind. The source for whirlwind is located at $VORTEX_HOME/Cecil/src/whirlwind. However, vortex already has this directory in its source path list, so you can compile Whirlwind from any directory. Thus pick a directory where you want your generated C files and object files to be stored for whirlwind. Let's say you choose /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind. Now run:

% cd /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind % run-vortex Vortex> tc whirlwind Vortex> make

You can now run /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind/gen/whirlwind.

Task 5: Compile a WIL program using Whirlwind

Browse through some of the code in $VORTEX_HOME/Cecil/src/whirlwind/tests.The file $VORTEX_HOME/Cecil/src/whirlwind/notes/ir.txt contains documentation about WIL. Now, compile the WIL files in that directory by running the following commands:

% cd /projects/instr/03wi/cse501/cse501a/lerns/objs/test
% /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind/gen/whirlwind
Whirlwind> makeo2 test

The makeo2 command compiles with optimization level 2. Anything above 0 will turn optimizations on.

Deliverables for milestone 1

Each student should send an email to Craig and Sorin by 4:30 pm on Monday Jan 13th confirming that they are fully set-up and that they were able to run Vortex, compile Whirlwind, and compile the given Wil file.

Milestone 2 (due Tuesday Jan 21st) To be done individually!

Task 1: Implement a binary search tree in Cecil

Implement in Cecil a binary search tree data structure which stores an integer at each node. Implement the following functions:

-- Creates an empty binary search tree. signature create_bin_search_tree():BinSearchTree;-- Inserts the given integer in the binary search tree, and returns the resulting binary search tree. signature insert(BinSearchTree, int):BinSearchTree;-- Returns whether or not the given integer is in the tree. signature is_in_tree(BinSearchTree, int):bool;

Test your program out by inserting breakpoint(); at the end of your program, and running it. When you get to the debug> prompt, you can type commands to test your binary search tree. Try at least the following commands:

debug> let t := create_bin_search_tree().insert(10).insert(23).insert(-5).insert(3)debug> t.is_in_tree(10)debug> t.is_in_tree(23)debug> t.is_in_tree(-5)
debug> t.is_in_tree(3)debug> t.is_in_tree(11)
debug> t.is_in_tree(100)

Task 2: Implement factorial in Wil

Write the factorial function in Wil. You should start with the following file. To this file you should add:

A factorial function. You can write it either recursively or iteratively.
Some code in main that continually asks the user for a number n, and prints "n! = ..." until the number n entered by the user is -1.

You should not use any prim statements in the code that you add. Compile your program using Whirlwind, and test it out.

Deliverables for milestone 2

Each student should send an email to Craig and Sorin by the beginning of class on Tues. Jan 21st with the following attachments:

The cecil file containing your binary search tree implementation
A file containing a transcript of your debug session used to test your binary search tree implementation.
The wil file containing your factorial implementation
A file showing some sample input/output runs for factorial program

Milestone 3 (due Tuesday Feb 4th)

Task 1: Implement an optimization in Whirlwind (To be done individually!)

Implement common subexpression elimination in Whirlwind. You should follow the instructions from the code-update site to get a version of constant prop that you will start from. Once you followed the update instructions, you should take a look at the file $VORTEX_HOME/Cecil/src/whirlwind/optimizations/constant-prop-cfg.cecil. This file implements constant propagation over the CFG. You should use this as a starting point for implementing your CSE pass.

You should also get familiar with Whirlwind's graph display features. These are EXTREMELY USEFUL. For each kind of graph in Whirlwind's IR, there is a boolean flag called show_dot_graphkind and print_graphkind. For example, for CFGs, you have print_cfgs and show_dot_cfgs. The print_graphkind flags determine if graphs are printed in text format, and the show_dot_graphkind flags determine if graphs are displayed graphically using dot.

By default these flags are false, but when you start Whirlwind you can set them to true. For example, you can type the following line at the Whirlwind prompt (or you can add it to $HOME/.whirlwindrc): print_cfgs true. If print_cfgs is set to true, then each time check_slice is called on a CFG, the CFG will be printed in text format. If show_dot_cfgs is true, then each time check_slice is called on a CFG, you will get a prompt similar to the following:

Show cptest1_test2-CFG-DFG.dot [return=no, d=debugger, 1-9=size, other=yes]?

Whirlwind is asking if you want to display the CFG for the WIL function cptest1_test2. For now, you don't have to worry about the last acronym in the name (DFG here). This last acronym will either be DFG or AST, and you can ignore it.

You should write a small WIL program, with at least one conditional, one loop in it, and an opportunity for doing constant prop. Then pass it through the CFG constant propagation pass that you just installed (this pass is off by default, so you should turn it on by typing cfg_constant_prop on at the Whirlwind prompt). Take a look at the dot CFG display and the CFG text printout before and after the CFG constant prop pass.

When you are done writing the CSE pass, you should test it. Write some appropriate test cases, and look at the CFG before and after your pass to make sure your optimizations is doing the right thing.

Task 2: Understand Java WIL (To be done individually!)

Explain in your own words how Java objects are represented in Wil. To learn how Java is translated into WIL, take a look at the simple Java example in $VORTEX_HOME/Java/src/example/src. The WIL files generated for this example by the Java-to-WIL front-end are in /projects/cecil/vortex-root/vortex/Java/src/example/wil (just take a look at the *.wil files in this directory). Don't forget to also look at $VORTEX_HOME/Cecil/src/whirlwind/notes/ir.txt for documentation on the WIL intermediate language.

Task 3: Write a project proposal

Write a one-page project proposal. You can pick from one of the following suggestions or come up with your own project.

Error detection tool that tries to find finite state machine (FSM) protocol violations. This requires you to do the following:

Identify the entities you want to track. You can do this by implementing an unification based pointer analysis to identify locations to track (see papers below). Or you can create your own interprocedural reaching defs analysis to determine which creation points reach which uses.

write an analysis that pushes an FSM for each entity you are tracking.

write a transformation pass that inserts run-time checks when the analysis indicates that there may be an error.

Papers to look at for more information. The first two papers describe research projects that check for FSM protocol violations. The third paper describes a unification based pointer analysis that you can use to identify locations to track.

ESP: Path-Sensitive Program Verification in Polynomial Time. Manuvir Das, Sorin Lerner and Mark Seigle (PLDI 2002)
Automatically Validating Temporal Safety Properties of Interfaces. Thomas Ball and Sriram K. Rajamani (SPIN 2001)
Points-to Analysis in Almost Linear Time. B. Steensgaard (POPL 96)

Constraint based class analysis and constant propagation. This requires you to do the following:

write a general constraint solver.

use the constraint solver to implement class analysis.

use the results of class analysis to build a call graph.

use the call graph and the constraint solver to implement a constraint-based inter-procedural constant propagation.

Papers to look at for more information. The first paper is an introduction to set constraint-based program analysis. The second describes efficient ways of solving constraints.

Introduction to Set Constraint-Based Program Analysis. A. Aiken (Science of Computer Programming, 35(1999):79-111)
Partial Online Cycle Elimination in Inclusion Constraint Graphs. M. Faehndrich, J. Foster, Z. Su and A. Aiken (PLDI 98)

Guidelines for your own project:

must perform some transformations.

must have some sort of inter-procedural analysis.

Deliverables for Milestone 3

By the beginning of class on Feb 4th:

Each student should email Craig and Sorin their file which implements CSE. Also, each student should hand-in a print-out of the CSE code, and their descriptions of Java WIL.
One person from each group should handin a project proposal.

Milestone 4 (due Tuesday Feb 25th)

Each group should hand-in a one page summary of what you've been up to. You should mention in your summary what papers you've read, what you understood of them, and how the implementation is going along. Also, in general, you should not hesitate to ask questions. If you encounter problems with the implementation, or have trouble understanding a paper, you should ask for help. You should not waste hours trying to understand a paper or figuring out how to do XYZ in Cecil, when with some help it would take you much less time. On the other hand, coming to us for help should not be a substitute for doing the work yourself. In other words, please read the papers/documentation carefully before asking questions...

Each group should also meet with Sorin and/or Craig to discuss your progress.

Milestone 4.5 (to be done by Tuesday March 4th)

Each group should meet with Sorin to discuss your progress.

Milestone 5 (to be done by Tuesday March 11th)

By March 11th, you should setup a meeting time somewhere in the week of March 11th with Craig and Sorin to demo your compiler.

Milestone 6 (due March 18th)

Write a final report. The report should not be longer than 10 pages single spaced. The final report should contain at least the following (these are listed here in no particular order):

A description of the problem you are trying to solve.
A decsription of how you are solving the problem. You should describe the analyses you have designed, and the algorithms you have implemented. This description should not talk about the details of implementing your algorithms in Whirlwind. Also, someone who knows what was taught in class should understand your algorithms without having to read other papers. This means that you can't just say "look at paper blah, that's what we did". This will require you to reexaplain some things that you've read from papers, and it will be a good test to see if you understood what you read.
An overview of how your algorithms are implemented in Whirlwind. You'll have to strike the right balance between detail and readability. Putting a printout of your code in your report is not appropriate. Saying "we implemented this in Whirlwind by adding a module that does the analysis" is not appropriate either. You should give enough detail so that someone who knows Whirlwind and understands your algorithm can go and implement it in Whirlwind, and get a very similar implementation to yours.
Some examples showing how your analysis runs.
A description of the cases that your analysis does not handle. From your demos today, I know that you were not able to handle all of the cases. So make it clear what you don't handle.
A description of where your analysis looses precision, with some examples.
A description of future work, including (but not limited to) how to scale to large programs and how to make your analysis more precise.