CSE 501: Implementation of Programming Languages
Course Project Information
Your project will consist of extending the Whirlwind
compiler developed here at UW with some interesting analyses and transformations.
The Whirlwind compiler is written in the Cecil
language, and the compiler for Cecil is called Vortex.
Milestone 1 (due start of class on Monday Jan
13th)
Task 1: Form groups
Form a group of 2 or 3 people, and send Craig and Sorin an email
with the names of the people in your group.
Task 2: Experiment with Cecil (to be done individually)
For small experimentation with simple Cecil code, the Cecil evaluator is
a fine option. First, you need to log on to one of the solaris8 machines (monet,
matisse, vangogh, klee, rousseau). Now, you can run the Cecil evaluator by
executing /projects/cecil9/vortex-play/run-cecil. At the Cecil>
prompt, enter expressions you'd like to evaluate. You can invoke
any methods and instantiate any objects from the Cecil standard library.
You also can define your own classes, methods, fields, and variables.
Note, however, that the Cecil evaluator performs no static type checking;
type declarations are allowed, but ignored. Also, the Cecil evaluator's
performance is low. A little additional information is available
in /projects/cecil9/vortex-play/vortex/notes/QUICK-START-CECIL;$VORTEX_HOME
in that file refers to /projects/cecil9/vortex-play/vortex.
You should also look at the following manuals to get yourself familiar
with Cecil:
Task 3: Get a copy of the Vortex tree, run Vortex, and compile a program
in Vortex
The interpreter is useful for experimenting with small Cecil programs,
but for larger programs (like Whirlwind), or for performing static typechecking,
the Vortex compiler is the best tool. Vortex is an optimizing compiler
for Cecil, as well as a number of other object-oriented languages like
Java, C++, Modula-3, and Smalltalk. (The Vortex compiler includes
the functionality of the Cecil evaluator, too, so if you start up Vortex,
you can do all the things at the Vortex> prompt that you could
do at the Cecil> prompt.)
First, each group needs to install a copy of the Vortex tree in a cvs repository,
and then each group member needs to checkout a copy of the Vortex tree from
that repository. Instructions on how to do this are here.
Now that each student has a copy of Vortex, you can start experimenting.
To compile a Cecil program using Vortex, follow these steps (look at the
How
to Use the Vortex Compiler manual for more information, or look at
$VORTEX_HOME/notes/QUICK-START-VORTEX):
- Create one (or more) files with Cecil code in some directory.
Let's say the directory is MYDIR and the main file is called
mytest.cecil .
There is a lot of Cecil code to look at for examples. Some of it is under
$VORTEX_HOME/Cecil/src . You can find all of it under /projects/cecil/vortex-root/vortex/Cecil/src
(a subset of these are also available under $VORTEX_HOME/Cecil/src).
You may want to start by browsing benchmarks/tiny, contrib,
and stdlib (sources for the standard library except the evaluator).
The Vortex compiler source code is in the compiler subdirectory.
The Cecil mode for emacs is here.
- In a shell window, cd to MYDIR .
- Start Vortex by running run-vortex.
- To typecheck your program, type typecheck mytest
(or tc mytest). After doing program editing, you can do
typecheck again to typecheck only the files you changed, or fulltypecheck
to retypecheck everything. Although Vortex does not require you to typecheck
the code before compiling, it's always a good idea to typecheck first,
since it will help you catch errors earlier.
- At the Vortex> prompt, type make. This
will compile your program. You do not need to provide the name of the program
(mytest) anymore, because Vortex remembers it from the typecheck
command. After this, to recompile the program if there are any errors
or when you make modifications, just type make.
- Run your program by typing run at the Vortex>
prompt. Alternatively, you can run the program by going to another shell,
and running the executable MYDIR/gen/mytest. Hope it works!
A good way to debug your program is, well, to use the debugger. Put
breakpoint(); as the last line in your program (or wherever you
want to suspend execution). When this expression is executed, the program
will give you the debugging prompt ( debug> ), and you can
use the full capabilities of the Cecil debugger/evaluator to interact with
your program. In particular, you can type in Cecil expressions and see
their values. Moreover, you can type in (or paste) the methods you want to
add (or replace) to your program, and they will be available to you without
recompiling the program. Similarly, you can add variable declarations. Press
Ctrl-D to exit the debugger (or type help for more options).
The print_line method, which takes a string argument and prints
it, followed by a newline, is another handy, low-tech debugging tool.
The way the evaluator prints out values is by sending them the print_string
message. By convention, this message returns a string that's a printable
representation of its argument, but does not print anything itself. The
debugger prints that string out. So if you define new kinds of objects
in your program, you may also want to define the print_string
methods so they print out nicely. Here's an example:
method print_string(p@:pair[`A,`B]):string {
"(" || p.first.print_string || "," || p.second.print_string || ")" }
Remember that the debug> prompt is provided by your running
program (and this is the place to add new methods and evaluate expressions),
while the Vortex> prompt is printed by the compiler, and
you should only issue compiler commands (like make) at that prompt.
- You can type help at the Vortex prompt to get a list
of commands that you can run.
- You can quit Vortex by typing quit or Ctrl-D at the
Vortex prompt.
- Vortex allows you to save a checkpoint (the state of Vortex)
to a file. Checkpointing is very important! Once Vortex has compiled a
program, it has done a lot of work that you don't want to repeat (for
example parsing the files and compiling them). If you simply quit Vortex,
and then run it again, Vortex will start everything from scratch. So before
you quit Vortex, you should checkpoint the state of the compiler, so that
you can restore it later. The command save will save the state
of the compiler to the file mytest.db, assuming that the program
being compiled is mytest. When you restart vortex later, the
command load mytest.db will restore the state of the compiler.
Because you are sharing the solaris8 machine with others (including graduate
students who are doing their research), you should save and quit vortex
when you are done using it, and then restore the state later when need
to run Vortex again. However, you can keep Vortex running while you debug
and edit your program; you don't need to exit Vortex after each make. Vortex
has incremental compilation: if you change a file in your source program,
and type make or tc, it will reparse/retypecheck/recompile
only those parts that have changed.
- There are several variations on the basic make command. The makeo1
and makeo2 commands set of the optimization level to
1 and 2 respectively. Prepending the letter p to any make command
(for example pmakeo2) will perform the C compilation phase in
parallel on several machines. Type help make for more details.
You should look at the following manual for more details about how Vortex
works.
Task 4: Compile Whirlwind
You are now ready to compile Whirlwind. The source for whirlwind
is located at $VORTEX_HOME/Cecil/src/whirlwind. However,
vortex already has this directory in its source path list, so you can
compile Whirlwind from any directory. Thus pick a directory where you want
your generated C files and object files to be stored for whirlwind. Let's
say you choose /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind.
Now run:
% cd /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind
% run-vortex
Vortex> tc whirlwind
Vortex> make
You can now run /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind/gen/whirlwind.
Task 5: Compile a WIL program using Whirlwind
Browse through some of the code in $VORTEX_HOME/Cecil/src/whirlwind/tests.The
file $VORTEX_HOME/Cecil/src/whirlwind/notes/ir.txt contains
documentation about WIL. Now, compile the WIL files in that directory
by running the following commands:
% cd /projects/instr/03wi/cse501/cse501a/lerns/objs/test
% /projects/instr/03wi/cse501/cse501a/lerns/objs/whirlwind/gen/whirlwind
Whirlwind> makeo2 test
The makeo2 command compiles with optimization level 2. Anything
above 0 will turn optimizations on.
Deliverables for milestone 1
Each student should send an email to Craig and Sorin by 4:30 pm
on Monday Jan 13th confirming that they are fully set-up and that
they were able to run Vortex, compile Whirlwind, and compile the given
Wil file.
Milestone 2 (due Tuesday Jan 21st) To be done individually!
Task 1: Implement a binary search tree in Cecil
Implement in Cecil a binary search tree data structure which stores
an integer at each node. Implement the following functions:
-- Creates an empty binary search tree.
signature create_bin_search_tree():BinSearchTree;
-- Inserts the given integer in the binary search tree, and returns
the resulting binary search tree.
signature insert(BinSearchTree, int):BinSearchTree;
-- Returns whether or not the given integer is in the tree.
signature is_in_tree(BinSearchTree, int):bool;
Test your program out by inserting breakpoint(); at the
end of your program, and running it. When you get to the debug>
prompt, you can type commands to test your binary search tree. Try
at least the following commands:
debug> let t := create_bin_search_tree().insert(10).insert(23).insert(-5).insert(3)
debug> t.is_in_tree(10)
debug> t.is_in_tree(23)
debug> t.is_in_tree(-5)
debug> t.is_in_tree(3)
debug> t.is_in_tree(11)
debug> t.is_in_tree(100)
Task 2: Implement factorial in Wil
Write the factorial function in Wil. You should start with the following
file. To this file you should add:
- A factorial function. You can write it either recursively
or iteratively.
- Some code in main that continually asks the user for a number
n, and prints "n! = ..." until the number n entered by the user is -1.
You should not use any prim statements in the code that you
add. Compile your program using Whirlwind, and test it out.
Deliverables for milestone 2
Each student should send an email to Craig and Sorin by the beginning
of class on Tues. Jan 21st with the following attachments:
- The cecil file containing your binary search tree implementation
- A file containing a transcript of your debug session used to
test your binary search tree implementation.
- The wil file containing your factorial implementation
- A file showing some sample input/output runs for factorial
program
Milestone 3 (due Tuesday Feb 4th)
Task 1: Implement an optimization in Whirlwind
(To be done individually!)
Implement common subexpression elimination in Whirlwind. You should
follow the instructions from the code-update
site to get a version of constant prop that you will start from. Once
you followed the update instructions, you should take a look at the file $VORTEX_HOME/Cecil/src/whirlwind/optimizations/constant-prop-cfg.cecil.
This file implements constant propagation over the CFG. You should use
this as a starting point for implementing your CSE pass.
You should also get familiar with Whirlwind's graph display features.
These are EXTREMELY USEFUL. For each kind of graph in Whirlwind's
IR, there is a boolean flag called show_dot_graphkind and print_graphkind.
For example, for CFGs, you have print_cfgs and show_dot_cfgs.
The print_graphkind flags determine if graphs are printed in text
format, and the show_dot_graphkind flags determine if graphs are
displayed graphically using dot.
By default these flags are false, but when you start Whirlwind you can
set them to true. For example, you can type the following line at the Whirlwind
prompt (or you can add it to $HOME/.whirlwindrc): print_cfgs
true. If print_cfgs is set to true, then each time check_slice
is called on a CFG, the CFG will be printed in text format. If show_dot_cfgs
is true, then each time check_slice is called on a CFG, you will
get a prompt similar to the following:
Show cptest1_test2-CFG-DFG.dot [return=no, d=debugger, 1-9=size,
other=yes]?
Whirlwind is asking if you want to display the CFG for the WIL function
cptest1_test2. For now, you don't have to worry about the last acronym
in the name (DFG here). This last acronym will either be DFG or AST, and
you can ignore it.
You should write a small WIL program, with at least one conditional,
one loop in it, and an opportunity for doing constant prop. Then pass it
through the CFG constant propagation pass that you just installed (this
pass is off by default, so you should turn it on by typing cfg_constant_prop
on at the Whirlwind prompt). Take a look at the dot CFG display and
the CFG text printout before and after the CFG constant prop pass.
When you are done writing the CSE pass, you should test it. Write some
appropriate test cases, and look at the CFG before and after your pass
to make sure your optimizations is doing the right thing.
Task 2: Understand Java WIL (To be done individually!)
Explain in your own words how Java objects are represented in Wil. To
learn how Java is translated into WIL, take a look at the simple Java example
in $VORTEX_HOME/Java/src/example/src. The WIL files generated
for this example by the Java-to-WIL front-end are in /projects/cecil/vortex-root/vortex/Java/src/example/wil
(just take a look at the *.wil files in this directory). Don't forget to
also look at $VORTEX_HOME/Cecil/src/whirlwind/notes/ir.txt for
documentation on the WIL intermediate language.
Task 3: Write a project proposal
Write a one-page project proposal. You can pick from one of the
following suggestions or come up with your own project.
- Error detection tool that tries to find finite state machine
(FSM) protocol violations. This requires you to do the following:
- Identify the entities you want to track. You can do this
by implementing an unification based pointer analysis to identify locations
to track (see papers below). Or you can create your own interprocedural
reaching defs analysis to determine which creation points reach which uses.
- write an analysis that pushes an FSM for each entity you
are tracking.
- write a transformation pass that inserts run-time checks
when the analysis indicates that there may be an error.
- Papers to look at for more information. The first two papers describe
research projects that check for FSM protocol violations. The third paper
describes a unification based pointer analysis that you can use to identify
locations to track.
ESP: Path-Sensitive Program Verification in
Polynomial Time. Manuvir Das, Sorin Lerner and Mark Seigle (PLDI 2002)
Automatically Validating Temporal Safety
Properties of Interfaces. Thomas Ball and Sriram K. Rajamani (SPIN 2001)
Points-to
Analysis in Almost Linear Time. B. Steensgaard (POPL 96)
- Constraint based class analysis and constant propagation. This
requires you to do the following:
- write a general constraint solver.
- use the constraint solver to implement class analysis.
- use the results of class analysis to build a call graph.
- use the call graph and the constraint solver to implement
a constraint-based inter-procedural constant propagation.
- Papers to look at for more information. The first paper is an
introduction to set constraint-based program analysis. The second describes
efficient ways of solving constraints.
Introduction
to Set Constraint-Based Program Analysis. A. Aiken (Science of Computer
Programming, 35(1999):79-111)
Partial
Online Cycle Elimination in Inclusion Constraint Graphs. M. Faehndrich,
J. Foster, Z. Su and A. Aiken (PLDI 98)
- Guidelines for your own project:
- must perform some transformations.
- must have some sort of inter-procedural analysis.
Deliverables for Milestone 3
By the beginning of class on Feb 4th:
- Each student should email Craig and Sorin their file which implements
CSE. Also, each student should hand-in a print-out of the CSE code, and
their descriptions of Java WIL.
- One person from each group should handin a project proposal.
Milestone 4 (due Tuesday Feb 25th)
Each group should hand-in a one page summary of what you've been up to.
You should mention in your summary what papers you've read, what you understood
of them, and how the implementation is going along. Also, in general, you
should not hesitate to ask questions. If you encounter problems with the implementation,
or have trouble understanding a paper, you should ask for help. You should
not waste hours trying to understand a paper or figuring out how to do XYZ
in Cecil, when with some help it would take you much less time. On the other
hand, coming to us for help should not be a substitute for doing the work
yourself. In other words, please read the papers/documentation carefully
before asking questions...
Each group should also meet with Sorin and/or Craig to discuss your progress.
Milestone 4.5 (to be done by Tuesday March 4th)
Each group should meet with Sorin to discuss your progress.
Milestone 5 (to be done by Tuesday March 11th)
By March 11th, you should setup a meeting time somewhere in the week of March
11th with Craig and Sorin to demo your compiler.
Milestone 6 (due March 18th)
Write a final report. The report should not be longer than 10 pages
single spaced. The final report should contain at least the
following (these are listed here in no particular order):
- A description of the problem you are trying to solve.
- A decsription of how you are solving the problem. You should describe
the analyses you have designed, and the algorithms you have implemented.
This description should not talk about the details of implementing your
algorithms in Whirlwind. Also,
someone who knows what was taught in class should understand your algorithms
without having to read other papers. This means that you can't just say
"look at paper blah, that's what we did". This will require you
to reexaplain some things that you've read from papers, and it will
be a good test to see if you understood what you read.
- An overview of how your algorithms are implemented in Whirlwind. You'll have to strike the right balance between detail and readability. Putting a printout of your code in your report is not appropriate.
Saying "we implemented this in Whirlwind by adding a module that does the analysis" is not appropriate either. You should give enough detail so that someone
who knows Whirlwind and understands your algorithm can go and implement it
in Whirlwind, and get a very similar implementation to yours.
- Some examples showing how your analysis runs.
- A description of the cases that your analysis does not handle.
From your demos today, I know that you were not able to handle all of the cases. So make it clear what you don't handle.
- A description of where your analysis looses precision, with some examples.
- A description of future work, including (but not limited to)
how to scale to large programs and how to make your analysis more precise.