501 is a class that focuses on the techniques that can be used to understand programs. These techniques serve as the foundations of a diversity of tools, from compilers to program analyzers, bug finders, malware detectors, etc. It is not a class that focuses on building a classical compiler from start to finish. Instead, the goal of the class is to supply students with a toolset that they can use for their own research, whenever they need to build tools that can understand programs.
The class is a project-based one where students come up with their project ideas that they work on throughout the quarter. We will have a few initial assignments to familarize students with available tools for analyzing programs. More information about the project will be provided later in the quarter.
We will discuss classical and recent research papers during each class meeting. We expect you to have read the paper before each meeting and be ready to participate in the discussion.
There are no formal prerequisites for the course. However, we expect you to have an understanding of data structures and mathematical logic, and know how to program. You do not have to have already taken a compiler or a programming languages class. However, undergraduates are recommended to do so.
Assignments: 30%
Project: 50%
Paper commentaries and class participation: 20%
As this is a graduate class,
the class will be A-centered, but students are not guaranteed a grade of A.
We will discuss 1-2 papers during each lecture. To help you prepare, you will write a one-paragraph commentary on each paper, and submit it at least 24 hours before the class meets. We will post all summaries before class online. The commentary should reflect your understanding and analysis of the issues raised by the paper, and should also help direct (both your and others') preparation for in-class discussion.
One suggestion for the format for the commentary is to critique the paper, listing the following three points: its biggest contribution (and, briefly, why that result was not already obvious), its biggest mistake (in motivation, methodology, algorithm, data analysis, conclusions, or some other area), and the biggest question that it raises (or the most interesting and important follow-on work that it suggests). Another acceptable format is to summarize the paper, describing its thesis, approach, and conclusions, and stating why it is significant. The commentary should also list questions that you have about the paper, such as about technical points or connections to related work.
We encourage you to use the commentaries to ask questions or comment on those posted by others. If you have a question, it is likely that many other people have the same question but are too shy (or vain, or insecure) to ask it; they will appreciate your raising the point. However, do come to class prepared: carefully read the paper and get as much as you can out of it on your own. Doing so will make the class time that much more productive.
You will have access to all the other students’ submissions. Please read them, because reading the other summaries is a good way for you to get perspective. You can see what you missed (or what other people missed), and whether you agree with their take on the key ideas. It will help to make the class sessions more productive.
To prepare students for conference presentations / quals talk / thesis defense, all students should expect to present one of the application papers during the quarter, and lead the in-class discussion. One way is to structure the discussion as a paper review seeking to answer the following questions:
Date | Topic | Reading | Slides |
---|---|---|---|
March 31 | Introduction | lecture 1 slides | |
Dataflow and Abstract Interpretation |
|||
April 2 | Dataflow frameworks 1 |
|
lecture 2 slides |
April 7 | Dataflow frameworks 2 |
|
lecture 3 slides |
April 9 | Abstract interpretation |
|
lecture 4 slides |
April 13 | Project Proposal due | ||
April 14 | Interlude: Introduction to pointer analysis |
|
lecture 5 slides |
April 16 | Applications: pointer and shape analysis |
|
lecture 6 slides |
April 21 | Applications: information flow |
|
lecture 7 slides |
April 23 | Homework help session | ||
April 28 | Applications: phone malware |
|
lecture 8 slides (FlowDroid) lecture 8 slides (Apposcopy) |
Language Issues |
|||
April 30 | Languages for high-performance computing |
|
lecture 9 slides |
May 5 | Languages for parallelization |
|
lecture 10 slides |
May 7 | Dynamic languages |
|
lecture 11 slides |
Program Verification |
|||
May 12 | Axiomatic semantics |
|
lecture 12 slides |
May 12 | Project Milestone Report due | ||
May 14 | Applications: Finding invariants |
|
lecture 13 slides |
May 19 | Applications: Verified compilers |
|
|
Dynamic Analysis |
|||
May 21 | Model checking |
|
lecture 15 slides |
May 26 | Applications: concolic testing |
|
lecture 16 slides (DART) lecture 16 slides (KLEE) |
(New Ways of) Constructing Compilers |
|||
May 28 | Superoptimization |
|
lecture 17 slides |
June 2 | Synthesis enabled translation |
|
lecture 18 slides |
June 4 | Project presentations | ||
June 9 | Project Final Report due |