501 is a graduate level, project based class in programming languages implementation. This quarter, we will study techniques and algorithms that can be used to understand programs. We will also look into applications of these techniques across a wide-variety of fields, from classical program optimization to security and utilizing novel hardware.

Administrative Information

Instructor: Alvin Cheung, office hours: by appointment.
Course Assistant: Andre Baixo, office hours: [TBD]
Contact: cse501-staff@cs.washington.edu
Class meetings: Tuesdays & Thursdays 11:00am-12:20pm, EEB 025.

Course Information

501 is a class that focuses on the techniques that can be used to understand programs. These techniques serve as the foundations of a diversity of tools, from compilers to program analyzers, bug finders, malware detectors, etc. It is not a class that focuses on building a classical compiler from start to finish. Instead, the goal of the class is to supply students with a toolset that they can use for their own research, whenever they need to build tools that can understand programs.

Format

The class is a project-based one where students come up with their project ideas that they work on throughout the quarter. We will have a few initial assignments to familarize students with available tools for analyzing programs. More information about the project will be provided later in the quarter.

We will discuss classical and recent research papers during each class meeting. We expect you to have read the paper before each meeting and be ready to participate in the discussion.

Prerequisites

There are no formal prerequisites for the course. However, we expect you to have an understanding of data structures and mathematical logic, and know how to program. You do not have to have already taken a compiler or a programming languages class. However, undergraduates are recommended to do so.

Optional references

  • Nielson, Nielson, and Hankin, Principles of Program Analysis, Springer, 2004.
  • Muchnick, Advanced Compiler Design and Implementation, Morgan Kaufmann, 1997.
  • Allen and Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufmann, 2001.
  • Aho, Ullman, Lam, and Sethi, Compilers: Principles, Techniques, and Tools (2nd ed.), Addison Wesley, 2006 (aka the dragon book).

Grading

Assignments: 30%
Project: 50%
Paper commentaries and class participation: 20%
As this is a graduate class, the class will be A-centered, but students are not guaranteed a grade of A.

Paper reading and commentaries (with writing suggestions from Prof. Ernst)

We will discuss 1-2 papers during each lecture. To help you prepare, you will write a one-paragraph commentary on each paper, and submit it at least 24 hours before the class meets. We will post all summaries before class online. The commentary should reflect your understanding and analysis of the issues raised by the paper, and should also help direct (both your and others') preparation for in-class discussion.

One suggestion for the format for the commentary is to critique the paper, listing the following three points: its biggest contribution (and, briefly, why that result was not already obvious), its biggest mistake (in motivation, methodology, algorithm, data analysis, conclusions, or some other area), and the biggest question that it raises (or the most interesting and important follow-on work that it suggests). Another acceptable format is to summarize the paper, describing its thesis, approach, and conclusions, and stating why it is significant. The commentary should also list questions that you have about the paper, such as about technical points or connections to related work.

We encourage you to use the commentaries to ask questions or comment on those posted by others. If you have a question, it is likely that many other people have the same question but are too shy (or vain, or insecure) to ask it; they will appreciate your raising the point. However, do come to class prepared: carefully read the paper and get as much as you can out of it on your own. Doing so will make the class time that much more productive.

You will have access to all the other students’ submissions. Please read them, because reading the other summaries is a good way for you to get perspective. You can see what you missed (or what other people missed), and whether you agree with their take on the key ideas. It will help to make the class sessions more productive.

Application paper presentations

To prepare students for conference presentations / quals talk / thesis defense, all students should expect to present one of the application papers during the quarter, and lead the in-class discussion. One way is to structure the discussion as a paper review seeking to answer the following questions:

  • Motivate the paper: what is problem being tackled, and why is it important?
  • What has been previously done (if any) to solve the problem? Why is it not solved?
  • What is the proposed solution? How is it described in the paper? Does it leverage existing techniques or invent new ones? How are they significant?
  • What are the limitations of the techniques proposed? Did the authors address them?
  • How was the proposed solution evaluated? Was it sufficient?
  • Does the paper open up new areas of research? Does it solve the problem for once and for all? What will happen if the proposed solution is widely adopted (if not so already)?
Some suggestion for doing paper reviews: Please sign up for paper presentations here (up to two students per paper). Once you signed up, email the instructor a few days before your slot to set up a time to discuss your slides.

Schedule

Date Topic Reading Slides
March 31 Introduction   lecture 1 slides

Dataflow and Abstract Interpretation

April 2 Dataflow frameworks 1 lecture 2 slides
April 7 Dataflow frameworks 2 lecture 3 slides
April 9 Abstract interpretation lecture 4 slides
April 13 Project Proposal due    
April 14 Interlude: Introduction to pointer analysis lecture 5 slides
April 16 Applications: pointer and shape analysis lecture 6 slides
April 21 Applications: information flow lecture 7 slides
April 23 Homework help session    
April 28 Applications: phone malware lecture 8 slides (FlowDroid)
lecture 8 slides (Apposcopy)

Language Issues

April 30 Languages for high-performance computing lecture 9 slides
May 5 Languages for parallelization lecture 10 slides
May 7 Dynamic languages lecture 11 slides

Program Verification

May 12 Axiomatic semantics lecture 12 slides
May 12 Project Milestone Report due    
May 14 Applications: Finding invariants lecture 13 slides
May 19 Applications: Verified compilers  

Dynamic Analysis

May 21 Model checking lecture 15 slides
May 26 Applications: concolic testing lecture 16 slides (DART)
lecture 16 slides (KLEE)

(New Ways of) Constructing Compilers

May 28 Superoptimization For those who are not familiar with the structure of classical optimizing compilers here are some references: lecture 17 slides
June 2 Synthesis enabled translation Introduction to synthesis for those who are interested: lecture 18 slides
June 4 Project presentations    
June 9 Project Final Report due    

Assignments