CSE590dg, Winter 2004

CSE590dg Language-Based Techniques for Improving C-Level Software Quality

Winter 2004

Meetings: TTh 12:00-1:20, MEB 245
Instructor: Dan Grossman
(careful: grossman@cs goes to a different person)
Allen Center 556

This course surveys recent language-based approaches for finding software defects that are endemic to programs written in C. Emphasis is on techniques leveraging language implementations and type-checking that (a) prove the absence of certain errors and (b) do not (necessarily) treat C as though it were a higher level language. Students design and complete projects aimed at finding errors in C programs automatically. The course should have plenty to offer programming-language students and students who write C-level systems (operating systems, networking code, embedded systems, etc.), so the latter are particularly encouraged to attend.

Evolving Schedule:

January 6: The C level of abstraction and approaches to safety
January 8: Catalog of "implementation defined" behavior in C and tools for analyzing C-level code
January 20: Type casts, approaches to implementing parametric polymorphism
January 22: C-level parametric polymorphism, memory kinds
January 27: Existential types, order-of-evaluation
January 29: More order-of-evaluation, dangling pointers
February 3: Conservative garbage collection
February 5: Michael Hicks guest lecture (Cyclone memory management)
February 10: Lexically scoped regions
February 12: C-level LIFO regions; avoiding explicit effects
February 17: Subtyping (layout, nullability, and const)
February 19: Daniel Weise guest lecture (How to Get Annotations and Specifications into Industrial Code: Three easy lessons)
February 26: Software fault isolation, program shepherding, and type homogeneity
March 2: Static bug-finding
March 4: CSSV (static string checking)
March 9: Limiting aliasing: restrict and uniqueness
March 11: Project presentations and wrap-up

Here is some useful information for course participants. (Access restricted to UWCSE for the time being.)

Probable topics (subject to modification based on participants' interests):

Introduction: Why is the C-level important? Why is it so hard to write safe programs? How do hacks work? What automatic techniques can detect errors? How should we judge techniques?
Type casts, unions, and polymorphism
Memory management
NULL Pointers
Uninitialized Memory
Aliases
Under-specified evaluation order
Out-of-bound array indexing
Nul-terminated strings
Multithreading
User-defined properties

Notice the topics are organized by error (problem) rather than by automated approach (solution). For each problem, there may be solutions based on:

Run-time checking
Testing
Dataflow analysis
Types
Ad hoc bug-detection and statistical techniques

We can judge approaches by what they guarantee (what errors they detect and how programs behave), when they provide a guarantee, how much effort they require from programmers, what assumptions they make, and how they interact with other solutions.

Last updated: 10 February 2003