CSE 413 Spring 2000

Introduction to Compilers

Programming Language Implementation Techniques

Why Study Compilers?

A (very short) History

Basic Structure of a Compiler

Front end - analysis

Back End - synthesis

There are many choices for how to structure a compiler.  Many commercial systems have a suite of front ends (C/C++, Fortran, Pascal, Java, ...), a common middle end (optimization and transformations that are independent of the source language and machine), and multiple back ends (code generators for different targets - x86, MIPS, SPARC, PowerPC, Java VM, ...)

Interfaces between phases:  Variety.  Some may be function calls, others actual data structures giving explicit intermediate representation (streams, abstract syntax trees, etc.).

Example

What we have:

     if (x > y) 
        y = 42 * x; 
     else 
        x = f(x+17);

What we want:

     load  x     ; jump to L1 if x <= y
     cmp   y
     ble   L1
     load  42     ; y = 42 * x;
     mul   x
     store y
     jmp   l2
 L1: load  x     ; x = f(x+17)
     add   17
     push
     call  f
     store x
 L2:

How do we get from here to there?  The first task is to analyze the input and reconstruct the program structure.

Lexical analysis: Break the program into a token stream, hiding the details of reading the input and source program layout, and discarding whitespace.  This is often called scanning.  Result:

   IF LPAREN ID(x) GT ID(y) RPAREN ID(y) GETS INT(42) 
   TIMES ID(x) SCOLON ELSE ID(x) GETS ID(f) LPAREN ID(x)
   PLUS INT(17) RPAREN SCOLON EOF

Parsing: Read the token stream and analyze the underlying grammatical structure of the input.  Most production compilers build explicit parse trees.  For the project in CSE 413, we won't build the parse tree explicitly, but we will still analyze the phrase structure of the program.  

Usual interface with between parser and scanner: parser calls method in the scanner when it needs to read the next token.

When we've gotten this far, we can generate code based on the information discovered during the analysis.

References

There are many books about programming languages and compilers.  Here are a couple you might find useful.

Compilers: Principles, Techniques, and Tools, Alfred Aho, Ravi Sethi, and Jeffrey Ullman; Addison-Wesley, 1986 (corrected printing 1988).  This has been the standard text for compiler courses for many years.  Although it's missing recent developments in the field, it is still a very solid introduction.  Chapters 1 & 2 cover most of what we will see in CSE 413.

Programming Language Pragmatics, Michael L. Scott; Morgan Kaufmann, 2000.  This would be an ideal textbook for CSE 413, but it's expensive, and covers far more, and in more depth, than we can possibly manage in 10 weeks.  Good source for supplemental reading though.