CSE401, Midterm 1, 1989

This was the first of 2 midterms I gave in FQ89. I was unable to find the questions; all I had were the answers. So I played exam "Jeopardy", and (re)wrote the questions today.

Q: What is the strategy phase of a compiler?

A: The strategy phase is a tree to tree translator responsible for converting source IR trees (closely modelling the statement and expressions in the body of procedures into target IR trees. The target IR trees are at a lower level than the source IR trees and they expose all computations, including control flow, that are hidden or implicit in the source trees.

Q: Describe 2 ways to implement FSMs.

A: One has to first decide what FSM one wants, perhaps by constructing one from regular expressions. (1) you can convert the FSM into a two dimensional table indexed by state and input symbol and containing new states; the table is a "program" for a generic interpreter. (2) You can convert the FSM into a piece of executable hard code wherein the state in the FSM is implicit in which piece of code is being executed.

Q: What does it mean for a compiler to be "syntax directed"?

A: The events taking place inside of a syntax directed compiler are sequenced by the syntax of the program being compiled. For example, declaration processing (declaration attribution) is invoked when a declaration is seen on the input, and so forth.

Q: Describe 3 software development tools that are run after the program is run, that require some assistance from the compiler.

A: Debuggers can be used to step through an executing program so that you can examine the dynamic structure of the program, such as the call paths, the values of variables, and so forth. Profilers gather information on which routines call each other, and how much time each routine takes to execute. Indexers build a map from names to declarations, modifications or uses, and help one discover how and when values may be set and used.

Q: Ollie is a rambunctious compiler writer who decides to make the attributer aggressively restructure expressions. For example, he changes the attributer so that the expression "a+b-b" is turned into just a. Is this a good idea?

A: Clearly, Ollie has not listened to his Professor, who would claim that almost all algebraic transformations on expressions are bound to be wrong, since the algebra of computer numbers is not the same as the algebra that we are familiar with. The example associates "((a+b)-b)"; Ollie assumed that machine addition was associative, and so the transformation might yield different answers. Ollie would be especially stupid if he did these transformations on floating point numbers, or if overflow/underflow detection must be preserved at run time.

Q: What is non determinism useful for.

A: Non determinism is useful because it lets us write simple automaton that obviously correspond to another kind of specification. It is not used in practice because it entails guessing, and if the guess turns out to be wrong one has to use backtracking; backtracking is quite expensive. One can write nice nondeterminstic algorithms in Prolog, but at some cost in efficiency.

Q: What is the difference between error detection, error recovery and error repair?

A: Error detection entails finding the error in the first place. Error recovery in phase X tries to adjust things so that X can get back on track and continue. Error repair attempts to (effectively) correct the input in some way by guessing what the user had in mind and fixing what was wrong.

Q: What are the advantages/disadvantages of putting limits on what the kind of source program that the compiler can compile?

A: Limits on the size of the source program translate to limits in the compiler, which makes the compiler writer's task much easier, since whe can then write the compiler with statically declared data structures, and will not have to deal with dynamically allocated memory. However, the compiler's user may submit a perfectly legal program that breaks the limits, and so will be frustrated by the compiler's limits.

Q: Write a grammar for a language with right associativity, no operator precedence, with infix operator "/", unary prefix operator "rho", distfix paired operator "(" and ")", and identifiers.

A: E : id / E | rho E | ( E ) | id

Q: When testing compilers why is it important to separate the roles of the tester from developer?

A: This is true for any kind of software testing. Separating the tester from the developer reduces the chances for so called "common mode" failure. If the tester and developer communicate informally and frequently, or are one and the same person, then they may both acquire the same wold view, and so may blind themselves to potential faults. Different world views promote robustness and the chance for things falling through the cracks is reduced. testers will end up taking perverse pleasure in finding bugs, and so the UUT (unit under test) is more likely to be thoroughly exercised. Remember: testing shows the presence of bugs, and never their absence.)

Q: Why doesn't the parser deal with comments?

A: In almost all languages comments are equivalent to white space, and so can appear between any two tokens. While this can be described grammatically, that would yield an unreadable grammar that is likely to be full of errors. It is much easier to have the scanner filter out those errors.

Q: Jack and Jill are working on the pl/0 compiler together. Jack and Jill don't talk together. Jack is doing sets; Jill is doing switches. What kinds of things can go wrong in this approach?

A: If they don't talk to each other, how can they expect that Jack's additions to the symbol table (say) will support Jill's desire for handling switches? What if the two of them edit the same file simultaneously? What if they both add procedures that do almost the same task?

Q: Describe the kinds of things you, as a compiler writer, must worry about if your primary users are: (1) students in an introductory compiler course; (2) people at the National Weather Service doing weather forecasting; (3) people in the military services building missile defense systems; (4) researchers designing and experimental language; (5) a compiler for a product that will have 100 million copies outstanding.

A1: A compiler for an introductory programming course will be invoked many times with many buggy programs; the object code that is produced will be executed infrequently and only for a short period of time. Spend time on the diagnostic ability (error detection, recovery and repair), compiler correctness (you NEVER want to have the students blame the compiler on why their programs fail), and then on compiler speed.

A2: A compiler for the weather forecasting codes will compile very lage programs written by experts; such programs run for a very long time on very fast computers. You want to concentrate on generating efficient object code. Compile time is not important. Compiler diagnostics are not that important; the users will even live with (eg, work around) compiler bugs.

A3: A compiler for missile guidance codes wants to be correct; you don't really want to have the compiler introduce bugs the programs that are (ostensibly) saving your life? It is not that unusual for compilers in these applications to be proved correct using formal techniques. (But what about the compiler's compiler?) The compiler must also point out all language nonconformities in the source program. Everything else is secondary to these goals.

A4: A compiler for an experimental language is bound to change as the language designers fiddle around. Best to make the compiler easy to develop and secondarily make it isolate problems in the source language. Execution speed isn't important.

A5: A compiler shipped in these volumes had better be correct, since field upgrades are out the question. Execution speed should be good, but need not be stellar, although it is unlikely that you can convince the market that your compiler is better for what they really should be caring about.