Write these up and submit them to the class dropbox, before lecture. =========================================================================== Inadequacy of simple domains Consider the following code. x = 0; y = read_even_value(); // produces an even value x = y+1; y = 2*x; x = y-2; y = x/2; Assume all values are integers, there is no overflow, etc. What do you know about the variable values at the end of execution? For example, are they positive? Are they even? Do you know other facts? What is the most precise and complete information that you can deduce? Think about this for a moment before reading forward. The most accurate result is that y has same value as its initial value that was read from input, and x is twice that. If you didn't already come up with this, please convince yourself of those facts. Now, reflect on how you determined this information. However you determined it, a computer can do the same. An effective way to produce a program analysis is to observe how you deduced a fact, and then automate your mental process. One way to determine the most precise information is via symbolic execution: for each variable value, determine an algebraic formula that represents its value. It's also a fact that x and y are both even. However, the abstract domain that consists of "even", "odd", and "unknown" cannot determine that y is even! It loses information and the final value for y is "unknown" instead of "even". Work through this program in the abstract domain to verify the result. Give an abstract domain that can be used to determine that the final value for y is even. The abstract domain only has to work for this particular program and property. Also give the transfer functions for +, *, -, and /. (There is nothing to do for the next two paragraphs; the only requirement for this item is giving the abstract domain.) The field of static analysis is primarily about choosing an appropriate abstraction: one that is simple enough for efficient computation, but expressive enough to retain precision. Abstract interpretation is useful in program analysis and optimizations. It has been used for decades at Airbus to verify safety of avionics systems. =========================================================================== Termination of abstract interpretation Abstract interpretation runs a program using abstract values. The abstract values are an overapproximation to the real values that the program might compute. Suppose that a program contains a loop L that can be entered and that terminates. In other words, at run time it is possible for the loop condition to be true, and it is possible for the loop condition to be false. If the abstract interpretation is sound, then the abstract value for the loop condition is { true, false }. This means that when abstract interpretation reaches the loop condition, it will proceed along two paths: it will execute the loop body, and separately it will execute the code after the loop. When the path inside the loop body reaches the loop condition again, it will again execute the loop body and the code after the loop. Since there are infinitely many looping paths through the program, it seems that abstract interpretation will never terminate. Briefly (in 1-3 paragraphs) explain why abstract interpretation of a program that contains a loop will terminate. =========================================================================== Here is some jargon that you should know in order to be prepared for the next class. If you don't know it, please look it up or ask the course staff a question about it. If there is some other term that appears in the reading, please tell the staff so that they can add it to this list or so they can explain it to you. Most important: AST (and distinction from a parse tree): https://en.wikipedia.org/wiki/Abstract_syntax_tree control flow graph: https://en.wikipedia.org/wiki/Control_flow_graph basic block: https://en.wikipedia.org/wiki/Basic_block 3-address form: https://en.wikipedia.org/wiki/Three-address_code lattice: https://en.wikipedia.org/wiki/Lattice_(order) Less important: SSA (single static assignment form) SCC forward edges back edges cross edges spanning tree: useful in profiling dominators: useful in optimization Not important (you can ignore it): Scott domain Galois interaction There is nothing to turn in for this part. =========================================================================== end.