===========================================================================

Static analysis teaser

Consider the following code.

x = 0;
y = read_even_value();
x = y+1;
y = 2*x;
x = y-2;
y = x/2;

Assume all values are integers, there is no overflow, etc.
What do you know about the variable values at the end of execution?
For example, are they positive?  Are they even?  Do you know other facts?

The most accurate result is that
  y has same value as its initial value that was read from input, and
  x is twice that.
We can determine this by doing symbolic execution:
for each variable value, determine an algebraic formula that represents its value.

It's also a fact that x and y are both even.
However, suppose that we used symbolic execution with a simpler abstraction
(a simpler abstract domain), where each value is "even", "odd", or "unknown".
This abstraction is simpler and faster to compute, but it loses information
and the final value for y is "unknown" instead of "even".

The field of static analysis is primarily about choosing an appropriate abstraction:  one that is simple enough for efficient computation, but expressive enough to retain precision.

Abstract interpretation is useful in program analysis and optimizations.
It has been used for decades at Airbus to verify safety of avionics systems.

----------------

Here is some jargon that you should know in order to be prepared for the next class.
If you don't know it, please look it up or ask the course staff a question about it.

Most important:
AST (and distinction from a parse tree): https://en.wikipedia.org/wiki/Abstract_syntax_tree
control flow graph: https://en.wikipedia.org/wiki/Control_flow_graph
basic block: https://en.wikipedia.org/wiki/Basic_block
3-address form: https://en.wikipedia.org/wiki/Three-address_code
lattice: https://en.wikipedia.org/wiki/Lattice_(order)

Less important:
SSA (single static assignment form)
SCC
forward edges
back edges
cross edges
spanning tree: useful in profiling
dominators: useful in optimization

Not important:
Scott domain
Galois interaction

===========================================================================

Teaser for lecture on Cousot & Cousot paper:

Give an explanation of the terminology -- a glossary

===========================================================================

Teaser for third lecture:

What is the reason that monotonicity is required of the lub function; that is,
what might go wrong if lub is not monotonic?
Give an example of a lub function that is not monotonic and that causes the problem.
Using the same lattice and a lub function that is monotonic, show that the problem does not occur.

----------------

Answer:  it is to guarantee that the analysis terminates (given a finite-height lattice)

Example:
Lattice elements = Top, Bottom
lub =
  lub(T, T) = Bottom
  lub(T, Bot) = Top
  lub(Bot, T) = Top
  lub(Bot, Bot) = Top
Transfer functions: none needed

code:

  x = input()
  loop:
  goto loop

or, equivalently,

  x = input()
  label:
  if (unanalyzable) goto label

The CFG looks like

   _   |
  / \  |
  |  v v 
  |  join
  |   |
  |   v
  |  nop
  \_/  |
       v

The estimate for x starts out as Top,
but on every iteration through the loop it flip-flops and the analysis never terminates.

===========================================================================