Pointer analysis

Requirements

You will implement a pointer analysis for Whirlwind. Your analysis should keep track of the following: pointers to variables, pointers to dynamically allocated memory, pointers to structures, pointers to fields of structures and pointers to arrays. You analysis should also perform lhs indirection elimination using the whirlwind framework. In particular, you will transform statements of the form *x := e into y := e if it is know that x can only point to y.

The information that your analysis will compute at each program point is a map from locations to sets of locations. A location is an abstract piece of memory. A dataflow fact [l -> {l1, l2, l3}] says that the content of location l may contain a pointer to l1, l2 or l3. There are several kinds of locations:

An example

The following example shows a simple program and the information your analysis should compute. Before looking at the example, there are a few things that you need to keep in mind about WIL:

Skeleton code

You will start out with the following skeleton code. This code provides a lot of the framework for implementing pointer analysis. In particular, it provides a PtsToAnalysis object and the definition of the analysis domain. This includes a Location hierarchy, a class that represents sets of locations, and a class that defines the pointer information. Your task is to fill in the implementation of the methods that have a TODO in their body.

You will find it useful to look at the files cse.cecil and constant-prop-cfg.cecil in the whirlwind code base for examples of complete optimizations written over the CFG. You can also see how transformations are performed by looking at the *try_to_simplify methods in those files.

The nodes to look at

In order to implement pts_to_targets and pts_to_analyze in the skeleton code you will need to consider various WIL nodes. The file ir-node.cecil contains a definition of all the IR nodes in WIL. In particular, the following nodes should be useful for your pointer analysis:
If you are ever confused by what a node is, you can always use print_string to see it's string representation, and basic_print_string to see what kind of object it is. So, for example, if have an IDExprNode that represents a reference to variable y, then if you call print_string on it, you will get "y", and if you call basic_print_string, you will get the some string that tells you, among other things, that you're looking at an IDExprNode.