CSE584: Software Engineering

CSE503: Software Engineering
Lecture 20 (February 24, 1999)

David Notkin

Slicing

Program slicing is an approach to selecting semantically related statements from a program [Weiser]
In particular, a slice of a program with respect to a program point is a projection of the program that includes only the parts of the program that might affect the values of the variables used at that point
The slice consists of a set of statements that are usually not contiguous

Basic ideas

If you need to perform a software engineering task, selecting a slice will reduce the size of the code base that you need to consider
Debugging was the first task considered
Weiser even performed some basic user studies
Claims have been made about how slicing might aid program understanding, maintenance, testing, differencing, specialization, reuse and merging

Example

read(n)
i := 1;
sum := 0;
product := 1;
while i <= n do begin
sum := sum + i;
product := product * i;
i := i + 1;
end;
write(sum);
write(product);

read(n)
i := 1;
product := 1;
while i <= n do begin
product := product * i;
i := i + 1;
end;
write(product);

In this example, the second program is a slice of the first, slicing on the product variable at the write statement.

For Weiser, a slice was a reduced, executable program obtained by removing statements from a program

The new program had to share parts of the behavior of the original
Weiser computed slices using a dataflow algorithm, given a program point (criterion)

Using data flow and control dependences, iteratively add sets of relevant statements until a fixpoint is reached

Ottenstein & Ottenstein defined an alternative approach that clearly dominates

True PDGs are somewhat more complicated than this

Vertices in the graph represent (a) assignment states and (b) predicates in the program
Edges represent control and data flow dependences
Control dependences always start at a predicate (or the entry node, a special node representing the beginning of the computation)

They are labeled with a boolean
Intuitively, node w is control dependent on node v if the predicate of node v evaluates to the label on the edge from v to w

An assignment statement followed immediately by another assignment statement have no control dependence between them, since the second one always executes when the first one does

(Roughly) there is a data dependence (edge) from node v to node w if v includes an assignment to some variable x, and then w includes a use of (that specific) x.

These can be separated into (at least) loop-independent and loop-carried dependences, which roughly distinguish whether the relationship is across iterations of a loop or not

Def-order dependences can also be used; these aren't needed for all analyses, but ensure that only equivalent programs have isomorphic PDGs.

Procedures

What happens when you have procedures and still want to slice?

Weiser extended his dataflow algorithm to interprocedural slicing

The PDG approach also extends to procedures

But interprocedural PDGs are a bit hairy (Horwitz, Reps, Binkley used SDGs)

Essentially a linked collection of PDGs (one per procedure), with linking vertices for calling protocols

Value-result parameter passing is used; it's much harder to handle by-reference mechanisms
If you slice an SDG-like structure using standard reachability, you can get a very large slice

Specifically, if you slice from inside a procedure, with reachability you trace back to all possible calls sites of the procedure; since calls and returns are represented by separate edges, it's roughly like taking a return to a call site that didn't make the call
There are a number of solutions to this, include the summary edges imposed in SDGs

Technical issues

Dynamic slicing

These algorithms have all been static
They work for all possible inputs
There is also work in dynamic slicing, trying to find slices that satisfy some execution streams over sets of inputs

Korel and Laski characterize dynamic slices in terms of a trajectory that captures the execution history of a program in terms of a sequence of statements and control predicates

(Potential) applications of slicing

Slicing can be used to define more rigorous testing criterion than a conventional data flow testing criterion

Recap