It is difficult to write a program that works well. A significant part of the
problem is to state precisely what it means for a program to work correctly. What
assumptions do we make about the way in which it is invoked? What guarantees does it
make about its results? How much time and space does it require? Answers to
these questions are called specifications --- descriptions of the expected
behavior of a program. Checking that a particular program satisfies a given
specification is called verification. There are many forms of specification
and many techniques for verification of programs. One form of specification with
which you are by now very familiar is a type specification; verification of a
type specification is called type checking. We've seen that type
specification and type checking are useful tools for helping us to get programs right.
Another form of specification is an asymptotic time and space bound on a procedure,
expressed as a function of the argument to the procedure. For example, we may state
that the function sort : int list -> int list
takes time T(n) = O(n lg
n) and space S(n) = O(n) for an input of size n. Verification
of a complexity bound is often a tricky business. Typically we define a recurrence
relation governing the time or space complexity of the program, then solve the recurrence
using asymptotic methods to obtain the result.
Type specifications and complexity specifications are useful tools, but it
is important to keep in mind that neither says very much about whether the code works
properly. We might define an incorrect sort routine (say, one that always returns
its input untouched), verify that its type is int list -> int list
, and
check that it runs in time O(n lg n), yet the code doesn't sort its input,
despite its name! Clearly more refined forms of specification are needed to state
precisely the expected behavior of a program, and some methods are needed to verify that a
program satisfies a given specification. We've explored such forms of specification
and verification earlier in these notes, for example when we checked the correctness of
various forms of the integer exponential function. In this chapter we'll put these
ideas to work to help us to devise a correct version of the regular expression matcher
sketched in the Overview, correcting a subtle error that may
not be immediately apparent from inspecting or even testing the code. The goal of
this chapter is to demonstrate the use of specification and verification to discover and
correct an error in a program through a technique that we call proof-directed
debugging. We first devise a precise specification of the regular expression
matcher, a difficult problem in itself, then attempt to verify that the matching program
satisfies this specification. The attempt to carry out a proof breaks down, and
suggests a counterexample to the specification. We then consider various methods of
handling the problem, ultimately settling on a change of specification rather
than a change of implementation.
Let us begin by devising a specification for the regular expression
matcher. As a first cut we write down a type specification. We seek to define
a function match
of type regexp -> string -> bool
that
determines whether or not a given string matches a given regular expression. More
precisely, we wish to satisfy the following specification:
For every regular expression r and every string s,
match
r s terminates, and evaluates to true iff s in L(r).
Recall that the language of a regular expresson r is a set of string L(r) defined as follows:
L(0) = 0
L(1) = 1
L(a) = a
L(r1r2) = L(r1) L(r2)
L(r1+r2) = L(r1) + L(r2)
L(r*) = L(0) + L(r) + L(rr) + L(rrr) + ...
where 0 = {}, 1 = {""}, a= {"a"}, L1 L2 = { s1s2 | s1 in L1 and s2 in L2 }, and L1+L2 = { s | s in L1 or s in L2 }. The language L(r*) can be characterized as the smallest language L such that L=1 + L(r) L. For if s in L(r*) as defined above, then s in L(ri) for some i>=0. We may show by induction on i that s in 1+L(r)L. If i=0, then s="" in 1, and if i>0, then s=tu with t in L(r) and u in L(ri-1). By induction u in L, and hence s in 1+L(r)L and hence s in L. Conversely, if s in L, then either s in 1, in which case s in L(r*), or s=tu with t in L(r) and u in L. Inductively u in L(r*) and hence s in L(r)L(r*) and hence s in L.
We saw in the Overview that a natural way to
define the procedure match
is to use a technique called continuation
passing. We defined an auxiliary function match_is
with the type regexp
-> char list -> (char list -> bool) -> bool
that takes a regular
expression, a list of characters (essentially a string, but in a form suitable for
incremental processing), and a continuation, and yields a boolean. The idea is that match_is
takes a regular expression r, a character list cs, and a continuation k,
and determines whether or not some initial segment of cs matches r,
passing the remaining characters cs' to k in the case that there is such
an initial segment, and yields false otherwise. Put more precisely,
For every regular expression r, character list cs, and continuation k, if cs=cs'@cs'' with cs' in L(r) and k cs'' evaluating to true, then
match_is
r cs k evaluates true; otherwise,match_is
r cs k evaluates to false.
Unfortunately, this specification is too strong to ever be satisfied by
any implementation of match_is
! Can you see why? The difficulty
is that if k is not guaranteed to terminate for all inputs, then there is no way
that match_is
can behave as required. If there is no input on which k
terminates, the specification requires that match_is return false. It should be
intuitively clear that we can never implement such a function. Formally, we can
reduce the halting problem to the matching problem so defined, which suffices to
show that no such match_is
procedure exists. Instead, we must restrict
attention to total continuations, those that terminate for all inputs. This
leads to the following revised specification:
For every regular expression r, character list cs, and total continuation k, if cs=cs'@cs'' with cs' in L(r) and k cs'' evaluating to true, then
match_is
r cs k evaluates to true; otherwise,match_is
r cs k evaluates to false.
Observe that the condition "If cs=cs'@cs' with ..., then ..." contains an implicit existential quantification. Written out in full, we might say "If there exists cs' and cs'' such that cs = cs'@cs'' with ..., then ...". This is an important observation because it makes clear that we must search for a suitable splitting of cs into two parts such that the first part is in L(r) and the second is accepted by k. There may, in general, be many ways to partition the input to as to satisfy both of these requirements; we need only find one such way. Note, however, that if cs = cs' @ cs'' with cs' in L(r) but k cs'' yielding false, we must reject this partitioning and search for another. In other words we cannot simply consider any partitioning whose initial segment matches r; we can consider only those that also induce k to accept the corresponding final segment.
Suppose for the moment that match_is
satisfies this
specification. Does it follow that match satisfies the original specification?
Recall that match is defined as follows:
fun match r s =
match_is r (String.explode s) (fn nil => true | false)
Notice that the initial continuation is indeed total, and that it yields
true (accepts) iff it is applied to the null string. Therefore, if match_is
satisfies its specification, then match
satisfies the following property
obtained by plugging in the initial continuation:
For every regular expression r and character list cs, if cs in L(r), then
match
r cs evaluates to true, and otherwisematch
r cs evaluates to false.
This is precisely the property that we desire for match
.
Thus match
is correct (satisfies its specification) if match_is
is correct (satisfies its specification).
So far so good. But does match_is
satisfy its
specification? If so, we are done. How might we check this? Recall the
definition of match_is
given in the overview:
fun match_is Zero _ k = false
| match_is One cs k = k cs
| match_is (Char c) (d::cs) k = if c=d then k cs else false
| match_is (Times (r1, r2)) cs k =
match_is r1 cs (fn cs' => match_is r2 cs' k)
| match_is (Plus (r1, r2)) cs k =
match_is r1 cs k orelse match_is r2 cs k
| match_is (Star r) cs k =
k cs orelse match_is r cs (fn cs' => match_is (Star r) cs' k)
Since match_is
is defined by a recursive analysis of the
regular expression r, it is natural to proceed by induction on the structure of r.
That is, we treat the specification as a conjecture about match_is
,
and attempt to prove it by structural induction on r.
We first consider the three base cases. Suppose that r is 0.
Then no string is in L(r), so match_is
must return false,
which indeed it does. Suppose that r is 1. Since the null
string is an initial segment of every string, and the null string is in L(1), we
must yield true iff k cs yields true, and false otherwise.
Again, this is precisely how match_is
is defined. Suppose that r
is a. Then to succeed cs must have the form a cs'
with k cs' evaluating to true; otherwise we must fail. The code for match_is
checks that cs has the required form and, if so, passes cs' to k
to determine the outcome, and otherwise yields false. Thus match_is
behaves correctly for each of the three base cases.
We now consider the three inductive steps. For r=r1+r2,
we observe that some initial segment of cs matches r and causes k
to accept the corresponding final segment iff either some initial segment matches r1
and drives k to accept or some initial segment matches r2 and
drives k to accept. By induction match_is
works as specified
for r1 and r2, which is sufficient to justify the
correctness of match_is
for r=r1+r2. For
r=r1r2, the proof is slightly more complicated. By
induction match_is
behaves according to the specification if it is applied to
either r1 or r2, provided that the continuation
argument is total. Note that the continuation k' given by (fn
cs' => match_is r2 cs' k)
is total, since by induction the inner recursive call
to match_is
always terminates. Suppose that there exists a partitioning
cs=cs'@cs'' with cs' in L(r)and k cs'' evaluating to true.
Then cs'=cs'1cs'2 with cs'1
in L(r1) and cs'2 in L(r2), by definition
of L(r1r2). Consequently, match_is
r2
cs'2cs'' k evaluates to true, and hence match_is
r1 cs'1cs'2cs'' k' evaluates to true,
as required. If, however, no such partitioning exists, then either no initial
segment of cs matches r1, in which case the outer recursive
call yields false, as required, or for every initial segment matching r1,
no initial segment of the corresponding final segment matches r2, in
which case the inner recursive call yields false on every call, and hence the
outer call yields false, as required, or else every pair of successive
initial segments of cs matching r1 and r2
successively results in k evaluating to false, in which case the inner
recursive call always yields false, and hence the continuation k' always
yields false, and hence the outer recursive call yields false, as
required. Be sure you understand the reasoning involved here, it is quite tricky to
get right!
We seem to be on track, with one more case to consider, r=r1*.
This case would appear to be a combination of the preceding two cases for
alternation and concatenation, with a similar argument sufficing to establish correctness.
But there is a snag: the second recursive call to match_is
leaves the
regular expression unchanged! Consequently we cannot apply the inductive hypothesis
to establish that it behaves correctly in this case, and the obvious proof attempt breaks
down. (Write out the argument to see where you get into trouble.) What to do?
A moment's thought suggests that we proceed by an inner induction on the length of
the string, based on the theory that if some initial segment of cs matches L(r),
then either that initial segment is the null string (base case), or cs=cs'@cs''
with cs' in L(r1) and cs'' in L(r) (induction step). We
then handle the base case directly, and handle the inductive case by assuming that match_is
behaves correctly for cs'' and showing that it behaves correctly for cs.
But there is a flaw in this argument! The string cs'' need not be
shorter than cs in the case that cs' is the null string! In that
case the inductive hypothesis does not apply, and we are once again unable to complete the
proof. But this time we can use the failure of the proof to obtain a counterexample
to the specification! For if r=1*, for example, then match_is
r cs k does not terminate! In general if r=r1* with
"" in L(r1), then match_is
r cs k
fails to terminate. In other words, match_is
does not satisfy
the specification we have derived for it! Our conjecture is false!
We have used the failure of an attempt to prove that match_is
satisfies a reasonable specification of its behavior to discover a bug in the code --- the
matcher does not always terminate. What to do? One approach is to explicitly
check for failure to make progress when matching against an iterated regular expression.
This will work, but at the expense of cluttering the code and imposing additional
run-time overhead. You should write out a version of the matcher that works this
way, and check that it indeed satisfies the specification we've given above. An
alternative is to observe that the proof of correctness sketched above goes through,
provided that the regular expression satisfies the condition that no iterated regular
expression matches the null string. That is, r* is admitted as a
valid regular expression only if "" is not in L(r). Call
a regular expression satisfying this condition standard. As an exercise
check that the proof sketched above goes through under the additional assumption that r
is a standard regular expression.
Thus the matcher behaves correctly for all standard regular expressions. But what about those non-standard ones? A simple observation is that every regular expression is equivalent to one in standard form. That is, we never really need to consider non-standard regular expressions. Instead we can pre-process the regular expression to put it into standard form, then call the matcher on the standardized regular expression. This pre-processing is based on the following definitions. First, we define null(r) to be the regular expression 1 if r accepts the null string, and the regular expression 0 if not. Then we define nonnull(r) to be a regular expression r' in standard form such that L(r') = L(r) \ {""} --- that is, r' accepts the same strings as r, except for the null string. Thus for every regular expression r, we have
L(r) = L(null(r)+nonnull(r)).
Moreover, the regular expression null(r)+nonnull(r) is in standard form.
Here is the definition of null:
null(0) = 0
null(1) = 1
null(a) = 0
null(r1+r2) = null(r1) ++ null(r2)
null(r1r2) = null(r1) ** null(r2)
null(r*) = 1
where we define 0++1=1++0=1++1=1 and 0++0=0 and 0**1=1**0=0**0=0 and 1**1=1.
Here is the definition of nonnull:
nonnull(0) = 0
nonnull(1) = 0
nonnull(a) = a
nonnull(r1+r2) = nonnull(r1)+nonnull(r2)
nonnull(r1r2) = null(r1)nonnull(r2) + nonnull(r1)nonnull(r2)
nonnull(r*) = null(r) + nonnull(r)*
Check that the stated properties of these regular expressions indeed hold true, and use these equations to define a pre-processor to put every regular expression into standard form.
This chapter is based on the paper entitled Proof-Directed Debugging, which is scheduled to appear as a Functional Pearl article in the Journal of Functional Programming.