v LCLint [Evans et al.]
ุ [Material taken in part from a talk by S.
Garland]
ุ Add some partial specification information
to C code to
ุ Detect potential bugs
ุ Enforce coding style
v Versatile and lightweight
ุ Incremental gain for incremental effort
ุ Fits in with other tools
ุ Detects potential bugs
ุ Specifications enable more accurate checks,
messages
ุ Memory management a particular problem in
the C language
ุ Enforces coding style
ุ Abstraction boundaries
ุ Use of mutable and immutable types
v LCLint Does Not
ุ Encourage programmer to write
ง
Contorted
code
ง
Inefficient
code
ุ Report only actual errors
ุ Report all errors
ุ Insist on reporting a fixed set of
potential errors
ง
Many options
and control flags
v Ex: Definition before Use
ุ Sample code
if
(setVal(n, &buffer)) ...
ุ Must buffer be defined before calling setVal?
ง
Yes: bool
setVal(int d, char *val);
ง
No: bool setVal(int d, out
char *val);
ุ Is buffer defined afterwards?
ง
Yes: bool setVal(...); {modifies *val;}
ง
Maybe: bool
setVal(...); {modifies nothing;}
ง
NO!: bool
setVal(...); {ensures trashed(val);}
v More Accurate Checks
ุ Conventional lint tools report
ง
Too many
spurious errors
ง
Too few
actual errors
ุ Because
ง
Code does not
reveal the programmers intent
ง
Fast checks
require simplifying assumptions
ง
Specifications
give good simplifying assumptions
v Abstraction Boundaries
ุ Client code should rely only on
specifications
ุ Types can be specified as abstract
ง
immutable
type date;
date nextDay(date d); { }
mutable type set;
void merge(set s, set t); {modifies s;}
v LCLint detects
ุ Inappropriate access to representation
ุ Including use of ==
v Checking Abstract Types
ุ Specification: set.lcl contains the single
line
mutable
type set;
ุ Client code
#include
"set.h"
bool f(set s, set t) {
if (s->size > 0) return (s ==
t);
...
ุ >
lclint set client.c
client.c:4,7: Arrow access field of abstract type (set): s->size
client.c:5,13: Operands of == are abstract type (set): s == t
v Checking Side Effects
ุ Specification:
void
set_insert (set s, int e)
{ modifies s;}
void set_union(set s, set t)
{ modifies s;}
ุ Code (in set.c) :
void
set_union (set s, set t) {
int i;
for (i = 0; i < s->size; i++)
set_insert(t, s->elements[i]);
}
ุ Message:
set.c:35,
27: Called procedure set_insert may modify t:
set_insert(t, s->elements[i])
v Checking Use of Memory
ุ Specifications
only char *gname;
. . .
void setName (temp char *pname) char *gname;
ุ Code
void
setName (char *pname) {
gname = pname;
}
ุ LCLint error messages
sample.c:2:3:
Only storage gname not released before assignment: gname = pname
sample.c:2:3: Temp storage assigned to only: gname = pname
v If C Were Better...
ุ Would LCLint still help?
ุ Yes, because specifications
ง
contain
information not in code
ง
contain
information that is hard to infer from code
ง
are usable
with legacy code, existing compilers
ง
can be
written faster than languages can be changed
ง
are important
even with better languages
v Experience with LCLint
ุ Reliable and efficient
ุ Runs at compiler speed
ุ Used on both new and legacy code
ุ 1,000-200,000 line programs
ุ Over 500 users have sent e-mail to MIT
ุ Tested with varying amounts of
specification
ุ Lots to almost none
ุ LCLint approximates missing specifications
ุ Results encouraging
v Understanding Legacy Code
ุ Analyzed interpreter (quake) built at DEC
SRC
ุ Discovered latent bugs (ordinary lint can
do this)
ุ Discovered programming conventions
ุ Documented use of built-in types (int,
char, bool)
ุ Identified (and repaired) (nearly) abstract
types
ุ Documented action of procedures
ุ Use of global information, side-effects
ุ Enhanced documentation a common thread
ุ Easier to read and write because formulaic
ุ More trustworthy because checked
v Fundamental benefit
ุ Partial specifications
ุ Low entry cost
ุ You get what you pay for (or maybe a bit
more)
v Purify
ุ Dynamic (commercial) tool for detecting
memory leaks and access errors; there are others, such as Bounds Checker
ง
http://www.rational.com/products/purify/
ุ In some sense, a dynamic analog to the
memory checking aspects of LCLint
v Trapping every memory access would be
overly expensive
ุ Purify inserts function calls before loads
and stores to maintain a table that holds a 2-bit state for each byte in the
heap and stack
ุ Requires working with malloc/free, too
v Memory State Transitions
ุ Writing to memory with any bytes that are
unwriteable prints a diagnostic
ุ Same with reading unreadable bytes
v Other violations
ุ Array Bound Violations
ง
Allocate a
red-zone around malloc blocks, recording them as unwriteable and unreadable
ุ Uninitialized variables
ง
Initialize
them to allocated-but-uninitialized state
ุ Accessing freed memory
ง
Do not
reallocate memory until it has aged
ง
Aging can be
specified by the user in terms of number of calls to free
v Overhead
ุ 2 bits per state; about 25% memory overhead
during development
ุ Run-time no worse than 5.5 times optimized
C code (and usually no worse than compiling with debugging on)