v LCLint [Evans et al.]
� [Material taken in part from a talk by S.
Garland]
� Add some partial specification information
to C code to
� Detect potential bugs
� Enforce coding style
v Versatile and lightweight
� Incremental gain for incremental effort
� Fits in with other tools
� Detects potential bugs
� Specifications enable more accurate checks,
messages
� Memory management a particular problem in
the C language
� Enforces coding style
� Abstraction boundaries
� Use of mutable and immutable types
v LCLint Does Not
� Encourage programmer to write
�
Contorted
code
�
Inefficient
code
� Report only actual errors
� Report all errors
� Insist on reporting a fixed set of
potential errors
�
Many options
and control flags
v Ex: Definition before Use
� Sample code
if
(setVal(n, &buffer)) ...
� Must buffer be defined before calling setVal?
�
Yes: bool
setVal(int d, char *val);
�
No:����������� bool setVal(int d, out
char *val);
� Is buffer defined afterwards?
�
Yes:���� bool setVal(...); {modifies *val;}
�
Maybe: bool
setVal(...); {modifies nothing;}
�
NO!:��� bool
setVal(...); {ensures trashed(val);}
v More Accurate Checks
� Conventional lint tools report
�
Too many
spurious errors
�
Too few
actual errors
� Because
�
Code does not
reveal the programmer�s intent
�
Fast checks
require simplifying assumptions
�
Specifications
give good simplifying assumptions
v Abstraction Boundaries
� Client code should rely only on
specifications
� Types can be specified as abstract
�
immutable
type date;
date nextDay(date d); { }
mutable type set;
void merge(set s, set t); {modifies s;}
v LCLint detects
� Inappropriate access to representation
� Including use of ==
v Checking Abstract Types
� Specification: set.lcl contains the single
line
mutable
type set;
� Client code
#include
"set.h"
bool f(set s, set t) {
� if (s->size > 0) return (s ==
t);
� ...
� >
lclint set client.c
client.c:4,7: Arrow access field of abstract type (set): s->size
client.c:5,13: Operands of == are abstract type (set):� s == t
v Checking Side Effects
� Specification:
void
set_insert (set s, int e)
���� { modifies s;}
void set_union(set s, set t)
���� { modifies s;}
� Code (in set.c) :
void
set_union (set s, set t) {
��� int i;
��� for (i = 0; i < s->size; i++)
������ set_insert(t, s->elements[i]);
��� }
� Message:
set.c:35,
27: Called procedure set_insert may modify t:
set_insert(t, s->elements[i])
v Checking Use of Memory
� Specifications
only char *gname;
. . .
void setName (temp char *pname) char *gname;
� Code
void
setName (char *pname) {
�� gname = pname;
}
� LCLint error messages
sample.c:2:3:
Only storage gname not released before assignment: gname = pname
sample.c:2:3: Temp storage assigned to only: gname = pname
v If C Were Better...
� Would LCLint still help?
� Yes, because specifications
�
contain
information not in code
�
contain
information that is hard to infer from code
�
are usable
with legacy code, existing compilers
�
can be
written faster than languages can be changed
�
are important
even with better languages
v Experience with LCLint
� Reliable and efficient
� Runs at compiler speed
� Used on both new and legacy code
� 1,000-200,000 line programs
� Over 500 users have sent e-mail to MIT
� Tested with varying amounts of
specification
� Lots to almost none
� LCLint approximates missing specifications
� Results encouraging
v Understanding Legacy Code
� Analyzed interpreter (quake) built at DEC
SRC
� Discovered latent bugs (ordinary lint can
do this)
� Discovered programming conventions
� Documented use of built-in types (int,
char, bool)
� Identified (and repaired) (nearly) abstract
types
� Documented action of procedures
� Use of global information, side-effects
� Enhanced documentation a common thread
� Easier to read and write because formulaic
� More trustworthy because checked
v Fundamental benefit
� Partial specifications
� Low entry cost
� You get what you pay for (or maybe a bit
more)
v Purify
� Dynamic (commercial) tool for detecting
memory leaks and access errors; there are others, such as Bounds Checker
�
http://www.rational.com/products/purify/
� In some sense, a dynamic analog to the
memory checking aspects of LCLint
v Trapping every memory access would be
overly expensive
� Purify inserts function calls before loads
and stores to maintain a table that holds a 2-bit state for each byte in the
heap and stack
� Requires working with malloc/free, too
v Memory State Transitions
� Writing to memory with any bytes that are
unwriteable prints a diagnostic
� Same with reading unreadable bytes
v Other violations
� Array Bound Violations
�
Allocate a
�red-zone� around malloc blocks, recording them as unwriteable and unreadable
� Uninitialized variables
�
Initialize
them to allocated-but-uninitialized state
� Accessing freed memory
�
Do not
reallocate memory until it has aged
�
Aging can be
specified by the user in terms of number of calls to free
v Overhead
� 2 bits per state; about 25% memory overhead
during development
� Run-time no worse than 5.5 times optimized
C code (and usually no worse than compiling with debugging on)