Valgrind is licensed under the GNU General Public License,
version 2
An open-source tool for supervising execution of Linux-x86 executables.
Although writing a tool is not easy, and requires learning quite a few things about Valgrind, it is much easier than instrumenting a program from scratch yourself.
The core provides the common low-level infrastructure to support program instrumentation, including the x86-to-x86 JIT compiler, low-level memory manager, signal handling and a scheduler (for pthreads). It also provides certain services that are useful to some but not all tools, such as support for error recording and suppression.
But the core leaves certain operations undefined, which must be filled by tools. Most notably, tools define how program code should be instrumented. They can also define certain variables to indicate to the core that they would like to use certain services, or be notified when certain interesting events occur. But the core takes care of all the hard work.
Code executed in user space includes all the program code, almost all of the C library (including things like the dynamic linker), and almost all parts of all other libraries.
malloc()
etc.)A tool has no control over these operations; it never ``sees'' the code doing this work and thus cannot instrument it. However, the core provides hooks so a tool can be notified when certain interesting events happen, for example when when dynamic memory is allocated or freed, the stack pointer is changed, or a pthread mutex is locked, etc.
Note that these hooks only notify tools of events relevant to user space. For example, when the core allocates some memory for its own use, the tool is not notified of this, because it's not directly part of the supervised program's execution.
It should be noted that a tool only has direct control over code executed in user space. This is the vast majority of code executed, but it is not absolutely all of it, so any profiling information recorded by a tool won't be totally accurate.
_dl_runtime_resolve()
); the number of
basic blocks, x86 instruction, UCode instructions executed; the number
of branches executed and the proportion of those which were taken.
cg_annotate
script to
annotate source code.The biggest difficulty with this is the simulation; the chip-makers are very cagey about how their chips do branch prediction. But implementing one or more of the basic algorithms could still give good information.
It would be easy to write a coverage tool that records how many times
each basic block was recorded. Again, the cg_annotate
script could be used for annotating source code with the gathered
information. Although, cg_annotate
is only designed for
working with single program runs. It could be extended relatively easily
to deal with multiple runs of a program, so that the coverage of a whole
test suite could be determined.
In addition to the standard coverage information, such a tool could record extra information that would help a user generate test cases to exercise unexercised paths. For example, for each conditional branch, the tool could record all inputs to the conditional test, and print these out when annotating.
Debugging via Run-Time Type CheckingSimilar is the tool described in this paper:
Alexey Loginov, Suan Hsi Yong, Susan Horwitz and Thomas Reps
Proceedings of Fundamental Approaches to Software Engineering
April 2001.
Run-Time Type Checking for Binary ProgramsThese approach can find quite a range of bugs, particularly in C and C++ programs, and could be implemented quite nicely as a Valgrind tool.
Michael Burrows, Stephen N. Freund, Janet L. Wiener
Proceedings of the 12th International Conference on Compiler Construction (CC 2003)
April 2003.
Ways to speed up this run-time type checking are described in this paper:
Reducing the Overhead of Dynamic AnalysisValgrind's client requests could be used to pass information to a tool about which elements need instrumentation and which don't.
Suan Hsi Yong and Susan Horwitz
Proceedings of Runtime Verification '02
July 2002.
This is achieved by packaging each tool into a separate shared object which is
then loaded ahead of the core shared object valgrind.so
, using the
dynamic linker's LD_PRELOAD
variable. Any functions defined in
the tool that share the name with a function defined in core (such as
the instrumentation function SK_(instrument)()
) override the
core's definition. Thus the core can call the necessary tool functions.
This magic is all done for you; the shared object used is chosen with the
--tool
option to the valgrind
startup script. The
default tool used is memcheck
, Valgrind's original memory checker.
To check out the code from the CVS repository, first login:
cvs -d:pserver:anonymous@cvs.valgrind.sourceforge.net:/cvsroot/valgrind login
Then checkout the code. To get a copy of the current development version
(recommended for the brave only):
cvs -z3 -d:pserver:anonymous@cvs.valgrind.sourceforge.net:/cvsroot/valgrind co valgrind
To get a copy of the stable released branch:
cvs -z3 -d:pserver:anonymous@cvs.valgrind.sourceforge.net:/cvsroot/valgrind co -r TAG valgrind
where TAG
has the form VALGRIND_X_Y_Z
for
version X.Y.Z.
automake
and autoconf
for the
creation of Makefiles and configuration. But don't worry, these instructions
should be enough to get you started even if you know nothing about those
tools.
In what follows, all filenames are relative to Valgrind's top-level directory
valgrind/
.
foobar
and fb
as an
example.
foobar/
which will hold the tool.
none/Makefile.am
into foobar/
.
Edit it by replacing all occurrences of the string
``none
'' with ``foobar
'' and the one
occurrence of the string ``nl_
'' with ``fb_
''.
It might be worth trying to understand this file, at least a little; you
might have to do more complicated things with it later on. In
particular, the name of the vgskin_foobar_so_SOURCES
variable
determines the name of the tool's shared object, which determines what
name must be passed to the --tool
option to use the tool.
none/nl_main.c
into
foobar/
, renaming it as fb_main.c
.
Edit it by changing the lines in SK_(pre_clo_init)()
to something appropriate for the tool. These fields are used in the
startup message, except for bug_reports_to
which is used
if a tool assertion fails.
Makefile.am
, adding the new directory
foobar
to the SUBDIRS
variable.
configure.in
, adding foobar/Makefile
to the
AC_OUTPUT
list.
autogen.sh ./configure --prefix=`pwd`/inst make installIt should automake, configure and compile without errors, putting copies of the tool's shared object
vgskin_foobar.so
in
foobar/
and
inst/lib/valgrind/
.
inst/bin/valgrind --tool=foobar date(almost any program should work;
date
is just an example).
The output should be something like this:
==738== foobar-0.0.1, a foobarring tool for x86-linux. ==738== Copyright (C) 1066AD, and GNU GPL'd, by J. Random Hacker. ==738== Built with valgrind-1.1.0, a program execution monitor. ==738== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. ==738== Estimated CPU clock rate is 1400 MHz ==738== For more details, rerun with: -v ==738== Wed Sep 25 10:31:54 BST 2002 ==738==The tool does nothing except run the program uninstrumented.
--prefix
for
./configure
.Now that we've setup, built and tested the simplest possible tool, onto the interesting stuff...
SK_(pre_clo_init)() SK_(post_clo_init)() SK_(instrument)() SK_(fini)()Also, it must use the macro
VG_DETERMINE_INTERFACE_VERSION
exactly once in its source code. If it doesn't, you will get a link error
involving VG_(skin_interface_major_version)
. This macro is
used to ensure the core/tool interface used by the core and a plugged-in
tool are binary compatible.
In addition, if a tool wants to use some of the optional services provided by
the core, it may have to define other functions.
SK_(pre_clo_init)()
.
Only use SK_(post_clo_init)()
if a tool provides command line
options and must do some initialisation after option processing takes place
(``clo
'' stands for ``command line options'').
First of all, various ``details'' need to be set for a tool, using the
functions VG_(details_*)()
. Some are all compulsory, some aren't.
Some are used when constructing the startup message,
detail_bug_reports_to
is used if VG_(skin_panic)()
is
ever called, or a tool assertion fails. Others have other uses.
Second, various ``needs'' can be set for a tool, using the functions
VG_(needs_*)()
. They are mostly booleans, and can be left
untouched (they default to False
). They determine whether a tool
can do various things such as: record, report and suppress errors; process
command line options; wrap system calls; record extra information about
malloc'd blocks, etc.
For example, if a tool wants the core's help in recording and reporting errors,
it must set the skin_errors
need to True
, and then
provide definitions of six functions for comparing errors, printing out errors,
reading suppressions from a suppressions file, etc. While writing these
functions requires some work, it's much less than doing error handling from
scratch because the core is doing most of the work. See the type
VgNeeds
in include/vg_skin.h
for full details of all
the needs.
Third, the tool can indicate which events in core it wants to be notified
about, using the functions VG_(track_*)()
. These include things
such as blocks of memory being malloc'd, the stack pointer changing, a mutex
being locked, etc. If a tool wants to know about this, it should set the
relevant pointer in the structure to point to a function, which will be called
when that event happens.
For example, if the tool want to be notified when a new block of memory is
malloc'd, it should call VG_(track_new_mem_heap)()
with an
appropriate function pointer, and the assigned function will be called each
time this happens.
More information about ``details'', ``needs'' and ``trackable events'' can be
found in include/vg_skin.h
.
SK_(instrument)()
is the interesting one. It allows you to
instrument UCode, which is Valgrind's RISC-like intermediate language.
UCode is described in the technical docs for
Memcheck.
The easiest way to instrument UCode is to insert calls to C functions when
interesting things happen. See the tool ``Lackey''
(lackey/lk_main.c
) for a simple example of this, or
Cachegrind (cachegrind/cg_main.c
) for a more complex
example.A much more complicated way to instrument UCode, albeit one that might result in faster instrumented programs, is to extend UCode with new UCode instructions. This is recommended for advanced Valgrind hackers only! See Memcheck for an example.
The file include/vg_skin.h
contains all the types,
macros, functions, etc. that a tool should (hopefully) need, and is the only
.h
file a tool should need to #include
.
In particular, you probably shouldn't use anything from the C library (there
are deep reasons for this, trust us). Valgrind provides an implementation of a
reasonable subset of the C library, details of which are in
vg_skin.h
.
Similarly, when writing a tool, you shouldn't need to look at any of the code in Valgrind's core. Although it might be useful sometimes to help understand something.
vg_skin.h
has a reasonable amount of documentation in it that
should hopefully be enough to get you going. But ultimately, the tools
distributed (Memcheck, Addrcheck, Cachegrind, Lackey, etc.) are probably the
best documentation of all, for the moment.
Note that the VG_
and SK_
macros are used heavily.
These just prepend longer strings in front of names to avoid potential
namespace clashes. We strongly recommend using the SK_
macro for
any global functions and variables in your tool, or writing a similar macro.
If you are getting segmentation faults in C functions used by your tool, the usual GDB command:
gdb prog core
usually gives the location of the segmentation fault.If you want to debug C functions used by your tool, you can attach GDB to Valgrind with some effort:
coregrind/vg_main.c
by
changing if (0)
into if (1)
:
/* Hook to delay things long enough so we can get the pid and attach GDB in another shell. */ if (0) { Int p, q; for (p = 0; p < 50000; p++) for (q = 0; q < 50000; q++) ; }
and rebuild Valgrind.
valgrind prog
Valgrind starts the program, printing its process id, and then delays for
a few seconds (you may have to change the loop bounds to get a suitable
delay).
gdb prog pid
-fomit-frame-pointer
,
and you'll need to get rid of this to extract useful tracebacks from
GDB.
If you just want to know whether a program point has been reached, using the
OINK
macro (in include/vg_skin.h
) can be easier than
using GDB.
If you are having problems with your UCode instrumentation, it's likely that
GDB won't be able to help at all. In this case, Valgrind's
--trace-codegen
option is invaluable for observing the results of
instrumentation.
The other debugging command line options can be useful too (run valgrind
-h
for the list).
valgrind/*.supp
; the final suppression file is aggregated from
these files by combining the relevant .supp
files depending on the
versions of linux, X and glibc on a system.
Suppression types have the form tool_name:suppression_name
. The
tool_name
here is the name you specify for the tool during
initialisation with VG_(details_name)()
.
foobar
as the example tool
name again):
foobar/docs/
.
foobar/Makefile.am
, adding docs
to
the SUBDIRS
variable.
configure.in
, adding
foobar/docs/Makefile
to the AC_OUTPUT
list.
foobar/docs/Makefile.am
. Use
memcheck/docs/Makefile.am
as an example.
foobar/docs/
.
foobar/tests/
.
foobar/Makefile.am
, adding tests
to
the SUBDIRS
variable.
configure.in
, adding
foobar/tests/Makefile
to the AC_OUTPUT
list.
foobar/tests/Makefile.am
. Use
memcheck/tests/Makefile.am
as an example.
.vgtest
test description files,
.stdout.exp
and .stderr.exp
expected output
files. (Note that Valgrind's output goes to stderr.) Some details
on writing and running tests are given in the comments at the top of the
testing script tests/vg_regtest
.
foobar/tests/filter_stderr
.
It can call the existing filters in tests/
. See
memcheck/tests/filter_stderr
for an example; in particular
note the $dir
trick that ensures the filter works correctly
from any directory.
#include "vg_profile.c"in the tool somewhere, and rebuild (you may have to
make clean
first). Then run Valgrind with the --profile=yes
option.
The profiler is stack-based; you can register a profiling event with
VGP_(register_profile_event)()
and then use the
VGP_PUSHCC
and VGP_POPCC
macros to record time spent
doing certain things. New profiling event numbers must not overlap with the
core profiling event numbers. See include/vg_skin.h
for details
and Memcheck for an example.
valgrind/foobar/
, you will
need to add an appropriate Makefile.am
to it, and add a
corresponding entry to the AC_OUTPUT
list in
valgrind/configure.in
.
If you add any scripts to your tool (see Cachegrind for an example) you need to
add them to the bin_SCRIPTS
variable in
valgrind/foobar/Makefile.am
.
VG_DETERMINE_INTERFACE_VERSION
macro exactly once in its code.
If not, a link error will occur when the tool is built.
The interface version number has the form X.Y. Changes in Y indicate binary compatible changes. Changes in X indicate binary incompatible changes. If the core and tool has the same major version number X they should work together. If X doesn't match, Valgrind will abort execution with an explanation of the problem.
This approach was chosen so that if the interface changes in the future, old tools won't work and the reason will be clearly explained, instead of possibly crashing mysteriously. We have attempted to minimise the potential for binary incompatible changes by means such as minimising the use of naked structs in the interface.
The first consequence of this is that the core/tool interface will continue to change in the future; we have no intention of freezing it and then regretting the inevitable stupidities. Hopefully most of the future changes will be to add new features, hooks, functions, etc, rather than to change old ones, which should cause a minimum of trouble for existing tools, and we've put some effort into future-proofing the interface to avoid binary incompatibility. But we can't guarantee anything. The versioning system should catch any incompatibilities. Just something to be aware of.
The second consequence of this is that we'd love to hear your feedback about it:
Happy programming.