% lec 16: valgrind # administrivia hw3 due next Thursday hw4 out / new lab (talk to staff) weeks 8 & 9: networking & sockets # today's plan memory errors valgrind other debugging techniques (weeks 9 & 10: helgrind, static checkers) # memory errors out-of-bounds read/write read before initialization read/write after free double free memory leak # consequences segfaults: hardware & OS traps worse, anything can happen control-flow hijack, code injection, ... hard to find: non-deterministic & unpredictable why this price: performance & control # what's correct safety property: must be true all the time liveness property: that will eventually be true bugs: violations of safety & liveness properties # bug reports false positives: incorrect bug reports false negatives: missing bug reports # how to detect memory errors high-level idea: state machine strawman: two-bit state for every bit ![](l16/fsm.svg) do we have/need per-bit alloc? # shadow memory & registers V bit: valid-value bit for every _bit_ A bit: valid-address bit for every _byte_ overhead: 1 memory byte - 9 shadow bits (8 V and 1 A) 1 register byte - 8 shadow bits (8 V) two-level table: 4GB address space primary map - 64KB entries, secondary map - 64KB V & 8KB A see further readings for 64-bit and optimizations # implementation how to monitor memory and registers? virtualization x86 → VEX IR → instrumentation → x86 system calls? model each system call multiple threads? serialized execution relate bugs to source code? `gcc -g` # false positives when to report bugs: need to reduce false positives every memory op? delay reporting (e.g., until externally visible) in glibc? whitelist: glibc-*.supp # false negatives every path others? [evil.c](l16/evil.c) # adjacent areas from two mallocs Geoffrey Liu: ```c +-------------+ | char *a | ... | | <-- a[99]: valgrind doesn't complain +-------------+ |.............| <-- a[100]: valgrind catches this |..some non-..| |..mallocated.| |....space....| |.............| +-------------+ | char *b | ... <-- a[500]: valgrind doesn't complain | | +-------------+ ``` q: how would you improve valgrind to address this limitation? # other limitations high-level bugs? porting effort can a buggy program overwrite valgrind's shadow memory? # bug example from hw2 ```c // HW2: wds->docIDs hash table // Each entry's key is a docID and the value // is the "positions" word positions list. HTIter iter = HashTableMakeIterator(wds->docIDs); ... HTKeyValue kv; HTIteratorGet(iter, &kv); ... = NumElementsInHashTable(kv.value); ... HTIteratorFree(iter); ``` . . . should be `NumElementsInLinkedList(...)`! valgrind's warnings are hard to read for this case... # bug example from last lecture ```c++ #include #include int main() { std::vector a {1, 2, 3}; for (auto i = a.begin(); i != a.end(); ++i) a.push_back(*i); for (auto i : a) std::cout << i << std::endl; } ``` no type information # other memory debugging tools Linux kernel's [kmemcheck](https://www.kernel.org/doc/Documentation/kmemcheck.txt) gdb's [watchpoints](https://sourceware.org/gdb/onlinedocs/gdb/Set-Watchpoints.html) [Google's Address Sanitizer](https://code.google.com/p/address-sanitizer/): "`gcc -fsanitizer=address`" static checkers from [lec 14](l14-smartptr.html) # more valgrind-based tools [valgrind's tool suite](http://valgrind.org/info/tools.html) week 9: data race detection (helgrind) [privacy leak measurement](http://people.csail.mit.edu/smcc/projects/secret-flow/)