Debugging Tips and Tools

This page is designed to provide some tools and tips for debugging your assignments in CSE 333. As this course has a considerable amount of programming, it is recommended to consider different tools and options when approaching and handling bugs in your development process.

Please also take a look at the links provided in the navigation bar under "Debug" that give more robust documentation for GDB and valgrind!

Debugging Tips

Code Walkthrough

Walking through your program's code is a good way to compare your expectations against the reality of the code execution. Finding where the two diverge from each other can provide the scope in which a bug in your code or a conceptual misunderstanding may exist. Here are some things to keep in mind:

Take your time! Making sure you understand why your code leads to the erroneous behavior will often illuminate the bug or misunderstanding and is a much more efficient debugging strategy than just trying a bunch of different things randomly.
Try different inputs to your program. How does the input impact the execution of your program? Are there edge cases to take into consideration?
Double-check your usage of library functions or objects in the and the .
Make use of the debugging tools at your disposal, as they are often the only way to truly see the reality of the code execution.
- As 333 deals heavily with your computer's memory, you will often need to examine pieces of memory (e.g., the stack, the heap) using GDB and not just variable values.
- A new wrinkle in 333 will be dealing with files, so you will use additional tools to examine the bytes of files stored on disk.

“Fail Fast”

Debugging is difficult when you do not know where the bug is in your code! “Fail fast” is a methodology to recognize and notify developers about bugs early in development through testing the expected execution of a program. Here are some ways to incorporate this into your programming:

If you are unsure about the current state of a variable or expression in a program, try inserting an assertion (i.e., assert() for exercises and Verify333() for homework) to ensure the truth about a statement. However, please make sure to remove these checks before submission!
You should test for correctness after attempting/completing each part of your assignment so you can isolate/restrict the source of bugs.
- For exercises, this typically means trying different command-line arguments or writing your own tests in main(). Make sure to test a variety of inputs, including different input types and ones that should fail. However, your submitted code should NOT fail or error.
- For homework, this usually means re-compiling and running the provided test suite to check if you pass the unit tests for the section of code you just wrote. However, note that the test suites do not always catch all errors.
Also run valgrind when you test to ensure that you do not have any memory bugs/issues.

Collaboration

We encourage collaboration in this course! Oftentimes, bugs can be easily overlooked individually, so collaboration allows more perspectives, ideas, and just another pair of eyes when writing code or trying to understand the problem! Here are some ideas to consider:

Discuss with your partner: Review the for policies and tips to work with one another.
Go to : Office hours are a wonderful place to work with other students that may be working on a similar problem, and it is facilitated by the staff, who have experience with the material from before.
Post on the : The discussion board is a place for student discussion of course material in an asynchronous (and possibly anonymized) manner.

Please review the for what differentiates collaboration and cheating.

Debugging Tools

`printf` Debugging

Debugging with printf can be a good starting point for better understanding the state and execution of your program. Here are some places where it might be useful:

Printing out the current state of the program (e.g., local variables or parameters) at a particular point of interest or to assess the changes over time. Note that this is entirely possible with GDB as well, though.
Printing a message to mark the execution (i.e., was this code reached?) of different parts of the program to give you a better idea of how your program is operating.

Feel free to do what works well for you, but the software we build in this class can be quite complex and we do strongly encourage you to use a dedicated debugger like GDB to find the more complex bugs. printf debugging also cannot help you find most types of memory bugs that valgrind can.

GDB

We are assuming some familiarity from CSE 351 (whose tutorials and resources you can review here) but we won't be looking at registers and assembly code. GDB (Gnu DeBugger) is an immensely useful tool to help you debug your C and C++ programs:

It lets you insert breakpoints into your programs so that you can stop execution and examine the values of variables, expressions, and memory.
It supports single-stepping your program one line of source code (C/C++) at a time.
It leads to much more productive debugging than just using printf statements.

GDB is basically a necessity to get through this course, so you will only be hurting yourself by refusing to learn and use this tool!

CSE 333 GDB Tips

More complete documentation can be found in the and (also linked from the navigation bar).

Now that we are dealing with multifile projects, it is important to specify exactly where we want to set breakpoints:

(gdb) break <function> # break at the beginning of a function (will locate proper file since name is unique)
(gdb) break <line number> # break on a program line for current/only file
(gdb) break <filename>:<function/line number> # break within a specific file

step and next allow you to walk through your program execution. Remember that step steps into function calls and next steps over function calls.
print and x allow you to evaluate variables or expressions. Remember that print will print the value of the expression while x will use the value of the expression as an address and dereference it. The has the relevant format specifiers for both in the "Display" section.
backtrace displays the program stack. This is particularly helpful for getting more information on program failures (e.g., segmentation faults).

Valgrind

Valgrind is a memory error detector tool. You should use this on every assignment in this class, as it can catch a bunch of memory errors that are difficult to impossible to detect otherwise, including:

Reading from uninitialized memory.
Reading/writing free'd memory (i.e., dangling pointers).
Reading/writing past the end of an array.
Reading/writing in inappropriate areas on the stack.

Even if you encounter an output bug or segfault, which you would normally use GDB to debug, running Valgrind may give you additional useful information!

While Valgrind is an extremely powerful tool, it's NOT a silver bullet! In particular, Valgrind is a dynamic analysis tool, meaning that it can only catch memory issues that it encountered during that particular execution of your program. This means that you will want to run Valgrind multiple times on different inputs and scenarios to ensure you get good code coverage. By design, it also chooses to ignore certain kinds of errors in order to avoid generating lots of false positives.

Sample output for illegal read/write

...
==235179== Invalid read of size 4
==235179== at 0x401183: main (valgrind-tutorial.c:16)
==235179== Address 0x520b068 is 20 bytes after a block of size 20 alloc'd
==235179== at 0x4C360A5: malloc (vg_replace_malloc.c:380)
==235179== by 0x40115E: main (valgrind-tutorial.c:8)
...

To interpret this error, the first line will explain the type of invalid access and the following line(s) gives the associated stack trace (just the 2nd line in this example). Following that, there is a stack trace for the block of memory that Valgrind thinks you may be referring to.

Sample output for memory leak

...
==235179== HEAP SUMMARY:
==235179== in use at exit: 20 bytes in 1 blocks
==235179== total heap usage: 3 allocs, 2 frees, 1,064 bytes allocated
==235179==
==235179== 20 bytes in 1 blocks are definitely lost in loss record 1 of 1
==235179== at 0x4C360A5: malloc (vg_replace_malloc.c:380)
==235179== by 0x40115E: main (valgrind-tutorial.c:8)
==235179==
==235179== LEAK SUMMARY:
==235179== definitely lost: 20 bytes in 1 blocks
==235179== indirectly lost: 0 bytes in 0 blocks
==235179== possibly lost: 0 bytes in 0 blocks
==235179== still reachable: 0 bytes in 0 blocks
==235179== suppressed: 0 bytes in 0 blocks
...

There will be information on lost blocks of memory, which means you have not free'd them before the end of the program execution.

Try reading the stack trace for each block in order to see which blocks of memory the program has not free'd during its execution.
Under LEAK SUMMARY, you want to ensure that "definitely lost" and "indirectly lost" are 0 to make sure there were no memory leaks in your code!

Using getaddrinfo() later in the course (ex10, ex11, hw4) will result in a leakage that is "still reachable". Disregard this leakage as it is a bug in the <netdb.h> library rather than your code.

Viewing Binary Files

In 333, we will deal with binary files, where the stored data uses the full range of possible character values. This is in contrast to text files, where the stored data are mostly restricted to the values that correspond to printable (e.g., ASCII or Unicode) characters. Different types of binary files serve different purposes, but the one thing they have in common is that they will look like garbage when opened in a text editor because non-printable characters will be represented in ways you wouldn't expect.

`xxd`

xxd will print out the bytes of a file in hex format. You don't need to know much about this utility and it is primarily useful for Exercise 11 and Homework 3.

Here is an example usage. The first command creates a file called test.bin whose contents are the ASCII characters hello world followed by the bytes CA, FE, F0, and 0D (the \x you see means that the byte is specified in hex). The second command invokes xxd on the binary file that we just created.

[attu]$ echo -ne "hello world\xca\xfe\xf0\x0d" > test.bin
[attu]$ xxd test.bin
00000000: 6865 6c6c 6f20 776f 726c 64ca fef0 0d hello world....

Generally, larger files will produce more lines, but still in this format, so you can follow the same interpretation tips described here.

There are three blocks/columns of information in the single line of output shown above:

The left-most block (the 00000000:) represents the byte address/index of the left-most byte of the line (in hex). This can be helpful for finding your place in a longer file.
The middle block (68 through 0d) shows the values (in hex) of the bytes themselves. Recall that one byte = two hex digits. The bytes have increasing address/index from left-to-right.
The last block is the ASCII decoding of the bytes (as a basic text editor would interpret the bytes). For instance, the first byte of the file, 0x68, is the 'h'. The last four bytes show up as ., which is how xxd chooses to show non-printable ASCII characters. This can be a little bit confusing as there is a valid '.' character (0x2e) and these non-printable characters might display differently in a text editor.

For very long files, dumping the xxd output to the console can be rather unhelpful and difficult to scroll and search through. You can use a utility called less, which acts as a pager, to allow you to scroll up and down through the output like so:

$ xxd <file name> | less