CSE 374, Lecture 15: C Preprocessor

The preprocessor so far

So far, we've talked about the preprocessor in terms of two uses:

However, the preprocessor is much more sophisticated, and we'll be learning more about it today.

Compilation

We looked at a diagram of the C compilation process: http://faculty.cs.niu.edu/~mcmahon/CS241/Images/compile.png

(note that this diagram is for C++ but applies to C as well).

gcc normally does all of these steps automatically, but if you'd like to see them, you can stop gcc at different points in the process to see the output:

    # Stop after the preprocessor and store the preprocessed C file in file.pp
    $ gcc -E file.c > file.pp

    # Stop after the compiler and store the assembly code in file.s
    $ gcc -S file.c

    # Stop after the assembler and store the machine code in file.o
    $ gcc -c file.c

The preprocessor's role

The preprocessor can do several things:

1) #include contents of header files. 2) #define constants and parameterized macros. 3) Conditional compilation.

Headers and the preprocessor

Header files are a specific type of file in C that allow us to define interfaces between files. They consist of function, struct, and constant declarations that might be useful in more than one program, and then we use #include to add them into the program that wants to use it.

Earlier in the quarter, we learned that there are two ways to include header files:

    #include <header.h>
    #include "header.h"

The first version tells the preprocessor that the header is from the standard library - and so it will go look for a file by that name in the standard library directories. The "" version tells the preprocessor that the header is local, and so it will look for the file in the local directory.

Some rules for header files that you include in your programs:

As an exercise we transformed our linkedlist.c program from last lecture into three files:

By separating out the different pieces of the linked list program, we are able to allow reuse of the linked list in multiple client programs (ie we could add linkedlistclient2.c to do something else with a linked list). Notice also how linkedlist.h is included in both linkedlist.c and linkedlistclient.c - we can do this because it is a header file and has no definitions that would be duplicated!

To compile all three files, we ran:

    $ gcc -o ll linkedlist.h linkedlist.c linkedlistclient.c

In addition, notice that we added a few more preprocessor directives in our linkedlist.h file:

    #ifndef LL_H
    #define LL_H
    ...
    #endif

Stay tuned for why we do this...

Macros

Preprocessor "macros" are basically just token-based textual replacement. They can be used exactly as if they were find-and-replaced by their values.

    #define PI 3.14
    #define MAX_LENGTH 100
    #define MY_STR "foo"

Macros can also be "parameterized". This means that we can essentially turn them into tiny functions! Take for instance this examble that tries to take a value x and double it:

    #define TWICE_AWFUL(x) x*2

We call this "TWICE_AWFUL" - why? It turns out macros are extremely easy to misuse and abuse. This macro seems like it would double any numbers, but is does it do exactly what we expect?

    int x = 1;
    int y = 2;
    int z = TWICE_AWFUL(x);  // z stores 2
    int w = TWICE_AWFUL(x+y);  // z stores 5, not 6

In this case, we forgot that macros do exact textual replacement - the TWICE_AWFUL macro is expanded directly to "x+y2" and not "(x+y)2" like you probably expected. A better form of the macro would be:

    #define TWICE_OK(x) ((x)*2)

Even better might be an actual function:

    double twice(double x) { return x+x; } // best (editorial opinion)

Moral of the story: macros can be powerful, but they can also be easy to misuse. When should you use them, then? Sometimes the C type system requires complicated logic, and macros can get around that; the TWICE_OK macro works on both doubles and integers, whereas the actual twice() function only works on doubles. Similarly, a useful macro to simplify syntax and types might be this one:

    #define NEW_T(t, howmany) ((t*)malloc((howmany)*sizeof(t))

Conditional compilation

Finally, the preprocessor allows us to include/exclude specific parts of a file. We might do this for the following purposes:

To mark a region with a conditional compilation flag, you can use the #ifdef or the #ifndef preprocessor commands

    #ifdef FOO
    // This code is only compiled if FOO is defined
    #endif

    #ifndef FOO
    // This code is only compiled if FOO is NOT defined
    #endif

    #if FOO > 2
    // This code is only compiled if FOO is greater than 2
    #endif

Example: debug printing

     #ifdef DEBUG // use DBG_PRINT for debug-printing
     #define DBG_PRINT(x) printf("%s",x)
     #else
     #define DBG_PRINT(x) // replace with nothing
     #endif

     DBG_PRINT("hello world!\n");

You can define preprocessor flags either when you run the program (with the #define command) or on the command line with the -D option

    $ gcc -D DEBUG foo.c

So now we know what we were doing with the #ifndef for header files. This declaration means that if different files in the program include the same header file, the first one to #include it will define the preprocessor definition and then include the rest of the file - but any other includes will only do it once! It is good practice to always do this in order to reduce the size of your preprocessed files, and actually it is necessary if you ever have the risk of circular includes in your header files.