CSE 143 C++ Programming Style Guide

CSE 143 Summer 2000 >> Homework >> Style Guide

Hal Perkins
CSE 143

One of the major goals of this course is that you learn to write programs that not only work correctly, but that are readable and understandable. These guidelines should help you towards that goal. We hope they will give you a good basis for developing a style of your own as you become a more experienced programmer. In this course we ask that you follow the rules given here, and you must do so to receive full credit on assignments.

There are often many ways to perform a particular task in C++, many of which are inherited from C and retained for compatibility. In these notes, standard C++ constructs are used when there is a choice.

Contents

Why?
General Guidelines
File Headings
Names
Data Definitions
Function Specifications
Statement Comments
Coding Conventions
Acknowledgments

Why?

So why all the fuss about programming style, comments, and variable names? Is it just one of those odd things that professors require in courses or does it really matter?

A computer program serves two distinct purposes. The most obvious is to instruct the computer to perform some useful task -- format and print a letter, simulate a factory, play a cool game. Less obvious, but far more important, is that a computer program is something that will read by others, and those readers must be able to understand it. New computers, operating systems, and software upgrades appear constantly; the needs of users change over time; new laws are passed; companies merge and split up. Existing software needs to be modified to adapt to these changes.

A working program is the result many design decisions. These decisions -- why this variable was introduced, what that function does -- are not reflected in the final code, which consists of very low-level, detailed declarations and statements. But the higher-level design must be understood to successfully modify the program. If these decisions are not recorded as comments in the code, the new programmer must attempt to reconstruct them by tedious analysis. This technological archeology is painful, aggravating, annoying, and error-prone -- and all too common.

"But this is only a quick hack. I'll never use it again. I don't need to waste my time on comments." Wrong. Every useful program lives forever, often in spite of the author's original intentions. Even if your program is never seen by anyone else, you may be the "new" programmer when it needs to be changed. A few months from now you won't remember the details that went into writing it. If it was poorly written or lacks good comments you will have a miserable time with it.

"It's a waste of time to put the comments in now. I'm going to change the program when I debug it and I don't want to have to change the comments too. I'll add the comments once the program is working." Wrong. In the real world, even programmers with the best of intentions never go back and fill in the comments. There's always more code to write, other interesting projects to do. Comments must be included from the start. It takes very little extra effort to add them when writing the code and the resulting comments are better than anything added after the fact. The information that belongs in a comment is freshest when a variable or function is first invented. Trying to reconstruct it days or weeks later is hard and the results are not as good.

Writing good comments as you write a program will help you clarify your ideas and create a working program quicker. If you can write down clearly how your program works, you are more likely to have a good understanding of the problem and your code is more likely to be correct. Time spent on careful thinking and writing is more than repaid in time saved during testing and debugging.

Good comments can also help you find or avoid errors (bugs). A very common source of errors is inconsistent use of variables, including function parameters. If the precise definition of each variable is written in a comment, you can double check that each reference to a variable matches its definition. If a precise definition isn't included in the code, it's all too easy to use the variable in one way today and in a slightly different way tomorrow, which causes subtle, hard to find errors.

General Guidelines

The most important rule is that every function, data structure, and significant variable must be given a complete definition in a comment that accompanies its declaration. This comment should contain everything needed to understand the item and how to use it and no more (don't include unnecessary implementation details that should be private). Comments should be complete. But they may refer to a published handout, article, book, manual, or web site if the full definition is available elsewhere (and will be available over the lifetime of the program).

It should never be necessary to look at code that uses or implements a function, variable, data structure, or object to understand how to use it (as opposed to understanding how it's implemented). If you find you have to read client code to understand a function or variable, the comments are inadequate.

The complete definition should appear in only one place. If the definition is used in several files, it belongs in a header file. Duplicating information in more than one place creates problems as the program changes. It is all too easy to fix one comment and forget to fix a copy elsewhere. Exception: the implementation of a function should include a complete specification comment even if that specification also appears in a header file.

A prototype for a function that is defined and used in only one file does not need a specification comment.

Additional comments are needed to break up long sequences of statements. Indenting, blank lines, and other whitespace should be used to clarify the structure of the code and reduce clutter.

If you are modifying code originally written by someone else, match their style. It will help the next person who reads the program if a consistent style is used throughout.

The comments must agree with the program; false comments are worse than none at all.

File Headings

Every file in your code should begin with a short comment giving the name of the file, the author(s), and a very high-level description of its contents.

// complex.h -- interface to Complex number type and operations. 
// Al Gaulle, 6/8/60 

In this course, every file in each programming assignment should include your name and other required information, including your id number, section identifier, and instructor.

// main.c -- CSE143 assignment 3, Summer 2001. 
// A. Hacker, id #9936524, section AB, A. Turing  

Names

Naming is one of the most important parts of programming. Good names make a program more readable and can reduce the amount of other documentation needed.

Functions and nontrivial variables must be given meaningful names. An appropriate name for a variable or for a function that returns a value is a noun or noun phrase describing the contents of the variable or value returned (e.g., length, total_sales, currentInventory). For variables and functions with logical (bool) values, good names are ones that suggest the meaning of a true result: isEmpty, can_proceed. A good name for a function that is only executed for its effect (has a return type of void) is a verb or verb phrase describing the action performed (print_report, ringBell).

Capitalization is significant in C++. ThisId, thisid, and thisId are three different identifiers. Usually, identifier and function names should begin with lower-case letters. Type, struct, and class names should start with an upper-case letter. Used consistently, this makes code easier to read and reduces clashes between type and variable names.

Names should be neither too long nor too short. Names that are significant to the problem being solved should have descriptive names as should names defined in libraries for use by others. Avoid cryptic abbreviations. Use sales_tax or salesTax, not stx.

For variables and parameters that are only used in a small function or region of the program, a short name is often better than a long one.

/* = larger value of x and y */ 
int max (int x, int y) { 
    return (x > y) ? x : y; 
}

/* print a line containing n *'s */ 
void printStars(int n) { 
    for (int k = 1; k <= n; k++) 
        cout << "*"; 
    cout << endl; 
} 

A name like theLoopCounter instead of k, or firstNumber and secondNumber for x and y would only create clutter. But avoid large collections of cryptic names. A function with parameters named xx, xx1, ff1, ff, and fff will be hard to understand.

A variable used as a "flag" should not be named flag but should be named for what the flag represents, like noMorePizza. Avoid generic names like count, counter, and value. Instead, describe the items being counted or the value stored in the variable.

Use consistent naming conventions. The same concept should usually have the same name when it appears in more than one place.

Avoid vague, misleading, silly, or obscene names (it's already been done). The amount of pizza in the fridge should not be named stuff or cat or fred (even if it's Fred's pizza).

Data Definitions

Every significant variable and data structure needs a precise and complete definition. This should provide all of the additional information that is needed to understand the variable in addition to its name and type. The most useful information is often the invariant properties of the data: facts that are always true except, perhaps, momentarily when several related variables are being updated. For arithmetic variables this might be a formula showing how the variables are related ( /* 0 <= currentItem <= maxItems */ ). For a variable used as a logical value, it is often easiest to define it with a phrase that gives its meaning when true.

bool done; //="user has selected Quit from file menu" 

Definitions must be precise. Comments like "flag for loop" or "index into array b" say nothing. Instead, describe the condition the flag represents or, if i is used as a subscript for array b, explain what b[i] is. A definition like "error code" is not complete. If particular values of a variable have specific meanings, list those values and their significance.

Variables used only as loop indices or subscripts do not require a comment if none would be helpful.

Related variables should be declared and described together. For example, the definition of a table should describe not only the array that holds the data but also the integer variable containing the number of items currently in the table.

const int maxTemps=150;      // maximum # of temperature readings 

                             // Table of temperature readings: 
double temps[maxTemps];      // Temperature values are stored 
int nTemps;                  // in temps[0..nTemps-1]. 

These comments should appear to the right of the variable declarations. Use tabs to line up identifier names and comments. Don't run everything together; this is much harder to read.

double temps[maxTemps]; // Temperature values are 
int nTemps; // stored in temps[0..nTemps-1]. 

Related variables should usually be packaged in a single structure or object type. Comments describing the fields of the type belong with the type definition; comments beside variables that have that type should describe the contents of that particular variable, not the type fields.

struct Temps {              // Table of temperature readings: 
    double temps[maxTemps]; //   Temperature values are stored
    int nTemps;             //   in temps[0..nTemps-1]. 
}; 

Temps SEA;                  // temperatures in Seattle 
Temps HNL;                  // temperatures in Honolulu 

Function Specifications

Every function must be preceded by a comment giving its specification. This specification and the prototype of the function, which gives the number and types of the parameters and type of the result, should provide all of the information needed to use the function and no more. It should describe what the function does, not how it does it. One should never have to look at the body of a function to understand how to use it. The specification comment is the place where the parameters of the function are described. All of this can usually be worked into a sentence or two.

// Store in common the temperature found most frequently in table t 
// and store in num the number of occurrences of common in t. 
void find_common (Temps &t, double &common, int &num); 

It is, unfortunately, more typical to find a comment like this, if any comment is provided at all.

// Find most frequent temperature 
void find_common (Temps &t, double &ct, int &n);

What is the purpose of parameters t, ct, and n? Does the function change them? If so, how? The comment doesn't say.

The heading must be complete. But be concise. Don't write an essay if a short sentence will do. Use the active voice.* Omit needless words.** Don't write "Function to crash the car..." or "Function crumple crashes the car ..." or even "Crashes the car ...". Just say "Crash the car." Don't write "returns xyzzy". Use "return xyzzy" instead.

For a value-returning function it is often easiest to simply describe the value returned.

// = distance between points (x1,y1) and (x2,y2) 
double dist (int x1, int y1, int x2, int y2);

[*W. Strunk Jr. and E. B. White, The Elements of Style, 3rd ed., rule 14, p. 18. **Strunk and White, rule 17, p. 23. This book is required reading for all writers, including programmers.]

Statement Comments

Comments should be included in long sequences of statements to describe logical units of processing. A "statement comment" should be a higher-level description of the operation implemented by the group of statements that follows it.

// Ensure x >= y, exchanging the values of x and y if needed. 
   if (x < y ) { 
       tmp=x;
       x=y;
       y=tmp;
   } 

The comment should explain what the group of statements does, not how it does it. Comment groups of statements, not individual statements whose meaning is clear. Put a blank line before such comments to visually separate these paragraph-like chunks of code. These comments can be a great help to someone trying to understand the program since they document its high-level ("top-down") structure, which is not otherwise visible in the text. They also help a reader scan the program quickly to find the section of current interest, much like the section and paragraph headings in a book or article.

Statement comments must be complete. The comment

// Test for valid input 

is not adequate. What happens if the input is valid? What if it isn't? The comment should include this information.

// Make a rude noise and terminate execution if the input is not valid. 

Obscure or unusual code should be avoided but when necessary a comment should be used to clarify.

// round cents to nearest dollar 
   cents= 100 * ((cents+50) / 100); 

If a complex algorithm or data structure is being implemented, a block of comments describing it or a reference to other sources of information (books, articles, etc.) should be included above the group of data structure and function definitions.

Do not comment that which is already clear. Don't write

// print the gross sales amount 
   cout << gross_sales; 

or

// increment c 
   c++; 

Assume that the reader knows C++ at least as well as you do. Comments should not be used to explain how the programming language works.

Statement comments should be placed above the code they document, not out to the side. Such marginal comments usually wind up paraphrasing the code without adding useful information.

cin >> k;                     // get the next input number 
if (k < 0)                    // check if it's negative 
cout << "try again" << endl;  // print error message if it is 

Textbook authors sometimes do this to explain examples. In real programs it is useless clutter. Don't do it.

Exception: In long files, comments in the right margin can serve as useful "tab" markers to help the reader skim through the code. An example is a switch statement that extends over several pages.

switch (fruit) { 
    case banana: 
        ...                        // banana 
        break; 
    case apple: 
        ...                        // apple 
        break; 
    case kumquat: 
        ...                        // kumquat 
        break; 
    default: ...                   // unknown 
}

Avoid redundant comments. Say things once in the proper place rather than repeatedly throughout the program. It is very possible to obscure a program by over-commenting. More is not necessarily better. Your purpose in writing is to guide your readers and anticipate questions they might have. Include enough to do this and no more.

Coding Conventions

This section contains several low-level details that need to be attended to. These rules are not necessarily better than any others, but they will lead to readable code, so we ask you to use them in your programs.

Indenting

Programs should be indented to make them easier to understand. The bodies of functions, loops, and conditional statements should be indented to make the logical structure clear.

If one were to pick a single piece of syntactic trivia that is responsible for more pointless debates among C, C++, and Java programmers, it is probably where to put the left curly brace at the beginning of a compound statement. One possibility is at the end of the previous line.

if (x < y) { 
    x = y;
    y = 0; 
} else {
    x = 0; 
    y = y/2; 
} 
Another is to put it on a line by itself. 
if (x < y) 
{ 
    x = y; 
    y = 0;"
} 
else 
{
    x = 0;
    y = y/2;
} 

Which is best is mostly a matter of religious preference. The former has the advantage of conserving vertical space, which helps fit more code onto a single screen or page. The latter has a more pleasing symmetry. Some style guides suggest putting the curly brace that begins a function body on a line by itself and putting other left braces at the end of a line. Pick one style and use it consistently. If you're working on code written by someone else, match their style.

Compound Names

Another source of heated debate is how to spell a name made up of more than one word. Should it be compoundName or compound_name? Traditional typography would suggest using the underscore, because upper-case letters were not designed to appear immediately to the right of a lower-case letter. But embedded upper-case letters are widely used in computing, particularly in the Java community, and have rubbed off on the mass culture, resulting in the biZarRE CAPitAlIZaTiOn that became popular in advertising and display text a few years ago. Take your choice (of the underscore vs embedded caps; not the wEIrD stuff).

Symbolic Constants

Important constants should be given symbolic names and the these names should be used throughout the code instead of the numeric value. This is particularly true for physical constants and parameters related to the problem being solved.

const double pi = 3.1415926535;       // Physical constants 
const double e = 2.781828;

const int max_grades = 200;           // Maximum # grades in input

Using the symbolic name reduces the possibility of typographical errors. It also makes it much easier to change values when needed. If the constant max_grades is used throughout the program, it is easy to adjust the maximum by changing the number used to initialize it. But if 200 is used directly, it cannot be changed without scanning the code for every occurrence of 200, deciding if that occurrence refers to the maximum number of grades, and changing it if it does. It is very easy to miss a copy, change one that should be left alone, or miss a related number.

Traditional C usage is to capitalize the names of symbolic constants: PI, E, MAX_GRADES. If you are working on a program or using libraries that follow this convention, do the same.

Block Comments

Programmers have strong preferences about how to write comment blocks, like the heading comment that should appear at the beginning of each file. Possibilities include:

// stuff.h interface to stuff 
// hp, 9/99 

//-------------------------------- 
// stuff.h interface to stuff 
// hp, 9/99 
//-------------------------------- 

/*********************************/ 
/* stuff.h interface to stuff    */ 
/* hp, 9/99                      */ 
/*********************************/ 

/* 
 * stuff.h interface to stuff 
 * hp, 9/99 
 */ 

Pick a style that looks good to you and use it.

Acknowledgments

The ideas in this missive originated in the structured-programming movement of the 1970's, and are every bit as applicable today. Many specific examples were adapted from introductory courses at Cornell University, based on ideas originating in the work of Richard Conway and David Gries (see their 1973 text An Introduction to Programming, using PL/C). Other good ideas and examples were taken from the Tcl/TK Engineering Manual by John Ousterhout, which is an industrial-strength style guide for a large collection of C code.