Class and Method Specifications

Contents:

Introduction
Abstract Values and Abstract State
- Mathematical Notation
- Names
Method Specifications
- Preconditions
- Postconditions
Subclasses and overridden methods

Introduction

This handout describes how to document the specifications of classes and methods, focusing on practical issues.

This document uses a Line class as an example. We do not provide fields or method bodies in our example. This document covers specifying the behavior of classes and methods (what they should do), not their implementation (what they actually do and how they do it). Abstraction Functions and Representation Invariants covers how to document a class's implementation

/**
 * This class represents the mathematical concept of a line segment. Think of
 * this as a pair (start-point, end-point), containing the starting and ending
 * points of the line.
 *
 * Abstract Invariant:
 *  A line's start-point must be different from its end-point.
 */
public class Line {

  ... // Fields not shown.

 /**
  * @spec.requires p != null && ! p.equals(start-point)
  * @spec.modifies this
  * @spec.effects Sets end-point to p
  */
  public void setEndPoint(Point p) {
    ...
  }

  ...

}

Because the concepts discussed here are interrelated, this document starts with a short list of definitions before diving into the details.

"Abstract" — Used to refer to an idea or concept, as opposed to something that exists in the code itself.
"Concrete" — The opposite of "abstract," something that is found literally written in the code. Writing "String s = "Hello";" is creating a concrete String object, which represents the abstract idea of the word "Hello".
Abstract Value — What an instance of a class is supposed to represent. For example, each instance of Line represents some line segment.
Abstract State — The information that defines the abstract value. For example, each abstract line has a start point and an end point.
Abstract Invariant — A condition that must be true over the abstract state of every instance of a class. For example, Line requires that no instance has the same start and end point. Abstract invariants are expressed in terms of the abstract state. Note that this is not the same as the Representation Invariant (RI) that describes properties of the concrete representation. An abstract invariant, if one is present, specifies constraints on abstract values only.
Method Specifications — Describe a method's behaviors in terms of abstract state. For example, Line's setEndPoint method updates the end-point, part of the pair that makes up its abstract state.

The above concepts are included in a class's external specification (in Javadoc comments). They help document for clients how to use the class.

The rest of this document is organized as follows. First, it explains how to document what a class abstractly represents using abstract state. Then, it explains how to specify method behavior, in terms of abstract state.

Abstract Values and Abstract State

The abstract value of an object is the mathematical concept that clients use to understand and reason about the object. For example, a line segment represents the set of points in the plane that lie on a particular line and in between two particular end points. In the computer, every object is a series of zeros and ones, but these represent higher-level ideas, their abstract values.

The abstract state of an object is the specific mathematical shape we use to define the abstract value for the client. In the example above, we defined the abstract state as a pair of points (start-point, end-point). This is just four numbers; however, the "meaning" of that pair (the abstract value) is really all the points in the plane that lie in between those two points, not just the two end points.

This abstract state does not necessarily imply that the concrete representation of the object is a pair of points. That is one concrete representation, but there are others, such as a start point, an angle, and a length. Abstract states typically describe the object at a less detailed level than the implementation so that clients can reason about only the concepts that are necessary to check that their own code is correct.

As another example, clients of String may only need to know that a string is a sequence of characters, not whether that sequence is implemented with an array, a linked list, a combination of the two, or some completely different way. The notion of sequence is more abstract than particular ways to represent sequences. How the String "cat" is represented concretely (within the Java code) does not affect its abstract state. Whether String internally uses an array, a List, or some clever compressed numerical encoding for the the characters 'c', 'a', and 't', the abstract state of "cat" is still a just a sequence of those letters.

Mathematical Notation

For some ADTs, the abstract values are well-described by concepts and notation that are common in mathematics and well-understood by software developers. Examples include:

a set of integers
a sequence of characters (i.e., a string)
a pair of real numbers (or a triple, or in general a tuple)

If you are specifying such a class, then you're in luck. You can use conventional notation for specifying the class's abstract values and methods. Such notation includes:

set comprehension: { x | P(x) } denotes the set of all elements x that satisfy the property P. More generally, { f(x) | P(x) } denotes the set of values of the expression f(x) for all x that satisfy the property P. For example, { x * x | x > 10 } represents the set of all numbers whose square root is greater than 10.
set union: x ∪ y denotes the union of two sets x and y. (This can also be written x + y when there's no danger of confusion with addition.)
set membership: a ∈ x or a in x tests whether a is an element of the set x.
sequence construction: [a, b, c] denotes a sequence of three elements.
sequence concatenation: x : y denotes the concatenation of two sequences x and y. (This can also be written x + y when there's no danger of confusion with addition or union.)
sequence indexing: x[i] denotes the i^th element of a sequence x.
set or sequence size: |x| denotes the number of elements in a set or sequence x.
tuple construction: <a, b, c> is a tuple of three elements. This is also written (a, b, c). Unlike sequences, tuples are fixed-length, so we don't normally think about concatenating them.

You aren't obliged to use this syntax. Some of it is more standard than the rest: set-comprehension syntax is standard in just about all of mathematics, but sequence concatenation isn't particularly standardized. You may find it clearer to write sequence concatenation as a function like concat(x, y). What really matters is clarity and lack of ambiguity, so if you have any doubt whether your reader will understand you, just define it: “...where concat(x,y) is the concatenation of two sequences x and y.”

Names

In the example above, we also implicitly defined names for the two endpoints: by defining the abstract state as (start-point, end-point), we were saying that we would refer to the first point in the pair as "start-point" and the second point in the pair as "end-point".

You can define names implicitly like this if you want, or you can write out more explicit definitions of the terminology that you will use to talk about the abstract states of your objects. The only strict requirement is to make sure that you are communicating clearly to the client. Misunderstandings between yourself and the client can easily lead to bugs.

Method Specifications

Method specifications describe the behavior of a method in terms of its preconditions and postconditions. Note that method specifications may refer to abstract states, method arguments, and global variables (also known as public static fields), but never to the concrete fields of the implementation - the client should not need knowledge of these concrete fields.

Preconditions

Preconditions are properties that must be true when the method is called. It is the responsibility of the caller to guarantee that these properties hold. If the preconditions do not hold, the method is allowed to behave in absolutely any fashion, including crashing the program, continuing with incorrect results, informing the user of the problem, or gracefully recovering from the problem. Callers should always assume that preconditions are not checked by a method. However, it is good practice — though not required — for an implementation to check its preconditions (if the check can be performed efficiently) and throw an exception if they are not satisfied.

Preconditions are indicated by the "requires" clause in a method specification. If a "requires" clause is omitted from the method specification, it is assumed that the method does not have any preconditions.

@spec.requires (default: no constraints): The preconditions that must be met by the method's caller.

The CSE 331 specification style doesn't include @param, but it's good to write it in Javadoc anyway, because that is good Java style. Javadoc's @param cannot express preconditions about global state and is sometimes inconvenient for preconditions that relate multiple formal parameters, such as "elt is a member of list". Because @spec.requires is needed whether or not @param exists, the CSE 331 specification style puts all the preconditions into @spec.requires, to simplify explanations and so there is only one place to look for preconditions.

Postconditions

Postconditions are properties that a method guarantees will hold when the method exits. However, if the precondition did not hold when the method was called, then nothing else is relevant, and the method may behave in any fashion whatsoever. In particular, if the precondition does not hold upon method entry, then the postcondition need not hold on method exit.

A postcondition can be written as a single complex logical formula, but it is convenient to separate it into logically distinct parts. CSE 331 uses "return", "effects", "throws", and "modifies". (In the descriptions below, "default" indicates what is assumed if that clause is omitted from the method specification.)

@returns (default: no constraint on what is returned): The value returned by the method, if any.
@throws (default: none, which means that no exceptions are ever thrown): The exceptions that may be raised, and under which conditions. The specification should not make guarantees about the behavior when the preconditions are not satisfied.
@spec.modifies (default: nothing, which means that there are no side effects): Objects whose abstract value may be modified by the procedure. They are not guaranteed to be modified, unless otherwise indicated by the effects clause. Usually, programmers are more interested in objects that are not listed in the modifies clause, since those are guaranteed not to be changed by the method.
@spec.effects (default: true, which means "can have any effect" on the references listed under modifies): The side effects of the method, such as changes to the state of the current object, to parameters, or to objects held in global variables. If an object is listed in the modifies clause but not in the effects clause, then it may take on any value allowed by the abstract invariants of this class of objects. The difference between the modifies and effects causes is that modifies lists everything that may change and effects indicates what changes occur.

Subclasses and overridden methods

A subclass often has a different (stronger) specification than its superclass, and often has a larger abstract state than its superclass. When the specification and abstract state are identical to those of the parent (for instance, an implementation of an abstract class), then there is no need to repeat them in the subclass. However, it is helpful to include a brief note indicating that the superclass documentation should be used instead. That note helps readers to distinguish whether the specification is the same, or the author simply didn't document the class.

When the specifications differ, then you have two options. The first option is to repeat, in the subclass, the full superclass documentation. The advantage is that everything is in one place, which may improve understanding. The second option is to augment the existing specification — for example, to add new abstract invariants. Whichever you do, make sure that you clearly indicate your approach, and that you are consistent throughout the codebase.

Similar rules hold for a method that overrides another method. It is acceptable to omit the Javadoc comment if the specification is identical. (The generated HTML will use the overridden method's Javadoc documentation, but a normal Java comment is a good hint to someone who is reading the source code.) Otherwise, it is usually better to give the complete specification. If you merely augment the overridden method's specification, be sure to refer to it in the documentation.