CSE 331: Class Specifications

Contents:

Introduction
Abstract Values
Method Specifications
Subclasses and overridden methods

Introduction

This article describes how to document the specifications of classes.

As with method specifications, the purpose of a class specification is to document the intended behavior of the class (what its methods should do) rather than its implementation (how they are implemented). Documenting the implementation of the class is also important but is discussed in a separate (forthcoming) document.

Motivation

In writing a specification for a class, our goal is to describe its intended behavior without any reference to the actual implementation. This last part is important for several reasons.

First, just as with methods, we need to know what a class should do in order to understand whether our implementation is actually correct. If our specification just describes the implementation, then any attempt to prove that our implementation is correct would be circular reasoning that wouldn't really prove anything. As an extreme example, if our specification says "this class should do whatever the code below does", then we could write anything at all and it would be considered "correct".

Second, there may not actually be an implementation yet. If we are working on a project with many people, we usually want to agree on the specifications for the important classes before we go off separately to implement and test our classes. Doing so makes it more likely that our implementations will work together properly when they are completed.

Third, suppose that we want to change the implementation (say, to write a faster version). Just as when implementing the class initially, we need to have a specification in order to know whether our new version will be correct. A specification that refers to the old implementation may no longer make sense for our new version, leaving us without any way to judge correctness.

Overview

In order to talk about what the methods of a class should do without talking in terms of its actual implementation, we will instead talk about a hypothetical implementation. We will refer to instances of the hypothetical implementation as abstract values. The state of those objects will be described in terms of fields, but since they are hypothetical fields of the specification rather than the actual fields, we will refer to them as specification fields or spec fields.

As an example, consider a class that represents a line segment. There are multiple ways that we could represent that in a class. One way would be to record the two end points as fields. However, in 2D, we could also represent it using one point, an angle, and a length, where the angle and length tell us how to get to the second point. Depending on the operations we want to support, either one of these specifications could be preferable to the other.

When specifying the class, we would pick one of these representations based on which makes the specification simpler and clearer. When we actually implement the class, we may care more about other attributes, like efficiency, rather than simplicity, which could make another representation preferable.

Abstract values are also typically defined in less detail than is necessary for an actual implementation. For example, a specification for a string might describe it as a sequence of characters without specifying whether that sequence is stored in an array, linked list, etc. The notion of "sequence" is more abstract than arrays and linked lists but is still sufficient to talk about what the methods of the string class are supposed to do. In fact, using less detailed (more abstract) values actually helps clients reason about the class since they can ignore details of the implementation that are irrelevant to them.

Example

The following shows how we would document a class that represents a line segment.


/**
 * This class represents the mathematical concept of a line segment.
 *
 * Specification fields:
 *  @specfield start-point : point // The starting point of the line
 *  @specfield end-point   : point // The ending point of the line
 *
 * Derived specification fields:
 *  @derivedfield length : real // length = sqrt((start-point.x - end-point.x)^2 + (start-point.y - end-point.y)^2)
 *                              // The length of the line
 *
 * Abstract Invariant:
 *  A line's start-point must be different from its end-point.
 */
public class LineSegment {

  ... // Fields not shown.

 /**
  * @requires p != null && ! p.equals(start-point)
  * @modifies this
  * @effects Sets end-point to p
  */
  public void setEndPoint(Point p) {
    ...
  }

  ...

}

The javadoc above the class uses "specfield" clauses to describe the specification fields. In this case, we represent a line segment by its two end points.

In addition, it has a derived field, defined in a "derivedfield" clause. The latter is simply a shorthand for a some information that can be computed from the spec fields. Defining a shorthand for it allows us to simplify the assertions that appear elsewhere, e.g., in our method preconditions and postconditions.

The final component that appears in the javadoc for the class is an abstract invariant. This consists of assertions about the abstract values (described in terms of its spec fields or derived fields) that must hold at all times, at least from the client's perspective. (The methods of the class may break the invariant briefly as long as it holds again by the the time they exit.)

Abstract Values

Mathematical Values

For some ADTs, the abstract values are well-described by concepts and notation that are common in mathematics and well-understood by software developers. Examples include:

a set of integers
a sequence of characters (i.e., a string)
a pair of real numbers (or a triple, or in general a tuple)

If you are specifying such a class, then you're in luck. You can use conventional notation for specifyng the class's abstract values and methods. Such notation includes:

set comprehension: { x | P(x) } denotes the set of all elements x that satisfy the property P. More generally, { f(x) | P(x) } denotes the set of values of the expression f(x) for all x that satisfy the property P. For example, { x * x | x > 10 } represents the set of all numbers whose square root is greater than 10.
set union: x ∪ y denotes the union of two sets x and y. (This can also be written x + y when there's no danger of confusion with addition.)
set membership: a ∈ x or a in x tests whether a is an element of the set x.
sequence construction: [a, b, c] denotes a sequence of three elements.
sequence concatenation: x : y denotes the concatenation of two sequences x and y. (This can also be written x + y when there's no danger of confusion with addition or union.)
sequence indexing: x[i] denotes the i^th element of a sequence x.
set or sequence size: |x| denotes the number of elements in a set or sequence x.
tuple construction: <a, b, c> is a tuple of three elements. This is also written (a, b, c). Unlike sequences, tuples are fixed-length, so we don't normally think about concatenating them.

You aren't obliged to use this syntax. Some of it is more standard than the rest: set-comprehension syntax is standard in just about all of mathematics, but sequence concatenation isn't particularly standardized. You may find it clearer to write sequence concatenation as a function like concat(x, y). What really matters is clarity and lack of ambiguity, so if you have any doubt whether your reader will understand you, just define it: “...where concat(x,y) is the concatenation of two sequences x and y.”

Specification Fields

We describe abstract values (instances of our hypothetical classes) as a set of named fields, called specification fields. Each field has a value that is either a mathematical value of the types described above or an instance of another class, which means it is an abstract value of that class.

In CSE 331, we have a Javadoc convention for describing spec fields: @specfield name : type // description. Here is an example:

/**
 * Represents an appointment for a meeting.
 * @specfield date : Date         // The time
 * @specfield room : integer      // The room number of the meeting's location
 * @specfield with : Set<Person>  // Whom the appointment is with
 */
 class Meeting {

By convention, in specification fields, lowercase types like sequence or set refer to mathematical entities. Capitalized types refer to other ADTs (classes or interfaces). Where you have a choice, prefer a mathematical entity as the type of a spec field; it is better to use sequence than List, for example. It's more elegant, and reduces the coupling between your specification and particular Java types.

The presence of a specification field does not imply anything about the interface or implementation of the class. Although spec fields often correspond to getter methods, that's not always true. The interface might not provide any getter methods that query the spec field's state, so clients of the class might not have access to the information stored in the spec field. (An example is that a stack implementation might have a spec field for the elements of the stack, but a client might only be able to push and pop rather than being able to obtain the full state of the stack.) Likewise, the implementation might not actually have a concrete field of the spec field's type: that information may be computed from multiple concrete fields, or it might not be available at all. The point is that specification fields are useful for giving method specifications in terms of the abstraction being provided.

Derived Fields

Derived fields are information that can be derived from the specification fields that it is useful to give a name to. For example, consider this class:

/**
 * Represents a square.
 * @specfield length : int // The length of the square's sides
 * @derivedfield area : int // area = length^2. The area of the square
 *
 * Abstract Invariant:
 *  length > 0
 */
class Square {...}

The derived field area can be derived by squaring the length specification field. A derived field's documentation should state how it is derived from the specification fields.

A derived field's purpose is to help with writing the specification: It is easier to write and understand area than length^2. They are a shorthand that can make class and method specifications easier to understand. Because a derived field is defined entirely in terms of specification fields (or other derived fields), method specifications do not need to state a method's effects on a derived fields. For example:

/**
 * Represents a square.
 * @specfield length : int // The length of the square's sides
 * @derivedfield area : int // area = length^2. The area of the square
 *
 * Abstract Invariant:
 *  length > 0
 */
class Square {

  /**
   * Creates a new Square with length = len.
   * @requires len > 0
   * @effects a new Square s with s.length = len
   */
  public Square(int len) {
    ...
  }

  /**
   * Returns the difference in area between this and s.
   * @return this.area - s.area
   */
  public int differenceInArea(Square s) {
    ...
  }

  /**
   * Sets this.length to len.
   * @requires len > 0
   * @effects sets this.length to len
   */
   public void setLength(int len) {
     ...
   }
}

Notice how the method specification of differenceInArea uses the derived field area to make it easier to explain what it returns.

It is never necessary for a method specification to indicate its effect on a derived field because the class documentation has defined the derived field in terms of specification fields; hence, we can already determine how the derived fields change from the information about how spec fields change. For example, since area is a derived field, the constructor does not need to say what the newly constructed Square's area is (we can determine that from what the length is). Similarly, the method specification for setLength does not need to document its effect on area.

Note that, on the other hand, we could have made area a specification field instead of a derived field. This would mean that we would not need to say how area is computed from the spec fields. However, in that case, the constructor and setLength would be required to specify their effects on area.

Suppose you have a derived field f. It is permissible for there to be a concrete field in the implementation that stores the value of f, or for there to be a method that computes the value of f, or for there to be no such field or method. That is an implementation detail that is of no interest to clients of the specification.

Method Specifications

As described in the document on method specifications, we frequently need to talk about the state of objects in order to specify the intended behavior of our methods.

This applies to all of the clauses in the javadoc for a method ("requires", "return", ...); however, the most common case is with the "effects" clause, which describes how the state of an object is changed by that method. The most common case of that is for the object "this". In that case, we are describing how the method changes the state of the object on which the method was called.

In all of these circumstances where we wish to describe the state of an object, we do so using its specification and derived fields. There should be no references to the actual implementation of that class (for all of the reasons described above).

It is worth noting that we can even refer to spec fields in the "modifies" clause, which is a list of variables that may be modified by the method. If object x has specification fields f, g, and h, then "modifies x" means that any combination of x.f, x.g, and x.h might be modified. However, if we write "modifies x.g, x.h", then we know that x.f is not changed.

Here is an example specification for a method on the LineSegment class:

/**
 * @requires l.start-point is equal to this.end-point && l != null
 * @return a line segment that is equal to this + l; that is, l appended to this
 */
public Line add(Line l) {...}

This method specifications refers to the specification fields start-point and end-point in order to describe the precondition of the method (in the "requires" clause).

We can also refer to derived fields in method specifications as shown in this example:

/**
 * @return true iff this.length > l.length
 */
public boolean longer(Line l) {...}

Subclasses and overridden methods

A subclass often has a different (stronger) specification than its superclass, and often has a larger abstract state than its superclass. When the specification and abstract state are identical to those of the parent (for instance, an implementation of an abstract class), then there is no need to repeat them in the subclass. However, it is helpful to include a brief note indicating that the superclass documentation should be used instead. That note helps readers to distinguish whether the specification is the same, or the author simply didn't document the class.

When the specifications differ, then you have two options. The first option is to repeat, in the subclass, the full superclass documentation. The advantage is that everything is in one place, which may improve understanding. The second option is to augment the existing specification -- for example, to add a few new specification fields and constraints on them. Whichever you do, make sure that you clearly indicate your approach.

Similar rules hold for a method that overrides another method. It is acceptable to leave the Javadoc blank if the specification is identical. (The generated HTML will use the overridden method's Javadoc documentation, but a normal Java comment is a good hint to someone who is reading the source code.) Otherwise, it is usually better to give the complete specification. If you merely augment the overridden method's specification, be sure to refer to it in the documentation.