Contents:
This handout describes how to document the specifications of classes and methods. This document focuses on practical issues.
This document uses a Line
class as an
example. We do not provide fields or method bodies in our
example. This document covers specifying the behavior of classes and
methods (what they should do), not their implementation (what they
actually do and how they do it).
Abstraction
Functions and Representation Invariants covers how to document a
class's implementation.
/** * This class represents the mathematical concept of a line segment. * * Specification fields: * @spec.specfield start-point : point // The starting point of the line. * @spec.specfield end-point : point // The ending point of the line. * * Derived specification fields: * @spec.derivedfield length : real // length = sqrt((start-point.x - end-point.x)^2 + (start-point.y - end-point.y)^2) * // The length of the line. * * Abstract Invariant: * A line's start-point must be different from its end-point. */ public class Line { ... // Fields not shown. /** * @spec.requires p != null && ! p.equals(start-point) * @spec.modifies this * @spec.effects Sets end-point to p */ public void setEndPoint(Point p) { ... } ... }
Because the concepts discussed here are interrelated, this document starts with a short list of definitions before diving into the details.
Line
represents some line segment.
Line
is made up
by the specification fields start-point
and end-point
.
Line
has the derived
field length
, which describes the length of the line segment.
Line
requires that no instance
has the same start and end point. Abstract invariants are expressed in terms of the abstract state.
Note that this is not the same as the Representation Invariant (RI) that describes properties of the concrete representation. An abstract invariant, if one is present, specifies constraints on abstract values only.
Line
's setEndPoint
method
updates the end-point
specification field.
The above concepts are included in a class's external specification (in Javadoc comments). They help document for clients how to use the class.
The rest of this document is organized as follows. First, it explains how to document what a class abstractly represents using abstract state, specification fields and derived fields. Then, it explains how to specify method behavior, in terms of abstract state.
The abstract value of an object is the information that clients use when reasoning about an object. For example, a line segment is defined in terms of a start point and an end point. This does not necessarily imply that the concrete representation of the object has two point fields. That is one representation, but there are others, such as a start point, an angle, and a length. Abstract values are typically at a less detailed level than an implementation because this helps clients reason about only what matters to them. For example, clients of some representation of strings just need to know that a string is a sequence of characters, not whether that sequence is implemented with an array, a linked list, a combination of the two, or some completely different way. The notion of sequence is more abstract than particular ways to represent sequences.
For some ADTs, the abstract values are well-described by concepts and notation that are common in mathematics and well-understood by software developers. Examples include:
If you are specifying such a class, then you're in luck. You can use conventional notation for specifying the class's abstract values and methods. Such notation includes:
You aren't obliged to use this syntax. Some of it is more standard than the rest: set-comprehension syntax is standard in just about all of mathematics, but sequence concatenation isn't particularly standardized. You may find it clearer to write sequence concatenation as a function like concat(x, y). What really matters is clarity and lack of ambiguity, so if you have any doubt whether your reader will understand you, just define it: “...where concat(x,y) is the concatenation of two sequences x and y.”
Usually, abstract values are not only simple mathematical objects like numbers or sets. In these cases, it's more useful to think about the abstract value as if it were an object with fields. For example, a line has a start and an end; a mailing address has a number, street, city, and zipcode; and a URL has a protocol, host name, and resource name.
Mathematically, this is the same as a tuple; it's just a tuple whose parts have useful names for humans, rather than simply parts in some order without names. So even though we could use tuples, it's convenient, and more readable, to break the abstraction state into named parts, where each part is a specification field. (Specification fields are more commonly called abstract fields, because they're fields of the abstract value, as opposed to rep fields which are fields of the representation value. Unfortunately, abstract has another meaning in Java, so we will avoid that potentially confusing terminology.)
Specification fields often (but not always) correspond to observers or
getters on
the abstract data type. Because the structure of abstract values
is a matter of interest to the clients of your
class, the specification fields should be listed in the class
overview. In CSE 331, we have a Javadoc convention for describing
them: @spec.specfield name : type // description
. Here's an
example:
/** * Represents an appointment for a meeting. * @spec.specfield date : Date // The time * @spec.specfield room : integer // The room number of the meeting's location * @spec.specfield with : Set<Person> // Whom the appointment is with */ class Meeting {
By convention, in specification fields, lowercase types like sequence or set refer to mathematical entities. Capitalized types refer to other ADTs (Java classes or interfaces). Where you have a choice, prefer a mathematical entity as the type of a spec field; it is better to use sequence than List, for example. It's more elegant, and reduces the coupling between your specification and particular Java types.
The presence of a specification field does not imply anything about the interface or implementation of the class. Although spec fields often correspond to observer methods, that's not always true. (An observer method is one that computes a value without performing any side effects. All getter methods are observers, but not all observers are getters.) The interface might not provide any observers that query the spec field's state, so clients of the class might not have access to the information stored in the spec field. (An example is that a stack implementation might have a spec field for the elements of the stack, but a client might only be able to push and pop rather than being able to obtain the full state of the stack.) Likewise, the implementation might not actually have a concrete field of the spec field's type: that information may be computed from multiple concrete fields, or it might not be available at all. The point is that specification fields are useful for giving method specifications in terms of the abstraction being provided.
Derived fields are information that can be derived from the specification fields, and that it is useful to give a name to. For example, consider this class:
/** * Represents a square. * @spec.specfield length : int // The length of the square's sides. * @spec.derivedfield area : int // area = length^2. The area of the square. * * Abstract Invariant: * length > 0 */ class Square {...}
The derived field area
can be derived by squaring the
length
specification field.
A derived field's documentation should state how it is derived from the
specification fields.
A derived field's purpose is to help with writing method
specifications, abstraction functions, and representation
invariants: It is easier to write and understand area
than
length^2
.
They are a shorthand that can make class and
method specifications easier to understand. Because a derived field
is defined entirely in terms of specification fields (or other derived
fields), method specifications do not
need to state a method's effects on a derived fields. For example:
/** * Represents a square. * @spec.specfield length : int // The length of the square's sides. * @spec.derivedfield area : int // area = length^2. The area of the square. * * Abstract Invariant: * length > 0 */ class Square { /** The length of the square's sides. */ int size; // Abstraction Function: // AF(r) = a square, s, with s.length = r.size. // // Representation Invariant: // size > 0 /** Creates a new Square with length = len. * @spec.requires len > 0 * @spec.effects constructs a new Square s with s.length = len */ public Square(int len) { if (len ≤ 0) { throw new IllegalArgumentException(len + " < 0"); } this.size = len; } /** Returns the difference in area between this and s. * @return this.area - s.area */ public int differenceInArea(Square s) { return (this.size*this.size) - (s.size*s.size); } /** Sets this.length to len. * @spec.requires len > 0 * @spec.effects sets this.length to len */ public void setLength(int len) { if (len ≤ 0) { throw new IllegalArgumentException(len + " ≤ 0"); } size = len; } }
Notice how the method specification of differenceInArea
uses the derived field area
to make it easier to explain what
it returns.
It is never necessary for a method specification to indicate its
effect on a derived specification field because the class
documentation has defined the derived specification field in terms of
specification fields. Since area
is a derived field,
the constructor does not need to say what the newly constructed
Square
's area
is. Similarly, the method
specification for setLength
does not need to document its
effect on area
.
When we relate concrete implementations to abstract values using Abstraction Functions, we will similarly only need to describe the abstraction in terms of specification fields and then the derived fields will follow from the specification fields.
Note that one could have made area
a specification field
instead of a derived field. This would relieve the
programmer from the responsibility of documenting how area
can be derived from the specification fields. However, in this case,
the constructor and setLength
would be required to specify
their effects on area
.
Suppose you have a derived specification field f
. It is
permissible for there to be a concrete field in the implementation that
stores the value of f
, or for there to be a method that computes
the value of f
, or for there to be no such field or method. That is an
implementation detail that is of no interest to clients of the
specification.
Another way to express the difference between a derived field and a specification field is the following. A derived field can be determined from one or more specification fields. A specification field might be computed from one or more concrete fields (or not stored in any concrete fields at all).
Method specifications describe the behavior of a method in terms of its preconditions and postconditions. Note that method specifications may only refer to specification fields, method arguments, and global variables (also known as public static fields), never to the concrete fields of the implementation.
Preconditions are properties that must be true when the method is called. It is the responsibility of the caller to guarantee that these properties hold. If the preconditions do not hold, the method is allowed to behave in absolutely any fashion, including crashing the program, continuing with incorrect results, informing the user of the problem, or gracefully recovering from the problem. Callers should always assume that preconditions are not checked by a method. However, it is good practice — though not required — for an implementation to check its preconditions (if the check can be performed efficiently) and throw an exception if they are not satisfied.
Preconditions are indicated by the "requires" clause in a method specification. If a "requires" clause is omitted from the method specification, it is assumed that the method does not have any preconditions.
The CSE 331 specification style doesn't include @param
, but
it's good to write it in Javadoc anyway, because that is good Java style.
Javadoc's @param
cannot express preconditions about global
state and is sometimes inconvenient for preconditions that relate multiple
formal parameters, such as "elt
is a member
of list
". Because @spec.requires
is needed
whether or not @param
exists, the CSE 331 specification
style puts all the preconditions into @spec.requires
, to
simplify explanations and so there is only one place to look for
preconditions.
Postconditions are properties that a method guarantees will hold when the method exits. However, if the precondition did not hold when the method was called, then nothing else is relevant, and the method may behave in any fashion whatsoever. In particular, if the precondition does not hold upon method entry, then the postcondition need not hold on method exit.
A postcondition can be written as a single complex logical formula, but it is convenient to separate it into logically distinct parts. CSE 331 uses "return", "effects", "throws", and "modifies". (In the descriptions below, "default" indicates what is assumed if that clause is omitted from the method specification.)
x
has specification fields f
,
g
, and h
, then "modifies x
" means
that any combination of x.f
, x.g
, and x.h
might be modified. "modifies x.g, x.h
" would be more
restrictive.
Often, programmers are more interested in quantities that are
not listed in the modifies clause, since those are
guaranteed not to be changed by the method.
Specification fields are useful for writing specifications of the
ADT's operations.
Here's a specification for a method on the
Line
class:
/** * @spec.requires l.start-point is equal to this.end-point && l != null * @return a line segment that is equal to this + l; that is, l appended to this */ public Line add(Line l) {...}
Specifications may refer to specification fields (such as
start-point
and end-point
), but never to representation fields. Rep fields depend on a particular
implementation, so we don't want to expose them in a specification,
which can be implemented in many different ways.
When you're writing specifications for operations, you may find it useful to define derived spec fields. A derived spec field is just a spec field that can be written in terms of other spec fields. In other words, it's a shorthand.
You can freely use derived spec fields in a method specification (and easing such specifications is the point of derived spec fields):
/** * @return true is this.length > l.length */ public boolean longer(Line l) {...}
A subclass often has a different (stronger) specification than its superclass, and often has a larger abstract state than its superclass. When the specification and abstract state are identical to those of the parent (for instance, an implementation of an abstract class), then there is no need to repeat them in the subclass. However, it is helpful to include a brief note indicating that the superclass documentation should be used instead. That note helps readers to distinguish whether the specification is the same, or the author simply didn't document the class.
When the specifications differ, then you have two options. The first option is to repeat, in the subclass, the full superclass documentation. The advantage is that everything is in one place, which may improve understanding. The second option is to augment the existing specification — for example, to add a few new specification fields and constraints on them. Whichever you do, make sure that you clearly indicate your approach, and that you are consistent throughout the codebase.
Similar rules hold for a method that overrides another method. It is acceptable to omit the Javadoc comment if the specification is identical. (The generated HTML will use the overridden method's Javadoc documentation, but a normal Java comment is a good hint to someone who is reading the source code.) Otherwise, it is usually better to give the complete specification. If you merely augment the overridden method's specification, be sure to refer to it in the documentation.
Back to the CSE 331 handouts home page.