An abstract data type is a programmer-defined type whose internal representation is hidden, and which can only be accessed via the operations that the programmer provides. For example, consider the following Stack structure in ML:
structure Stack :> sig type 'a stack exception EmptyStack val create : 'a stack val isEmpty : 'a stack -> bool val push : 'a -> 'a stack -> 'a stack val pop : 'a stack -> 'a stack val peek : 'a stack -> 'a end = struct type 'a stack = 'a list exception EmptyStack val create = nil fun isEmpty nil = true | isEmpty (x::xs) = false fun push v (s:'a stack) = v::s fun pop nil = raise EmptyStack | pop (x::xs) = xs fun peek nil = raise EmptyStack | peek (x::xs) = x end
The use of an opaquely ascribed signature that omits the
representation of 'a stack
makes this type abstract.
The programmer can only access the stack via the operations that
are defined on it.
Pure abstract data types first appeared at the language level in a language called CLU, by Barbara Liskov et al., in the mid-1970's.
In languages that do not provide any mechanisms for hiding data representation, we can still program in an ADT-like style, relying on programmer discipline to "hide" representation. For example, consider the following stack abstraction in Scheme (we assume the presence of an exception library like the one we defined in our notes on call/cc):
(define empty-stack '()) (define (push v a-stack) (cons v a-stack)) (define (empty? a-stack) (null? a-stack)) (define (pop a-stack) (if (null? a-stack) (raise '(empty-stack "Trying to pop empty stack")) (cdr a-stack))) (define (peek a-stack) (if (null? a-stack) (raise '(empty-stack "Trying to peek top of empty stack")) (cdr a-stack)))
Using this implementation strategy, we cannot prevent the programmer from accessing the private data representation of a stack. It is only a "friendly agreement" between the implementor and client.
This "exposed representation" style of implementation of ADTs is often used in purely procedural languages like C and Pascal --- these languages have neither ADTs, nor objects, nor first-class lexically scoped functions (the last of these, as we shall see below, can be used to hide representation).
Exposed representations are also often used in Lisp, because Lisp programmers often want to expose the representation --- which is often a list, and which can therefore be manipulated using the full suite of list functions (filter, foldr, etc.). The power of "lists as the universal interface" is compelling enough that Lisp programmers are often willing to live with the risk that some client will accidentally access and break the representation.
Procedural data abstraction (a term due to William Cook) is a different way of simulating some of the features of ADTs in a language without ADT features. Actually, we've already seen an example of PDA --- the point "object" from previous notes. Here's a Scheme implementation of procedural data abstraction of stacks:
(define (empty-stack) (lambda (method . args) (case method ((empty?) #t) ((push) (non-empty-stack (car args) (empty-stack)) ((pop) (raise '(empty "Tried to pop empty stack"))) ((peek) (raise '(empty "Tried to peek top of empty stack"))))))) (define (non-empty-stack top rest) (lambda (method . args) (case method ((empty?) #f) ((push) (non-empty-stack (car args) (non-empty-stack top rest))) ((pop) rest) ((peek) top))))
In this formulation, we use functions (procedures) to perform the abstraction. We are relying on the fact that the only legal operation on a function is to call it with some arguments. Since the function controls what arguments it may accept, it can therefore control access to its representation.
OOP can be viewed as a relative of PDA. The key feature that distinguishes ADT-style or PDA-style programming from OOP is inheritance. This adds greater opportunities for code reuse, and adds an important step to the design process.
When programming in abstract data types (or PDAs), the design process follows roughly the following steps:
The OOP design process is as follows:
Superclass factoring can occur at many points in the process. Usually, one recognizes common functionality after doing some design, and one recognizes even more after doing some implementation.
Two common forms of superclass factoring are refactoring, and framework definition:
There are two basic flavors of inheritance:
IconButton
as a subclass of Button
because an icon button is a special kind of button.Stack
inherit from Array
because you
can reuse the implementation of the array to hold the
elements.Generally, organizing for common interfaces is a better idea in the long run than inheritance purely for code reuse. (Although the Smalltalk libraries do have plenty of examples of the latter; even Smalltalk programmers aren't perfect.)
A concrete class is one that is intended to be instantiated dirctly; an abstract class is one that is intended to be used purely for inheritance.
Generally, one should make only "leaf" classes concrete. In other words, one should not inherit from concrete classes if it is possible to avoid it. Instead, create an abstract class that is the parent of the concrete class, and inherit from that. (This is sometimes not possible --- for example, most languages that do not permit you to change the superclass of a class without altering its source code, so if you don't have access to the source then you can't do this.)
The Smalltalk collections are heavily factored in the OOP style. Consider the following concrete collections:
All collections support certain methods, including iteration
via do:
, which takes a single-argument block and
applies it to each element in the collection.
Smalltalk factors these classes into an inheritance hierarchy. Here are some of the abstract collections Smalltalk defines:
Smalltalk organizes these in a particular way that trades off some "purity" of interface for code reuse. If one were factoring these collections ourselves, in the "cleanest" way possible, how would one do it?