class name | Pair |
---|---|
superclass | Object |
instance variable names | first second |
methods | |
first ^ first second ^ second |
Consider the Pair
class shown in Fig. 1. Notice
that, since instance variables are only accessible in their owning
class, and this class does not provide accessors, there is no way
to mutate the members of Pair
--- this is Smalltalk's
idiom for immutable data.
However, as written, this class has an obvious deficiency.
Recall that classes are objects, and that we instantiate
objects by sending the new
message to the class. We
could produce a fresh instance of this class with the
expression
Pair new
However, the default implementation of new
only
allocates space for the instance variables and
initializes them to nil
. Since we cannot mutate the
fields, this Pair
class is only useful for holding
pairs of nils.
One solution to this problem is to provide a method to initialize the fields:
first: firstVal second: secondVal first := aValue. second := aValue. ^ self.
Now a programmer can initialize the Pair:
(Pair new) first: 3 second: 4
However, it would be nice if we could customize
instantiation directly --- what we'd really like to do is
send a customized instantiation message to the Pair
class:
Pair first: 3 second: 4
and have it return a fresh instance that was already initialized.
It turns out that Smalltalk has so-called class methods, which play roles similar to those played by as constructors and static methods in Java. In the Squeak class browser, there is a button below the class list which allows you to select and define "class" methods, as shown in Fig. 2
When we select the "class" method button, instance methods and definitions are no longer visible in the browser. Instead, we view class methods. When you first select the class button, the metaclass description will appear in the code pane of the browser. We describe metaclasses further below, but for the moment you can just think of this as a description of your class's behavior.
A class method is a method that handles messages sent to the class, rather than instances of the class. This has the following consequences:
self
is bound to
the class object itself. Notice that we use self
new
in the body of the class method to construct the raw
object, rather than Pair new
.Class methods are useful not only for implementing specialized instantiators --- roughly corresponding to "constructors" in some other object-oriented languages --- but for any kind of behavior that you would like to share among classes.
Recall the following, from our Smalltalk introductory notes
Following these rules to their logical conclusion, we conclude that:
But where is this "class of a class"? There are several answers that might be "reasonable" in a language design. Smalltalk-80 (and therefore Squeak) take one particular route, but it is instructive to examine other possibilities before we explain the full-fledged Smalltalk-80 solution.
Class
In Smalltalk-76, the class of a class is Class
(note the capital letter). Class
implements all the
methods that any class would need --- for example,
new
is implemented by the Class
class.
The problem with this approach is that all classes share
Class
. If we wish to add a "class method" to
any class --- for example, our first:second:
method
for Point
--- then we must add it as a method of
Class
.
This gets cumbersome, for obvious reasons. A full-fledged
Smalltalk environment will have thousands of classes. Forcing all
the classes to share Class
will result in namespace
management problems --- there are only so many short, intuitive
message names to go around.
Java is actually quite similar to Smalltalk-76; Java has a mechanism called reflection (which is no big deal in Smalltalk, but which people coming from the C++ world found rather exotic) whereby you can ask object for its class:
// Reflective Java code; evaluates to the String.class value "hi".getClass();
Once again, if you ask the class of a class for its
class, you get Class
:
Class c = "hi".getClass().getClass();
The Class
class is shared by all classes. So why
don't we have the same problem in Java that we'd have in
Smalltalk-76? The answer is that the Java language designers
elected to add extra kinds of declarations and
expressions to the language, namely:
public class Foo { int x; public Foo(int x) { this.x = x; } } pubic class Bar extends Foo {}If constructors were "like any other method", then we would be able to call
new Bar(3)
, because the
single-argument integer constructor would be inherited from
Foo
. This is not the case (in fact, language
lawyers will note that the definition of Bar
above
will not compile, because all constructors in a non-abstract
class must invoke one valid superclass constructor; and Bar's
one constructor --- the no-argument constructor that gets
automatically generated if the user doesn't define any --- does
not do this.)new
,
super
, and this
: The Java
expression new Foo()
is not an ordinary
message send. It is a special kind of expression that
invokes a constructor. Similarly, super()
and
this()
constructor invocations are special kinds of
expressions.static
fields and methods:
Java defines different kinds of fields and methods,
with different rules about inheritance and invocation.Even for all this complexity, there are still things that Java
doesn't handle well. For example, a new
expression
requires a direct class name as its argument, not an
expression that evaluates to a class. It is not possible* for a
client to vary the class instantiated at runtime based on a
computed expression, except by enumerating all the class names
that might be instantiated in an if/then/else statement. Compare
Smalltalk:
instantiate: aClass numTimes: anInt anInt timesRepeat: [ aClass new. ]
The naive equivalent code in Java is illegal:
void instantiate(Class aClass, int numTimes) { for (int i=0; i<numTimes; i++) { // ILLEGAL: new takes a class name, not an expression! new aClass(); } }
* Or, with reflection, merely extraordinarily cumbersome.
A design pattern is what it sounds like: a commonly reused "pattern" or "template" for a group of classes (or functions) that cooperate to solve a particular design problem. The abstract factory design pattern was invented to solve the problem that instantiation is not generally first-class.
In the abstract factory pattern, the programmer defines an abstract factory that abstracts away instantiation:
public abstract class AbstractFactory { public Foo makeFoo(int x); }
Later, some programmer defines a concrete factory, which has the choice of providing an instance of any subclass of the declared result type:
// (assuming Bar and Baz extend Foo) public class BarFactory extends AbstractFactory { public Foo makeFoo(int x) { return new Bar(x); } } public class BarFactory extends AbstractFactory { public Foo makeFoo(int x) { return new Baz(x); } }
Now, depending on which concrete factory is selected, the client will get a different "collection" of classes that might be related. The client doesn't need to know which concrete factory (s)he is using --- the client only needs to know that there is some implementor of the abstract factory.
The classic example is a cross-platform GUI (graphical user interface) toolkit. You want your application core to be GUI-independent, so you can run it on a Windows, Linux, or Mac GUI, but you need to instantiate different widgets (e.g. buttons or menus) on each platform. The abstract factory solution is to define an abstract factory for the GUI toolkit, and then have different concrete factories for different platforms (Windows, Linux, Mac).
However, using abstract factories is cumbersome and requires
considerable foresight: before you write all the new
statements, you must decide to use the abstract factory
pattern.
Smalltalk-80 uses metaclasses to provide
class-specific behavior. The idea behind a metaclass is simple:
for each class, define a metaclass (created automatically
when the user creates the class) to hold its methods. Hence, for
example, SmallInteger
has a metaclass, denoted
SmallInteger class
, which holds
SmallInteger
-specific class methods.
The idea is simple. Taking this idea to its logical conclusion involves some complications, however.
First, for any class A
and a subclass
SubA
, we want SubA class
to
inherit the methods of A class
.
For example, consider a Stack
class, which
inherits fom an Array
class. Suppose Array
class
provides a class method
size:initialValue:
which takes an integer and returns
an array with that many elements, each initialized to the
initialValue:
argument. We don't want to have to
write this all over again in the Stack class
; we just
want to inherit it.
And, indeed, this is what Smalltalk does when it automatically creates the metaclass for any class:
SubA
,A
,SubA
's metaclass is the
metaclass of the superclass of SubA
--- i.e.,
SubA class
subclasses A class
.If you take this to its logical conclusion, you may wonder what
the superclass of Object class
(the metaclass of
Object
) is. Well, the answer gets into rather hairy
implementation details, but if you want Squeak's answer,
try printIt on each of the following expressions:
Object class. Object class superclass. Object class superclass superclass. Object class superclass superclass superclass. Object class superclass superclass superclass superclass. "These last few are pretty interesting." Object class superclass superclass superclass superclass superclass. Object class superclass superclass superclass superclass superclass superclass. Object class superclass superclass superclass superclass superclass superclass superclass. Object class superclass superclass superclass superclass superclass superclass superclass class.
Metaclass
And we have just learned:
Following these rules to their logical conclusion, we deduce:
It so happens that all metaclasses share the single class
Metaclass
. Of course, taking this a step further, we
deduce:
Metaclass
is an object.Metaclass
has a class.Metaclass
is a class, so its class is a metaclass,
and is denoted Metaclass class
, like any other
metaclass.
Now, here's the really interesting part: what's the class of
Metaclass class
? Well, Metaclass class
is a metaclass, and all metaclasses are instances of
Metaclass
. So Metaclass
's metaclass is
an instance of Metaclass
--- there is a circular
instance-of relationship.
If you don't believe me, do printIt on the following code:
3 class. 3 class class 3 class class class 3 class class class class 3 class class class class class 3 class class class class class class
Metaclasses are one of the few really ugly parts of Smalltalk. They are a prime demonstration of the fact that simple mechanical rules, taken to their logical conclusion, sometimes lead to results that humans find confusing.
On the other hand, the benefits for programmers of having first-class classes --- i.e., classes that can understand messages, and that can define customized methods --- ultimately does pay off. One consequence is that there's no need for programmers to learn a "factory pattern" in Smalltalk --- classes themselves can serve as factories.
On the third hand, there is a way to get all the benefits of metaclasses without any of the costs. That is to abandon the following propositions in the core Smalltalk catechism:
Instead, get rid of the notion of classes completely, and replace it with the following:
Now, there is no need for metaclasses, because the parent object need not itself have a parent. Rather than the two kinds of relationships among objects --- "instance-of" and "subclass-of" --- there is only one: "parent-of". Rather than instantiating objects from classes, you simply clone objects.
In the late 80's, Dave Ungar and Randall B. Smith designed Self, a purely prototype-based programming language. Self is an interesting language that we probably won't have time to discuss further in this class. For language aficionados, I highly recommend the paper Self: The Power of Simplicity, originally published in OOPSLA '87.
The advantage of prototype-based programming is its regularity, simplicity, and power. The disadvantage is that there are fewer "cues" by which a programmers who is inspecting a program can understand its structure. The Ungar and Smith paper has the following very interesting paragraph:
Reducing the number of basic concepts in a language can make the language easier to explain, understand, and use. However, there is a tension between making the language simpler and making the organization of a system manifest. As the variety of constructs decreases, so does the variety of linguistic cues to a system's structure.
You may already have noticed this effect in your brief encounter with Scheme programming: everything is a parenthesized list. When reading Scheme code, you often have to rely on your editor's syntax highlighting to help you distinguish special forms from function calls, and quoted data from evaluated data. And when you get a bug, often it manifests itself as a bunch of empty pairs appearing in your output --- who knows where those pairs came from?
There are a variety of ways to attack this problem. Type systems, smart programming environments and tools, and programmer discipline can all help. Ultimately, I believe that a simple, regular language is usually better than a more complex and irregular one. But the tradeoff does exist.
(BTW, yes, this is one of those "system design lessons that's more broadly applicable outside of languages" that I spoke of on the first day of class.)