341 : 20 Feb 2002 : Smalltalk intro

Smalltalk: the language

Smalltalk is small and simple. Like Scheme, it demonstrates the principle that power comes not from piling feature upon feature, but from pruning away restrictions.

Everything in Smalltalk is an object. Objects have a class, some instance variables, and methods, all of which are ultimately objects. All actions are performed by sending a message to an object; if the object has a method for handling that message, the method is executed.

Bindings

Objects are assigned/bound to names by using the assignment statement:

    x := 'hi'.

Here, we are binding x to the String object representing 'hi'. Differences relative to ML:

Message sending syntax

    x negated.       "Unary message syntax"
    x + 5.           "Infix message syntax"
    x gcd: 21.       "Keyword message syntax"

    "Keyword message with multiple arguments"
    'Hello, world' replaceFrom: 1 to: 6 with: 'byebye' startingAt: 1.

In keyword syntax, each argument to a message follows a keyword. This only seems weird because you are used to memorizing the argument order from languages like C. Since each keyword describes the argument that follows, I find this nicer than "normal" function syntax. For example, it's easier to remember the order of arguments.

Common syntax gotchas

Closures

Smalltalk has lexically scoped closures (lambdas), which are enclosed in brackets and evaluated by sending one of the value messages. Optionally, formal parameters may be specified preceding a vertical bar at the start of the lambda expression:

"Smalltalk"                "Rough ML equivalent..."
[ 3 ].                     "fn () => 3;"
[ 3 ] value.               "(fn () => 3)();"
[ :x :y | x + y ].         "fn (x, y) => x + y;"
a := [ :x :y | x + y ].    "val a = fn (x, y) => x + y;"
a value: 1 value: 2.       "a(1, 2)"

Exercise: Write the approximate Scheme equivalents each of the above code fragments. What are the differences? Based on what I've told you, does ML or Scheme have semantics that more closely resembles Smalltalk in each case?

Closures and environments

Closures are lexically scoped, but they may have arbitrary side effects, including the effect of changing bindings in enclosing environments:

    "Executing this code..."   "Yields this value for i"
    i := 5.                    "5"
    [ i := 7 ] value.          "7"
    [ :i | i := 9 ] value: 2.  "2, then 9 in local scope; 7 in outer scope"

Closures for control structures

Unlike ML, which has both closures and "special forms" like if/then/else, Smalltalk uses closures to implement control structures:

    Transcript open.  "Open a Transcript window"
    5 timesRepeat: [
        Transcript show: 'hi'; cr.
    ].

    x = 0 ifTrue: [ Transcript show: 'Cannot divide by zero' ]
          ifFalse: [ Transcript show: (1.0 / x) asString. ].

    i := 0.
    [ i < 10 ] whileTrue: [ i := i + 1. ].

These structures work because closures are not evaluated until sent one of the value messages. Delayed evaluation is a crucial property of closures!

value:value:value:value:?

Closures with many arguments are evaluated using up to 4 value: keywords:

    seal := [ :a :b :c :d | a + b * c + d ].
    seal value: 1 value: 2 value: 3 value: 4.

For argument lists longer than that, or if you just don't feel like typing value: too many times, you can use the valueWithArguments: message, which takes an array:

    walrus := [ :a :b :c :d :e | a + b * c + d * e ].
    walrus valueWithArguments: #( 10 20 30 40 50 ).  "Note #() syntax"

A word about access protection

Smalltalk classes have no access protection mechanisms for their methods.

However, only methods of an object have access to the object's instance variables. Since classes inherit their superclasses' instance variables, subclass instances may access variables defined in a superclass.

In C++ terms, all instance variables are protected, and all methods (member functions) are public.

Hence, in Squeak, where everything is implemented in Smalltalk, you can freely change everything about the world, up to and including the implementation of message sending, subclassing, and closure evaluation. Doing such things will break literally everything in your environment, of course.

A chink in the armor

Smalltalk people like to talk about how "everything is an object", as if the entire description of the language were self-evident once you understand this principle.

However, what is a variable? When I say "x := 3", we are binding the object 3 to x; but is x itself an object that exists before we perform this assignment? In other words, is assignment a message that you send to a variable object? The answer is no: assignment is not a message at the language level. Variables are not objects; they are names for objects.

Aside: a bit of ugliness

I said earlier that

This implies that classes have a class. This is, in fact, true. For every class in the system, there exists a "meta-class", which is the class of that class.

Now, these classes of classes are are themselves objects, so they, too, must have a class---which is Metaclass. In other words, the class of a class is an instance of Metaclass. And Metaclass's class is simply an instance of MetaClass---in other words, the class of Metaclass is one of its own instances...

    "Smalltalk expression"                  "Result of printIt"
    x := 3.                                 3
    x class.                                SmallInteger
    x class class.                          SmallInteger class
    x class class class.                    Metaclass
    x class class class class.              Metaclass class
    x class class class class class.        Metaclass
    x class class class class class class.  Metaclass class
    " ... etc."

Are you confused by metaclasses?

If you are, then join the crowd. Metaclasses are one of the few ugly parts of Smalltalk. It's quite useful to be able to inspect and dynamically modify the classes of objects, but the conceptual weirdness of using classes and metaclasses is a high price to pay.

One solution is to remove the distinction between objects and classes, and create objects simply by copying other objects. In this scenario, you would create a new integer by cloning an old integer, and perhaps changing the copy (including adding new methods to that copy). The metaclass problem goes away, because objects have a "parent" instead of a class, and there is no rule that says every parent must have a parent. At the same time, you retain the power to treat parents as regular objects.

Languages that use this approach are called "prototype-based" object-oriented languages. Self (developed by Dave Ungar's group at Stanford, and then Sun) and Cecil (developed here at UW, by Chambers et al.) are two such languages.

Squeak key command reference

Squeak key commands are invoked using a modifier key plus a regular letter (possibly a capital letter). The modifier key, which I will refer to as "mod", is ALT under Windows; on Macintosh, it is COMMAND.

You can get the command key list by bringing up the Squeak menu and clicking "help..." -> "command key help", but I had some empty space on these handouts so here's a quick reference for the commands I use frequently (the ones at the top are the ones I do most often):

Evaluation/environment
doIt mod-d
printIt mod-p
save/accept mod-s
browse it mod-b
inspect it mod-i
 
Text editing
delete previous word mod-w
cut mod-x
copy mod-c
paste mod-v
find a string mod-f
find next mod-g
undo mod-z
set font mod-k

cse341-webmaster@cs.washington.edu
Last modified: Wed Feb 20 17:43:31 PST 2002