CSE 341 -- Scheme Basics
These notes cover the following topics:
- Scheme Introduction
- Primitive data types and operations.
- Introduce an old friend, EVAL, a function which forms the
heart of any good Scheme interpreter.
- Basic control abstractions, including function definition, the
if and cond special forms, identity, equality, and
boolean logic
- Variables, Binding and Scope (Concept)
- Type Checking (Concept)
What is Scheme?
Scheme is sometimes called a "List Processing" language or a Symbol
Processing language. It's a cousin (nephew?) to Lisp, the Platypus
of languages.
Application areas:
- AI (expert systems, planning, etc)
- Simulation, Modeling
- Applications programming (CAD, Mathematica)
- Rapid prototyping
- Extension languages (for Webservers or Image processing)
Important attributes/features:
- Program/data equivalence (this makes it easy to write Scheme programs that process other
programs, e.g. compilers, structure editors, debuggers, etc.)
- Functional programming style.
- Interpreter based environment.
- Heavy use of recursion and higher order functions.
- High level data structures well-supported (eg. lists)
- Garbage collection.
- Strong/dynamic typing.
Lisp was developed in the late 50s by John McCarthy. Many
incompatible lisps emerged. In the 80s, a standardized version
emerged, called Common Lisp (CL). CL is a kitchen sink language: many
many features. Scheme was developed in the mid-70s as a reaction to
the shortcomings and inconsistencies of Lisp.
Primitive Scheme data types and operations
Scheme provides us with the following primitive (atomic) data types:
- Numbers:
- integers (examples: 1, 4, -3, 0)
- reals (examples: 0.0, 3.5, 1.23E+10)
- rationals (eg. 2/3, 5/2)
- Symbols: (eg. FRED, X, A12, SET!, +1, #foobar)
- Boolean: Scheme uses the special identifiers #T and #F to
represent true and false.
- Strings: (eg. "hello sailor")
- Characters (eg #\c)
Case is generally not significant (except in characters or strings).
Here are some of the basic operators that Scheme provides for the above
datatypes.
- Arithmetic operators (+, -, *, /, ABS, SQRT)
- Relational (=, <, >, <=, >=) (for numbers)
- Relational (EQ?, EQUAL?) for arbitrary data
- Logical (AND, OR, NOT): short circuit logical operators.
Some operators are predicates, that is, they are truth
tests. In Scheme, they return #T or #F. Here are some
useful examples:
- Type testing: number?, boolean?, string?, character?, symbol?,
vector?, procedure?
- odd?, even?
- Arbitrary identity test: eq?
- Arbitrary equality test: equal?
- Number relations: <, >, =, >=, etc
- Character relations: character>?, character=>?, etc
Applying functions, procedures, operators
Ok, so we know the names of a bunch of operators, how do we
use them. Scheme provides us with a uniform syntax for invoking functions:
(function-name arg1 arg2 ... argN)
Examples:
(+ 2 3)
(abs -4)
(- (- 2 3) 8)
(* 5 (+ 7 10))
EVAL notation; the EVAL function
Users typically interact with Scheme though a read-eval-print loop
(REPL). Scheme waits for the user to type an expression, reads it,
evaluates it, and prints the return value. EVAL is a function which
forms the heart of any Scheme interpreter. It takes a Scheme
expression as an argument, and returns the Scheme exression which
represents the value of the evaluated expression. Scheme expressions
(sometimes called S-Expressions, for Symbolic Expressions) are either
lists or atoms. Lists are composed of other S-Expressions (note the
recursive definition). Lists are often used to represent function
calls, where the list consists of a function name followed by its
arguments. However, lists can also used to represent arbitrary
collections of data. Atoms are any kind of atomic data (a number,
character, string, etc.)
In these notes, we'll generally write:
<S-expression> => <return-value>
when we want to show an S-expression and the evaluation of that
S-expression. For instance:
(+ 2 3) => 5
(cons 1 ()) => (1)
Evaluation rules:
- Numbers, characters, and strings are "self-evaluating atoms", that
is, they evaluate to themselves.
- Symbols are treated as variable names, and to evaluate them,
their bindings are looked up in the environment.
- For lists, the first element specifies the function. The remaining
elements of the list specify the arguments. Evaluate each of the
arguments in the current environment, and call the function on these values.
For instance:
(+ 2 3) => 5
(+ (* 3 3) 10) => 19
(= 10 (+ 4 6)) => #T
The List Data Type
Perhaps the single most important built in data type in Scheme is
the list. In Scheme, lists are unbounded, possibly heterogeneous
collections of data. Examples:
(X)
(BILL HILLARY)
(2 3 5 7 11)
(2 3 X Y "Zoo" 2.9)
()
Box-and-arrow representation of lists:
_______________ ________________
| | | | | |
| o | ----|----->| o | X |
|___|___|_______| |____|___|_______|
| |
| |
BILL HILLARY
Notes:
- (X) is not the same as X
- () is the empty list
- Lists of lists: ((A B) (C D)) or ((FRED) ((X)))
- Scheme lists can be heterogeneous collections of data:
(1 1.5 X (A) ((7)))
Here are some important functions that operate on lists:
- length -- length of a list
- equal? -- test if two lists are equal (recursively)
- car -- same as first
- cdr -- same as rest
- cons -- make a new list cell
- list -- make a list
Predicates for lists:
- null? -- is the list empty?
- list? -- is this thing a list?
Using Symbols (Atoms) and Lists as Data
If we try evaluating
(list bill clinton) we'll get an error. Why? Because
Scheme will treat the atom bill as a variable name and try to look
for it's binding, which it won't find. We therefore need to "quote"
the names bill and clinton, which means that we
want Scheme to treat them literally. Scheme provides syntax for doing this.
The evaluation for quoted objects is that a quoted object evalutes to itself.
'x => x
(list bill clinton) => error! bill is unbound symbol
(list 'bill 'clinton) => (bill clinton)
(bill clinton) => error! bill is unknown function
'(bill clinton) => (bill clinton)
(equal? (x) (x)) => error! x is unknown function
(equal? '(x) '(x)) => T
(cons 'x '(y z)) => (x y z)
(cons 'x ()) => (x)
(car '(1 2 3)) => 1
(cdr (cons 1 '(2 3))) => (2 3)
Note that there are (at least) 3 ways to make a list:
- '(x y z) => (x y z)
- (cons 'x (cons 'y (cons 'z nil))) => (x y z)
- (list 'x 'y 'z) => (x y z)
Syntactic Sugar
Internally, quoted symbols and lists are represented using the special
function QUOTE. When the reader reads '(A B) it translates
this into (QUOTE (A B)), which is then passed onto the evaluator.
When EVAL sees an expression of the form (QUOTE S-expr) it just
returns S-expr. QUOTE is sometimes called a "special form" because
unlike most other Scheme operations, it doesn't evaluate its argument.
The quote mark is an example of what is called "syntactic sugar":
this is a bit of syntax added by the language designer/implementor
to shorten a common expression. In this case, it allows
us to not write (quote ...) every time we want to use a literal
expression.
'X => X
(QUOTE X) => X
To quote or not to quote
Knowing when to use a quote is often confusing to beginning Schemers.
If you're thinking of using a quote, you need to ask yourself: Does
this expression evaluate to something? Literal numbers, characters,
strings, and boolean values all evaluate to themselves, and hence
don't need to be quoted. Lists and symbols are a different matter,
however. Remember that if the interpreter sees a list, it thinks that
it's a function call and will evaluate it as such. If you wish to
provide a list of literal data, then you must quote it. Similarly,
the interpreter treats symbols as if they are names (either for
functions or for data). Hence, symbols must be quoted if they are to
be used as simple symbols.
Quoting is allows us to represent lists of things literally. Can you
imagine a way to do this in C or Java?
(define names '(bob bill jane ian sarah))
(define president 'george)
'names => ???
names => ???
(cons president names) => ???
'(cons president names) => ???
What is the final form in the above example?
Variables/Names/Environments
Variables in Scheme are in some ways similar to variables in Java or
C. Variables are "names" for "things". In Scheme, we can define new
names as follows:
(define foo 17) ; this declares a variable called foo (if one doesn't exist)
; and makes it refer to 17
foo => 17
(* 2 foo) => 34
In Scheme, variables are said to by "typeless." You need never
declare them to be of some type, and can arbitrarily bind them to
objects of different types at any time. Names may be bound either to
data or functions (but not both). We'll see an example of function
definition later.
The Environment
Names in an environment refer to things. The rule for evaluating
names: a symbol evaluates to the value of the variable it names.
Scheme supports a style of programming known as functional
programming. In this view, programs do not operate by modifying
the contents of a variable, rather they compute functions on
data, without ever storing the results anywhere. Hence,
redefinition does not modify the old definition. We can redefine
names like so:
(define foo 17)
foo => 17
(define foo 23)
foo => 23
Draw a picture of the above environment (in the functional view and in
the C/C++ view). Note that in the functional view, the redefinition of
foo shadows the old definition.
Scheme has local and global variables. We'll see examples of local
variables later, but for now, all of our definitions take place in the
global environment.
Functions
In Scheme, lambda
is a special form that evaluates
to an anonymous function. The general pattern is:
(lambda <formal parameter list> <body>)
Examples
(lambda (x) (* x 2)) => #<procedure>
; applying a function
((lambda (x) (* x 2)) 5) => 10
; we can give the procedure a name
(define double (lambda (x) (* x 2)))
; applying our function
(double 10)
double => ? what do I get here?
The rule for applying functions is: evaluate the actual parameters
(left-to-right), bind the formal parameter names to the actual
parameter values, and then evaluate the expressions in the body (in
order). Return the value of the final expression evaluated.
We'll often use a little more syntactic sugar to clean up our
function definitions:
(define (double x)
(* x 2))
(define (plus1 x)
(+ x 1))
Let and Let*: creating temporary, local variables
We use the special form let to declare and bind local, temporary variables.
Example:
(let ((name1 value1) ;; general form of let
(name2 value2)
...
(nameN valueN))
expression1
expression2
...
expressionQ)
;; example
(let
((x 1)
(y 2))
(+ x y)) => 3
The one wrinkle with let is that while the bindings are being created,
expressions cannot refer to bindings that have been made previously,
within the same let block. (Why? Because Scheme doesn't guarantee an
order of evaluation for the forms.) For example, this doesn't work,
since x isn't known outside the body:
(let ((x 3)
(y (+ x 1)))
(+ x y))
To get around this problem, Scheme provides us with let*:
(let* ((x 3)
(y (+ x 1)))
(+ x y))
Equality and Identity: EQUAL? and EQ?
Scheme provides two primitives for equality and identity testing:
- EQ? is pointer comparison. It returns #T iff its arguments
literally refer to the same objects in memory. Symbols
are unique ('FRED always evaluates to the same object).
Two symbols that look the same are EQ?. Two variables
that refer to the same object are EQ?. Two strings
that look the same are not EQ?. Two characters
that look the same are EQ?. Two integers that look the
same are EQ. Two floats that look the same are not EQ.
- EQUAL?, roughly speaking, returns true if its arguments
"look" the same. EQUAL? returns #T iff its arguments are EQ?, or
its arguments are lists, whose corresponding elements are EQUAL?.
Two objects that are EQ? are also EQUAL?. Two objects that are EQUAL?
are not necessarily EQ?. EQ? is sometimes called an identity
comparison and EQUAL? is called an equality comparison. What does
this mean?
(define foo '(1 2 3))
(define bar foo) ; bar refers to the same list as foo
(eq? 'foo 'foo) => #T
(eq? foo foo) => #T
(eq? foo bar) => #T
(eq? foo '(1 2 3)) => #F
(eq? '(1 2 3) '(1 2 3) => #F
(eq? 10 10) => #T ; (generally, but imp. dependent)
(eq? 10.0 10.0) => #F ; (generally, but imp. dependent)
(equal? foo '(1 2 3)) => #T
(equal? '(1 2 3) '(1 2 3)) => #T
Scheme provides =
for comparing
two numbers, and will coerce one type to another.
For example, (equal? 0 0.0)
returns #F
, but
(= 0 0.0)
returns #T
.
Logical operators
Scheme provides us with at least three useful logical operators: AND,
OR, and NOT. The only value in Scheme that "means" false is #f. All
other values are interpreted as true (including the empty list).
(AND EXPR1 EXPR2 ... EXPR-N)
; return true if all the expr's are true
; ... or more precisely, return expr-n if all the expr's evaluate to
; something other than #f. Otherwise return #f.
(AND (= 2 3) (= 2 2) #T) => #F
(OR EXPR1 EXPR2 ... EXPR-N)
; return true if at least one of the expr's is true
; ... or more precisely, return expr-j if expr-j is the first expr that
; evaluates to something other than #f. Otherwise return nil.
(OR (= 2 3) (= 2 2) #T) => #T
(OR (= 2 3) 'FRED (= 3 (/ 1 0))) => 'FRED
(define (single-digit x)
(and (> x 0) (< x 10)))
(NOT EXPR)
; return #t if EXPR is false, returns #f otherwise
(NOT (= 10 20)) => #T
(Small) Concept: Short-circuit AND and OR vs evaluating AND and OR
The book calls the short circuit version of AND and OR a conditional;
the book calls the evaluating version a boolean function.
In Scheme the short-circuit version is the one that is provided,
although one can easily write an evaluating version.
Other languages may provide both. For example, in Ada
- AND and OR are the evaluating versions
- AND THEN and OR ELSE are the short-circuit versions
example:
1=1 OR 3=(1/0)
versus
1=1 OR ELSE 3=(1/0)
IF special form
(IF A B C) if A evaluates to true, then the evalutation
of B is returned, otherwise the evaluation of C is returned. IF is
a special form, like QUOTE, because it doesn't automatically evaluate
all of its arguments.
(IF (EQUAL 5 (+ 2 3)) 10 20) => 10
(IF (EQUAL 0 1) (/ 1 0) (+ 2 3)) => 5 ; note that the (/ 1 0) is not evaluated
(define (my-max x y))
(if (> x y) x y))
(my-max 10 20) => 20
(define (my-max3 x y z)
(if (and (> x y) (> x z))
x
(if (> y z)
y
z)))
COND: a more general conditional
The general form of the COND special form is:
(COND (test1 expr1)
(test2 expr2)
....
(testn exprn))
As soon as we find a test that is true (not #f), then we evaluate the
corresponding expr and return its value. The remaining tests are not
evaluated, and all the other expr's are not evaluated. If we fall off
the end of the COND without any test succeeding, then the value of the
COND is not defined.
(define (weather f)
(cond ((> f 80) 'too-hot)
((> f 60) 'nice)
((< f 35) 'too-cold)
(#t 'typical-seattle)))
;; notice use of #t as a final catch-all test
Concept: Lexical Scoping (aka static scoping)
Scoping refers to the visibility of names in a program. By default,
Scheme uses lexical scoping, which means that the names visible at any
point of the program (in a given scope) are only the local names and
any names visible in the lexically enclosing scope. The toplevel
scope is the global scope. Function definitions, Let, and Let* define
new static scopes.
(define z 100)
(define (squid foo)
(clam foo z)) ; foo and z are visible here
(define (test x)
(let ((y (* 2 x))) ; x, y, and z are visible inside the body of let
(+ x y z))) ; foo is not visible inside here
(define (clam a)
(+ a x)) ; x is not visible here, so this is an error
(define (snake bar) ; what about z here?
(let ((z 42)) ; we are making a new definition of z that shadows
(+ z bar))) ; the outer one...
Concept: Type Checking
There is no static type checking in Scheme; type checking is done at
run time.
There is unfortunately little agreement about many of the terms which
are associated with type checking: so I'll give you the definitions
we'll use in class, while pointing out some of the other
interpretations.
- Type safe
- Type safe means that the language guarantees that one type can't
be incorrectly used in place of another type, in other words, that all
expressions are guaranteed to be type consistent. This can be done
checked statically, dynamically, or a mixture. Scheme, Ada,
Smalltalk, and Prolog are examples of type safe languages. Fortran
and C are examples of languages that aren't type safe.
- Static/Dynamic type checking
- This refers to when type checking occurs. Static type
checking takes place during compilation, dynamic type checking takes
place at run-time. This distinction makes no guarantee about the
completeness of the type checking. Ada is a statically typed,
type-safe language, which is to say that the necessary type checking
takes place at compile time. (This was true at least of the old Ada,
Ada 95 appears to support dynamic binding, so I'm not certain if it
can do all checking statically.) C is also statically typed, but not
type safe. Scheme is dynamically typed and type safe. (Some texts use
statically typed to imply type safe, but I think this is confusing,
because we should be separating out the issues of the completeness of
the type checking (type safety) from the time when the type checking
occurs (dynamic vs. static).)
- Strongly typed
- There are two common definitions for strongly typed (sorry) --
one is strongly typed = statically typed and type safe; the other is
strongly typed = type safe. I will attempt to use the latter
definition. Scheme and Ada are strongly typed by this definition.
- Weakly typed
- Weakly typed means "not type safe". Fortran and C are examples
of weakly typed languages.
dugan@cs.washington.edu
(Last Update:
)