CSE 341 -- Scheme Basics

Characteristics of Scheme

Common Lisp is the commercial standard for Lisp (but Scheme is cleaner!)

Lisp application areas:

Lisp was developed in the late 50s by John McCarthy. The Scheme dialect was developed by Guy Steele and Gerry Sussman in the mid 70s. In the 80s, the Common Lisp standard was devised. Common Lisp is a kitchen sink language: many many features.


Using Scheme in CSE

Available on orcas and sanjuan (instructional DEC alphas). To run, execute the following:
/cse/courses/misc_lang/axp/mit-scheme-7.3/bin/scheme
or to avoid typing all that, add the following line into the file .cshrc in your home directory:
set path = ( $path /cse/courses/misc_lang/axp/mit-scheme-7.3/bin/)
so that scheme is on your search path.

Scheme uses a "read-eval-print loop" at the top level: read in an expression, evaluate it, then print the result. control-d exits.

If you get an error, you'll end up in the debugger. For now just type

(restart 1)

To read in the contents of a file named myprog.s, say

(load "myprog.s")
See the MIT Scheme User Manual for more about the environment.

Primitive Scheme data types and operations

Some primitive (atomic) data types: Case is generally not significant (except in characters or strings). Note that you can have funny characters such as + or - or ! in the middle of symbols. (You can't have parentheses, though.) Here are some of the basic operators that scheme provides for the above datatypes. Some operators are predicates, that is, they are truth tests. In Scheme, they return #f or #t (remember that the empty list is the same as #f).

Applying functions, procedures, operators

Ok, so we know the names of a bunch of operators, how do we use them. Scheme provides us with a uniform syntax for invoking functions:
  (function arg1 arg2 ... argN)
Examples:
  (+ 2 3)
  (abs -4)
  (+ (* 2 3) 8)
  (+ 3 4 5 1)
  ;; note that + and * can take an arbitrary number of arguments
  ;; actually so can - and / but you'll get a headache trying to remember
  ;; what it means
  ;;
  ;; semicolon means the rest of the line is a comment 


The List Data Type

Perhaps the single most important built in data type in Scheme is the list. In Scheme, lists are unbounded, possibly heterogeneous collections of data. Examples:
  (x)
  (bill hillary)
  (2 3 5 7 11)
  (2 3 x y "zoo" 2.9)
  ()
Box-and-arrow representation of lists:
                 _______________        ________________ 
                |       |       |      |        |       |
                |   o   |   ----|----->|    o   |   o   |
                |___|___|_______|      |____|___|___|___|
                    |                       | 	    | 	 
                    |                       |	    | 
                   bill                  hillary    ()
						    
Or
						    
                 _______________        _____________  	 
                |       |       |      |        |  / | 	 
                |   o   |   ----|----->|    o   | /  |   
                |___|___|_______|      |____|___|/___|   
                    |                       | 	      	 
                    |                       |	      
                   bill                  hillary      

Notes:

Here are some important functions that operate on lists: Predicates for lists:

Evaluating Expressions

Users typically interact with Scheme though a read-eval-print loop. Scheme waits for the user to type an expression, reads it, evaluates it, and prints the return value. Scheme expressions (often called S-Expressions, for Symbolic Expressions) are either lists or atoms. Lists are composed of other S-Expressions (note the recursive definition). Lists are often used to represent function calls, where the list consists of a function name followed by its arguments. However, lists can also used to represent arbitrary collections of data. In these notes, we'll generally write:
<S-expression> => <return-value>
when we want to show an S-expression and the evaluation of that S-expression. For instance:
  (+ 2 3)      => 5
  (cons 1 () ) => (1)
Evaluation rules:
  1. Numbers, strings, #f, and #t are literals, that is, they evaluate to themselves.
  2. Symbols are treated as variables, and to evaluate them, their bindings are looked up in the current environment.
  3. For lists, the first element specifies the function. The remaining elements of the list specify the arguments. Evaluate the first element in the current environment to find the function, and evaluate each of the arguments in the current environment, and call the function on these values. For instance:
      (+ 2 3)              => 5
      (+ (* 3 3) 10)       => 19
      (equal? 10 (+ 4 6))  =>  #t
    

Using Symbols (Atoms) and Lists as Data

If we try evaluating (list bill clinton) we'll get an error. Why? Because Scheme will treat the atom bill as a variable name and try to look for its binding, which it won't find. We therefore need to "quote" the names bill and clinton, which means that we want scheme to treat them literally. Scheme provides syntax for doing this. The evaluation for quoted objects is that a quoted object evalutes to itself.
'x                    => x
(list bill clinton)   => error! bill is unbound symbol
(list 'bill 'clinton) => (bill clinton)
(bill clinton)        => error! bill is unknown function
'(bill clinton)       => (bill clinton)
(equal? (x) (x))      => error! x is unknown function
(equal? '(x) '(x))    => #t
(cons 'x '(y z))      => (x y z)
(cons 'x () )         => (x)
(car '(1 2 3))        => 1
(cdr (cons 1 '(2 3))) => (2 3)
Note that there are 3 ways to make a list:
  1. '(x y z) => (x y z)
  2. (cons 'x (cons 'y (cons 'z () ))) => (x y z)
  3. (list 'x 'y 'z) => (x y z)
Internally, quoted symbols and lists are represented using the special function quote. When the reader reads '(a b) it translates this into (quote (a b)), which is then passed onto the evaluator. When the evaluator sees an expression of the form (quote s-expr) it just returns s-expr. quote is sometimes called a "special form" because unlike most other Scheme operations, it doesn't evaluate its argument. The quote mark is an example of what is called "syntactic sugar."
  'x          => x
  (quote x)   => x
(Alan Perlis: "syntactic sugar causes cancer of the semicolon".)

Variables

Scheme has both local and global variables. In Scheme, a variable is a name which is bound to some data object (using a pointer). There are no type declarations for variables. The rule for evaluating symbols: a symbol evaluates to the value of the variable it names. We can bind variables using the special form define:

This declares a variable called clam (if one doesn't exist) and makes it refer to 17:

 
(define clam 17) 

clam          => 17

(define clam 23)  ; this rebinds clam to 23

(+ clam 1) => 24
(define bert '(a b c))
(define ernie bert)  
Scheme uses pointers: bert and ernie now both point at the same list.

In 341 we'll only use define to bind global variables, and we won't rebind them once they are bound, except when debugging.

We can also use define to bind variables that are the names of functions:

(define (double x)   ; x is local to the function double
   (* 2 x))
This is actually a shorthand for:
(define double 
   (lambda (x) (* 2 x)))
where lambda is a way of defining an anonymous function.

let and let*: creating temporary, local variables

We use the special form let to declare and bind local, temporary variables. Example:
;; general form of let
(let ((name1 value1)
      (name2 value2)
	  ...
      (nameN valueN))
   expression1
   expression2
   ...
   expressionQ)

;; reverse a list and double it

;; less efficient version:
(define (r2 x) 
  (append (reverse x) (reverse x)))

;; more efficient version:
(define (r2 x) 
  (let ((r (reverse x)))
        (append r r))
The one problem with Let is that while the bindings are being created, expressions cannot refer to bindings that have been made previously. For example, this doesn't work, since x isn't known outside the body:
(let ((x 3)
      (y (+ x 1)))
    (+ x y))
To get around this problem, Scheme provides us with let*:
(let* ((x 3)          
      (y (+ x 1)))
    (+ x y))
define can be used to rebind a variable to a new value (but we won't do it, right?) Scheme also has an assignment statement:
(set! x 42)
... which we won't use either. Good scheme style is to avoid using set!, and to program without side effects. Consider carefully whether you really need non-local variables. They are reasonable for constants and of course functions. Use lots of small functions.

Bad style:

(define badbadbad () )
(define (r2 x)
  (set! badbadbab (reverse x))
  (append badbadbad badbadbad))


More about defining functions

(define (function-name param1 param2 ... paramk)
   expr1
   expr2
   ...
   exprN)
expr1, expr2, ..., exprN are evaluated in order, and Scheme returns the value of exprN. However, since the values of expr1, ... exprN-1 are thrown away, the only reason to do this is if they have side effects. So in 341 we'll write functions with just a single expression in the body.

Some places you might use multiple expressions, though, would be for a bunch of print statements, file operations, etc (which of course have side effects).

   
(define (double x)
  (* 2 x))

(double 4) => 8


(define (centigrade-to-fahrenheit c)
  (+ (* 1.8 c) 32.0))

(centigrade-to-fahrenheit 100.0) => 212.0
The x in the double function is the formal parameter. It has scope only within the function. Consider:
(define x 10)

(define (add1 x)
  (+ x 1))

(define (double-add x)
  (double (add1 x)))

(double-add x)   => 22

Three different x's here...

Functions can take 0 arguments:

(define (test) 3)
(test)  => 3


Equality and Identity: equal?, eqv?, eq?

Scheme provides three primitives for equality and identity testing:
  1. eq? is pointer comparison. It returns #t iff its arguments literally refer to the same objects in memory. Symbols are unique ('fred always evaluates to the same object). Two symbols that look the same are eq. Two variables that refer to the same object are eq.
  2. eqv? is like eq? but does the right thing when comparing numbers. eqv? returns #t iff its arguments are eq or if its arguments are numbers that have the same value. eqv? does not convert integers to floats when comparing integers and floats though.
  3. equal? returns true if its arguments have the same structure. Formally, we can define equal? recursively. equal? returns #t iff its arguments are eqv, or if its arguments are lists whose corresponding elements are equal (note the recursion). Two objects that are eq are both eqv and equal. Two objects that are eqv are equal, but not necessarily eq. Two objects that are equal are not necessarily eqv or eq. eq is sometimes called an identity comparison and equal is called an equality comparison. Examples:
    (define clam '(1 2 3))
    (define octopus clam)          ; clam and octopus refer to the same list
    
    (eq? 'clam 'clam)            => #t
    (eq? clam clam)              => #t  
    (eq? clam octopus)           => #t
    (eq? clam '(1 2 3))          => #f (or () )
    (eq? '(1 2 3) '(1 2 3))      => #f 
    (eq? 10 10)                  => #t ; (generally, but imp. dependent)
    (eq? 10.0 10.0)              => #f ; (generally, but imp. dependent)
    (eqv? 10 10)                 => #t ; always
    (eqv? 10.0 10.0)             => #t ; always
    (eqv? 10.0 10)               =>#f  ; no conversion btwn types
    (equal? clam '(1 2 3))       => #t
    (equal? '(1 2 3) '(1 2 3))   => #t
    
    Scheme provides = for comparing two numbers, and will coerce one type to another. For example, (equal? 0 0.0) returns #f, but (= 0 0.0) returns #t.


    Logical operators

    Scheme provides us with several useful logical operators, including and, or, and not.
    (and expr1 expr2 ... expr-n)   
    ; return true if all the expr's are true
    ; ... or more precisely, return expr-n if all the expr's evaluate to
    ; something other than #f.  Otherwise return #f
    
    (and (equal? 2 3) (equal? 2 2) #t)  => #f
    
    (or expr1 expr2 ... expr-n)   
    ; return true if at least one of the expr's is true
    ; ... or more precisely, return expr-j if expr-j is the first expr that
    ; evaluates to something other than #f.  Otherwise return #f.
    
    (or (equal? 2 3) (equal? 2 2) #t)  => #t
    
    (or (equal? 2 3) 'fred (equal? 3 (/ 1 0)))  => 'fred
    
    (define (single-digit x)
       (and (> x 0) (< x 10)))
    
    (not expr)   
    ; return true if expr is false
    
    (not (= 10 20))     => #t
    

    Short-circuit and and or

    In Scheme and and or just evaluate as far as needed to decide whether to return #t or #f (like and: and or: in Smalltalk, or the and and or operators in C++). However, one could easily write a version that evaluates all its arguments.

    Other languages may provide both as built-in functions. For example, in Ada

    Ada example:
      1=1 OR 3=(1/0)
    versus
      1=1 OR ELSE 3=(1/0)
    


    if special form

    (if a b c) if a evaluates to true, then the result of evaluating b is returned, otherwise the result of evaluating c is returned. if is a special form, like quote, because it doesn't automatically evaluate all of its arguments.
    (if (= 5 (+ 2 3)) 10 20)     => 10 
    (if (= 0 1) (/ 1 0) (+ 2 3)) => 5  
    ; note that the (/ 1 0) is not evaluated
    
    (define (my-max x y)     
       (if (> x y) x y))    
    
    (my-max 10 20)                   => 20 
    
    (define (my-max3 x y z)
       (if (and (> x y) (> x z))
           x
           (if (> y z) 
                y
                z)))  
    

    cond -- a more general conditional

    The general form of the cond special form is:
    (cond (test1 expr1)
          (test2 expr2)
          ....
          (else exprn))
    
    As soon as we find a test that evaluates to true, then we evaluate the corresponding expr and return its value. The remaining tests are not evaluated, and all the other expr's are not evaluated. If none of the tests evaluate to true then we evaluate exprn (the "else" part) and return its value. (You can leave off the else part but it's not good style.)

    (define (weather f)
       (cond ((> f 80) 'too-hot)
             ((> f 60) 'nice)
             ((< f 35) 'too-cold)
             (else 'typical-seattle)))
    

    cadr, cadadr, and friends

    The function names car and cdr have the virtue that they are easily composable. Thus cadr is the same as:
    (define (cadr s) (car (cdr s)))
    
    and gets the second element of a list.

    All combinations are defined up to 4 letters, e.g. caddr, cdadar, etc.


    Lexical Scoping (aka static scoping)

    Scoping refers to the visibility of names in a program. Scheme uses lexical scoping, which means that the names visible at any point of the program (in a given scope) are only the local names and any names visible in the lexically enclosing scope. The toplevel scope is the global scope. Function definitions, let, and let* define new scopes.
    (define z 100)
    
    (define (squid w)
      (clam w z))       ; w and z are visible here
    
    (define (test x)
      (let ((y (* 2 x)))  ; x, y, and z are visible inside the body of let
        (+ x y z)))       ; w is not visible inside here
    
    (define (clam a)
      (+ a x))            ; x is not visible here, so this will give an error
    

    Commenting Style

    If Scheme finds a line of text with a semicolon, the rest of the line (after the semicolon) is treated as whitespace. However, a frequently used convention is that one semicolon is used for a short comment on a line of code, two semicolons are used for a comment within a function on its own line, and three semicolons are used for an introductory or global comment (outside a function definition).