CSE341 Notes for Friday, 5/8/09

I began by pointing out that Scheme has a way to define a structured object with named fields. You call the "define-struct" procedure, as in the following definition of a point structure that has named fields called x and y:

        (define-struct point (x y))

The net effect of calling this procedure is that we have a number of useful procedures defined for us, including:

make-struct which can be used to construct a point
point? which can be used to test whether something is a point
point-x which can be used to get the x value of a point
point-y which can be used to get the y value of a point

For example:

        > (define-struct point (x y))
        > (define p (make-point 2 15))
        > (point? p)
        #t
        > (point? '(2 15))
        #f
        > (point-x p)
        2
        > (point-y p)
        15

We will be using a struct for the first Scheme homework assignment.

Then I spent a few minutes pointing out that to understand Scheme, it is helpful to understand the motivation of the people who designed it. Some people just want to be consumers of programming languages. Others are interested in understanding the details of how programming languages are implemented. The designers of Scheme are very interested in language implementation. On the MIT Scheme page the first sentence about the language talks about its simplicity:

It was designed to have an exceptionally clear and simple semantics and few different ways to form expressions.

but the second sentence emphasizes the ability to implement programming language constructs in Scheme:

A wide variety of programming paradigms, including imperative, functional, and message passing styles, find convenient expression in Scheme.

Structure and Interpretation of Computer Programs includes an extensive discussion of how to implement a metacircular interpreter in Scheme. This is a Scheme interpreter written in Scheme. That may sound like a useless thing to have ("I already have Scheme, so why build another version on top of it?"), but it can prove quite useful. Probably the most useful byproduct of writing a metacircular interpreter is that you learn a lot about how Scheme itself works. It also gives you the ability to create your own variations to the language. Programming languages like Java are not very flexible. You can't, for example, add new control structures to the language. That's not true in Scheme. The language gives you a lot of room to create your own language constructs. And with a metacircular interpreter, you can create slight variations of Scheme that behave just the way you want them to. This is one explanation for the fact that there are so many different versions (often called "flavors") of Scheme.

Then we discussed equality operators in Scheme. In ML the = operator was used to compare equality of many types of data, including structured data. In Scheme the = operator is used for numerical equality. So if you know that the two values you are comparing are numbers, then you should use the = operator.

If you aren't guaranteed to be comparing numbers, then you have a range of equality operators to choose from. The strictest of these is the eq? predicate. It does a pointer comparison (are these references to the exact same object?). This is similar to Java's == operator for objects. This can have surprising results for numbers. For example, we got the expected result when comparing integer values:

        > (define a 13)
        > (define b 13)
        > (eq? a b)
        #t

but not when comparing floating point numbers:

        > (define c 13.8)
        > (define d 13.8)
        > (eq? c d)
        #f

Similar issues come up in Java as well. If you execute this code:

	Integer a = 13;
        Integer b = 13;
        System.out.println(a == b);
        Double c = 12.4;
        Double d = 12.4;
	System.out.println(c == d);
	Integer e = 666;
	Integer f = 666;
	System.out.println(e == f);

The output is:

        true
        false
        false

In Java, the floating point numbers are always stored as distinct objects, so when you ask whether they are "==" to each other (whether they are exactly the same object), the answer is no. In the case of integer values, Java sometimes uses separate objects and sometimes uses a cached version of the object. The language standard says that only a certain range of integes will be cached, which is why we get different results using 13 versus 666.

This issue comes up with lists as well. Scheme will, in general, try to avoid making copies when it doesn't have to. For example, given these definitions:

        > (define x '(1 2 3))
        > (define y x)
        > (define z '(1 2 3))

Scheme will create two lists objects. The first list is pointed to by both x and y. The second is pointed to by z. So when we compare with the eq? predicate, we find that x and y are equal (are the same pointer) but not x and z:

        > (eq? x y)
        #t
        > (eq? x z)
        #f

The eqv? predicate is slightly stronger than the eq? predicate. It returns true in the same cases but also returns true for simple values like numbers. So with eqv, we see that the floating point comparisons now return true, but the list comparisons still do not:

        > (define a 38.4)
        > (define b 38.4)
        > (eqv? a b)
        #t
        > (define x '(1 2 3))
        > (define y x)
        > (define z '(1 2 3))
        > (eqv? x y)
        #t
        > (eqv? x z)
        #f

The third variation is the equal? predicate, which can be thought of as a deep equality operator. It recursively compares the values in each structure. So with the equal? predicate, even our list example works:

        > (equal? x y)
        #t
        > (equal? x z)
        #t

Then I spent a few minutes reviewing boolean operators. We wrote a predicate to determine if one number divides another:

        ; predicate to test whether m divides n
        (define (divides? m n) (= (modulo n m) 0))

But this failed if we asked whether 0 divides a number. By definition, 0 does not divide anything, so we fixed this up with an if:

        (define (divides? m n)
          (if (= m 0)
              #f
              (= (modulo n m) 0)))

I then discussed the fact that something interesting is going on here. Imagine, for example, that you tried to write your own version of if as a simple procedure:

        (define (if test val1 val2) ...)

Would it have the correct behavior? The answer is no. When you call a simple procedure, Scheme evaluates the values being passed as parameters to the procedure and then executes the procedure. This is known as eager evaluation.

Consider what would happen if if used eager evaluation. It would first evaluate the test and the two expressions. In the case of our divides? predicate, it would produce the error by trying to compute a value modulo zero. In the case of a recursive definition, it is even worse. We count on the if being able to stop the recursive call from taking place when we reach a base case. If the if fully evaluated both arguments, we couldn't use it to write recursive definitions because they would produce infinite recursion.

In Scheme, the if is a special form that delays the evaluation of the two expressions that come after the test. It evaluates only one of these expressions depending upon what the test evaluates to. This is sometimes referred to as a form of lazy evaluation.

We decided that this definition of divides? was violating boolean zen for Scheme. Scheme has procedures called "and", "or" and "not" that perform boolean operations. So we rewrote the divides? predicate as:

        (define (divides? m n)
          (and (not (= m 0)) (= (modulo n m) 0)))

I then mentioned that I wanted people to see the other major conditional execution construct. We've seen the if form that allows you to choose between two alternatives. The cond procedure allows you to pick between any number of alternatives. It is like a nested if/else construct in Java or ML (if, else if, else if, else if, else). It has this general form:

(cond (<test> <exp>) (<test> <exp>) ... (<test> <exp>)) If you want a final "else" clause, the convention is to use the word "else" for the final test. As with a nest if/else, it sequentially performs each test and returns the expression that corresponds to the first test that returns true.

For example, in section you wrote this procedure for flattening a list of lists:

        (define (flatten lst)
          (if (null? lst)
              ()
              (if (list? (car lst))         
                  (append (flatten (car lst)) (flatten (cdr lst)))
                  (cons (car lst) (flatten (cdr lst))))))

There is a base case for an empty list and two recursive cases: one for when the car of the list is a list and one for when the car of the list is a simple value. This is better written as a cond because it really has three different cases:

    (define (flatten2 lst)
      (cond ((null? lst) ())
            ((list? (car lst)) (append (flatten (car lst)) (flatten (cdr lst))))
            (else (cons (car lst) (flatten (cdr lst))))))

I spent the last few minutes of lecture reminding people that Scheme has higher order functions like map, filter, foldl and foldr and that you can use lambda expressions to form anonymous functions to pass to each. For example, this expression forms a list of the first 20 even numbers:

        > (map (lambda (n) (* 2 n)) (range 1 20))
        (2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40)

In ML we formed anonymous functions by saying:

fn <parameters> => <expression> The syntax is similar in Scheme:

(lambda <parameters> <expression>) Below is an example of an expression that finds all the factors of 24:

        > (filter (lambda (n) (divides? n 24)) (range 0 100))
        (1 2 3 4 6 8 12 24)

And this one finds the sum of the factors of 24:

        > (foldl + 0 (filter (lambda (n) (divides? n 24)) (range 0 100)))
        60

Stuart Reges

Last modified: Fri May 8 15:03:46 PDT 2009