CSE341 Notes for Friday, 11/16/07

I said that I wanted to finish up our discussion of Scheme by exploring some of the areas of the language that we hadn't had time to explore in detail.

I briefly discussed some of the applications people have made of Scheme and its predecessor, Lisp. Richard Stallman wrote the first version of emacs in Lisp and much of emacs is still written in Lisp. Stallman was the founder of the GNU project and an early pioneer of the free software movement. For example, here are some entries from my emacs initialization file (called .emacs):

        (setq default-major-mode 'text-mode)
        (setq initial-major-mode 'text-mode)

        ;; inhibit startup message
        (setq inhibit-startup-message 1)

The setq function in Lisp is like the define procedure in Scheme. I also mentioned that Paul Graham has built a successful software career out of Lisp programming. He wrote the initial version of Yahoo! Store in Lisp and later sold the company to Yahoo.

Then I said that I wanted to look at various examples that involved delayed evaluation of code. To explore this, we began with the following procedure that displays a message every time it is called:

        (define (work x)
          (display "work called: ")
          (display x)
          (newline)
          (+ x 1))

For example, we have discussed the fact that if does not evaluate its third and fourth arguments unless it has to. Notice how in this call we see a message only for the call on work with the parameter 3:

        > (if (< 2 3) (work 3) (work 4))
        work called: 3
        4

While in this case we see just a call on work with the parameter 4:

        > (if (> 2 3) (work 3) (work 4))
        work called: 4
        5

We considered what happens when you write a procedure with a similar set of parameters to if (a test, a first value and a second value):

        (define (test t e1 e2)
          (if t
              (+ e1 e1 e1)
              e2))

As we've discussed, when you write a simple procedure like this, Scheme will fully evaluate both parameters once before executing the procedure, so it's not surprising that we see both messages when we call it:

        > (test (< 2 3) (work 3) (work 4))
        work called: 3
        work called: 4
        12
        > (test (> 2 3) (work 3) (work 4))
        work called: 3
        work called: 4
        5

The first variation I discussed was the idea of delaying evaluation by making these parameters thunks. A thunk is a procedure of zero arguments (a lambda) that is used to wrap up an expression to delay its evaluation. So for the test procedure, instead of taking parameters e1 and e2 that are already evaluated, we assume they are thunks (lambdas of zero arguments) that need to be called:

        (define (test2 t e1 e2)
          (if t
              (+ (e1) (e1) (e1))
              (e2)))

In calling the test2 procedure, we now have to wrap up each expression in a lambda:

        > (test2 (< 2 3) (lambda () (work 3)) (lambda () (work 4)))
        work called: 3
        work called: 3
        work called: 3
        12
        > (test2 (> 2 3) (lambda () (work 3)) (lambda () (work 4)))
        work called: 4
        5

Notice that we have duplicated one of the properties that if has in that we only evaluate the parameter that we end up being interested in. But notice that for e1, we end up evaluating it three different times.

Another way to approach this is to use the built in procedures called delay and force. They operate on a data type known as a "promise". For example:

        > (define x (delay (+ 2 2)))
        > x
        #<struct:promise>

This says to delay the evaluation of the expression until later. You then request the evaluation by calling force:

        > (force x)
        4

You might think that after calling this that x has now been set to 4, but that's not true. x still refers to the promise:

        > x
        #<struct:promise>

But you can always get the value again by calling force and, as we'll see, Scheme uses memoization to ensure that the code is not evaluated multiple times:

        > (force x)
        4

We rewrote the test procedure to use calls on force:

        (define (test3 t e1 e2)
          (if t
              (+ (force e1) (force e1) (force e1))
              (force e2)))

When we called it, we now had to wrap up the expressions in a call on delay, as in:

        > (test3 (< 2 3) (delay (work 3)) (delay (work 4)))
        work called: 3
        12
        > (test3 (> 2 3) (delay (work 3)) (delay (work 4)))
        work called: 4
        5

Notice that with the combination of force and delay, we have the same properties that we had with if. We only evaluate arguments if we need them and we only evaluate them once, even if the result is referred to multiple times.

Then we looked at the most powerful way to approach this, which is to use what are called macros. Scheme was designed to have a small core set of operations from which everything else can be built.

Scheme has a special form called define-syntax that can be used to introduce new syntactic forms. It's general form is:

(define-syntax <name> (syntax-rules (<keywords>) (<pattern> <value>))) We began by writing the equivalent of an if that is implemented by calling cond:

        (define-syntax if2
          (syntax-rules ()
            ((if2 test e1 e2)
             (cond (test e1)
                   (else e2)))))

In this form we don't introduce any keywords. We indicate that the pattern is the word if2 followed by three parameters. The cond indicates what value to return. We found that when you define this with a macro, Scheme delays the evaluation of the arguments. So we found that this macro had similar properties to if:

        > (if2 (< 2 3) (work 3) (work 4))
        work called: 3
        4
        > (if2 (> 2 3) (work 3) (work 4))
        work called: 4
        5

Often you want to include special keywords for your construct. For example, suppose that we like the ML-style if/else construct with the keywords "then" and "else". We can write a macro to convert from that form to the built-in if. I called it "if2":

        (define-syntax if3
          (syntax-rules (then else)
            ((if3 e1 then e2 else e3)
              (if e1 e2 e3))))

This definition indicates that the new form will be called if3, meaning that we'll call it like this:

(if3 <something>) The syntax-rules clause begins by saying that it has special keywords "then" and "else". Then comes a list of two items. The first item indicates the pattern or general form that this will take. It begins with if3 and includes the keywords "then" and "else", but also includes three values identified as e1, e2 and e3. Scheme will recognize these as values to be provided when if3 is called. For example, a legal call would look like this:

        (if2 (< 3 4) then "foo" else "bar")

We saw that Scheme enforces this syntax, rejecting anything that doesn't fit the pattern that we provided in the call on define-syntax.

After the pattern in the define-syntax there is an expression indicating how to evaluate this. It translates a call like the one above into a call on the built-in if:

        (if e1 e2 e3)

So if you don't like Scheme's built-in syntax, you can define your own, including your own keywords.

Writing macros can be tricky. We looked at how to write a macro that would be equivalent in behavior to the built-in or. I asked people what or returns and some people seemed to think that it always returns either #t or #f. That's not quite right. The built-in or evaluates its arguments as long as they evaluate to #f or until one of them evaluates to something other than #f. If that happens, it returns that value. This is part of the general Scheme notion that #f represents false and everything else represents true. For example:

        > (or 2 3)
        2
        > (or (> 2 3) (+ 2 3))
        5

Only if all arguments return #f does or return #f:

        > (or (> 2 3) (= 2 3))
        #f

We can simulate this behavior with an if:

        (define-syntax or2
          (syntax-rules ()
            ((or2 a b)
             (if a a b))))

This provides a pretty good definition for or. It gets the same answer for all of the cases above:

        > (or2 2 3)
        2
        > (or2 (> 2 3) (+ 2 3))
        5
        > (or2 (> 2 3) (= 2 3))
        #f

There is a subtle problem with this, though. Someone said that it potentially evaluates its first argument twice. This can be a problem if the expression being evaluated has a side-effect like a call on set! or a call on display:

        > (or2 (begin (display "ha") (< 2 3)) (= 2 3))
        haha#t

We get two occurrences of "ha" because the expression is evaluated twice. The built-in or has no such problem:

        > (or (begin (display "ha") (< 2 3)) (= 2 3))
        ha#t

I didn't have time to show it, but we can fix this by using a let to store the result and evaluate just once:

        (define-syntax or3
          (syntax-rules ()
            ((or3 a b)
             (let ((aval a))
               (if aval aval b)))))

This version behaves correctly:

        > (or3 (begin (display "ha") (< 2 3)) (= 2 3))
        ha#t

I also briefly mentioned that Scheme has two mechanisms for specifying parameters. You can either indicate exactly how many parameters a procedure has and it will enforce that number or you can indicate that it has an indefinite number of parameters (zero or more). You choose the one or the other by either including parentheses or not when you define a procedure using the lambda form, as in:

        (define f1 (lambda (n) (+ n 1)))
        (define f2 (lambda n (display n)))

We saw that f1 required exaclty one parameter while f2 would take an indefinite number of parameters. In that case, the parameters are provided to f2 as a list.

Stuart Reges

Last modified: Sun Dec 2 11:30:52 PST 2007