CSE341 Notes for Wednesday, 2/21/07

I mentioned that assignment 7 would be the biggest programming project for the quarter, but that I hadn't finished it yet. I said that I'd be giving people extra time to work on it. The final assignment (assignment 8) will involve Ruby and will just involve some basic practice. It won't be a huge assignment.

I also said that we were reaching the point in the quarter when we start shifting from talking about language details to talking about language concepts. I've been disappointed to find that students who have taken 341 recently haven't always gotten a lot out of these discussions. Many people seem to miss the big picture among all the various concepts. So I decided to spend some time discussing a few "big picture" concepts as they relate to Scheme.

First we talked about the idea of static type checking versus dynamic type checking. We saw that ML was similar to Java in that before any line of code gets executed, the compiler wants to double check all of our code looking for any errors we might have made. In each case, the compiler wants to know in advance exactly what type each variable and function has. It was convenient that ML used type inferencing to figure this out so that we didn't have to include as many type specifications as we do in Java, but everyone who programs in ML has to get used to "tycon mismatch" messages appearing frequently.

Several people expressed opinions about what they liked about each approach with two common themes:

Several people liked the fact that the static type checking of ML allows them to locate errors early. When you make a mistake in Scheme, it often manifests as a subtle bug that can be difficult to locate.
Other people liked the flexibility of Scheme. For example, we couldn't write functions in ML that took different types of arguments. This is easy to do in Scheme.

Each approach has its own advantages and disadvantages. I also briefly discussed the idea of type safety. To guarantee type safety, Scheme does a lot of checking at runtime. When you try to add two values together, for example, Scheme first makes sure that the two values are numbers. No such runtime check would be necessary in a language like Java or ML because the compiler makes sure that you never attempt to add two things that aren't numbers.

I then spent a few minutes discussing a question that had been raised on the message board. What is ML good for? I said that the people I most often see using ML are programming language experts. For example, a friend of mine named Kathleen Fisher is a researcher at AT&T who used a C compiler written in ML to develop a domain-specific programming language called hancock that is used by AT&T employees to process vast amounts of data to solve problems like searching for credit card fraud. It is fairly easy to write compilers and interpreters in functional languages like ML and Scheme and once you have a compiler or interpreter written, it is fairly easy to "tweak" it to process a slightly different language than the original. In our next programming assignment we'll see how to use Scheme to implement a language interpreter.

We also briefly discussed some of the applications people have made of Scheme and its predecessor, Lisp. Richard Stallman wrote the first version of emacs in Lisp and much of emacs is still written in Lisp. Stallman was the founder of the GNU project and an early pioneer of the free software movement. For example, here are some entries from my emacs initialization file (called .emacs):

        (setq default-major-mode 'text-mode)
        (setq initial-major-mode 'text-mode)

        ;; inhibit startup message
        (setq inhibit-startup-message 1)

The setq function in Lisp is like the define procedure in Scheme. I also mentioned that Paul Graham has built a successful software career out of Lisp programming. He wrote the initial version of Yahoo! Store in Lisp and later sold the company to Yahoo.

I then turned to another "big picture" idea about Scheme that differentiates it from ML. In Scheme, the distinction between data and code is very loose. We know that we use list notation to tell the interpreter to call built-in procedures like +:

        > (+ 3 4)
        7

But that same list can be stored as data, as in:

        > (define a '(+ 3 4))
        > a
        (+ 3 4)

The symbol a stores a reference to a list that could be thought of as storing data, but could also be thought of as code that can be executed. In Scheme, you can execute such code by evaluating it with the procedure eval:

        > (eval a)
        7

Similarly, we can set a variable to refer to the symbol +:

        > (define b '+)
        > b
        +

And if we ever want to turn the symbol + into the procedure +, we eval it:

        > (eval b)
        #<primitive:+>

I then asked people to think about how the top-level read-eval-print loop is written. I first wrote it this way:

        (define (repl)
          (display "expression? ")
          (let ((expr (read)))
            (print expr)
            (display " --> ")
            (print expr)
            (newline)
            )
          (repl))

This sets up a continuous loop that prompts, then reads an expression, then echos the expression. It behaved like this:

        > (repl)
        expression? 2.8
        2.8 --> 2.8
        expression? (+ 2 2)
        (+ 2 2) --> (+ 2 2)

Of course, the actual read-eval-print loop evaluates expressions like (+ 2 2). It was easy to change our code to do this by calling eval on the expression:

        (define (repl)
          (display "expression? ")
          (let ((expr (read)))
            (print expr)
            (display " --> ")
            (print (eval expr))
            (newline)
            )
          (repl))

This had the usual behavior of the read-eval-print loop:

        > (repl)
        expression? 3.4
        3.4 --> 3.4
        expression? (+ 2 2)
        (+ 2 2) --> 4
        expression? (* 3.4 (+ 7 9) (- 18 4))
        (* 3.4 (+ 7 9) (- 18 4)) --> 761.6

There is a similar procedure called apply that takes two arguments: a procedure and a list of arguments, as in:

        > (apply + '(3 8 14.5))
        25.5

We don't have this kind of capability in Java or ML. For example, it might be nice to say something like this in Java:

        String s = "System.out.println(48);";
        execute(s);

Java doesn't have any such capability. It is much more difficult in a statically typed language like Java or ML to dynamically execute code like this at runtime. We'd have to somehow invoke the compiler to check the types of values mentioned in the expression. But in a language like Scheme that is designed to do its type checking at runtime, it is much easier to allow this.

I then spent the last part of class talking about a bigger example than we've seen before. I talked about the idea of writing a procedure that would take the derivative of a function with respect to some variable. This is a classic symbolic computation task where languages like Scheme are likely to outshine languages like Java. I got the idea from the Abelson and Sussman text (Structure and Interpretation of Computer Programs).

We said that we would support the following kinds of expressions:

        ; expression:
        ;    number
        ;    variable
        ;    (+ expression expression)
        ;    (* expression expression)

We didn't have much time left, so we were only able to implement the derivative of a simple number or a variable:

        (define (deriv exp var)
          (cond ((number? exp) 0)
                ((symbol? exp)
                 (if (eq? exp var) 1 0))))

I asked the TAs to cover the other cases in section. Here is my final solution that includes all four kinds of expressions:

        (define (deriv exp var)
          (cond ((number? exp) 0)
                ((symbol? exp)
                 (if (eq? exp var) 1 0))
                ((sum? exp)
                 (make-sum (deriv (arg1 exp) var)
                           (deriv (arg2 exp) var)))
                ((product? exp)
                 (make-sum
                  (make-product (arg1 exp)
                                (deriv (arg2 exp) var))
                  (make-product (deriv (arg1 exp) var)
                                (arg2 exp))))
                (else
                 ((error "unknown expression type -- DERIV" exp)))))
        
        (define (arg1 e) (cadr e))
        
        (define (arg2 e) (caddr e))
        
        (define (make-sum a1 a2) (list '+ a1 a2))
        
        (define (make-product m1 m2) (list '* m1 m2))
        
        (define (sum? x)
          (and (pair? x) (eq? (car x) '+)))
        
        (define (product? x)
          (and (pair? x) (eq? (car x) '*)))

This version produces correct answers, but it doesn't provide very simple answers. For example:

        > (deriv '(* x (* x x)) 'x)
        (+ (* x (+ (* x 1) (* 1 x))) (* 1 (* x x)))
        > (deriv '(+ x y) 'x)
        (+ 1 0)

We get better results if we redefine make-sum and make-product to do some simplification:

        (define (make-sum a1 a2)
          (cond ((eq? a1 0) a2)
                ((eq? a2 0) a1)
                ((and (number? a1) (number? a2)) (+ a1 a2))
                (else (list '+ a1 a2))))
        
        (define (make-product m1 m2)
          (cond ((or (eq? m1 0) (eq? m2 0)) 0)
                ((eq? m1 1) m2)
                ((eq? m2 1) m1)
                ((and (number? m1) (number? m2)) (* m1 m2))
                (else (list '* m1 m2))))

This version produces the following results:

        > (deriv '(* x (* x x)) 'x)
        (+ (* x (+ x x)) (* x x))
        > (deriv '(+ x y) 'x)
        1

Stuart Reges

Last modified: Sun Feb 25 14:21:38 PST 2007