CSE341 Notes for Friday, 2/16/07

Mikey talked about programming with side effects and provided the following lecture notes.

We started off with a simple definition statement of (define a 1). We then used (set! a (+ a 1)) and noticed that "a" was now bound to 2, not 1. It seemed no different than typing (define a 2). We then wrote a very simple function:

        (define a 1)
        
        (define (f n)
          (begin (set! a (+ a 1))
                 (* n a)))

And tried simple calls, like (f 5), which produced 10. Then the next call to (f 5) produced 15, and then 20, and we were amazed. The set! was affecting the value of a bound in the closure of f, and these changes persisted between function calls. They also affected the top level environment; after 3 calls to f, entering a in the REPL printed 4.

This is because the set! is changing the binding of a at the top level, not inside of our function. We can especially see this if we introduce another function:

        (define (g n)
          (+ n a))

As we call (f n), (g n) will also return different results; the a in f, g, and the top level are all the same a. A simplification of how this works is to view the top level environment and its bindings as a stack. We start with an empty stack:

        ---+
          || <- bottom
        ---+

If we then call, from above, the define of a and define f, the stack looks like:

           _____   _____
          |     | |     |
        --v-----|-|-----v----+
        f(n):(* n a) || a:1 ||
        ---------------------+

Where the ASCII-arrows represent where the closure gets the value. If we then add the definition of g, we end up with this stack:

                  _____________________
          _____  |        _____   _____|
         |     | |       |     | |     |
       --v-----|-|-------v-----|-|-----v----+
       g(n):(+ n a) || f(n):(* n a) || a:1 ||
       -------------------------------------+

Also in glorious ASCII art. So both f and g end up pointing at the same binding of a in their closures. If we add another definition of a to the environment, we end up with something that looks the same:

                         _____________________
                 _____  |        _____   _____|
                |     | |       |     | |     |
       ---------v-----|-|-------v-----|-|-----v----+
       a:5 || g(n):(+ n a) || f(n):(* n a) || a:1 ||
       --------------------------------------------+

New definitions that reference a will grab the new binding, but f and g will continue to reference that old one. Also, the set! in the definition of f will continue to affect f and g, but if we (set! a (+ a 1)) at the top level, f and g will not be affected.

After we saw some set!, we moved on to set-car! and set-cdr!. We started with some definitions:

        (define a '(1 2 3))
        (define b '(4 5 6))

        a: (1 2 3)

        (set-car! a 42)

        a: (42 2 3)

        (set-cdr! a '(9 10))

        a: (42 9 10)

The two new sets did what we expected. We wrote a map! to avoid creating a whole new list when we map:

        (define (map! f lst)
          (if (null? lst)
              ()
              (begin (set-car! lst (f (car lst)))
                     (map! f (cdr lst)))))

We noticed some peculiarities; first off, map! returns the empty list in all cases. As an exercise to the reader, you could modify it to return the original list. We also noticed that it was actually slower to map! over a long list as opposed to map over a long list. However, it was better on memory use. I mentioned the relevance of using set! in these kind of operations to save unnecessary work, like in the ngram tree where very insert created a completely new tree by copying values.

After map!, we did some oddities with these definitions:

        (define a '(1 2 3))
        (define b '(4 5 6))
        (define c (append a b))
        
        a: (1 2 3)
        b: (4 5 6)
        c: (1 2 3 4 5 6)
        
        (set-car! b 42)
        
        a: (1 2 3)
        b: (42 5 6) ;expected
        c: (1 2 3 42 5 6) ;shock and awe, shock and awe!
        
        (set-car! a 30)
        
        a: (30 2 3) ;expected
        b: (42 5 6) ;expected
        c: (1 2 3 42 5 6) ;jaws hit the floor at this point

To explain what was going on here, we made some pictures on the overhead, which I will try to recreate in ascii:

           +---+---+   +---+---+   +---+---+
        A: | 1 | --+-->| 2 | --+-->| 3 | X |
           +---+---+   +---+---+   +---+---+
        
           +---+---+   +---+---+   +---+---+
        B: | 4 | --+-->| 5 | --+-->| 6 | X |
           +---+---+   +---+---+   +---+---+

We had 4 options for what c would be. I will use A and B to refer to the original lists, and A' and B' to refer to copies of them, and the arrow (-->) means to set the (currently null) tail pointer to the next list.

        1: A --> B

              +---+---+   +---+---+   +---+---+
        C, A: | 1 | --+-->| 2 | --+-->| 3 | - |
              +---+---+   +---+---+   +---+-/-+
                 __________________________/
                v
              +---+---+   +---+---+   +---+---+
           B: | 4 | --+-->| 5 | --+-->| 6 | X |
              +---+---+   +---+---+   +---+---+
        
        2: A'--> B

           +---+---+   +---+---+   +---+---+
        A: | 1 | --+-->| 2 | --+-->| 3 | X |
           +---+---+   +---+---+   +---+---+
        
           +---+---+   +---+---+   +---+---+
        C: | 1 | --+-->| 2 | --+-->| 3 | - |
           +---+---+   +---+---+   +---+-/-+
              __________________________/
             v
           +---+---+   +---+---+   +---+---+
        B: | 4 | --+-->| 5 | --+-->| 6 | X |
           +---+---+   +---+---+   +---+---+
        
        3: A --> B'
        
              +---+---+   +---+---+   +---+---+   +---+---+   +---+---+   +---+---+
        C, A: | 1 | --+-->| 2 | --+-->| 3 | --+-->| 4 | --+-->| 5 | --+-->| 6 | X |
              +---+---+   +---+---+   +---+---+   +---+---+   +---+---+   +---+---+
        
              +---+---+   +---+---+   +---+---+
           B: | 4 | --+-->| 5 | --+-->| 6 | X |
              +---+---+   +---+---+   +---+---+
        
        4: A'--> B'
        
           +---+---+   +---+---+   +---+---+
        A: | 1 | --+-->| 2 | --+-->| 3 | X |
           +---+---+   +---+---+   +---+---+
        
           +---+---+   +---+---+   +---+---+
        B: | 4 | --+-->| 5 | --+-->| 6 | X |
           +---+---+   +---+---+   +---+---+
        
           +---+---+   +---+---+   +---+---+   +---+---+   +---+---+   +---+---+
        C: | 1 | --+-->| 2 | --+-->| 3 | --+-->| 4 | --+-->| 5 | --+-->| 6 | X |
           +---+---+   +---+---+   +---+---+   +---+---+   +---+---+   +---+---+

In cases 1 and 3, we would run into a problem of modifying the definition of list A whenever we used it in append, such that we would see this sequence:

        (define a '(1 2 3))
        (define b '(4 5 6))
        (define c (append a b))
        
        a: (1 2 3 4 5 6) ;woah, thats not good
        b: (4 5 6) ;ok
        c: (1 2 3 4 5 6) ;ok

Unintended side affects are not good. Cases 3 and 4 would result in creating a copy of B, something that isn't strictly necessary - B remains unchanged if we point to it, as we don't actually modify the list.

This left us with option 2, which it turns out Scheme is doing for us. We then wrote our own append function that does this:

        (define (append a b)
          (if (null? a) b
              (cons (car a) (append (cdr a) b))))

Which preserves (and maybe illustrates) the semantics of the build-in append.

Then I showed circular lists and blew some minds. They are more of an oddity than anything else, but you can easily create one by doing

        (append! a a)
        
        ;or
        
        (set-cdr! a a)
        
        ;or almost any thing that you can think of
        
        (define a '(1 2 3))
        (define b '(4 5 6))
        (define c (append a b))
        (set-cdr! (cddr b c))

Both b and c will be circular, and the same list, but at different starting indices. It also becomes dangerous to use these lists in code that doesn't support it; things like map and length will realize it isn't a proper list (circular lists fail the list? test), but other things like recursive functions we have written may infinite loop, or produce unexpected results.

For instance:

        (define a '(1 2 3))
        (append! a a) ;a = #0=(1 2 3 . #0#) (circular)
        (append '(1 2 3) a) ;works, odd list though
        (append a '(1 2 3)) ;infinite loop
        (append a a) ;infinite loop
        (set-cdr! (cddr a) a) ;no effect
        (set-cdr! (cddr a) (cons 4 a)) ;still circular with one more element
        (set-cdr! (cdddr a) ()) ;all better now

Nathan then talked about delayed evaluation and provided the following notes in the form of a Scheme file with comments.

; memoization
; maintain a table of results to avoid recomputing function calls
; especially important when problems depend on subproblems

; useful example - fibonacci
; maintain a table (in list form) of a list of the argument and result
; memo -> ((1 1) (2 1)) to start
(define fibonacci
  (letrec((memo (list (list 1 1) (list 2 1)))
          (f (lambda (x)
               (let ((ans (assoc x memo)))
                 (if ans (cadr ans)
                       (let ((new-ans (+ (f (- x 1))
                                             (f (- x 2)))))
                             (begin (set! memo (cons (list x new-ans)
                                                     memo))
                                    new-ans)))))))
    f))



; thunks
; used to delay evaluation for performance (or termination)
; can be used to provide call-by-need semantics in scheme

(define (work x)
  (display "work called ")
  (display x)
  (display "\n")
  (+ x 1))

(define (test t e1 e2)
  (if t
      (+ e1 e1 e1)
      e2))

(test true (work 20) (work 10))

; same function, thunks
(define (test2 t e1 e2)
  (if t
      (+ (e1) (e1) (e1))
      (e2)))

(test2 true (lambda () (work 20)) (lambda () (work 10)))


; same function, thunks with force/delay (force memoizes the result)
(define (test2 t e1 e2)
  (if t
      (+ (force e1) (force e1) (force e1))
      (force e2)))

(test2 true (delay (work 20)) (delay (work 10)))

Stuart Reges

Last modified: Wed Feb 21 06:49:04 PST 2007