** 4.  Lists **

4.1.

The main data structure in Scheme (and most other fnal languages) is
the singly linked list.

Built from pairs.

3 main operations on pairs/lists:

cons: construct a new pair
car: extract the first component of the pair
cdr: extract the second component of the pair

(define p (cons 3 4))  -->  (3 . 4)   [and draw picture]
(car p)  -->  3
(cdr p)  -->  4

(define q (cons p (cons p p)))  -->  ((3 . 4) . ((3 . 4) . (3 . 4)))
[draw picture]

[p is shared 3 times in q, but don't worry about it yet.]

-------------------------------------------------------------------

4.2.

Lists made by making the cdr field be (refer to) another list.
The end of the list is marked with a special nil constant, written ().

()  -->  ()
(define l1 (cons 3 ()))  -->  (3)    [same as (3 . nil)]
(define l2 (cons 2 l1))  -->  (2 3)   [same as (2 . (3 . nil))]
(define l3 (cons 1 l2))  -->  (1 2 3)  [same as (1 . (2 . (3 . nil)))]
[draw pictures]

When interpreting nested pairs as lists, car becomes first and cdr
becomes rest:

(car l3)  -->  1
(cdr l3)  -->  (2 3)
(car (cdr l3))  -->  2
(cdr (cdr l3))  -->  (3)
(car (cdr (cdr l3)))  -->  3
(cdr (cdr (cdr l3)))  -->  ()

A convenience:

(list 3 4 5)  -->  (3 4 5)
(list)  -->  ()

-------------------------------------------------------------------------

4.3.

** Properties of lists:

*) cons is functional: it creates a new pair without modifying its
arguments or any other value.  similarly, car and cdr extract parts of
their argument pair, without modifying the pair or the pieces.  pairs
no longer referenced have their space implicitly reclaimed and
recycled for new pairs.  [refer to above examples of cons, car, and
cdr, and pictures]  =>  ** garbage collection **
   ++) no dangling pointers (free too early)
   ++) no storage leaks (free too late) [not exactly]
   => huge productivity, safety boon

*) Dynamically (type) checked: car or cdr of a non-pair is a run-time
error; dynamically (type) checked.  (similarly, + of a non-number is a
run-time, dynamically (type) checked error.)  Scheme includes
predicates to test the type of a value: null? pair? boolean? number?
integer? string? ...

*) Heterogeneous: can mix values of different types in a list, e.g.

(list 3 4 "hi" () (list 5 6))  -->  (3 4 "hi" () (5 6))

*) Nestable: as evidenced above, lists can contain lists, to arbitrary
depth.  Good for simple databases:

(define Zips (list (cons "Seattle" 98195)
                   (cons "Boston"  02115)
                   (cons "Reston"  22091)))
  -->  (("Seattle" . 98195) ("Boston" . 02115) ("Reston" 22091))

++) simple, regular, flexible data structure
++) subsumes arrays (homogeneous, variable length) and records
(heterogeneous, fixed length) due to dynamic typing.  (but not as fast
to access interior elements as either arrays or records)

------------------------------------------------------------------------

4.4.

** Quoting

Special form & syntax included for literal data structures (more
powerful than in most other languages).

* quote special form: return argument expression *as a list data
structure w/o evaluating it*

(list 3 (list 4 5) (cons 6 7))  -->  (3 (4 5) (6 . 7))

(quote (3 (4 5) (6 . 7)))  -->  (3 (4 5) (6 . 7))

Special syntax for quote special form: 'expr instead of (quote expr)

'(3 (4 5) (6 . 7))  -->  (3 (4 5) (6 . 7))

Can quote non-list expressions, too:

(quote x)  -->  x
'x  -->  x
'X  -->  x

(a quoted variable name is called a *symbol*)

(capitalization of identifiers & symbols is insigificant; Scheme
canonicalizes using all lower-case letters)

(define PhoneBook '((Bob 5241235)
                    (Sue 3459876)
                    (Don 1235678)))
  -->  ((bob 5241235) (sue 3459876) (don 1235678)))

(Use strings if capitalization is important, or multiple words are
needed)

(define PhoneBook '(("Bob W." "524-1235")
                    ("Sue N." "345-9876")
                    ("Don A." "123-5678")))
  -->  (("Bob W." "524-1235") ("Sue N." "345-9876") ("Don A." "123-5678")))

** program expressions and data use the same syntax, and quoting
allows program expressions to be used easily as data
** data can be treated as a program expression using the eval function
** Lisps are good for writing program manipulators, transformers, and
evaluators, since parsing & pretty printing come for free, and
programs are naturally represented as trees.

-------------------------------------------------------------------------

4.5.

** Binding, Sharing, and Equality

A variable binding (define, let) makes the variable refer to its
argument value (like a pointer); no copying takes place.

Parameter works the same way, binding a formal to the actual parameter
value [not call-by-value or call-by-reference, but call-by-sharing or
call-by-pointer-value].  Returns work the same way.

Cons allocates a fresh memory cell, different from any other
The new memory cell refers to its argument values; no copying takes place.
The same is true for list and other allocators.

* A pure reference-oriented memory model *

[draw pictures]
(define l1 (cons 3 4))  -->  (3 . 4)
(define l2 l1)          -->  (3 . 4)
(define l3 (cons 3 4))  -->  (3 . 4)
(define l4 (cons l1 l3))-->  ((3 . 4) 3 . 4)  [! note weird printing!]
(define l5 (cdr l4))    -->  (3 . 4)

But how can you tell?

-------------------------------------------------------------------------

4.6.

** Equality

What does it mean for two values to be equal?

*) they are the same identical object in memory (e.g. two variables
that refer to the same value returned from a single call to cons):  eq?

*) they have the same "abstract" value, i.e., they print out the same
(structural equality):  equal?

*) they are numerically the same number:  =

(define l1 (cons 3 4))  -->  (3 . 4)
(define l2 l1)          -->  (3 . 4)
(define l3 (cons 3 4))  -->  (3 . 4)
(define l4 (cons l1 l3))-->  ((3 . 4) 3 . 4)
(define l5 (cdr l4))    -->  (3 . 4)

(eq? l1 l2)  -->  #t
(eq? l1 l3)  -->  #f [()]
(equal? l1 l2)  -->  #t
(equal? l1 l3)  -->  #t
(equal? l4 (cons l2 l2))  -->  #t
(eq? l5 l3)  -->  #t

------------------------------------------------------------------------

4.7.

** Side-effects

Sharing matters in the face of side-effects.

(set! var expr) modifies the lexically enclosing binding of var to
refer to the result of evaluating expr.  Changes the
binding/reference, not the old or new value.

(set-car! expr_list expr_newfirst) modifies the cons cell's first
element to be the new value.  The old first element isn't affected in
any way, but the cons cell (and all other things that refer to it)
are.

(set-cdr! expr_list expr_newrest) similarly modifies the cons cell's
second (tail) element.

[! is convention for side-effecting operation]

(set-car! l1 5)  -->  <no value>
l1  -->  (5 . 4)
l2  -->  (5 . 4)
l3  -->  (3 . 4)
l4  -->  ((5 . 4) 3 . 4)

(set-cdr! l2 '(6 7))
l1  -->  (5 6 7)
l2  -->  (5 6 7)
l3  -->  (3 . 4)
l4  -->  ((5 6 7) 3 . 4)

(set! l1 (cons 8 9))
l1  -->  (8 . 9)
l2  -->  (5 6 7)
l3  -->  (3 . 4)
l4  -->  ((5 6 7) 3 . 4)

** side effects used rarely in Lisp and other fn'al languages.  much
more common is creating new temporary data structures from older data
structures.  help reasoning about program behavior, because sharing
(or not) is not semantically important, and things can be freely
shared (e.g. for memory & time savings and programmer convenience)
without worrying about them being accidentally modified.  only
"global" data structures representing the state of whatever's being
modeled get changed, when that's the appropriate thing to do from a
modeling standpoint.

------------------------------------------------------------------------

4.8.

** Recusion over lists **

Lists are recursive data types:

List ::= nil                 [base case]
       | cons(data, List)    [inductive case]

Recursive data types are well-suited for manipulation by recursive
functions whose structure matches the data type structure.

(define (f lst)
   (if (null? lst) <base expr>
       <inductive expr... (car lst) ... (f (cdr lst)) ...>))

E.g.:

*) first example:
(define (sum lst)
   (if (null? lst) 0
       (+ (car lst) (sum (cdr lst)))))


*) more interesting if/cond/or control structures
(define (member elem lst)   ;; a version is built-in
   (if (null? lst) #f
       (if (equal? (car lst) elem) #t   ;; a choice of equality test fn
           (member (cdr lst)))))
OR
(define (member elem lst)
   (cond ((null? lst) #f)
         ((equal? (car lst) elem) #t)
         (else (member (cdr lst)))))
(define (member elem lst)
   (or (not (null? lst))
       (equal? (car lst) elem)
       (member (cdr lst))))


*) more interesting list data structures
(define (assoc key alst)    ;; alst = list of key-value pairs/lists/...
   (if (null? alst) ()      ;; a version is built-in
      (let* ((entry (car alst))
             (k (car entry)))
         (if (equal? k key) entry
	    (assoc key (cdr alst))))))


*) which list to recur on?
(define (append lst1 lst2)
   (cond ((null? lst1) lst2)
         ((null? lst2) lst1)	;; optional
         (else (cons (car lst1) (append (cdr lst1) lst2)))))


*) recur over two lists!
(define (merge lst1 lst2)  ;; assume lst1 & lst2 are sorted
   (let ((h1 (car lst1))
         (h2 (car lst2)))
      (if (<= h1 h2) (cons h1 (merge (cdr lst1) lst2))
                     (cons h2 (merge lst1 (cdr lst2))))))


Other examples:

reverse

append!

------------------------------------------------------------------------

4.9.

** Recursion vs. iteration:

*) recursion more general than iteration (e.g. when have more than one
inductive case taken dynamically (merge doesn't count, but tree
traversals do)); iteration may require an explicit stack to simulate
recursion

*) recursion can be less efficient than iteration (unless iteration
requires a stack): stack allocation & deallocation & parameter-passing

BUT: tail recursion can be implemented just as efficiently as
iteration

tail recursion: recursive call is last thing caller does before
returning, and it returns the result of the recursive call as the
result of the caller.

In above examples:
+ assoc & member are tail-recursive fns
+ sum, append, merge are not

implement tail-recursive calls (or in general, tail-calls of any sort)
as a direct jump to callee entry point, without creating a new stack
frame; let callee reuse stack frame of caller, and return to caller's
caller directly.  constant-time and zero space for tail-calls!

tail-recursion required to be implemented this efficiently in Scheme!

can often rewrite a non-tail-recursive function as a tail recursive
one, typically by introducing a helper function that has an extra
accumulator argument that builds up the result.

E.g.

(define (sum lst)
   (define (sum-helper lst res)  ;; res = sum of all earlier elements
      (if (null? lst) res
         (sum-helper (cdr lst) (+ res (car lst)))))
   (sum-helper lst 0))

append?  merge?