** 4. Lists ** 4.1. The main data structure in Scheme (and most other fnal languages) is the singly linked list. Built from pairs. 3 main operations on pairs/lists: cons: construct a new pair car: extract the first component of the pair cdr: extract the second component of the pair (define p (cons 3 4)) --> (3 . 4) [and draw picture] (car p) --> 3 (cdr p) --> 4 (define q (cons p (cons p p))) --> ((3 . 4) . ((3 . 4) . (3 . 4))) [draw picture] [p is shared 3 times in q, but don't worry about it yet.] ------------------------------------------------------------------- 4.2. Lists made by making the cdr field be (refer to) another list. The end of the list is marked with a special nil constant, written (). () --> () (define l1 (cons 3 ())) --> (3) [same as (3 . nil)] (define l2 (cons 2 l1)) --> (2 3) [same as (2 . (3 . nil))] (define l3 (cons 1 l2)) --> (1 2 3) [same as (1 . (2 . (3 . nil)))] [draw pictures] When interpreting nested pairs as lists, car becomes first and cdr becomes rest: (car l3) --> 1 (cdr l3) --> (2 3) (car (cdr l3)) --> 2 (cdr (cdr l3)) --> (3) (car (cdr (cdr l3))) --> 3 (cdr (cdr (cdr l3))) --> () A convenience: (list 3 4 5) --> (3 4 5) (list) --> () ------------------------------------------------------------------------- 4.3. ** Properties of lists: *) cons is functional: it creates a new pair without modifying its arguments or any other value. similarly, car and cdr extract parts of their argument pair, without modifying the pair or the pieces. pairs no longer referenced have their space implicitly reclaimed and recycled for new pairs. [refer to above examples of cons, car, and cdr, and pictures] => ** garbage collection ** ++) no dangling pointers (free too early) ++) no storage leaks (free too late) [not exactly] => huge productivity, safety boon *) Dynamically (type) checked: car or cdr of a non-pair is a run-time error; dynamically (type) checked. (similarly, + of a non-number is a run-time, dynamically (type) checked error.) Scheme includes predicates to test the type of a value: null? pair? boolean? number? integer? string? ... *) Heterogeneous: can mix values of different types in a list, e.g. (list 3 4 "hi" () (list 5 6)) --> (3 4 "hi" () (5 6)) *) Nestable: as evidenced above, lists can contain lists, to arbitrary depth. Good for simple databases: (define Zips (list (cons "Seattle" 98195) (cons "Boston" 02115) (cons "Reston" 22091))) --> (("Seattle" . 98195) ("Boston" . 02115) ("Reston" 22091)) ++) simple, regular, flexible data structure ++) subsumes arrays (homogeneous, variable length) and records (heterogeneous, fixed length) due to dynamic typing. (but not as fast to access interior elements as either arrays or records) ------------------------------------------------------------------------ 4.4. ** Quoting Special form & syntax included for literal data structures (more powerful than in most other languages). * quote special form: return argument expression *as a list data structure w/o evaluating it* (list 3 (list 4 5) (cons 6 7)) --> (3 (4 5) (6 . 7)) (quote (3 (4 5) (6 . 7))) --> (3 (4 5) (6 . 7)) Special syntax for quote special form: 'expr instead of (quote expr) '(3 (4 5) (6 . 7)) --> (3 (4 5) (6 . 7)) Can quote non-list expressions, too: (quote x) --> x 'x --> x 'X --> x (a quoted variable name is called a *symbol*) (capitalization of identifiers & symbols is insigificant; Scheme canonicalizes using all lower-case letters) (define PhoneBook '((Bob 5241235) (Sue 3459876) (Don 1235678))) --> ((bob 5241235) (sue 3459876) (don 1235678))) (Use strings if capitalization is important, or multiple words are needed) (define PhoneBook '(("Bob W." "524-1235") ("Sue N." "345-9876") ("Don A." "123-5678"))) --> (("Bob W." "524-1235") ("Sue N." "345-9876") ("Don A." "123-5678"))) ** program expressions and data use the same syntax, and quoting allows program expressions to be used easily as data ** data can be treated as a program expression using the eval function ** Lisps are good for writing program manipulators, transformers, and evaluators, since parsing & pretty printing come for free, and programs are naturally represented as trees. ------------------------------------------------------------------------- 4.5. ** Binding, Sharing, and Equality A variable binding (define, let) makes the variable refer to its argument value (like a pointer); no copying takes place. Parameter works the same way, binding a formal to the actual parameter value [not call-by-value or call-by-reference, but call-by-sharing or call-by-pointer-value]. Returns work the same way. Cons allocates a fresh memory cell, different from any other The new memory cell refers to its argument values; no copying takes place. The same is true for list and other allocators. * A pure reference-oriented memory model * [draw pictures] (define l1 (cons 3 4)) --> (3 . 4) (define l2 l1) --> (3 . 4) (define l3 (cons 3 4)) --> (3 . 4) (define l4 (cons l1 l3))--> ((3 . 4) 3 . 4) [! note weird printing!] (define l5 (cdr l4)) --> (3 . 4) But how can you tell? ------------------------------------------------------------------------- 4.6. ** Equality What does it mean for two values to be equal? *) they are the same identical object in memory (e.g. two variables that refer to the same value returned from a single call to cons): eq? *) they have the same "abstract" value, i.e., they print out the same (structural equality): equal? *) they are numerically the same number: = (define l1 (cons 3 4)) --> (3 . 4) (define l2 l1) --> (3 . 4) (define l3 (cons 3 4)) --> (3 . 4) (define l4 (cons l1 l3))--> ((3 . 4) 3 . 4) (define l5 (cdr l4)) --> (3 . 4) (eq? l1 l2) --> #t (eq? l1 l3) --> #f [()] (equal? l1 l2) --> #t (equal? l1 l3) --> #t (equal? l4 (cons l2 l2)) --> #t (eq? l5 l3) --> #t ------------------------------------------------------------------------ 4.7. ** Side-effects Sharing matters in the face of side-effects. (set! var expr) modifies the lexically enclosing binding of var to refer to the result of evaluating expr. Changes the binding/reference, not the old or new value. (set-car! expr_list expr_newfirst) modifies the cons cell's first element to be the new value. The old first element isn't affected in any way, but the cons cell (and all other things that refer to it) are. (set-cdr! expr_list expr_newrest) similarly modifies the cons cell's second (tail) element. [! is convention for side-effecting operation] (set-car! l1 5) --> l1 --> (5 . 4) l2 --> (5 . 4) l3 --> (3 . 4) l4 --> ((5 . 4) 3 . 4) (set-cdr! l2 '(6 7)) l1 --> (5 6 7) l2 --> (5 6 7) l3 --> (3 . 4) l4 --> ((5 6 7) 3 . 4) (set! l1 (cons 8 9)) l1 --> (8 . 9) l2 --> (5 6 7) l3 --> (3 . 4) l4 --> ((5 6 7) 3 . 4) ** side effects used rarely in Lisp and other fn'al languages. much more common is creating new temporary data structures from older data structures. help reasoning about program behavior, because sharing (or not) is not semantically important, and things can be freely shared (e.g. for memory & time savings and programmer convenience) without worrying about them being accidentally modified. only "global" data structures representing the state of whatever's being modeled get changed, when that's the appropriate thing to do from a modeling standpoint. ------------------------------------------------------------------------ 4.8. ** Recusion over lists ** Lists are recursive data types: List ::= nil [base case] | cons(data, List) [inductive case] Recursive data types are well-suited for manipulation by recursive functions whose structure matches the data type structure. (define (f lst) (if (null? lst) )) E.g.: *) first example: (define (sum lst) (if (null? lst) 0 (+ (car lst) (sum (cdr lst))))) *) more interesting if/cond/or control structures (define (member elem lst) ;; a version is built-in (if (null? lst) #f (if (equal? (car lst) elem) #t ;; a choice of equality test fn (member (cdr lst))))) OR (define (member elem lst) (cond ((null? lst) #f) ((equal? (car lst) elem) #t) (else (member (cdr lst))))) (define (member elem lst) (or (not (null? lst)) (equal? (car lst) elem) (member (cdr lst)))) *) more interesting list data structures (define (assoc key alst) ;; alst = list of key-value pairs/lists/... (if (null? alst) () ;; a version is built-in (let* ((entry (car alst)) (k (car entry))) (if (equal? k key) entry (assoc key (cdr alst)))))) *) which list to recur on? (define (append lst1 lst2) (cond ((null? lst1) lst2) ((null? lst2) lst1) ;; optional (else (cons (car lst1) (append (cdr lst1) lst2))))) *) recur over two lists! (define (merge lst1 lst2) ;; assume lst1 & lst2 are sorted (let ((h1 (car lst1)) (h2 (car lst2))) (if (<= h1 h2) (cons h1 (merge (cdr lst1) lst2)) (cons h2 (merge lst1 (cdr lst2)))))) Other examples: reverse append! ------------------------------------------------------------------------ 4.9. ** Recursion vs. iteration: *) recursion more general than iteration (e.g. when have more than one inductive case taken dynamically (merge doesn't count, but tree traversals do)); iteration may require an explicit stack to simulate recursion *) recursion can be less efficient than iteration (unless iteration requires a stack): stack allocation & deallocation & parameter-passing BUT: tail recursion can be implemented just as efficiently as iteration tail recursion: recursive call is last thing caller does before returning, and it returns the result of the recursive call as the result of the caller. In above examples: + assoc & member are tail-recursive fns + sum, append, merge are not implement tail-recursive calls (or in general, tail-calls of any sort) as a direct jump to callee entry point, without creating a new stack frame; let callee reuse stack frame of caller, and return to caller's caller directly. constant-time and zero space for tail-calls! tail-recursion required to be implemented this efficiently in Scheme! can often rewrite a non-tail-recursive function as a tail recursive one, typically by introducing a helper function that has an extra accumulator argument that builds up the result. E.g. (define (sum lst) (define (sum-helper lst res) ;; res = sum of all earlier elements (if (null? lst) res (sum-helper (cdr lst) (+ res (car lst))))) (sum-helper lst 0)) append? merge?