CSE341 Notes for Friday, 5/24/24

I finished our discussion of mixins. We had been looking at the Point class. Consider the following assignments:

        p1 = Point.new(3, 5)
        p2 = p1
        p3 = Point.new(3, 5)

I asked what Ruby would say when we asked whether these different objects were equal to each other using the == operator. We found that p1 and p2 were equal but not p1 and p3. By default, Ruby uses the most strict definition of equality, which is object equality (the variables point to the same object).

In the previous lecture we looked at an implementation of the spaceship operator (<=>) for the Point class. Normally we would also say:

        include Comparable

I left that off of the class so that we could do it in the interpreter. This allowed us to see how the methods changed when we added this to the class.

        >> p = Point.new(3, 5)
        => #
        >> m1 = p.methods
        => 
        [:y,
        ...
        >> class Point
        >>   include Comparable
        >>   end
        => Point
        >> m2 = p.methods
        => 
        [:y,
        ...
        >> m2.length - m1.length
        => 6
        >> m2 - m1
        => [:clamp, :<=, :>=, :<, :>, :between?]

But we found that it also redefined equality:

        >> p1 == p2
        => true
        >> p1 == p3
        => true
        >> p1 == Point.new(5, 3)
        => true

Before the definition was too strict. Not it is too loose. It is saying that the point (3, 5) is equal to the point (5, 3). That's because the spaceship operator uses distance from the origin to determine the order of values. So we introduced our own version of the equals operator:

        def == other
          return (x == other.x and y == other.y)
        end

I spent a little time discussing a subclass called Point3D that would include a z-coordinate. We started with this definition:

        class Point3D < Point
          def initialize(x = 0, y = 0, z = 0)
            super(x, y)
            @z = z
          end
        
          def to_s
            super[0..-2] + ", " + z.to_s + ")"
          end
        
        protected
          attr_reader :z
          attr_writer :z
        end

This is a well-known example of where you run into trouble with an equality operator and an inheritance relationship. This class inherits a version of the equals operator that looks at just the x an y coordinates. We can fix that by having it also check the z coordinate

        def == other
          return (self and z == other.z)
        end

This gives the right answer when you compare two Point3D objects, but it gives the wrong answer when you have a regular Point comparing itself to a Point3D. It can return true when it shouldn't because it doesn't check the z-coordinate. And if you turn the comparison around and ask the Point3D whether it equals the regular point, you can an error when it tries to access the z component of the simple Point. There is not a great solution to this problem other than to avoid inheritance in this case.

The other common Ruby mixin in Enumerable. It defines a series of methods built on top of the each method. Remember that we defined a MyRange class with an each method:

        class MyRange
          def initialize(first, last)
            @first = first
            @last = last
          end
        
          def each
            i = @first
            while i <= @last
              yield i
              i += 1
            end
          end
        end

By including the Enumerable mixin, we get 58 new methods. I demonstrated it this way in the interpreter:

        >> x = MyRange.new(1, 10)
        => #<MyRange:0x00007f9593d1df00 @first=1, @last=10>
        >> m1 = x.methods
        => 
        [:first,
        ...
        >> class MyRange
        >>   include Enumerable
        >>   end
        => MyRange
        >> m2 = x.methods
        => 
        [:first,
        ...
        >> m1.length
        => 69
        >> m2.length
        => 127
        >> m2.length - m1.length
        => 58

And we can see the list of the actual methods by saying:

        >> m2 - m1
        => 
        [:slice_after, :slice_when, :chunk_while, :sum, :uniq, :chain, :lazy,
         :to_h, :include?, :max, :min, :to_set, :find, :to_a, :entries, :sort,
         :sort_by, :grep, :grep_v, :count, :detect, :find_index, :find_all,
         :select, :filter_map, :reject, :collect, :map, :flat_map,
         :collect_concat, :inject, :reduce, :partition, :group_by, :tally,
         :all?, :any?, :one?, :none?, :minmax, :min_by, :max_by, :minmax_by,
         :member?, :each_with_index, :reverse_each, :each_entry, :each_slice,
         :each_cons, :each_with_object, :zip, :take, :take_while, :drop,
         :drop_while, :cycle, :chunk, :slice_before]
        >>

So by defining a single method (each) and using the mixin, we can get 58 very useful methods. This is certainly easier from a programming point of view and it is easier to create consistency across different types of objects when they all refer to the same mixin.

Then I switched to a topic we didn't have time for earlier when we discussed Scheme. I wanted to explore how mutation works in Scheme. We've seen that you can call define to bind and rebind a value:

        > (define x 34)
        > x
        34
        > (define x 18.9)
        > x
        18.9

But this all takes place at the top level environment. For example, if I have a function that tries to rebind x, it doesn't work:

        > (define (foo) (define x 13) x)
        > (foo)
        13
        > x
        18.9

In this case it defines a local x that has value 13, but that has no effect on the global x in the top-level environment.

There is an alternative. You need to use define to introduce new identifiers into the environment, but once introduced, you can call set! to change their values. The convention in Scheme is to have an exclamation mark at the end of the name of any function that is a mutating function. So if we had written our local function using set! instead, then we end up changing the actual global variable:

        > (define (foo) (set! x 13) x)
        > (foo)
        13
        > x
        13

I then talked about how the mutating functions can be used to create a local variable that only certain functions have access to:

        (define incr null)
        (define get null)
        (define m 3)
        (let ((n 0))
          (set! incr (lambda (i) (set! n (+ n i m))))
          (set! get (lambda () n)))

We saw that the get function would report the value of n and the incr function was able to increment n, but we can't access n from the top-level environment:

        > (get)
        0
        > (incr 3)
        > (get)
        6
        > (incr 5)
        > (get)
        14
        > n
        * reference to undefined identifier: n

The call on let introduces a local scope in which n is declared. Each lambda that appears inside the let causes Scheme to set up a closure that has a reference to that inner environment that has the variable n.

We then spent time discussing a technique known as memoization in which we remember what values were returned by various calls on a function. The idea is similar to the idea of caching. As a function is called, we keep track of what values were returned for each call, memoizing the result. This is a useful technique that can be used in any programming language. It is particularly helpful for speeding up recursive definitions that compute the same value multiple times.

In a previous lecture we wrote an inefficient version of the fib function for computing Fibonacci numbers:

        (define (fib n)
          (if (<= n 1)
              1
              (+ (fib (- n 1)) (fib (- n 2)))))

The complexity of this function is exponential. We were able to rewrite it using a more iterative approach, but we can use memoization instead. We set up a global variable called answers with the first two values:

        (define answers '((1 . 1) (0 . 1)))

Then in writing the function, we first did a lookup against our list of answers. If we've already computed that Fibonacci number, then we just return its value. Otherwise we compute an answer and store it in our list of answers so that we never compute it again:

        (define (fib n)
          (let ([match (assoc n answers)])
            (if match
                (cdr match)
                (let ([new-answer (+ (fib (- n 1)) (fib (- n 2)))])
                  (set! answers (cons (cons n new-answer) answers))
                  new-answer))))

We saw that as we asked for higher values of fib, our global variable ended up with more memoized results:

        > answers
        ((1 . 1) (0 . 1))
        > (fib 5)
        8
        > answers
        ((5 . 8) (4 . 5) (3 . 3) (2 . 2) (1 . 1) (0 . 1))
        > (fib 20)
        10946
        > answers
        ((20 . 10946)
         (19 . 6765)
         (18 . 4181)
         (17 . 2584)
         (16 . 1597)
         (15 . 987)
         (14 . 610)
         (13 . 377)
         (12 . 233)
         (11 . 144)
         (10 . 89)
         (9 . 55)
         (8 . 34)
         (7 . 21)
         (6 . 13)
         (5 . 8)
         (4 . 5)
         (3 . 3)
         (2 . 2)
         (1 . 1)
         (0 . 1))

We then looked at how to localize answers. Instead of using a global variable, we can use a let inside the function and define a helper function that has access to it:

        (define (fib n)
          (let ([answers '((1 . 1) (0 . 1))])
            (define (helper n)
              (let ((match (assoc n answers)))
                (if match
                    (cdr match)
                    (let ((new-answer (+ (helper (- n 1)) (helper (- n 2)))))
                      (set! answers (cons (cons n new-answer) answers))
                      new-answer))))
            (helper n)))

This is a pretty good answer in that every time you call fib, it sets up a variable called answers that memoizes the results. But we can do even better. Why reconstruct answers every time you call fib? If we instead want to construct the answers just once, then we don't want fib to be defined as a normal function as we have done above. Instead, we want to set it up just once and make it a function that has access to answers. But we've already done that. Our helper function is the function we want fib to be, so all we have to do is assign fib to be the helper:

        (define fib
          (let ([answers '((1 . 1) (0 . 1))])
            (define (helper n)
              (let ([match (assoc n answers)])
                (if match
                    (cdr match)
                    (let ([new-answer (+ (helper (- n 1)) (helper (- n 2)))])
                      (set! answers (cons (cons n new-answer) answers))
                      new-answer))))
            helper))

This assignment happens just once, so that we construct the answers just once. That means that we'll never end up computing the same value of fib more than once.

Stuart Reges

Last modified: Fri May 24 13:59:51 PDT 2024