CSE341 Notes for Wednesday, 6/3/09

I finished our discussion of the tree iterator. We looked again at the tedious code that Sun had to write to implement a "successor" method that moves from one node of a tree to another to get a preorder traversal.

Ruby gives us a solution that is simple and efficient. We had developed this code in section for a binary search tree

        class Tree
          def initialize()
            @overallRoot = nil
          end
        
          def insert(v)
            @overallRoot = insert_helper(v, @overallRoot)
          end
        
          def print()
            print_helper(@overallRoot)
          end
        
          private # beginning of private definitions
        
          class Node
            def initialize(data = nil, left = nil, right = nil)
              @data = data
              @left = left
              @right = right
            end
        
            attr_reader :data, :left, :right
            attr_writer :data, :left, :right
          end
        
          def insert_helper(v, root)
            if root == nil
              root = Node.new(v)
            elsif v < root.data then
              root.left = insert_helper(v, root.left)
            else
              root.right = insert_helper(v, root.right)
            end
            return root
          end
        
          def print_helper(root)
            if root != nil then
              print_helper root.left
              puts root.data
              print_helper root.right
            end
          end
        end
We can define an iterator called inorder that looks a lot like the current print and print_helper methods. The big difference is that instead of calling puts to print values, they will call yield to generate values:

        class Tree
          ...
        
          def inorder
            inorder_helper(@overallRoot) {|n| yield n}
          end
        
          private
            def inorder_helper(root)
              if root then
                inorder_helper(root.left) {|n| yield n}
                yield root.data
                inorder_helper(root.right) {|n| yield n}
              end
            end
          ...
        end
Given this method, we can call it with a block. In fact, print can now be redefined as a call on this iterator:

        def print()
          inorder {|n| puts n}
        end
We loaded this new version into irb and tested it out. First we create a tree and inserted 25 random values:

        >> t = Tree.new
        => #<Tree:0xb7eb0a5c @overallRoot=nil>
        >> 25.times{t.insert(rand(100))}
        => 25
We found that print still worked just fine:

        >> t.print
        2
        2
        15
        16
        19
        23
        23
        32
        38
        42
        43
        47
        51
        61
        64
        68
        70
        73
        77
        79
        80
        83
        88
        90
        96
        => nil
But now we could specify variants of print by using the inorder iterator, like printing each number doubled:

        >> t.inorder {|n| puts 2 * n}
        4
        4
        30
        32
        38
        46
        46
        64
        76
        84
        86
        94
        102
        122
        128
        136
        140
        146
        154
        158
        160
        166
        176
        180
        192
        => nil
We were also able to use the iterator to find the sum of the numbers:


        >> sum = 0
        => 0
        >> t.inorder {|n| sum += n}
        => nil
        >> sum
        => 1282
I pointed out that not only was this iterator fairly easy to define, it is also highly efficient. We would describe it as lazy in the sense that it doesn't compute a value until it needs it. For example, we reset the sum to be 0 and wrote this variant that breaks out of the computation as soon as the sum becomes greater than 100:

       t.inorder do |n|
         puts n
         sum += n
         break if sum > 100
       end
When we ran it, it produced this output:

        2
        2
        15
        16
        19
        23
        23
        32
We found that it had set sum to 132 and then stopped. As we noted earlier, one approach is to precompute the entire traversal before it begins. For a computation like the one above that breaks out early, that would be very expensive.

I gave one other quick example of this kind of computation in Ruby. There is a library known as "mathn" that has some interesting math extensions. For example, it has a class called Prime that can be used to generate the prime numbers in sequence:

        >> require "mathn"
        => true
        >> p = Prime.new
        => #<Prime:0xb7cfa794 @counts=[], @primes=[], @seed=1>
        >> p.next
        => 2
        >> p.next
        => 3
        >> p.next
        => 5
        >> p.next
        => 7
        >> 10.times {puts p.next}
        11
        13
        17
        19
        23
        29
        31
        37
        41
        43
        => 10
It has an each method that can compute an arbitrary number of primes. Obviously it doesn't precompute them. It computes them only as it needs them. We would have to include a call on break or return if we want to use it, as in this code, which computes the sum of the primes up to 10000:

        require "mathn"
        p = Prime.new
        sum = 0
        p.each do |n|
          break if n > 10000
          sum += n
        end
        puts sum
This reports that the sum of the primes up to 10000 is equal to 5736396.

I also briefly mentioned that wikipedia has a nice entry on probabilistic primality testing using a technique known as Miller-Rabin. I had considered giving this as a Ruby assignment, but I was thwarted by the fact that the wikipedia page includes sample Ruby code. I copied it and pasted it into irb and we found that we could compute the same sum of primes using the new prime? method:

        >> sum = 0
        => 0
        >> for n in 1..10000 do
        >>     sum += n if n.prime?
        >>   end
        => 1..10000
        >> sum
        => 5736396
We spent the rest of the lecture exploring what are known as mixins. This is one of the most interesting features of Ruby.

Before looking at Ruby mixins, I spent a few minutes discussing Java's inheritance model. I asked people what you get when class B extends class A in Java. The answer is that you get two different things:

These are really two different things. Java also has interfaces, which allow you to define subtype relationships without any code reuse. So in Java you can have one code reuse relationship and any number of subtype relationships, but you can't get the code reuse relationship without the subtype.

C++ is an interesting contrast. C++ supports multiple inheritance. With multiple inheritance, you can get multiple code reuse relationships. But it turns out that multiple inheritance is rather messy. For example, Arthur Riel in his book Object-Oriented Design Heuristics includes as item 54:

54. If you have an example of multiple inheritance in your design, assume you have made a mistake and then prove otherwise.
C++ also has a notion of private inheritance where you have code reuse but no subtype relationship.

Ruby offers something in between. It has single inheritance, just as Java does. Subtyping doesn't matter in Ruby because it uses duck typing (Ruby doesn't care of kind of duck you are as long as you can quack in an appropriate manner when asked to do so). So the only issue in Ruby is code resuse. We've seen that inheritance of classes is similar in Ruby to what we saw in Java. Mixins offer an alternative. You can define a mixin by define a module and including a set of methods. For example, I wrote the following mixin that defines two methods that allow sequences to be stuttered:

        module Stutterable
          def stutter
            result = []
            for n in self
              result.push n
              result.push n
            end
            result
          end
        
          def stutter_each
            for n in self
              yield n
              yield n
            end
          end
        end
You use the word "module" instead of "class". Once you have define this module, you can include it in classes by saying:

        include Stutterable
It is almost as if the actual code from the module is included. For example, we went into the interpreter and added this code to the Array class:

        >> class Array
        >>   include Stutterable
        >>   end
        => Array
        >> x = [1, 2, 3]
        => [1, 2, 3]
        >> x.stutter
        => [1, 1, 2, 2, 3, 3]
        >> x.stutter_each {|n| puts n}
        1
        1
        2
        2
        3
        3
        => [1, 2, 3]
and we added it to the Range class:

        >> class Range
        >>   include Stutterable
        >>   end
        => Range
        >> x = (1..5)
        => 1..5
        >> x.stutter
        => [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
        >> x.stutter_each {|n| puts n}
        1
        1
        2
        2
        3
        3
        4
        4
        5
        5
        => 1..5
I mentioned that the two most common mixins are Comparable and Enumerable. For example, we modified the Point class to implement a method called <=>, which is the Ruby equivalent of the java compareTo method. We had it find which point is closer to the origin. To make this more efficient, we introduced a class variable called @@origin. The double at-sign indicates that it's a class variable versus an instance variable (i.e., one shared value for the entire class, like a static field in Java):
        class Point
          include Comparable
          def initialize (x = 0, y = 0)
            @x = x
            @y = y
          end
        
          attr_reader :x, :y
          attr_writer :x, :y
        
          def to_s
            "(#{@x}, #{@y})"
          end
        
          def distance(other)
            return Math.sqrt((x - other.x) ** 2 + (y - other.y) ** 2)
          end
        
          @@origin = Point.new
        
          def <=> other
            return distance(@@origin) - other.distance(@@origin)
          end
        end
What the mixin gets us is five extra methods built on top of the <=> method. For example, now we can say:

        >> p1 = Point.new(3, 5)
        => #<Point:0xb8052298 @y=5, @x=3>
        >> p2 = Point.new(5, 3.1)
        => #<Point:0xb804e2b0 @y=3.1, @x=5>
        >> p1 < p2
        => true
        >> p1 <= p2
        => true
        >> p1 > p2
        => false
        >> p1 >= p2
        => false
So this is an example of code reuse without using the inheritance mechanism. Instead, we have defined five methods in terms of another method. This is a very convenient way to be able to build up new functionality.

The other common Ruby mixin in Enumerable. It defines a series of methods built on top of the each method. Remember that we defined a MyRange class with an each method:

        class MyRange
          def initialize(first, last)
            @first = first
            @last = last
          end
        
          def each
            i = @first
            while i <= @last
              yield i
              i += 1
            end
          end
        end
By including the Enumerable mixin, we get 21 new methods. I demonstrated it this way in the interpreter:

        >> x = MyRange.new(1, 10)
        => #<MyRange:0xb7f7ccec @last=10, @first=1>
        >> lst1 = x.methods
        => ["methods", "respond_to?", "dup", "instance_variables", "__id__", "eql?",
        "object_id", "id", "singleton_methods", "send", "taint", "frozen?",
        "instance_variable_get", "__send__", "instance_of?", "to_a", "type",
        "protected_methods", "instance_eval", "display", "instance_variable_set",
        "kind_of?", "extend", "to_s", "each", "class", "hash", "tainted?", "==",
        "private_methods", "===", "nil?", "untaint", "is_a?", "inspect", "method",
        "clone", "=~", "public_methods", "instance_variable_defined?", "equal?",
        "freeze"]
        >> class MyRange
        >>   include Enumerable
        >>   end
        => MyRange
        >> lst2 = x.methods
        => ["reject", "methods", "respond_to?", "dup", "instance_variables", "member?",
        "__id__", "eql?", "object_id", "find", "each_with_index", "id",
        "singleton_methods", "send", "collect", "all?", "entries", "taint", "include?",
        "frozen?", "instance_variable_get", "__send__", "instance_of?", "detect",
        "to_a", "zip", "type", "map", "protected_methods", "instance_eval", "any?",
        "display", "sort", "min", "instance_variable_set", "kind_of?", "extend",
        "find_all", "to_s", "each", "class", "hash", "tainted?", "==",
        "private_methods", "inject", "===", "sort_by", "nil?", "untaint", "max",
        "is_a?", "select", "inspect", "method", "clone", "=~", "partition",
        "public_methods", "instance_variable_defined?", "grep", "equal?", "freeze"]
        >> lst1.length
        => 42
        >> lst2.length
        => 63
        >> lst2.length - lst1.length
        => 21
And we can see the list of the actual methods by saying:

        >> lst2 - lst1
        => ["reject", "member?", "find", "each_with_index", "collect", "all?",
        "entries", "include?", "detect", "zip", "map", "any?", "sort", "min",
        "find_all", "inject", "sort_by", "max", "select", "partition", "grep"]
So by defining a single method (each) and using the mixin, we can get 21 very useful methods. This is certainly easier from a programming point of view and it is easier to create consistency across different types of objects when they all refer to the same mixin.


Stuart Reges
Last modified: Sun Jun 7 12:44:29 PDT 2009