CSE413 Notes for Monday, 3/4/24

I began by pointing out several higher order functions that Ruby has. They are similar to what we have seen in OCaml and Scheme. There is a map function that expects a block specifying an operation to apply to each value in a structure:

        >> x = [1, 42, 7, 19, 8, 25, 12]
        => [1, 42, 7, 19, 8, 25, 12]
        >> x.map {|n| 2 * n}
        => [2, 84, 14, 38, 16, 50, 24]

There is a find function that expects a block that specifies a predicate:

        >> x.find {|n| n % 3 == 1}
        => 1

This version finds just the first occurrence. If you want to find them all, you can call find_all which is Ruby's version of filter:

        >> x.find_all {|n| n % 3 == 1}
        => [1, 7, 19, 25]

Ruby also has methods for determining whether every value satisfies a certain predicate and whether all values satisfy a certain predicate:

        >> x.any? {|n| n % 3 == 1}
        => true
        >> x.all? {|n| n % 3 == 1}
        => false

These are computational equivalents of the mathematical existential quantifier ("there exists") and universal quantifier ("for all").

Then I discussed the inject method. When you don't supply a parameter, it behaves like the reduce function in OCaml (collapsing a sequence of values into one value of the same type):

        >> [3, 5, 12].inject {|a, b| a + b}
        => 20

But you can also call it with a parameter, in which case it behaves like foldl:

        >> [3, 5, 12].inject("values:") {|a, b| a + " " + b.to_s}
        => "values: 3 5 12"

It's nice that Ruby has the inject function for other types as well like ranges:

        >> (1..20).inject("values:") {|a, b| a + " " + b.to_s}
        => "values: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20"

Then we reviewed console and file-reading operations. I mentioned that I particularly like the readlines method, as in:

        lst = File.open("hamlet.txt").readlines

This read in the entire contents of Hamlet into an array of strings. We were then able to ask questions like how many lines there are in the file or what the 101st line is:

        irb(main):002:0> lst.length
        => 4463
        irb(main):003:0> lst[100]
        => "  Hor. Well, sit we down,\r\n"

I asked people how we could write code to count the number of occurrences of various words in the file. We'd want to split each line using whitespace, which you can get by calling the string split method, as in:

        irb(main):004:0> lst[100].split
        => ["Hor.", "Well,", "sit", "we", "down,"]

To store the counts for each word, we need some kind of data structure. In Java we'd use a Map to associate words with counts. We can do that in Ruby with a hashtable:

        irb(main):005:0> count = Hash.new
        => {}

As we saw in an earlier lecture, we can use the square bracket notation to refer to the elements of the table. For example, to increment the count for the word "hamlet", we're going to want to execute a statement like this:

        count["hamlet"] += 1

Unfortunately, when we tried this out, it generated an error:

        irb(main):006:0> count["hamlet"] += 1
        NoMethodError: undefined method `+' for nil:NilClass
                from (irb):6
                from :0

That's because there is no entry in the table for "hamlet". But Ruby allows us to specify a default value for table entries that gets around this:

        irb(main):007:0> count = Hash.new 0
        => {}
        irb(main):008:0> count["hamlet"] += 1
        => 1
        irb(main):009:0> count
        => {"hamlet"=>1}

Using this approach, it was very easy to count the occurrences of the various words in the lst array:

        irb(main):007:0> count = Hash.new 0
        => {}
        irb(main):010:0> for line in lst do
        irb(main):011:1*     for word in line.split do
        irb(main):012:2*       count[word.downcase] += 1
        irb(main):013:2>     end
        irb(main):014:1>   end

After doing this, we could ask for the number of words in the file and the count for individual words like "hamlet":

        irb(main):022:0> count.length
        => 7234
        irb(main):023:0> count["hamlet"]
        => 28

The File object can be used with a foreach loop, so we could have written this same code without setting up the array called lst:

        irb(main):024:0> count = Hash.new 0
        => {}
        irb(main):025:0> for line in File.open("hamlet.txt") do
        irb(main):026:1*     for word in line.split do
        irb(main):027:2*       count[word.downcase] += 1
        irb(main):028:2>     end
        irb(main):029:1>   end
        => #<File:hamlet.txt>
        irb(main):030:0> count.length
        => 7234
        irb(main):031:0> count["hamlet"]
        => 28

The key point here is that it is possible to write just a few lines of Ruby code to express a fairly complex operation to be performed. We'd expect no less from a popular scripting language.

I then spent some time discussing the object-oriented features of Ruby. I started with a simple Point class for storing x/y coordinates.

        class Point
          def initialize (x = 0, y = 0)
            @x = x
            @y = y
          end
        
          attr_reader :x, :y
          attr_writer :x, :y
        
          def to_s
            return "(#{@x}, #{@y})"
          end
        end

This class has several new features that I described. In the constructor (the initialize method) the parameters have default values of 0, which means there really are three constructors.

We discussed in a prior lecture how to define getter and setter methods. Ruby has a special form for this called "attr_reader" and "attr_writer". In the class above, we put a colon in front of x and y to turn them into a symbol. These lines of code introduce a getter for each and a setter for each.

Finally, I am using another bit of syntatic sugar for the to_s method. Suppose you define a couple of variables called x and y:

        irb(main):001:0> x = 3
        => 3
        irb(main):002:0> y = 4.7
        => 4.7

In Java, if you wanted to print out the values of these variables, you would say something like: System.out.println("x = " + x + ", y = " + y); You can do something equivalent in Ruby, but we would have to call to_s for each of x and y to convert them to a string:

        irb(main):004:0> puts "x = " + x.to_s + ", y = " + y.to_s
        x = 3, y = 4.7
        => nil

Ruby gives an alternative where you embed an expression in #{...} inside a quoted string. Ruby will evaluate the expression and convert it to a string, as in:

        irb(main):006:0> puts "x = #{x}, y = #{y}
        irb(main):007:0" "
        x = 3, y = 4.7
        => nil
        irb(main):008:0> puts "sum = #{x + y}"
        sum = 7.7
        => nil

Notice in the second case that it has to add x and y together before converting it to a string.

So back to our Point class. Fields like @x and @y are encapsulated, which basically makes them private. But Ruby has a different notion of "private" than Java does. In the case of fields, there isn't a convenient way to refer to the fields of another object, even an object of the same type. For example, we tried to write this distance method for the Point class:

        def distance(other)
          return Math.sqrt((@x - other.@x) ** 2 + (@y - other.@y) ** 2)
        end

Ruby rejected this as syntactically invalid. Although the fields are called @x and @y, we can't refer to other.@x and other.@y. We can only refer to fields of "self" in that manner. But we found that this version worked:

        def distance(other)
          return Math.sqrt((@x - other.x) ** 2 + (@y - other.y) ** 2)
        end

In this case, we are calling the "getter" function of the other Point object. So other.x and other.y are really function calls, as in other.x() and other.y(). Then I moved the attr_reader and attr_writer calls into a private section of the class:

        class Point
          ...
        
          private
          attr_reader :x, :y
          attr_writer :x, :y
        end

This broke the distance method. When we called it, we got an error message about trying to call a private method. In Java, private means private to the class. In Ruby, private means private to the object. In other words, the only way to call a private method is in the context of self calling it. Even other instances of the same class can't call a private method.

These are the two extremes. A public method can be called by anyone. A private method can only be called by self. There is a third option. We can declare a method to be protected, in which case it can be called by objects of the same class and objects whose type is a subclass of this type. The calls still have to appear in the class definition, so that clients of the class aren't able to call the method. So changing "private" to "protected" in the example above allowed us to have a functioning distance method without exposing the x and y getters and setters outside the class.

I then pointed out an interesting property of instance variables. I added this new method to the class that mentions an instance variable @z:

        def foo
          @z = 26
        end

When I then constructed a point, we saw that @z was not there:

        >> p = Point.new 3, 15
        => #<Point:0xb7b692e0 @y=15, @x=3>
        >> p.instance_variables
        => ["@y", "@x"]

But as soon as I called the foo method, the instance variable appears:

        >> p.foo
        => 26
        >> p
        => #<Point:0xb7b692e0 @z=26, @y=15, @x=3>
        >> p.instance_variables
        => ["@z", "@y", "@x"]

This is very different from Java. Java is statically typed, so before the program ever begins executing, we have to specify exactly what each object will look like. So when Java constructs an object, it makes all of the fields at the same time. But Ruby is far more dynamic. Instance variables are added to the object as they are encountered in executing methods of the class. The @x and @y instance variables are mentioned in the constructor, so they are allocated when the constructor is called. The @z instance variable is only mentioned in the foo method, so it is allocated only when we call foo.

I then showed a quick example of inheritance:

        class Point2 < Point
          def translate(dx, dy)
            self.x += dx
            self.y += dy
          end
        end

The notation "Point2 < Point" in the header indicates that Point2 extends (inherits from) Point. This subclass adds a translate method. By referring to "self.x" and "self.y", this code is calling the superclass getters and setters for x and y. This is the "right" way to go to preserve encapsulation. But Ruby would allow us to directly access the instance variables if we wanted to:

        class Point2 < Point
          def translate(dx, dy)
            @x += dx
            @y += dy
          end
        end

This wouldn't be allowed in Java where we tend to make instance variables private, but in Ruby the philosophy is that subclasses should have access to the superclass instance variables.

I pointed that the keyword "super" is used differently in Ruby than in Java. In Java we say things like "super.to_s" to refer to the superclass version of the to_s method. In Ruby, "super" refers to the superclass version of whatever method you are overriding. So if you are overriding "to_s", then "super" refers to the superclass version of the method. For example, we added this method to the Point class to have the to_s method add an exclamation point:

          def to_s
            super + "!"
          end

I then discussed the idea of writing an iterator for a binary tree class. How would we implement a binary tree inorder iterator for a Java binary tree? There were several suggestions. One idea was to do the complete traversal in advance and store the result in some kind of data structure like an ArrayList and then we could iterate over the ArrayList. Another suggestion was to keep a stack and to simulate the call stack ourselves. That could work as well, although that is also rather tricky.

What Java does is to keep track of parent links in the tree and then you move around in the tree from node to node. That works, but it requires keeping extra parent links and the code to move from node to node is a bit tricky. Here is a bit of source code from TreeMap.java that has the code that moves from one node to the next using an inorder traversal:

        private Entry<K,V> successor(Entry<K,V> t) {
            if (t == null)
                return null;
            else if (t.right != null) {
                Entry<K,V> p = t.right;
                while (p.left != null)
                    p = p.left;
                return p;
            } else {
                Entry<K,V> p = t.parent;
                Entry<K,V> ch = t;
                while (p != null && ch == p.right) {
                    ch = p;
                    p = p.parent;
                }
                return p;
            }
        }

All of these solutions work, but none of them is simple and efficient. Ruby gives us a solution that is simple and efficient. At that point we ran out of time, so I said that we'd complete it in the next lecture.

Stuart Reges

Last modified: Mon Mar 4 16:43:46 PST 2024