CSE341 Notes for Wednesday, 2/28/07

We continued our exploration of Ruby. I reminded people that I had put up a set of lecture notes from William Mitchell. I've found these very helpful in picking up the language details step by step. I've also included an init file for irb that William made up that gives us a few nice features. It allows us to use "it" in the same way we did in the ML interpreter to get the most recently computed value. It does a better job of indenting when we type in control structures. And it allows us to hit the escape key to see the list of possible completions for something like "3.". I've included instructions on the class web page about how to get this file on your attu account if you're interested.

I spent some time discussing arrays and strings in more detail. I mentioned that one of the annoying things about Java is that you often have to memorize several ways to do the same kind of thing. You find the length of an array by asking for a.length. If it's a string you ask for s.length(). If it's an ArrayList you ask for lst.size(). And if you want an individual element? You ask for a[i] for an array, s.charAt(i) for a String and lst.get(i) for an ArrayList.

Ruby has a simpler and more consistent syntax for this kind of thing. For example, suppose you define:

        irb(main):003:0> a = [1, 2, 3, 4, 5, 6]
        => [1, 2, 3, 4, 5, 6]
        irb(main):004:0> s = "hello"
        => "hello"
You can request the length of each in the same way:
        irb(main):005:0> a.length
        => 6
        irb(main):006:0> s.length
        => 5
You can request individual elements with the square bracket notation:

        irb(main):007:0> a[2]
        => 3
        irb(main):008:0> s[3]
        => 108
It's a little weird that the string returns the ASCII value of the character, but that's one of Ruby's little quirks. You can ask for a subrange by specifying a starting position and a number of elements to include:

        irb(main):009:0> a[2, 2]
        => [3, 4]
        irb(main):010:0> s[3, 2]
        => "lo"
You can also specify a range of indexes by using a ".." instead of a comma:

        irb(main):012:0> a[1..3]
        => [2, 3, 4]
        irb(main):013:0> s[2..4]
        => "llo"
And you can append to either using the "<<" operator (think of this as an arrow sending the data into the given structure):

        irb(main):014:0> a << 5
        => [1, 2, 3, 4, 5, 6, 5]
        irb(main):015:0> s << " there"
        => "hello there"
We also saw that we can do some pretty unusual assignments where we reassign parts of a string or array to something new:

        irb(main):016:0> a[2..4] = [8, 9, 10, 12, 15, 26]
        => [8, 9, 10, 12, 15, 26]
        irb(main):017:0> a
        => [1, 2, 8, 9, 10, 12, 15, 26, 6, 5]
        irb(main):018:0> s[3..5] = "p! me please"
        => "p! me please"
        irb(main):019:0> s
        => "help! me pleasethere"
I mentioned that there is a class called Hash that can be used to store a hashtable of key/value pairs (like the Java Map classes). You can construct it by calling new or just using curly brace notation:

        irb(main):020:0> x = Hash.new
        => {}
        irb(main):021:0> x = {}
        => {}
Once constructed, you can use array-like square bracket notation to associate keys with values:

        irb(main):022:0> x["hello"] = 15
        => 15
        irb(main):023:0> x["foo"] = 92
        => 92
        irb(main):024:0> x[85] = "bar"
        => "bar"
        irb(main):025:0> x
        => {85=>"bar", "foo"=>92, "hello"=>15}
This is simpler than the Java approach of calling get and put methods, especially when you want to do complex manipulations like:

        irb(main):026:0> x["foo"] *= 2
        => 184
        irb(main):027:0> x
        => {85=>"bar", "foo"=>184, "hello"=>15}
I then talked a bit about control structures. I said that I really like the "quick reference guide" that is linked under Ruby resources on the class web page. Ruby has many familiar control structures like if/else and while and the quick reference has templates for these:

        if bool-expr [then]
          body
        elsif bool-expr [then]
          body
        else
          body
        end
        
        while bool-expr [do]
         body
        end
Ruby also allows you to include these after statements. So you can either say something like this:

        irb(main):034:0> x = 3
        irb(main):030:0> while x < 200 do
        irb(main):031:1*     x *= 2
        irb(main):032:1>   end
        => nil
        irb(main):033:0> x
        => 384
or you can say it this way:

        irb(main):034:0> x = 3
        => 3
        irb(main):035:0> x *= 2 while x < 200
        => nil
        irb(main):036:0> x
        => 384
There are also interesting variations like an "unless" construct that is like an inverse if/else and an "until" construct that is like an inverse while.

Then I spent some time talking about classes. I started by pointing out that the Ruby philosophy is very different from the Java philosophy. In Java, a class definition contains the complete blueprint for the class, listing all instance variables and methods. In Ruby, you can define a class multiple times, each time adding more instance variables and methods. You can even do this for built-in classes.

For example, we've seen the built-in Array class:

        irb(main):037:0> x = [1, 2, 3, 4, 5]
        => [1, 2, 3, 4, 5]
        irb(main):038:0> x.class
        => Array
The Array class has methods called push and pop that allow you to treat an array like a stack:

        irb(main):039:0> x = []
        => []
        irb(main):040:0> x.push 8
        => [8]
        irb(main):041:0> x.push 19
        => [8, 19]
        irb(main):042:0> x.push 27
        => [8, 19, 27]
        irb(main):043:0> x.pop
        => 27
        irb(main):044:0> x.pop
        => 19
        irb(main):045:0> x
        => [8]
We saw that we could use a class definition to dynamically add a new definition to the Array class:

        irb(main):046:0> class Array
        irb(main):047:1>   def push2(n)
        irb(main):048:2>     push n
        irb(main):049:2>     push n
        irb(main):050:2>     end
        irb(main):051:1>   end
        => nil
        irb(main):052:0> x.push2 3
        => [8, 3, 3]
        irb(main):053:0> x.push2 8
        => [8, 3, 3, 8, 8]
This becomes a part of the Array class for all instances of Array. We created a new one just to double-check:

        irb(main):054:0> y = []
        => []
        irb(main):055:0> y.push2 19
        => [19, 19]
I then mentioned that I wanted to discuss one of the most important concepts in Ruby: the idea of a block. You can think of it as a "block of code," although it really is something we've seen before: a closure. You can specify blocks either with curly brace notation or with do...end notation. For example, the FixNum class has a method called times that expects a block. You get an error if you don't provide one:

        irb(main):056:0> 3.times
        LocalJumpError: no block given
                from (irb):56:in `times'
                from (irb):56
                from :0
Using the curly brace notation we'd say:

        irb(main):057:0> 3.times { puts "hello" }
        hello
        hello
        hello
        => 3
The FixNum object executes the block of code the given number of times (3 times in this case because we asked 3 to do this task). We could instead use do...end notation:

        irb(main):058:0> 3.times do
        irb(main):059:1*     puts "hello"
        irb(main):060:1>   end
        hello
        hello
        hello
        => 3
According to our textbook, the usual convention is to use curly braces for short, one-line blocks, and to use do...end for multiline blocks.

Blocks can include parameters. This is very similar to an anonymous function in ML when we said things like:

        fn x => 2 * x
We read this as, "a function of x that returns 2 * x." In Ruby you put any parameters inside pipe characters ("|") at the beginning of the block. After the parameter(s), you put the code, as in:

        {|n| puts n}
which we would read as, "a function of n that calls puts on n". We can pass this block to the times method:

        irb(main):061:0> 3.times {|n| puts n}
        0
        1
        2
        => 3
As you can see, the times method produces the values 0 through 2 as it executes the block three different times. Our earlier examples simply ignored this parameter value.

Then I said that I wanted to spend a little time understanding how Range objects are implemented in Ruby:

        irb(main):066:0> x = 1..10
        => 1..10
        irb(main):068:0> x.class
        => Range
A common use for Range objects is to control the foreach loop in Ruby:

        irb(main):069:0> for i in x
        irb(main):070:1>   puts i
        irb(main):071:1>   end
        1
        2
        3
        4
        5
        6
        7
        8
        9
        10
        => 1..10
We began by writing a constructor for it. In Ruby, you specify a constructor by overriding the initialize method:

        class MyRange
          def initialize(first, last)
            @first = first
            @last = last
          end
        end
In Ruby, you differentiate between instance variables and local variables by putting an at-sign ("@") in front of any instance variable.

You construct objects by calling the new method of the class, although Ruby will make sure that you provide the right number of arguments:

        irb(main):079:0> x = MyRange.new
        ArgumentError: wrong number of arguments (0 for 2)
                from (irb):79:in `initialize'
                from (irb):79:in `new'
                from (irb):79
                from (null):0
        irb(main):080:0> x = MyRange.new(1, 10)
        => #
Then I asked people how to write a method that we'll call "eech" for now that simply prints every integer in the range from first to last. Someone suggested using a while loop:

        def eech
          i = @first
          while i <= @last
            puts i
            i += 1
          end
        end
We had to remember to put an @ in front of every instance variable name (a common error, especially for people used to Java). We forgot to include the increment of i in our first version, which gave us an infinite loop, but when we added it, we found that it printed the values, as expected:

        irb(main):018:0> x.eech
        1
        2
        3
        4
        5
        6
        7
        8
        9
        10
        => nil
Everyone thought this was very boring until I said that we were about to see something really interesting. I said that instead of calling "puts" to print the value, what if we instead call "yield"?

        def eech
          i = @first
          while i <= @last
            yield i
            i += 1
          end
        end
The yield statement is used in Ruby to invoke a block. In fact, just including a call on yield caused Ruby to now insist on getting a block when the method is called:

        irb(main):029:0> x.eech
        LocalJumpError: no block given
                from (irb):23:in `eech'
                from (irb):29
                from :0
Now we have to supply a block to execute, as in:

        irb(main):031:0> x.eech {|n| puts 2 * n}
        2
        4
        6
        8
        10
        12
        14
        16
        18
        20
        => nil
Here's what is going on. The block represents some code that isn't immediately executed. It's passed to the eech method. The eech method does whatever it wants to, but then it calls yield as a way to invoke the block. At that point, control shifts to the block. The method called yield with a parameter, so that value is fed into the block into its parameter n. Once the block finishes executing, control goes back to the eech method. The eech method then does more work and calls yield again, shifting control back to the block. This back and forth continues until the eech method finishes executing.

I briefly discussed the idea of a block as a closure. When we studied ML, we saw that a closure has two key elements:

ML functions keep track of both the code and the environment. In the same way, Ruby blocks keep track of their context, remembering any local variables and keeping track of the value of "self" (which object it is defined inside of).

I said that we'd talk more about this interesting aspect of Ruby in Friday's lecture.


Stuart Reges
Last modified: Thu Mar 1 15:34:48 PST 2007