>> x = gets hello there => "hello there\n"You can define a variable that is tied to an external input file by calling File.open:
>> infile = File.open("utility.sml") => #<File:utility.sml>Here I'm reading our file of ML utility functions that we used earlier in the quarter. To get a line of text, you can call gets on this object, as in:
>> infile.gets => "(* Stuart Reges *)\n" >> infile.gets => "(* 1/17/07 *)\n" >> 3.times {puts infile.gets} (* *) (* Collection of utility functions *) => 3You can also use a for-each loop, as in:
for str in infile puts str puts str endThis will echo each line of the input file twice. Or you can read the whole thing into an array by saying:
lines = infile.readlinesKeep in mind, though, that the file object keeps track of where it is in the file, so you might need to open the file again to read it more than once.
Then we reviewed console and file-reading operations. I mentioned that I particularly like the readlines method, as in:
lst = File.open("hamlet.txt").readlinesThis read in the entire contents of Hamlet into an array of strings. We were then able to ask questions like how many lines there are in the file or what the 101st line is:
irb(main):002:0> lst.length => 4463 irb(main):003:0> lst[100] => " Hor. Well, sit we down,\r\n"I asked people how we could write code to count the number of occurrences of various words in the file. We'd want to split each line using whitespace, which you can get by calling the string split method, as in:
irb(main):004:0> lst[100].split => ["Hor.", "Well,", "sit", "we", "down,"]To store the counts for each word, we need some kind of data structure. In Java we'd use a Map to associate words with counts. We can do that in Ruby with a hashtable:
irb(main):005:0> count = Hash.new => {}As we saw in an earlier lecture, we can use the square bracket notation to refer to the elements of the table. For example, to increment the count for the word "hamlet", we're going to want to execute a statement like this:
count["hamlet"] += 1Unfortunately, when we tried this out, it generated an error:
irb(main):006:0> count["hamlet"] += 1 NoMethodError: undefined method `+' for nil:NilClass from (irb):6 from :0That's because there is no entry in the table for "hamlet". But Ruby allows us to specify a default value for table entries that gets around this:
irb(main):007:0> count = Hash.new 0 => {} irb(main):008:0> count["hamlet"] += 1 => 1 irb(main):009:0> count => {"hamlet"=>1}Using this approach, it was very easy to count the occurrences of the various words in the lst array:
irb(main):007:0> count = Hash.new 0 => {} irb(main):010:0> for line in lst do irb(main):011:1* for word in line.split do irb(main):012:2* count[word.downcase] += 1 irb(main):013:2> end irb(main):014:1> endAfter doing this, we could ask for the number of words in the file and the count for individual words like "hamlet":
irb(main):022:0> count.length => 7234 irb(main):023:0> count["hamlet"] => 28The File object can be used with a foreach loop, so we could have written this same code without setting up the array called lst:
irb(main):024:0> count = Hash.new 0 => {} irb(main):025:0> for line in File.open("hamlet.txt") do irb(main):026:1* for word in line.split do irb(main):027:2* count[word.downcase] += 1 irb(main):028:2> end irb(main):029:1> end => #<File:hamlet.txt> irb(main):030:0> count.length => 7234 irb(main):031:0> count["hamlet"] => 28The key point here is that it is possible to write just a few lines of Ruby code to express a fairly complex operation to be performed. We'd expect no less from a popular scripting language.
I then spent a few minutes showing people the Bagels and Jotto programs that are included in homework 9.
I then discussed the idea of writing an iterator for the binary tree class. How would we implement a binary tree inorder iterator for a Java binary tree? There were several suggestions. One idea was to do the complete traversal in advance and store the result in some kind of data structure like an ArrayList and then we could iterate over the ArrayList. Another suggestion was to keep a stack and to simulate the call stack ourselves. That could work as well, although that is also rather tricky.
What Sun does is to keep track of parent links in the tree and then you move around in the tree from node to node. That works, but it requires keeping extra parent links and the code to move from node to node is a bit tricky. Here is a bit of source code from TreeMap.java that has the code that moves from one node to the next using an inorder traversal:
private Entry<K,V> successor(Entry<K,V> t) { if (t == null) return null; else if (t.right != null) { Entry<K,V> p = t.right; while (p.left != null) p = p.left; return p; } else { Entry<K,V> p = t.parent; Entry<K,V> ch = t; while (p != null && ch == p.right) { ch = p; p = p.parent; } return p; } }All of these solutions work, but none of them is simple and efficient. Ruby gives us a solution that is simple and efficient. At that point we ran out of time, so I said that we'd complete it in the next lecture.
Ruby gives us a solution that is simple and efficient. We had developed this code in section for a binary search tree
class Tree def initialize() @overallRoot = nil end def insert(v) @overallRoot = insert_helper(v, @overallRoot) end def print() print_helper(@overallRoot) end private # beginning of private definitions class Node def initialize(data = nil, left = nil, right = nil) @data = data @left = left @right = right end attr_reader :data, :left, :right attr_writer :data, :left, :right end def insert_helper(v, root) if root == nil root = Node.new(v) elsif v < root.data then root.left = insert_helper(v, root.left) else root.right = insert_helper(v, root.right) end return root end def print_helper(root) if root != nil then print_helper root.left puts root.data print_helper root.right end end endWe can define an iterator called each that looks a lot like the current print and print_helper methods. The big difference is that instead of calling puts to print values, they will call yield to generate values:
class Tree ... def each inorder_helper(@overallRoot) {|n| yield n} end private def inorder_helper(root) if root then inorder_helper(root.left) {|n| yield n} yield root.data inorder_helper(root.right) {|n| yield n} end end ... endGiven this method, we can call it with a block. In fact, print can now be redefined as a call on this iterator:
def print() each {|n| puts n} endWe loaded this new version into irb and tested it out. First we create a tree and inserted 25 random values:
>> t = Tree.new => #<Tree:0xb7eb0a5c @overallRoot=nil> >> 25.times{t.insert(rand(100))} => 25We found that print still worked just fine:
>> t.print 2 2 15 16 19 23 23 32 38 42 43 47 51 61 64 68 70 73 77 79 80 83 88 90 96 => nilBut now we could specify variants of print by using the inorder iterator, like printing each number doubled:
>> t.inorder {|n| puts 2 * n} 4 4 30 32 38 46 46 64 76 84 86 94 102 122 128 136 140 146 154 158 160 166 176 180 192 => nilWe were also able to use the iterator to find the sum of the numbers:
>> sum = 0 => 0 >> t.inorder {|n| sum += n} => nil >> sum => 1282I pointed out that not only was this iterator fairly easy to define, it is also highly efficient. We would describe it as lazy in the sense that it doesn't compute a value until it needs it. For example, we reset the sum to be 0 and wrote this variant that breaks out of the computation as soon as the sum becomes greater than 100:
t.inorder do |n| puts n sum += n break if sum > 100 endWhen we ran it, it produced this output:
2 2 15 16 19 23 23 32We found that it had set sum to 132 and then stopped. As we noted earlier, one approach is to precompute the entire traversal before it begins. For a computation like the one above that breaks out early, that would be very expensive.
I gave one other quick example of this kind of computation in Ruby. There is a library known as "mathn" that has some interesting math extensions. For example, it has a class called Prime that can be used to generate the prime numbers in sequence:
>> require "mathn" => true >> p = Prime.new => #<Prime:0xb7cfa794 @counts=[], @primes=[], @seed=1> >> p.next => 2 >> p.next => 3 >> p.next => 5 >> p.next => 7 >> 10.times {puts p.next} 11 13 17 19 23 29 31 37 41 43 => 10It has an each method that can compute an arbitrary number of primes. Obviously it doesn't precompute them. It computes them only as it needs them. We would have to include a call on break or return if we want to use it, as in this code, which computes the sum of the primes up to 10000:
require "mathn" p = Prime.new sum = 0 p.each do |n| break if n > 10000 sum += n end puts sumThis reports that the sum of the primes up to 10000 is equal to 5736396.
I also briefly mentioned that wikipedia has a nice entry on probabilistic primality testing using a technique known as Miller-Rabin. I had considered giving this as a Ruby assignment, but I was thwarted by the fact that the wikipedia page includes sample Ruby code. I copied it and pasted it into irb and we found that we could compute the same sum of primes using the new prime? method:
>> sum = 0 => 0 >> for n in 1..10000 do >> sum += n if n.prime? >> end => 1..10000 >> sum => 5736396