CSE341 Notes for Friday, 4/10/09

I began by reviewing the qsort function we wrote in Monday's lecture to sort a list using the quicksort algorithm. I loaded a file that has some utility functions in it for producing random lists:

        (* a random number generator initialized using the current time *)
        val utilityRandom =
            let val time = IntInf.toInt(Time.toSeconds(Time.now()) mod 
                                        Int.toLarge(valOf(Int.maxInt)))
            in Random.rand(time, 42)
            end;
        
        (* returns a random int value *)
        fun randomInt() = Random.randInt(utilityRandom);
        
        (* returns a list of random ints of given length; assumes n >= 0 *)
        fun randList(n) =
            let val time = IntInf.toInt(Time.toSeconds(Time.now()) mod 100000)
                val r = Random.rand(time, 42)
                fun build(0) = []
                |   build(m) = Random.randInt(r)::build(m - 1)
            in build(n)
            end;
This function calls some built-in utilities for finding out the current time and using that to seed a random number generator. I pointed out that under the "ML resources" section of the class web page I have a link to the documentation for what is known as the "standard basis library" that contains many such useful utilities. This is similar to the way that Java has packages that you can access. As in Java, you can use the dot notation, as I have in the code above.

Using this function, we generated some lists of varying length, including a list of 500 thousand ints, and we were able to sort them using our qsort function.

We looked at some examples to review the fact that we can sort lists using different definitions of order and that we can sort lists that are composed of values other than ints, as in:

        - qsort(op <=, ["four", "score", "and", "seven", "years", "ago"]);
        val it = ["ago","and","four","score","seven","years"] : string list
In this case ML sees that the second argument is a list of string values, so instead of defaulting to integers for the "<=" operator, it uses the version for string values.

In other words, we have written a powerful, general-purpose sorting utility that we can use to sort lists of any kind. Java has similar utilities, but the syntactic overhead is high. You have to define an object that implements the Comparator interface. In ML it is much simpler. Functions can be passed easily as arguments to other functions.

I then mentioned that we will be talking about three of the most important higher-order functions: map, filter, and reduce. We first looked at map:

        fun map(f, []) = []
        |   map(f, x::xs) = f(x)::map(f, xs);
This function applies a function to every element of a list. I used this with the infix -- operator that we wrote in Wednesday's lecture. It allows you to ask for a list of integer in a particular range. Here are a few examples from the interpreter:

        - 1--10;
        val it = [1,2,3,4,5,6,7,8,9,10] : int list
        - 1--100;
        val it =
          [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,
           29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,
           54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,
           79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100]
          : int list
For example, we can compute the square roots of the numbers 1 to 100 using map:

        - map(Math.sqrt, map(real, 1--100));
        val it =
          [1.0,1.41421356237,1.73205080757,2.0,2.2360679775,2.44948974278,
           2.64575131106,2.82842712475,3.0,3.16227766017,3.31662479036,3.46410161514,
           3.60555127546,3.74165738677,3.87298334621,4.0,4.12310562562,4.24264068712,
           4.35889894354,4.472135955,4.58257569496,4.69041575982,4.79583152331,
           4.89897948557,5.0,5.09901951359,5.19615242271,5.29150262213,5.38516480713,
           5.47722557505,5.56776436283,5.65685424949,5.74456264654,5.83095189485,
           5.9160797831,6.0,6.0827625303,6.16441400297,6.2449979984,6.32455532034,
           6.40312423743,6.48074069841,6.5574385243,6.63324958071,6.7082039325,
           6.78232998313,6.8556546004,6.92820323028,7.0,7.07106781187,7.14142842854,
           7.21110255093,7.28010988928,7.34846922835,7.4161984871,7.48331477355,
           7.54983443527,7.61577310586,7.68114574787,7.74596669241,7.81024967591,
           7.87400787401,7.93725393319,8.0,8.0622577483,8.12403840464,8.18535277187,
           8.24621125124,8.30662386292,8.36660026534,8.42614977318,8.48528137424,
           8.54400374532,8.60232526704,8.66025403784,8.71779788708,8.77496438739,
           8.83176086633,8.88819441732,8.94427191,9.0,9.05538513814,9.11043357914,
           9.16515138991,9.21954445729,9.2736184955,9.32737905309,9.38083151965,
           9.43398113206,9.48683298051,9.53939201417,9.59166304663,9.64365076099,
           9.69535971483,9.74679434481,9.79795897113,9.8488578018,9.89949493661,
           9.94987437107,10.0] : real list
You can also map a function you define over a list. For example, the same list above can be computed by saying:
        - fun f(x) = Math.sqrt(real(x));
        val f = fn : int -> real
        - map(f, 1--100);
This is even easier to write with an anonymous function. Ullman describes how to define anonymous functions in section 5.1.3 of the book. The basic syntax involves the keyword "fn" and the two characters "=>" that are supposed to look like an arrow. The general form is: fn <parameter> => <expression> For example, our function f above could be defined as:

        fn x => Math.sqrt(real(x))
Notice that for anonyomous functions you use "fn", not "fun". I read this as, "A function that maps x into Math.sqrt(real(x))." Using this function, we can rewrite our call on map to be:

        map(fn x => Math.sqrt(real(x)), 1--100);
Then I showed the filter function, which takes a predicate function (a boolean test) and a list as arguments and that returns the list of values that satisfy the given predicate. We write it this way:

        fun filter(f, []) = []
        |   filter(f, x::xs) =
                if f(x) then x::filter(f, xs)
        	else filter(f, xs)
Given this function and the isPrime function we wrote in section, I wrote this expression to request a list of all primes up to 1000:

        - filter(isPrime, 1--1000);
        val it =
          [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,
           103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,
           197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,
           307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,
           419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,
           523,541,547,557,563,569,571,577,587,593,599,601,607,613,617,619,631,641,
           643,647,653,659,661,673,677,683,691,701,709,719,727,733,739,743,751,757,
           761,769,773,787,797,809,811,821,823,827,829,839,853,857,859,863,877,881,
           883,887,907,911,919,929,937,941,947,953,967,971,977,983,991,997] : int list
Then we looked at the reduce function which collapses a list to a single value given a function that collapses two values from the list into a single value. At first this might sound unusual, but you'll find that we have many such collapsing operations. For example, the addition operator takes two numbers and turns them into one number. The reduce function does two at a time to an entire list, until the list has been reduced to a single value:

        exception empty_list;
        fun reduce(f, []) = raise empty_list
        |   reduce(f, [x]) = x
        |   reduce(f, x::xs) = f(x, reduce(f, xs));
For example, this expression asks ML to reduce the list of integers 1 through 100 to a single value using addition:

        - reduce(op +, 1--100);
        val it = 5050 : int
We computed 5! by collapsing the list 1 through 5 with multiplication:

        - reduce(op *, 1--5);
        val it = 120 : int
I asked people for other examples of collapsing operations and we came up with a rather extensive list:

The following table gives a summary of map, filter and reduce.

function 1st argument 2nd argument returns
map function mapping 'a to 'b list of n 'as list of n 'bs
filter predicate converting 'a to bool list of n 'as list of m 'as (m <= n)
reduce function that collapses a tuple
'a * 'a into a single 'a
list of n 'as one 'a

I then spent a few minutes discussing the fact that the reduce function has disappeared from the standard ML libraries. We still talk about the idea of a reduce function, but there were too many problems with the standard reduce. First, it doesn't handle empty lists well. The newer versions of reduce include an extra parameter that indicates a "default value" to use for the computation when there is no data to process (the answer for an empty list). For example, when you are adding, the default value is 0. When you are multiplying, the default value is 1. Second, we sometimes want to make a distinction between processing the list from left-to-right versus processing the list from right-to-left. For many computations it doesn't matter, but when it does matter, it's nice to be able to control which direction it uses. Finally, reduce always reduces to a value of the same type as the original list. For example, lists of int values are reduce to a single int. Lists of strings are reduced to a single string. Sometimes you want to use a reducing function that reduces to some other kind of value.

The new terminology involves thinking of this as a "folding" operation, so the two replacement functions are known as foldl and foldr (fold from the left or fold from the right). We're written reduce to fold from the right. For example, the following are equivalent:

        - reduce(op ^, ["four", "score", "and", "seven", "years", "ago"]);
        val it = "fourscoreandsevenyearsago" : string
        - List.foldr op^ "" ["four", "score", "and", "seven", "years", "ago"];
        val it = "fourscoreandsevenyearsago" : string
The call on List.foldr includes an extra parameter of an empty string (default value). To make this even more clear, I used the string "foo":

        - List.foldr op^ "foo" ["four", "score", "and", "seven", "years", "ago"];
        val it = "fourscoreandsevenyearsagofoo" : string
Notice that the values from the list are concatenated with "foo" starting with the rightmost value and working backwards towards the first value. We get the text in the opposite order when we call List.foldl:

        - List.foldl op^ "foo" ["four", "score", "and", "seven", "years", "ago"];
        val it = "agoyearssevenandscorefourfoo" : string
The functions List.foldl and List.foldr are defined as curried functions. That's why you don't use parentheses and commas when you call them. We'll discuss that more in Wednesday's lecture. I also mentioned that these aren't the standard definitions of map and filter. The standard versions are curried.

I also mentioned that Google has developed a tool that they call MapReduce to help them solve the problem of performing large scale computations in a distributed computing environment (hundreds of computers each handling a small part of the overall computation). The lack of side effects that you get from functional constructs like map and reduce make it easier to parallelize the computation. Google cares so much about this technology that they have developed a course here at the uw that we have numbered 490h. It teaches the same concepts using an open-source implementation of MapReduce known as hadoop. The department expects to continue to offer 490H in the future.


Stuart Reges
Last modified: Sun Apr 12 19:15:43 PDT 2009