CSE341 Notes for Friday, 1/26/07

I began by asking whether anyone had heard about the concept of nullable types. Nobody raised their hand. I mentioned that I had heard an interesting talk by Anders Heljsberg, the designer of the C# programming language, in which he described Microsoft's decision to include nullable types in C# 2.0.

I said that in C# 2.0, you can declare a variant of int called int? that is a nullable version of int. I asked what that might look like and someone said it sounds like an int that can be set to null. That's exactly right. In version 2.0 of C# you can say things like:

        int? x;
        x = null;

Why would you want to do that? Anders said that Microsoft was doing this to allow C# code to more easily interoperate with SQL applications. Anyone who uses Excel should be familiar with this concept. There is a difference between leaving a data cell blank versus filling in a value like 0. I often have to yell at the TAs not to enter a 0 unless a student earned an actual score of 0. Otherwise our computations for values like minimum, median, and average will give misleading results.

So then I asked people if this reminded them of anything. Does Microsoft's "int?" correspond to anything in ML? Someone said it's an int option, which is right. ML's option type allows us to create nullable types. Of course, option types can be a bit of a pain, as we saw in the last lecture where I struggled to write a good version of the depthAt function. I've included a clean solution in the Wednesday lecture notes. Microsoft has decided to have the compiler do a lot of the work for you to implicitly call the equivalent of valOf and the equivalent of the SOME constructor to "wrap" a value into an option.

One of the things I like about ML is that it is usually possible to write clean, simple code without having to redesign the language or to build such support into the compiler. The option type is easy to define in terms of standard ML:

        datatype 'a option = NONE | SOME of 'a;

Of course, we also have to define some supporting functions, as in:

        exception Option;
        fun valOf(NONE) = raise Option
        |   valOf(SOME x) = x;

        fun isSome(NONE) = false
        |   isSome(SOME x) = true;

but again, this is fairly easy to do with the standard tools of ML.

Then I started a new topic: lexical scope versus dynamic scope. This is just one example of a number of related topics that have to do with the static properties of a program versus the dynamic properties of a program. The terms compile time and run time are related terms because we can think of these as the static properties that can be deduced ahead of time by a program like a compiler versus the dynamic properties that are apparent only when the program actually executes.

Lexical scope is a static property, which is why it is sometimes referred to as static scoping (e.g., in the wikipedia entry about scope0. Lexical scope will be familiar because Java uses it. Consider, for example, the following program:

        public class Test {
            public static int x = 3;
        
            public static void one() {
        	x *= 2;
        	System.out.println(x);
            }
        
            public static void two() {
        	int x = 5;
        	one();
        	System.out.println(x);
            }
        
            public static void main(String[] args) {
        	one();
        	two();
        	int x = 2;
        	one();
        	System.out.println(x);
            }
        }

In Java, every set of curly braces introduces a new lexical scope (a new region of the program known as a block). In the program above, there is an outer scope for the overall class and inner scopes for each of the three methods:

         class Test
        +------------------+
        |                  |
        |  method one      |
        | +--------------+ |
        | |              | |
        | +--------------+ |
        |                  |
        |  method two      |
        | +--------------+ |
        | |              | |
        | +--------------+ |
        |                  |
        |  method main     |
        | +--------------+ |
        | |              | |
        | +--------------+ |
        +------------------+

We want to pay attention to the identifier "x" as it is used in each of these scopes. There is a global variable x declared in the outer scope. All three methods refer to x and two of the three declare a local x:

         class Test
        +------------------+
        | global int x     |
        |                  |
        |  method one      |
        | +--------------+ |
        | | refers to x  | |
        | +--------------+ |
        |                  |
        |  method two      |
        | +--------------+ |
        | | local int x  | |
        | | refers to x  | |
        | +--------------+ |
        |                  |
        |  method main     |
        | +--------------+ |
        | | local int x  | |
        | | refers to x  | |
        | +--------------+ |
        +------------------+

In both method two and method main, the local definition of x is the one that is used inside that method. This is actually the same in both lexical scope and dynamic scope. The big question has to do with the x in method one. It is not defined in method one, so which x is used? The answer is familiar to all of us. The reference to x in method one is a reference to the global variable x.

Because all of the manipulations in method one are on the global variable x, we know that it doubles from 3 to 6 to 12 to 24 and we know that the other methods refer to their local variables. As a result, people were able to easily tell me that the output produced by the program would be 6, 12, 5, 24, 2.

I pointed out that in this example the scopes aren't very deeply nested, but the scopes can actually be fairly deeply nested because we can have a situation like a for loop inside a while loop inside an if/else inside a method inside a class.

I said that I wanted to think about a hypothetical language that we might call "Dynamic Java" which uses dynamic instead of lexical scope. Let's consider how the same program would execute in Dynamic Java. There would be a dynamic scope opened up when we first invoked the Test class. Most people don't realize that this kind of thing happens in Java, but you'd see it very clearly if you add a static initializer to the class. In Java, you can add code like the following that is executed when the class is first accessed:

        public class Test {
            static {
                System.out.println("in static initializer");
            }

            ...
        }

So when we access this class, we get a dynamic scope for the class itself that has the global variable x inside it:

         class Test
        +-------------------------+
        | global int x            |
        |                         |
        | ...                     |
        +-------------------------+

So far this is the same as in the lexical scope case. But now we have to consider the sequence of methods that are called to determine which dynamic scopes are created. We start by calling method main, which means we'll have a scope for that call:

         class Test
        +-------------------------+
        | global int x            |
        |                         |
        |  method main            |
        | +---------------------+ |
        | |                     | |
        | +---------------------+ |
        +-------------------------+

Method main calls three methods: one followed by two followed by one. Each call introduces a new scope, so we end up with three inner scopes:



         class Test
        +-------------------------+
        | global int x            |
        |                         |
        |  method main            |
        | +---------------------+ |
        | |  method one         | |
        | | +-------------+     | |
        | | |             |     | |
        | | +-------------+     | |
        | |                     | |
        | |  method two         | |
        | | +-------------+     | |
        | | |             |     | |
        | | +-------------+     | |
        | |                     | |
        | |  method one         | |
        | | +-------------+     | |
        | | |             |     | |
        | | +-------------+     | |
        | +---------------------+ |
        +-------------------------+

This is very different than in the lexical scope case. With lexical scope these were all at the same outer level and there was only one scope for method one. But we aren't done even with this picture. Remember that method two calls method one, which means that there is an inner scope produced by that call:

         class Test
        +-------------------------+
        | global int x            |
        |                         |
        |  method main            |
        | +---------------------+ |
        | |  method one         | |
        | | +-------------+     | |
        | | |             |     | |
        | | +-------------+     | |
        | |                     | |
        | |  method two         | |
        | | +-----------------+ | |
        | | |                 | | |
        | | |  method one     | | |
        | | | +-------------+ | | |
        | | | |             | | | |
        | | | +-------------+ | | |
        | | +-----------------+ | |
        | |                     | |
        | |  method one         | |
        | | +-------------+     | |
        | | |             |     | |
        | | +-------------+     | |
        | +---------------------+ |
        +-------------------------+

Now consider what happens when we include information about variable declarations and references:

         class Test
        +-------------------------+
        | global int x            |
        |                         |
        |  method main            |
        | +---------------------+ |
        | |  method one         | |
        | | +-------------+     | |
        | | | refers to x |     | |
        | | +-------------+     | |
        | |                     | |
        | |  method two         | |
        | | +-----------------+ | |
        | | | local int x     | | |
        | | |                 | | |
        | | |  method one     | | |
        | | | +-------------+ | | |
        | | | | refers to x | | | |
        | | | +-------------+ | | |
        | | | refers to x     | | |
        | | +-----------------+ | |
        | |                     | |
        | | local int x         | |
        | |                     | |
        | |  method one         | |
        | | +-------------+     | |
        | | | refers to x |     | |
        | | +-------------+     | |
        | |                     | |
        | | refers to x         | |
        | +---------------------+ |
        +-------------------------+

The key thing to pay attention to is the interpretation of the variable x in method one. On the first call to method one, the only x we will have seen is the global one, so this call doubles the global variable to 6 and prints it out. But on the second call to method one, we see the x that is local to method two. So we double it from 5 to 10 and print it out both in method one and in method two. Then we return to main and a local variable x is introduced. This local variable is the one that method one finds on the third call to the method, so it doubles it from 2 to 4. This value is then reported by main. So in Dynamic Java this program would produce the output 6, 10, 10, 4, 4.

Someone then asked about the relationship between dynamic scope and the call stack. I said that was an excellent way to think about this issue. It's easy to get the impression that dynamic scope would be difficult to implement. In fact, it's very easy to implement if you are writing an interpreter because with dynamic scope you search for the most recently allocated version of a variable on the call stack. If the current method has allocated such a variable, you use it. If not, you see if the method that called this one has allocated such a variable and so on.

I said that dynamic scope isn't used very often because programmers generally find it confusing. The early languages that used it tended to be languages that were interpreted rather than compiled. There is, however, a notable exception. Nobody seemed to know, so I mentioned that it's something we teach in a 300-level class. Still nobody seemed to know, so I mentioned that shell scripts use dynamic scope. We considered the following short example:

        #!/bin/sh
        
        x="hello"
        
        foo()
        {
            echo $x
        }
        
        bar()
        {
            local x="foo"
            foo
        }
        
        foo
        bar
        echo $x

This script produces the following output:

        hello
        foo
        hello

On the first call to the function one we echo the global variable x. On the second call we echo the variable x that is local to function two. Just to prove that nothing funny is going on, I printed x after the call to function two at the end to show that the global variable is still set to "hello".

We were running short of time, so I showed two quick examples in ML. I asked people what this code would produce as a result:

        val x = 3;
        fun f(n) = x * n;
        f(8);
        val x = 5;
        f(8);

Several people correctly predicted that both calls on f(8) produced the result 24. So the function is using the initial binding of x to 3 even after we rebind x to something else. I then asked about a fairly complex example using let:

        val y = 2;
        fun f(n) =
            let val x =
                    let val n = 3
                    in 10 * (n + y)
                    end
        	val y = 100 * n
            in x + y + n
            end;
        
        f(6);

The result produced here is 656. I said that we'd discuss this example in detail in Monday's lecture.

Stuart Reges

Last modified: Sun Jan 28 21:41:50 PST 2007