341 : 14 Feb 2002 : ML wrapup + project

Principles of module systems

In all of history, human beings have devised only one way of managing the complexity of the systems they construct: divide and conquer. The construction of software is no different. As the amount of code in a system grows, so does the need to break the system into small and manageable pieces that can be considered independently of the others.

To some extent, functions fill this decompositional role. When writing a function, the implementor does not need to think about the client; and when calling a function, the client does not need to worry about its implementation. Fundamentally, functions divide labor between the implementor and client. However, functions are too fine-grained to serve as the only structural unit of large-scale software.

Most reasonable languages therefore have a module system. Module systems, ideally, play several roles at the language level. Roughly in order of increasing complexity, these include the following:

Namespace management
The simplest kind of module system merely provides a disciplined way of separating declared names (data types, functions, global values) into logical groups.
Information hiding
Once you can split up names into distinct spaces, it's not a huge step to hide some of the names in a namespace from "outside view" by other modules.
Interface checking
Statically typed languages usually provide mechanisms for ensuring that modules interact with each other through published "contracts". The most common form of contract is an interface type: a module's published interface can include not only names but type information about those names. Some languages have more complex notions of "contract" as well, e.g. you could declare that something takes "only a non-null pointer to X" as a parameter.
Code reuse
If a module system is flexible enough, then the code in a a module can be reused in many different contexts. Some languages have higher-order modules: such modules construct other modules based on user-specified parameters, facilitating the reuse of code in many different type contexts. These "generic" modules have various names, and varying degrees of flexibility and power: ML has functors, Ada has generics, C++ has templates. Java and C# will both have support for some form of generics in the near future.

(Note: dynamically typed languages usually are flexible enough for the programmer to write higher-order modules manually. For example, in Scheme it's quite trivial to write code that manipulates other code in much the same way that a functor does; the difference is that this code cannot be statically type checked.)

Modules also have several implications at the implementation level. For example, they could be used to facilitate separate compilation, or dynamic loading of code (e.g., plugins/applets).

Modules: ML vs. the world

ML's modules provide all of the above benefits:

Many languages, by contrast, have module systems that are much weaker:

One feature that the Standard ML module system does not have is inheritance. This form of code reuse is quite different from functors: whereas a functor allows a module to be parameterized by other modules, inheritance allows a module to inherit code from a parent module. They may seem totally different, but both involve the notion of an "incomplete" module that relies upon another module for part of its implementation.

For example, a functor can be viewed as an "incomplete" module that is "completed" by its application to its module parameters. Similarly, a class can be viewed as an "incomplete" module that is "completed" by the addition of its subclass/superclass code. Both mechanisms facilitate code reuse---the functors and classes can be "completed" by many other pieces of code, and hence reused---but in quite different ways.

A few final features of ML

Mutual recursion

(* Two mutually recursive datatypes.  We use the 'and' keyword instead
   of redeclaring 'datatype'.  These are the lists with an even number of
   elements and odd number of elements, respectively. *)
datatype 'a EvenList = Empty
                     | Even of 'a * 'a OddList
     and 'a OddList = Odd of 'a * 'a EvenList;

(* Here's the result of entering some data of the above types into the
   SML interpreter... notice that we can't construct an EvenList of
   length 1! *)

- Empty;
val it = Empty : 'a EvenList

- Even(1, Odd(0, Empty));
val it = Even (1,Odd (0,Empty)) : int EvenList

- Even(2, Empty);
stdIn:22.1-22.15 Error: operator and operand don't agree [tycon mismatch]
  operator domain: int * int OddList
  operand:         int * 'Z EvenList
  in expression:
    Even (2,Empty)


(* To write well-behaved functions over these types, we need mutually
   recursive functions.  Again, we use the 'and' keyword instead of
   'fun'. *)
fun evenLength(Empty) = 0
  | evenLength(Even(_, rest)) = oddLength(rest) + 1
and oddLength(Odd(_, rest)) = evenLength(rest) + 1;

(* A use... *)
- evenLength(Even(1,Odd(2,Even(3,Odd(4,Empty)))));
val it = 4 : int

Currying

Named after the mathematician Haskell Curry, "currying" is a way of defining functions of multiple arguments in terms of higher-order functions, each of which takes a single argument and returns a higher order function.

The key insight of currying: Any function of type

a * b -> c

can also be phrased as a function of type

a -> b -> c

Consider normal addition:

(* Ordinary add *)
val add = fn (x, y) => x + y;
val foo = add(3, 4);

(* Manually curried add, and its application *)
val handCurriedAdd = fn (x) => fn(y) => x + y;
val addFive   = handCurriedAdd 5;       (* adds 5 to any int *)
val eleven    = addFive 6;              (* applying addFive *)
val seventeen = (handCurriedAdd 8) 9;   (* applying both at once *)
val twenty    = handCurriedAdd 10 10;   (* application is left-associative *)

(* ML provides a nice syntax for currying: instead of having a tuple
   of arguments, have a space-delimited list.  This is automatically
   curried addition: *)
fun curriedAdd x y = x + y;

Why curry? Well, it's nice to give your clients the freedom to partially apply your function and "save the result for later". The SML standard library uses currying heavily.


cse341-webmaster@cs.washington.edu
Last modified: Wed Feb 13 20:13:51 PST 2002