CSE 341 -- Data Abstractions in ML

Introducing New Data Types

ML, like most languages, allows us to create new datatypes, which may or may not be polymorphic. A datatype declaration looks like this:

datatype (type parameters) typename = con1 (of type1)
                                    | con2 (of type2)
                                    ...
                                    | condN (of typeN);

(* here is a simple example *)

datatype CarMake = Ford | VW | Toyota | Lada ;
datatype  CarMake
  con Ford : CarMake
  con Lada : CarMake
  con Toyota : CarMake
  con VW : CarMake

- val mycar = Lada;
val mycar = Lada : CarMake
In the above example, we declare a new type, CarMake, and four type constructors (Ford, VW, Toyota, Lada). Type constructors can take parameters; if they don't they act as "constants" of the type. Below is a simple example of an abstract stack of integers:
- datatype IntStack = IntStack of int list;
datatype  IntStack
  con IntStack : int list -> IntStack
We introduce a new type called IntStack with the datatype keyword. On the right of the =, we introduce a data constructor, which is also called IntStack (the data constructor does not have to have the same name as the type, but it can). This data constructor takes as an argument a list of integers and wraps those integers in the type Stack. Data constructors shouldn't be confused with functions, although they act like functions which create values with a type tag. Below we introduce some operations on integer stacks:
exception EmptyStack;

- val newIntStack = IntStack(nil);
val newIntStack = IntStack [] : IntStack

- fun push (IntStack(s),item) = IntStack(item::s);
val push = fn : IntStack * int -> IntStack

- fun top (IntStack(nil)) = raise EmptyStack
    | top (IntStack(x::xs)) = x;
val top = fn : IntStack -> int

- fun pop (IntStack(nil)) = raise EmptyStack
    | pop (IntStack(x::xs)) = IntStack(xs);
val pop = fn : IntStack -> IntStack

- val S = push(1, push(2, newIntStack));
val S = IntStack [1,2] : IntStack

- val foo = IntStack [1,2,3,4];
val foo = IntStack [1,2,3,4] : IntStack
Notice the type profiles: Even though a stack is actually represented as a list of integers, this is not evident from the type profiles.


Polymorphic Data Types

We'd like to be able to define data types which are polymorphic. The idea is that we'd like a stack of any type of object, not just of integers. ML allows us to do this conveniently by "parameterizing" the type definition. Below is the implementation of a "generic" stack.
datatype 'a Stack = Stack of 'a list;
The important difference is that the type definition is augmented with a type parameter, indicating a polymorphic type: a stack of "anythings." We also introduce a data constructor (called Stack) which is applied to lists of any type and wraps the list up as an object of type Stack. Below we define some operations on Stacks:
val newStack = Stack(nil);

fun push (item, Stack(s)) = Stack(item::s);

fun top (Stack(nil)) = raise EmptyStack
  | top (Stack(x::xs)) = x;

fun pop (Stack(nil)) = raise EmptyStack
  | pop (Stack(x::xs)) = Stack(xs);

- val S = push(1, push(2, newStack));
val S= Stack [1,2] : int Stack

- val foo = Stack [1,2,3];
val foo = Stack [1,2,3] : int Stack

Recursive Data Types

ML also allows us to introduce recursively defined types, such as binary trees. Here is the definition of a binary tree of integers:
datatype Btree = Empty | Node of int * Btree * Btree;
We introduce a new type called Btree, and two constructors, Empty and Node. Empty is a constructor which takes no arguments and we'll use it to represent an empty tree. Node is a constructor which takes a 3-tuple consisting of an integer, and two more objects which are of type Btree. Below we define a couple of operations which allow us to use the tree as a Binary Search Tree.
fun insert(x, Empty) = Node(x, Empty, Empty)
  | insert(x, Node(y, left, right)) =
    if x=y then Node(y, left, right)
    else 
	if x < y then Node(y, insert(x, left), right)
    else
	Node(y, left, insert (x, right));
	       
fun lookup(x, Empty) = false
  | lookup(x, Node(y, left, right)) =
    if x=y then true
    else
      if x < y then lookup(x, left)
    else
      lookup(x, right);

- val t = insert(3, insert(1, insert(2, Empty)));
val t = Node (2,Node (1,Empty,Empty), Node (3,Empty,Empty)) : Btree

Tables

Below, we introduce a generic table:
datatype (''key, 'val) Table = Table of (''key * 'val) list;
We introduce a new type called Table, which is parameterized by a tuple of some equality type and some type. In doing so, we introduce a new data constructor called Table, which takes a list of tuples. Our representation, therefore, is that a table consists of a list of tuples, the first element of which is the key, and the second element of which is the value. The key must be of an equality type, because we'll want to perform lookup on the table. Below are some operations we'll want to define for tables.
datatype (''key, 'val) table = table of (''key * 'val) list;

exception TableError;

(* create a new, empty table *)
val create = table(nil);

(* lookup returns the value bound to key, or raises a table error *)
fun get(key, table(nil)) = raise TableError
  | get(key, table((k,v)::xs)) = 
       if key=k then v else get(key, table(xs));

(* assoc updates entries if they exist, otherwise just adds them *)
fun put(key, value, table(bindings)) =
   let
     fun update(k,v,nil) = [(k,v)]
       | update(k,v,(k1,v1)::bindings) =
	   if k=k1 then 
             (k,v)::bindings
	   else 
             (k1,v1)::update(k,v,bindings);
   in
      table(update(key, value, bindings))
   end;

- val t = table [("bob", 2.3), ("jane", 4.0)];
val t = table [("bob",2.3),("jane",4.0)] : (string,real) table

- val t = assoc ("bob", 2.7, t);
val t = table [("bob",2.7),("jane",4.0)] : (string,real) table

S-Expressions in ML

We can represent S-expressions in ML as well! In this way, we support bounded, planned, heterogeneity in ML.
datatype Sexpr = 
  Nil |
  Integer of int |
  Symbol of string |
  Cons of (Sexpr * Sexpr);

fun first(Nil) = Nil
  | first(Cons(hd,x)) = hd;

fun rest (Nil) = Nil
  | rest (Cons(x,tl)) = tl;

Structures & Signatures

The problem with our table implementation (above) is that it doesn't do a very good job of hiding its representation. Also, we define functions (globally) that operate on the contents of the table, which may cause name collisions in a large system. First let's look at clustering the table operations into a module. We'll use a structure.
structure Table = struct

(* insert all of the code for tables here *)

end
This wraps all of our definitions (datatypes, exceptions, and functions into a module). When we file in this definition, the interpreter responds:
structure Table :
  sig
    datatype ('a,'b) table = table of ('a * 'b) list
    exception TableError
    val create : ('a,'b) table
    val get : ''a * (''a,'b) table -> 'b
    val put : ''a * 'b * (''a,'b) table -> (''a,'b) table
  end
Notice that it has deduced a signature for the structure. Now we can use the structure:
- val t = Table.create;
val t = Table [] : (''a,'b) Table
- val t = Table.put("bob", 1, t);
val t = Table [("bob",1)] : (string,int) Table
- Table.get("bob", t);
val it = 1 : int
Notice the use of the dot-notation to access names defined in the structure. While this solves a namespace issue, we still have the problem of visibility: too much of the internal representation is visible. Not only are all functions and values visible, the actual representation of the datatype (table as a list) is visible. Essentially everything inside of our table is public to the outside world. If we wanted to restrict access to elements of the structure we can define a signature, which defines the interface to the structure.
signature TABLE = sig
    type  (''key, 'val) table;
    exception TableError;
    val create : (''key, 'val) table;
    val get : ''key * (''key, 'val) table -> 'val;
    val put : ''key * 'val * (''key, 'val) table -> (''key, 'val) table;
end
Notice that the above is exactly the same as the signature deduced by ML, except that we no longer advertise a datatype definition. Instead, we say simply that there is a parameterized type called table, but don't say anything about its representation. Now we can modify our structure definition as follows:
structure Table : TABLE = struct

(* insert all of the code for tables here *)

end
We can also "inline" the signature definition as part of the structure. Below is an example for a Point structure.
structure Point :
sig
    type point;

    datatype Point = Point of int * int;
    val add : Point * Point -> Point
    val create : int * int -> Point
end
= 
struct
  datatype Point = Point of int * int;
  fun create(x,y) = Point(x,y);
  fun add(Point(x1, y1), Point(x2, y2)) = Point(x1+x2, y1+y2);
  (* maybe a bunch of private stuff here too! *)
end;
In many cases, it may be preferrable to keep them separate, because then people could build different implementations of a "point" signature, such as cartesian representations or polar representations.

Summary

Structures are similar in some ways to classes in C++, although they are arguably more general. Whereas a class defines one type and operations on that type, structures can define collections of types and operations. ML decouples the notion of interface from representation via the use of signatures. Signatures coupled with structures allow us to perform information hiding. In the next section, we'll se a powerful language feature for importing types/functions into structures, called functors, which is similar to the C++ template mechanism.
dugan@cs.washington.edu (Last Update: )