CSE 341 -- Miranda

CSE 341 - Programming Languages - Autumn 2006
Miranda

Key Concepts

The principal Big Ideas that we'll study in Miranda are polymorphic type systems and type inference. These are exhibited in a really crisp, clean form in Miranda. Later in the course we'll look at Java generics, and see how related ideas occur in object-oriented languages (complicated by the addition of inheritance).

Other important ideas in Miranda are:

how to program in a pure functional programming language -- there are no side effects in Miranda
syntax in which layout is significant (like Python)
a language with normal order evaluation (i.e. parameters are passed using lazy evaluation, not passed by value). This lets us use quite interesting programming idioms, including infinite data structures
combinator-based implementation

Referential Transparency and Functional Languages

Imperative languages have variables and assignment; they are in effect abstractions of the von Neumann machine. There is no referential transparency.

In mathematics, we can substitute equals for equals:
if x=y+z then you can substitute y+z for x

Informally, in a referentially transparent language you can substitute equals for equals, just as in mathematics. (You can't do that in an imperative language -- for example, in C or Java, just because you have a statement x=y+1 someplace, you can't safely substitute y+1 for x everywhere in your program.)

Haskell is a standard pure functional language, but it is much more complex than Miranda. ML has a functional subset.

Barriers to widespread use of functional languages:

programmer acceptance
unnatural for some things, such as interactive I/O
efficiency (arguably a declining problem; also functional languages good candidates for parallel execution)

However, even if you don't end up programming in a pure functional language in your other work, it is still very worth knowing about the functional style -- you can often use it effectively in some parts of your application. (For example, Python includes support for lambda, map, list comprehensions, and other concepts we'll see in Miranda, and these can be very clear and concise. Further, it often makes your program clearer if you don't use side effects for a given data structure -- then you know that you can count on some data not being altered as you pass it around.)

Running Miranda

The system is available on the instructional linux cluster (attu.cs.washington.edu). To run it, log in to attu and type

mira

To load in an existing file named myfile.m, type

mira myfile.m

The standard extension for a Miranda source file is .m

Miranda will also generate a compiled version of the file with a .x extension -- if there is no .x version it creates one, and if the .m version is more recent it recompiles.

For a list of useful commands within Miranda type /help. These include the ability to edit the current program (or script, to use Miranda terminology). To do this, type /edit. The newly edited script is then loaded in. Other commands:

/man  - online manual
/quit (or  control d)

There is a standard environment that comes pre-loaded with Miranda, with various useful functions. You can see what is there by looking at the "Standard Environment" option in the online manual. (This is a good thing to do not only to see what is available, but also to see some nice examples of Miranda programmming style.)

The default editor is vi, but you can change this to emacs or pico or whatever you'd like by running this at the Miranda prompt:

/editor emacs

(Note that you only have to do this once. Miranda will remember your editor choice across sessions.)

If you need to avoid having emacs show up in a separate x window for some reason, do this instead:

/editor emacs -nw

If you are using emacs on a .m file, the current version of emacs will think this is an Objective C file and open it in that mode. You can convince emacs to edit .m files in text mode by either creating a file named .emacs in your top-level directory on attu with the following line of code, or else adding this line to an existing .emacs file:

(setq auto-mode-alist (cons '("\\.m$" . text-mode) auto-mode-alist))

There is also an emacs package that defines a genuine Miranda mode -- if someone wants to try getting it working on attu let me know. (Incidentally, notice that emacs is using a Scheme dialect as its extension language!)

Miranda Examples

All the examples in these lecture notes are also available in a Miranda file that you can copy to your own directory and then use or modify. To do this give the following Unix command on attu.cs.washington.edu:

cp ~borning/miranda/lecture.m lecture.m

Also see ~borning/miranda/* on attu for other example programs (or scripts, to use the Miranda term).

Miranda Environment

Miranda is an interactive text-oriented environment. Examples:

Miranda  3+4
  7

Miranda  8 : [1..10]
  [8,1,2,3,4,5,6,7,8,9,10]

Miranda  [1,2,3] ++ [8,9]
  [1,2,3,8,9]

Miranda  member [1,2,3] 8
  False

To define a function:

double x = x+x

Function and other definitions must go in a file (script); Miranda distinguishes between definition mode (when reading from a file) and evaluation mode.

Variable names (including function names) must begin with a lower-case letter.

Data types

numbers (both int and float): 3 4.5 3.9e10
booleans: True False
characters: 'a'
lists: [1,2,3,8]

type system requires that all elements in a list must be of the same type:

  [2,4,7]
  [[1,2],[4,9]]

This would give a type error:

[2, 3, 's']

(Note the contrast with Scheme!)

strings -- shorthand for list of chars: "hi there"

tuples: (False,[1,2,3],"str")

Tuples aren't the same as lists! Lists have indefinite length and all elements must be of the same type; tuples have fixed size but may have different types of elements.

functions are first class citizens

function application denoted by juxtaposition

neg 3
-3
member [1,4,6] 4

infix -- 
3+4
(+) 3 4

arithmetic 
+ - * /
sin cos etc
sum, product

product [1,2,8]

Recursion Examples

Miranda is quite terse.

|| recursive factorial
|| whitespace is significant - notice the use of layout in the example

rec_factorial n = 1, if n=0
                = n * rec_factorial (n-1), n>0

|| alternate version using pattern matching:
pattern_factorial 0     = 1
pattern_factorial (k+1) = (k+1) * pattern_factorial k


|| factorial using the built-in prod function
factorial n = product [1..n]

|| mapping function, like map in Scheme
|| (Miranda also has a builtin function named map, so I gave it a new name here)
my_map f [] = []
my_map f (a:x) = f a : my_map f x


my_map factorial [1,5]
my_map factorial [1..10]
my_map factorial [1..]

Currying

Named after the logician Haskell Curry (not the spices).

First let's define a new function plus so that we don't get confused right off with infix notation:

plus x y = x+y

Now we can apply plus to one argument to get the function that adds one to numbers:

addone = plus 1

Here's the same function, without trying to avoid infix notation. (To use + as an ordinary function put parens around it.)

another_add1 = (+) 1

Currying and higher-order functions:

twice f x = f (f x)
g = twice add1
g 3

map factorial

Block structure - lexical scoping

hyp x y = sqrt sum
          where sum = x*x + y*y



hyp2 x y = sqrt sum
           where sum = x2 + y2
                       where x2 = x*x
                             y2= y*y

Lazy Evaluation and Infinite Lists

my_if True x y = x
my_if False x y = y

my_if (x>y) 3 4

my_if (x>y) 3 (1/0)

ones = 1 : ones

Compare with circular list in Scheme:

  (define ones '(1))
  (set-cdr! ones ones)

ints n = n : ints (n+1)
[1..]

lets = "abc" ++ lets

Two prime number programs:

factors n = [k | k <- [1..n]; n mod k = 0]

dullprimes = filter isprime [2..]
             where
             isprime p = (factors p = [1,p])


interestingprimes = sieve [2..]
                    where
                    sieve (p:x) = p : sieve [n | n <- x; n mod p > 0]

Hamming numbers:

These are numbers whose only factors are powers of 2, 3, and 5. (See the Wikipedia article on Hamming numbers.)

my_merge (x:a) (y:b) = x : my_merge a (y:b) , if x<y
                     = y : my_merge a b , if x=y
                     = y : my_merge (x:a) b , otherwise

ham = 1: my_merge ham2 (my_merge ham3 ham5)

ham2 = map (*2) ham
ham3 = map (*3) ham
ham5 = map (*5) ham

Higher-order functions

twice f x = f (f x)
double = (*) 2

twice double 4


map f [] = []
map f (a:x) = f a: map f x

map ( (+) 1) [1..5]

argument and result types of the function supplied to 'map' can be different:

map code "0123"
[48,49,50,51]

interesting uses:

member x a = or (map (=a) x)

length x = sum (map (const 1) x)

sum, product, and, or, concat, etc ...

could define recursively, e.g.

sum [] = 0
sum (a:x) = a+sum x


foldr op r = f
             where
             f [] = r
             f (a:x) = op a(f x) 

sum = foldr (+) 0
product = foldr (*) 1
or = foldr (\/) False
and = foldr (&) True

This does not hurt efficiency. Miranda's implementation uses combinator graphs -- the first time you run the function it rewrites that portion of the graph.

This is also the case for things like (1+3) and code '0' -- compare with the concept of 'constant folding' that you'll encounter in compilers.

Other examples:

my_filter f [] = []
my_filter f (a:x) = a : my_filter f x, f a
                  = my_filter f x, otherwise

reverse_args f x y = f y x

const k x = k

fix f = e
        where e = f e

double x = 2*x
const x y = x

Type system

Miranda uses Milner-style polymorphism (which was originally developed for type checking ML). A polymorphic function is a function with a type variable in its type. In Milner-style polymorphism there are no restrictions on types that a type variable can take on (universal polymorphism).

3::
num

[1,2,3] ::
[num]

[[1,2,3]] ::
[[num]]

[]::
[*]

neg ::
num -> num

+ ::
num -> num -> num
(note that -> is right associative)

member::
[*] -> * -> bool

= ::
* -> * -> bool

run time error if = applied to functions

The type system is not always as descriptive as one would like.

For append and member, universal polymorphism is just right. It isn't what we want for functions -- but Miranda fudges this with a runtime type check. Miranda avoids the complexities of overloading by having a single type num rather than integer and float types. (Haskell's type system includes type classes, which address both of these issues at the cost of added complexity.)

Universal polymorphism doesn't work well at all for object-oriented languages: there we want a type system that specifies that a variable x can hold any object that understands the + and * messages. (F-bounded polymorphism is a way to provide this.)

How does type checking work?

Variables are like the logic variables we'll encounter in CLP(R). The type checker uses unification (i.e. solving type equations -- you can think of it as a kind of two-way pattern matching).

Examples:

double x = x+x
double::num->num

map f [] = []
map f (a:x) = f a: map f x

map::(*->**)->[*]->[**]

map code "0123"
[48,49,50,51]


twice:: 
(*->*)->*->*


map f [] = []
map f (a:x) = f a: map f x


foldr::(*->**->**)->**->[*]->**
foldr op r = f
             where
             f [] = r
             f (a:x) = op a(f x) 


sum = foldr (+) 0

Here * and ** are type variables. (Aside: the * convention for naming type variables is not ideal when the types become complex - it would be better to have names formed using ordinary letters, perhaps with a special marker to distinguish type variables from ordinary variables. This is done in Haskell and ML.)

User-defined types

Type names must begin with a lower-case letter; constructors with an upper-case letter.

Example:

numtree ::= EmptyNumTree | NumNode num numtree numtree

t1 = NumNode 7 EmptyNumTree (NumNode 4 EmptyNumTree EmptyNumTree)


|| polymorphic types:

tree * ::= EmptyTree | Node * (tree *) (tree *) 

|| define some sample trees
t2 = Node 7 EmptyTree (Node 4 EmptyTree EmptyTree)
t3 = Node 7 (Node 3 EmptyTree EmptyTree) (Node 4 EmptyTree EmptyTree)

|| flattening a tree
flatten :: tree * -> [*]
flatten (Node n left right) = flatten left ++ [n] ++ flatten right
flatten EmptyTree = []

|| another way to flatten a tree (preorder traversal)
pre (Node n left right) = [n] ++ pre left ++ pre right
pre EmptyTree = []

|| test for membership in a tree
tmember :: * -> tree * -> bool
tmember x EmptyTree = False
tmember x (Node y left right) = True, x=y
                              = tmember x left \/ tmember x right, otherwise

|| sum the elements in a tree of numbers
tree_sum EmptyTree = 0
tree_sum (Node n left right) = n + tree_sum left + tree_sum right

other_sum t = sum (flatten t)

|| note the types of tree_sum and other_sum:
|| tree num -> num
|| (thus these only work on one kind of tree - this is inferred automatically)

The user-defined type mechanism encompasses enumeration types, union types, variant records:

weekday ::= Mon | Tues | Wed | Thurs | Fri
intbool ::= MyInt num | MyBool bool

|| negate a number or apply not to a boolean
flip (MyInt n) = MyInt (-n)
flip (MyBool b) = MyBool (~b)
intbool_list = [MyInt 5, MyInt (-10), MyBool True, MyBool False]
flipped_list = map flip intbool_list

|| aliases
string == [char]

seacreatures :: [string]
|| this type declaration is equivalent to
||    seacreatures :: [[char]]
seacreatures = ["clam", "squid"]

ZF notation (list comprehensions)

a list of cubes of numbers betwen 1 and 10

	[ n*n*n | n <- [1..100] ]

all cubes:

	[ n*n*n | n <- [1..] ]

	perms [] = [[]]
	perms x  = [ a:y | a <- x; y <- perms (x--[a]) ]

|| quicksort

qsort [] = []
qsort (a:x) = qsort [ b | b <- x; b<=a ] ++ [a] ++ qsort [ b | b <- x; b>a ]

Square root
`limit' applied to a list of values, returns the first value which is the same as its successor. Useful in testing for convergence. For example the following Miranda expression computes the square root of r by the Newton-Raphson method

sqt r = limit [x | x<-r, 0.5*(x + r/x).. ]

limit::[*]->*
limit (a:b:x) = a, if a=b
              = limit (b:x), otherwise
limit other = error "incorrect use of limit"


cp x y = [ (a,b) | a <- x; b <- y]

Miranda cp [1..5] [1..4]
[(1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(2,4),
(3,1),(3,2),(3,3),(3,4),(4,1),(4,2),(4,3),(4,4),(5,1),(5,2),(5,3),(5,4)]


cpdiag x y = [ (a,b) // a <- x; b <- y]

rats = cpdiag [1..] [1..]

[(1,1),(1,2),(2,1),(1,3),(2,2),(3,1),(1,4),(2,3),(3,2),(4,1),
(1,5),(2,4),(3,3),(4,2),(5,1),(1,6),(2,5),(3,4),(4,3),(5,2),
(6,1),(1,7),(2,6),(3,5),(4,4),(5,3),(6,2),(7,1),(1,8),(2,7),
(3,6),(4,5),(5,4),(6,3),(7,2),(8,1),(1,9),(2,8),(3,7),(4,6),(5,5),
(6,4),(7,3),(8,2),(9,1),(1,10),(2,9),(3,8),(4,7),(5,6),(6,5), ...

force -- forces all parts of its argument to be evaluated

other functional languages that don't use combinators -- more conventional approach to lazy evaluation would be to pass a thunk

there, strictness analysis can win big in efficiency

Abstract data types in Miranda

abstype queue *
    with emptyq :: queue *
         isempty :: queue * -> bool
	 head :: queue * -> *
	 add :: * -> queue * -> queue *
	 remove :: queue * -> queue *
	 
queue * == [*]
emptyq = []
isempty [] = True
isempty (a:x) = False           
head (a:x) = a
add a q = q ++ [a]
remove (a:x) = x

Outside of the definition, the only way you can get at the queue is via its protocol. Example: building a queue

q1 = add 5 (add 3 emptyq)
q2 = remove q1

this won't work: q1 = [3,5]

CSE 341 - Programming Languages - Autumn 2006 Miranda