In Scheme, anonymous functions can be used to implement objects. Conversely, in Smalltalk, we found that a BlockContext object was used to implement anonymous functions.
Likewise, Java has no anonymous functions, but it does have objects. It turns out that you can use anonymous inner classes to do some of the same things that anonymous functions are traditionally used for --- which should not be surprising, since functions and objects both bundle state (local variables/parameters) and behavior (code).
The syntax for Java anonymous inner classes is as follows:
new ClassOrInterfaceName() { classBody }
This is an expression that constructs an instance of an
anonymous class that subclasses/subtypes
ClassOrInterfaceName
.
For example, Java defines the built-in Iterator
interface, which abstracts sequential iteration over a series of
values:
public interface Iterator { public boolean hasNext(); public Object next(); public void remove(); }
To define an anonymous inner class that meets the
Iterator
type, and provides iteration over the
integers from 0 to 9, you could write the following:
new Iterator() { private int i = 0; public boolean hasNext() { i < 10; } public Object next() { Integer retval = new Integer(i); ++i; return i; } public void remove() { throw new UnsupportedOperationException(); } }
Since anonymous inner classes are expressions, you can do whatever you do with any other expressions, e.g. assign one to local variables:
Iterator anIterator = new Iterator() { private int i; public boolean hasNext() { i < 10; } ... /* as above */ }; while (anIterator.hasNext()) { System.out.println(anIterator.next()); }
Like anonymous functions in "good" languages, Java anonymous classes are lexically scoped --- their bodies have access to names in the enclosing scope, including:
final
local variables and parameters.The restriction in the third set of names --- the fact that you
can only access final
locals --- is due to quirks in
the Java language design which we won't discuss further here.
The Iterator
example here is contrived, but there
are more useful examples...
Callbacks are a fundamental programming idiom. A callback interface is one in which the client of a library passes a pointer to some code, which the library will "call back" when some event occurs.
For example:
atexit
callback;
clients can register code to be executed right before the
current process terminates (whenever that is).Callbacks are inherently higher-order: the interface to register callbacks has to accept a function, i.e., the code to be called back. Anonymous functions and callbacks go especially well together; in Java, you commonly use anonymous inner classes instead.
In the Java Swing GUI library, the
java.awt.event.MouseListener
interface describes "any
object that can listen to mouse click events":
public interface MouseListener { void mouseClicked(MouseEvent e); void mouseEntered(MouseEvent e); void mouseExited(MouseEvent e); void mousePressed(MouseEvent e); void mouseReleased(MouseEvent e); }
The java.awt.event.MouseAdapter
class defines a
default implementation of this interface that does nothing:
public abstract class MouseAdapter { public void mouseClicked(MouseEvent e) {} public void mouseEntered(MouseEvent e) {} public void mouseExited(MouseEvent e) {} public void mousePressed(MouseEvent e) {} public void mouseReleased(MouseEvent e) {} }
Now, in order to register interest in a component's mouse events, you can use an anonymous inner class:
final String buttonLabel = "A button"; JComponent c = new JButton(buttonLabel); c.addMouseListener(new MouseAdapter() { public void mouseEntered(MouseEvent e) { System.out.println("Mouse entry event: " + e + " on button: " + buttonLabel); } })
This produces an instance of an anonymous class that inherits
from MouseAdapter
, overriding the
mouseEntered(MouseEvent)
method. This instance is
then registered using the addMouseListener
callback
interface. The system will call this class back with a MouseEvent
object whenever the mouse enters the button's screen area.
Notice that the body of the anonymous inner class uses the name
buttonLabel
from the surrounding lexical context.
The Java libraries could have been (but were not) designed to
encourage functional programming, but it would be easy enough to
add it. Java has a built-in interface for collections,
java.util.Collection
:
public interface Collection { void add(Object o); // Adds an object to this collection void clear(); // Empties the collection Iterator iterator(); // Returns iterator for the collection int size(); // Returns number of objects in collection ... // other methods }
This interface is OK, but doesn't encourage higher-order
programming; there's no equivalents for do
or
filter
in the style of functional languages (by
contrast, Smalltalk collections have do:
and
collect:
, which take blocks that operate on the
elements). So, how could we design a collections library to
support a "higher-order" style of programming?
First, define an interface for "function objects":
public interface Function { Object apply(Object argument); }
You can implement Function
using named
classes...
public class AddPeriod implements Function { public Object apply(Object argument) { return ((String)argument) + "."; } }
...or anonymous classes:
String hi = "hello"; String hiWorld = (new Function() { public Object apply(Object o) { return ((String)argument) + "."; } }).apply(hi);
(Aside: notice that we have to cast the argument, because we
only know that it's of type Object
. For now, we
won't worry about accurate static typing for arguments or results
--- all functions take and return Object
. We'll need
bounded parametric polymorphism, in the style of Pizza, to fix
this.)
Now, define the HigherOrderCollection
interface:
public interface HigherOrderCollection { /** Add an object to this collection */ void add(Object o); /** Apply f to each element of this collection. */ void doEach(Function f); /** Apply f to each element in this collection, and add the result to target. */ void map(HigherOrderCollection target, Function f); /** Add all elements satisfying pred into the target collection. */ void filter(HigherOrderCollection target, Function pred); ... // other operations }
Aside: There are some minor differences between this and the
higher-order collection functions we've seen before --- most
importantly, the use of a target
parameter to
map
and filter
. Our reasoning for this
design decision is as follows
map
and filter
return a fresh
list because they only operate over lists. However,
HigherOrderCollection
is an interface for many
kinds of collections --- e.g., lists, trees, and sets.collect:
solve this
problem by returning a collection that is a clone of the
original collection. This isn't appropriate for
map
, which may return elements of different kinds,
necessitating different properties for the target collection
than the source collection. For example, map
ping a
binary search tree of integers to a binary search tree of
strings will generally require a different comparison
criterion.The following defines the HOList
class, which
meets the above interface:
public class HOList implements HigherOrderCollection { /** Helper class for list nodes */ private class Link { Object value; Link next; Link(Object value, Link next) { this.value = value; this.next = next; } } private Link head; public HOList() { this.head = null; } public void add(Object o) { this.head = new Link(o, this.head); } public void doEach(final Function f) { for (Link current = head; current != null; current = current.next) { f.apply(current.value); } } public void map(final HigherOrderCollection target, final Function f) { this.doEach(new Function() { public Object apply(Object o) { target.add(f.apply(o)); return null; } }); } public void filter(final HigherOrderCollection target, final Function pred) { this.doEach(new Function() { public Object apply(Object o) { Boolean predVal = (Boolean)pred.apply(o); if (predVal.booleanValue()) { target.add(o); } return null; } }); } // ... other operations }
Here's an example of a client of HOList
HOList greetings = new HOList(); greetings.add("hi"); greetings.add("bonjour"); greetings.add("hola"); HOList helloWorlds = new HOList(); greetings.map(helloWorlds, new Function() { public Object apply(Object o) { return ((String)o) + ", world!"; } }); helloWorlds.doEach(new Function() { public Object apply(Object o) { System.out.println(o); return null; } });
The above is cool in a way, but I bet it doesn't strike you as very clean. Why is it that something that seems so clean in ML (and other languages we've studied) becomes ugly in Java? I claim that the answer has two parts:
Anonymous inner classes are much more verbose than anonymous functions in ML, Scheme, or Smalltalk. As a result, we end up writing lots more curly braces and other text that we don't really want to write.
Now, this verbosity does buy us something: the flexibility to
specify many different methods is useful for things like
MouseListener
implementations, which must specify
different functions to call for different events. But for
simple things like Function
it's overkill.
Java has only subtype polymorphism; it
doesn't have ML-style parametric polymorphism.
As a result, the argument to apply
has type
Object
, and we must nearly always use a cast in
apply
's body.
What we really want is to have Function
represent a family of interfaces, whose instances take
and return different types --- in much the same way that the ML
function type 'a -> 'b
gets instantiated to
different specific types depending on the function
definition.
As it happens, Pizza provides solutions to these problems.
Pizza is an extension of Java developed by Margin Odersky and Philip Wadler in 1997. Pizza is backwards-compatible with Java: every legal Java program is also a legal Pizza program that has the same meaning, and Pizza compiles to standard Java bytecodes. Pizza augments Java with three ideas from the functional language community:
We won't discuss Pizza's algebraic datatypes in these notes. They're cool, but the motivation for adding them to Java is arguably weaker than the former two features.
Consider the type of the ML map
function:
('a -> 'b) -> 'a list -> 'b list
This type uses parametric polymorphism: there are two type
parameters (type variables), 'a
and 'b
,
which are automatically instantiated to specific types
when map is applied:
map (fn x => x ^ ", world!") ["hi", "bonjour", "hola"];
In the above expression, (fn x => x ^ ", world!")
is of type string -> string
, and
map
's type is automatically adapted to this type.
It would be useful to add the analogous power to Java. This
isn't an artifact of our weird HigherOrderCollection
type; it's useful in general, especially for the built-in standard
Java collections. Consider the following code:
Collection c = new LinkedList(); c.add("hi"); c.add("bonjour"); c.add("hola"); Collection c2 = new LinkedList(); for (Iterator i = c.iterator(); i.hasNext(); ) { String s = (String)i.next(); // XXX c2.add(s + ", world!"); }
The cast on line XXX is not very satisfying. For one thing, it
may fail at runtime. For another, it's a really poor way to
document that c
holds collections of
String
rather than merely collections of type
Object
. What we'd really like to do is say that
Collection
is (to borrow ML's type syntax) a
'a Collection
;'a Collection
's iterator()
method returns a 'a Iterator
; and'a Iterator
's next()
method
returns a 'a
, not just an Object
.But Java gives us no way of saying this.
In Pizza, you can declare that a type (interface or class type)
has type parameters by writing the type
parameters in angle brackets. Pizza's Collection
interface looks like this:
public interface Collection<T> { void add(T value); Iterator<T> iterator(); ... // other methods }
The T
here is a type variable: it plays a role
similar to 'a
in an ML datatype declaration.
Unlike ML, Java has no real type inference. To maintain
harmony with Java, and for some other good reasons, Pizza doesn't
infer instantiations of parameterized types. Therefore, when you
declare a reference to an instantiation of
Collection<T>
, you have to provide the type
parameter explicitly:
Collection<String> c = new LinkedList<String>(); c.add("hi"); c.add("bonjour"); c.add("hola"); Collection<String> c2 = new LinkedList<String>(); for (Iterator<String> i = c.iterator(); i.hasNext(); ) { String s = i.next(); // XXX c2.add(s + ", world!"); }
At first glance, this seems like a greater burden than before
--- where before we only had to write the cast to
String
on line XXX, we must now fill in type
parameters on many of our type declarations. However, this is
still superior for at least the following reasons:
c
and c2
only contain strings; this
checking extends to all method calls on the collection.
For example, if we tried to add an Integer
to one
of the collections, we'd get a static error:
c.add(new Integer(3)); // compile-time error
To implement a Collection<T>
, we can define
a linked list class that also has a type parameter:
public class LinkedList<T> implements Collection<T> { private class Link<T> { T value; Link<T> next; Link<T>(T value, Link<T> next) { this.value = value; this.next = next; } } private Link<T> head; public LinkedList<T>() { this.head = null; } void add(T value) { current = new Link<T>( } Iterator<T> iterator() { return new Iterator<T>() { Link<T> current = head; public boolean hasNext() { return current != null; } public T next() { T retval = current; current = current.next; return retval; } public void remove() { return new UnsupportedOperationException(); } }; } }
Notice that wherever we have a type name, we consistently insert a parameter to indicate the element type.
ML has parametric polymorphism, which Java does not. But Java has subtype polymorphism, which ML does not. It so happens that when you have both subtyping and parametric polymorphism in your language, it becomes natural to extend parametric polymorphism with bounds on the type. Pizza supports this feature, which is called bounded parametric polymorphism.
For example, we might want to define a class of "printable
lists", whose elements must define a print
method.
First, we define a Printable
interface:
public interface Printable { String print(); }
Now, we can define an interface for printable collections:
public interface PrintableCollection<T implements Printable> extends Collection<T> { String printAll(); }
Notice the extends
clause on the type parameter.
This says that the type variable can only be instantiated with
types that implement the Printable
interface.
Therefore, for example, suppose we have classes Foo
and Bar
:
class Foo implements Printable { String print() { return "foo"; } } class Bar {} // does not implement Printable PrintableCollection<Foo> = ...; // OK PrintableCollection<Bar> = ...; // Static error: Bar not Printable
In the implementation of a PrintableCollection
,
the class body is allowed to assume that print
is
defined on the element type:
public class PrintableLinkedList<T extends Printable> extends LinkedList<T> implements PrintableCollection<T> { public String printAll() { String retval = ""; for (Iterator<T> i = this.iterator(); i.hasNext(); ) { retval = retval + i.next().print(); // XXX } return retval; } }
Notice that on line XXX we use.
Bounded polymorphism can gets pretty fancy --- it turns out that you really want recursive bounds to express certain typing patterns. You can read the Pizza paper for details.
The important lessons to take away from Pizza's bounded polymorphism:
Note on terminology: sometimes you will hear bounded parametric polymorphism called generics or generic types. Variations on generics have appeared in Ada, Modula-3, and C++ (templates), among other languages.
Java's anonymous inner classes are too verbose for simple uses of anonymous functions. Pizza adds a simpler syntax for anonymous functions:
fun (argType1 argName1, ..., argTypeN argNameN) -> returnType stmt
Here's the identity function:
fun (Object x) -> Object { return x; }
Functions can be applied using normal Java function call syntax, so here's the identity function applied to a string:
(fun (Object x) -> Object { return x; })("hello")
In this world, it is simple for our LinkedList
class to support a map
function:
class LinkedList<T> implements Collection<T> { ... <T2> LinkedList<T2> map((T)->T2 f) { LinkedList<T2> retval = new LinkedList<T2>(); for (Iterator<T> i = this.iterator(); i.hasNext(); ) { retval.add(f(i.next())); } return retval; } }
The user of this function must write much less than with anonymous inner classes:
LinkedList<String> greetings = new LinkedList<String>(); greetings.add("hello"); greetings.add("bonjour"); greetings.add("hola"); LinkedList<String> helloWorlds = greetings.map(fun (String s) -> String { return s + ", world!" });
Notice that parametric polymorphism and anonymous functions have synergistic effects. Each makes the other more powerful and elegant.
MultiJava is an extension of Java originally developed by Craig Chambers and Todd Millstein (of UW) and Curtis Clifton and Gary Leavens (of Iowa State University). MultiJava is in semi-active development.
Like Pizza, MultiJava is backwards compatible --- every legal Java program is also a legal MultiJava program that has the same meaning, and MultiJava programs compile to standard Java bytecodes. MultiJava augments Java with two key ideas from the (object-oriented) research language Cecil:
The MultiJava compiler is freely available at multijava.org. I have used it daily for several months as the implementation language for some of my own projects; it is a relatively stable, high-quality tool, and I strongly recommend you try it out.
Recall the Shape
and Rectangle
classes from our lecture on OO
static typing:
class Shape extends Object { boolean overlaps(Shape other) { ... } // AAA } class Rectangle extends Shape { boolean overlaps(Rectangle other) { ... } // BBB } Rectangle r = new Rectangle(...); Shape s = new Rectangle(...); boolean b = r.overlaps(s); // XXX
The methods at lines AAA and BBB are statically
overloaded, not overridden. At line XXX, the method is
chosen based on the static overload resolution, not
dynamic dispatch, and the static type of the argument is
Shape
.
Clearly, we don't want static overloading. This is known as the binary method problem. In Java, you can only implement the "right" behavior for binary methods as follows:
class Rectangle extends Shape { boolean overlaps(Shape other) { if (other instanceof Rectangle) { Rectangle otherRect = (Rectangle)other; ... // code to compare with otherRect } else { return super.overlaps(other); } } }
This is not satisfying. We must manually test for
Rectangle
and use a cast; it's easy to make an error,
and it's tedious to write and maintain this code, which amounts to
a manual implementation of dynamic dispatch.
Languages with multiple dispatch enable the programmer to specify directly that a method should dynamically dispatch based on multiple arguments --- i.e., based on the runtime type of arguments in addition to the receiver.
In MultiJava, you can make overlaps
dispatch
dynamically when the argument is a Rectangle
as
follows:
class Rectangle extends Shape { boolean overlaps(Shape@Rectangle other) { ... } }
Notice that the declared argument type is now
Shape@Rectangle
. This means that
overlaps(Shape)
inherited from Shape
;Rectangle
.As a result, we simply get the "right" thing. The MultiJava compiler will automatically generate dispatch code.
Multiple dispatch has uses besides binary methods: event handling, extensible data structure traversals, and many more.
EML (Extensibe ML) is a language developed by Todd Millstein and Craig Chambers (two of the designers of MultiJava) in order to explore certain connections between functional and object-oriented languages.
In the functional universe, we have pattern matching and datatypes:
In the object-oriented universe, we have classes and dynamic dispatch:
You should have recognized this similarity when you implemented the Smalltalk binary tree after implementing the ML binary tree. Smalltalk binary tree nodes might be represented with the classes:
Object subclass: #TreeNode ... TreeNode subclass: #EmptyNode ... TreeNode subclass: #ValueNode instanceVariableNames: 'v left right' ...
ML binary tree nodes might be represented with the datatype:
datatype 'a TreeNode = EmptyNode | ValueNode of {v:'a, left:'a TreeNode, right:'a TreeNode}
(The above examples have been modified slightly from the homeworks, in order to make the parallel clearer.)
EML develops this observation further by unifying pattern matching with multiple dispatch, and unifying ML-style data types with object-oriented classes.
Consider the following ML-style datatype
declaration of points:
datatype Point = CartPoint of {x:real, y:real} | PolarPoint of {rho:real, theta:real} fun getX (CartPoint {x, y}) = x | getX (PolarPoint {rho, theta}) = rho * Math.cos(theta)
In ML, it would not be possible to add a third case,
CartPoint3D
, with a z
field, without
altering this original source code. On the other hand, it is easy
to add a new function,
The duality between these two forms of extensibility is sometimes called the horizontal-vertical extensibility problem, because if you arrange data types and functions in a table, then object-oriented programming gives you "horizontal" extension and functional programming gives you "vertical" extension:
CartPoint | PolarPoint | ... | |
---|---|---|---|
getX | getX(CartPoint) | getX(PolarPoint) | ... |
getY | getY(CartPoint) | getY(PolarPoint) | ... |
... | ... | ... | ... |
In EML, both data types and functions are extensible (with some restrictions to ensure that typechecking can be performed separately on each module), thereby solving the horizontal-vertical extensibility problem.
In EML, the above datatype
declaration for
Point
is actually syntactic sugar for the following
class declarations:
abstract class Point() of {} class CartPoint(x:real, y:real) extends Point() of {x:real = x, y:real = y} class PolarPoint(rho:real, theta:real) extends Point() of {rho:real = rho, theta:real = theta}
Class names serve as constructors, just as with ML datatype constructor names:
val aCartPoint = CartPoint {1.0, 2.0};
You can extend this data type straightforwardly, in the OO style:
class CartPoint3D(x:real, y:real, z:real) extends CartPoint(x, y) of {z:real = z}
Notice that classes only declare data members. Functions are still specified separately from classes, and use pattern-matching syntax:
fun getX (CartPoint {x, y}) = x getX (PolarPoint {rho, theta}) = rho * Math.cos(theta) fun plus (CartPoint {x=x1, y=y1}, CartPoint {x=x2, y=y2}) = CartPoint {x=x1+x2, y=y1+y2} | plus (PolarPoint {rho=r1, theta=t1}, PolarPoint {rho=r2, theta=t2}) = ...
With the ability to extend datatypes, one must have the ability
to extend functions for datatypes; and, indeed, one can, using the
extend fun
construct:
extend fun getX (CartPoint3D{x, y, z}) = x; extend fun plus(CartPoint3D {x=x1, y=y1, z=z1}, CartPoint3D {x=x2, y=y2, z=z2}) = CartPoint3D {x=x1+x2, y=y1+y2, z=z1+z2}; extend fun plus(CartPoint{x=x1, y=y1}, CartPoint3D{x=x2, y=y2, z=z2}) = CartPoint3D {x=x1+x2, y=y1+y2, z=z2}; ... (* extend funs for other cases *)
It turns out that, if you dig down under the syntax, EML and MultiJava are actually based on the same underlying ideas: