Previously, we have studied a subset of ML that is purely functional --- i.e. not having assignments, updatable data structures, or other side effects (a "side effect" is anything that is not just evaluation). This functional subset is computationally complete --- any computation can be expressed in it. So why would we even want to add mutable (updatable) data?
At first, one might respond: Well, to model certain processes accurately, we need side effects. The world, for example, changes over time. However, this is something of a fallacy --- we can always model changes over time using a function that takes the old state, and produces the new state; i.e., that has type:
World -> World
where World
is some data type that represents the
entire state of the world.
So why would we want to introduce side effects into our language? Here's a short list, with elaboration below:
The upshot of the explanations that follow is that side effects (including mutable data) are currently a necessary evil in a practical language.
Logically, most functional programs make many copies of
data. You're always constructing "the new world",
e.g. the new list value that's returned from reverse
.
Therefore:
reverse
function: if the incoming argument
is known to be the only pointer to the list that is to be
reversed, then we can reuse the cells of the input list for
the output list --- because nobody else will ever use the old
list again.final
fields, or a C++ class with only
const
fields).Note that the caveats on sophisticated compilers and type systems apply to most high-level language features, such as the uniform reference model (all objects referred to by pointer) and, to a lesser extent, garbage collection. I am a student/researcher in languages and compilers, and I am therefore a big fan of high-level languages, smart compilers, and sophisticated type systems. Ultimately, I believe gains in programmer productivity from using high-level languages outweigh performance costs, as well as the costs of implementing smart compilers and teaching programmers to use clever type systems. However, these caveats and costs must still be kept in mind when evaluating the importance or value of high-level languages.
Certain data structures, including (but not limited to) cyclic data structures, are inherently hard to express in purely functional languages.
Consider a doubly linked list:
datatype 'a DList = DEmpty | DNode of {elem:'a, prev:'a DList, next:'a DList};
It is obvious how to construct an empty linked list, or a linked list with one node:
val empty_dlist = DEmpty val single_dlist = DNode {elem=25, prev=DEmpty, next=DEmpty};
But how do we prepend a node onto this list? OK, the empty case is easy, but what about the node case?
fun prepend x Empty = DNode {elem=25, prev=DEmpty, next=DEmpty} | prepend x (DNode {elem, prev, next}) = DNode {elem=x, prev=DEmpty, next=(DNode {elem, prev=(XXX?), next=(YYY?)})};
What will we fill in for the (XXX?)
? The
prev
pointer of the second node must point to
the first node, which we are currently constructing
(i.e., the node whose elem
is x
). But
we have no way of referring to this node until it's constructed.
And what will we fill in for (YYY?)
? We must
recursively reconstruct the next node --- after all, we must
update its prev
pointer to point to the
second node --- but what will we pass to it to
use as the prev
value?
Pure functional languages have solutions to this problem, but they're complicated and arguably less natural than simply allowing the pointer to be updated after the node is constructed.
Again, it is arguably more natural to allow the programmer to allocate the array, then afterwards update the values by running the computation.
Permissiveness is arguably just a different kind of expressiveness --- if a language is too permissive, then it is because it does not allow the programmer to express some restriction that (s)he wishes to express.
In this case, we would like to express the following simple constraint: Don't copy the world. If we're representing the world as a data structure, then arguably we should not permit more than one example of this world to be "alive" in the program at once. But in an ordinary functional language, it is easy to copy the world:
fun copy (w:World) = (w, w);
This makes a pair of worlds. In a language where the world is represented as mutable state (i.e., the current contents of memory, which can be updated), the "world" is the implicitly unique thing that can't be copied.
When the world can be duplicated, then it can sometimes be complex to make sure this never happens; or that, if it happens, you're always using the same world.
There are fancy type systems --- e.g., the linear type systems we alluded to above, in the section on efficiency --- that prevent the user from keeping more than one pointer to a given value. These systems can prevent "copying the world", but the usual caveats apply.
The notion of state update as a function that constructs a new world is all fine and well, but the fact is that most of the rest of the world is not defined this way. To interact with the outside world, programs need to be aware of side effects, somehow.
Input and output (I/O) inherently depend on a changing world. For example, consider a network card buffer: this is a specific chunk of memory in the computer where your hardware sticks incoming or outgoing network data. Imagine processing network buffer data:
getBuffer()
. (Notice that this is a perfectly fine
compiler optimization for a truly pure function.)One solution to this problem is to program those lower layers
in C, or some other impure language. The runtime system of your
programming language will might present you with a purely
functional interface to this buffer. For example, perhaps the
only way to process network buffers is to implement a function of
type Buffer -> Buffer
and register it to the
system.
But this is unsatisfactory --- you've simply admitted that your language is not expressive enough to capture this kind of I/O operation.
Input and output has always been a rather vexing problem for functional languages. Languages like ML simply punt and accept impurity (side effects). Other languages have dared to be pure, but until Haskell none of them had a really satisfactory solution to I/O. Haskell, a pure functional language, copes with I/O using a special sequencing construct called a monad.
We won't cover monads in this class, but Haskell fans claim that monads have many nice properties, and indeed large Haskell programs using I/O have been written. However, performing I/O (and simulating other side effects with monads) nevertheless suffers from a problem similar to the "threading the world" problem described below.
When you have to model side effects using an explicit world argument and return value, then every function that may have side effects must take the world as an argument, and return the new world as a result.
This doesn't sound so bad, until you realize that this also means that any time a function has a side effect, every function that calls it must take and return the world. And every function that calls those functions, and so on, up the chain.
This results in a vexing software engineering problem:
f
, which is a pure
function.f
. (This is more common than you think: for
example, while debugging you might want to insert a statement
that changes "the world" to display a debugging message on the
standard output.)f
, and back up. This could involve
modifying dozens or hundreds of functions.This is the software evolution argument for side effects. Another argument is the abstraction argument: when side effects require "threading the world" in this manner, then all clients have to know about a function's side effects in order to use it. This prevents the function from hiding that aspect of its implementation from clients --- the "side-effect-ness" of a function cannot be hidden.
Some people claim that side effects should not be hidable, but I disagree. A function may present an abstraction that is purely functional, but its implementation may use side effects in some "harmless" and correct way
For example, a factorial
function may internally
use an updatable data structure to store previous answers for
later use, i.e. a cache (this optimization is called
memoizing). This abstraction is not "clean" if
the caller must pass the function a representation of its "world"
(the cache of previously computed factorials), and remember to
pass the resulting world again next time --- which is the only way
that the programmer can make memoization happen in a purely
functional language.
(Purists would reply that compilers can perform memoization automatically in a functional language, and indeed can often do so more correctly and consistently than human programmers can. But there are clearly cases when a programmer's hand-coded memoization would be superior, because programmers can know more about the function in question than the compiler can.)
prepend
function for such
a data type. Write a copy
function.Array.fromList
function.