CSE 341: Mutable data, supplementary notes

Why side effects?

Previously, we have studied a subset of ML that is purely functional --- i.e. not having assignments, updatable data structures, or other side effects (a "side effect" is anything that is not just evaluation). This functional subset is computationally complete --- any computation can be expressed in it. So why would we even want to add mutable (updatable) data?

At first, one might respond: Well, to model certain processes accurately, we need side effects. The world, for example, changes over time. However, this is something of a fallacy --- we can always model changes over time using a function that takes the old state, and produces the new state; i.e., that has type:

World -> World

where World is some data type that represents the entire state of the world.

So why would we want to introduce side effects into our language? Here's a short list, with elaboration below:

The upshot of the explanations that follow is that side effects (including mutable data) are currently a necessary evil in a practical language.

Efficiency

Logically, most functional programs make many copies of data. You're always constructing "the new world", e.g. the new list value that's returned from reverse. Therefore:

Note that the caveats on sophisticated compilers and type systems apply to most high-level language features, such as the uniform reference model (all objects referred to by pointer) and, to a lesser extent, garbage collection. I am a student/researcher in languages and compilers, and I am therefore a big fan of high-level languages, smart compilers, and sophisticated type systems. Ultimately, I believe gains in programmer productivity from using high-level languages outweigh performance costs, as well as the costs of implementing smart compilers and teaching programmers to use clever type systems. However, these caveats and costs must still be kept in mind when evaluating the importance or value of high-level languages.

Expressiveness

Certain data structures, including (but not limited to) cyclic data structures, are inherently hard to express in purely functional languages.

Permissiveness

Permissiveness is arguably just a different kind of expressiveness --- if a language is too permissive, then it is because it does not allow the programmer to express some restriction that (s)he wishes to express.

In this case, we would like to express the following simple constraint: Don't copy the world. If we're representing the world as a data structure, then arguably we should not permit more than one example of this world to be "alive" in the program at once. But in an ordinary functional language, it is easy to copy the world:

fun copy (w:World) = (w, w);

This makes a pair of worlds. In a language where the world is represented as mutable state (i.e., the current contents of memory, which can be updated), the "world" is the implicitly unique thing that can't be copied.

When the world can be duplicated, then it can sometimes be complex to make sure this never happens; or that, if it happens, you're always using the same world.

There are fancy type systems --- e.g., the linear type systems we alluded to above, in the section on efficiency --- that prevent the user from keeping more than one pointer to a given value. These systems can prevent "copying the world", but the usual caveats apply.

Interaction with the outside world

The notion of state update as a function that constructs a new world is all fine and well, but the fact is that most of the rest of the world is not defined this way. To interact with the outside world, programs need to be aware of side effects, somehow.

Input and output (I/O) inherently depend on a changing world. For example, consider a network card buffer: this is a specific chunk of memory in the computer where your hardware sticks incoming or outgoing network data. Imagine processing network buffer data:

One solution to this problem is to program those lower layers in C, or some other impure language. The runtime system of your programming language will might present you with a purely functional interface to this buffer. For example, perhaps the only way to process network buffers is to implement a function of type Buffer -> Buffer and register it to the system.

But this is unsatisfactory --- you've simply admitted that your language is not expressive enough to capture this kind of I/O operation.

Input and output has always been a rather vexing problem for functional languages. Languages like ML simply punt and accept impurity (side effects). Other languages have dared to be pure, but until Haskell none of them had a really satisfactory solution to I/O. Haskell, a pure functional language, copes with I/O using a special sequencing construct called a monad.

We won't cover monads in this class, but Haskell fans claim that monads have many nice properties, and indeed large Haskell programs using I/O have been written. However, performing I/O (and simulating other side effects with monads) nevertheless suffers from a problem similar to the "threading the world" problem described below.

Abstraction and ease of evolution

When you have to model side effects using an explicit world argument and return value, then every function that may have side effects must take the world as an argument, and return the new world as a result.

This doesn't sound so bad, until you realize that this also means that any time a function has a side effect, every function that calls it must take and return the world. And every function that calls those functions, and so on, up the chain.

This results in a vexing software engineering problem:

This is the software evolution argument for side effects. Another argument is the abstraction argument: when side effects require "threading the world" in this manner, then all clients have to know about a function's side effects in order to use it. This prevents the function from hiding that aspect of its implementation from clients --- the "side-effect-ness" of a function cannot be hidden.

Some people claim that side effects should not be hidable, but I disagree. A function may present an abstraction that is purely functional, but its implementation may use side effects in some "harmless" and correct way

For example, a factorial function may internally use an updatable data structure to store previous answers for later use, i.e. a cache (this optimization is called memoizing). This abstraction is not "clean" if the caller must pass the function a representation of its "world" (the cache of previously computed factorials), and remember to pass the resulting world again next time --- which is the only way that the programmer can make memoization happen in a purely functional language.

(Purists would reply that compilers can perform memoization automatically in a functional language, and indeed can often do so more correctly and consistently than human programmers can. But there are clearly cases when a programmer's hand-coded memoization would be superior, because programmers can know more about the function in question than the compiler can.)

Suggested exercises