Today: Open implementation or, "clients know best"
One particular tenet of information hiding is the principle that clients not know about how implementations are implemented and that implementations not know about how clients will use them
One implication is that clients should all be satisfied with the same (e.g., any) implementation
Clearly, this is not so
In particular, (as we have discussed) performance does in fact matter
Even if we make performance a part of the interface, as we have discussed, there is a serious downside: the broad proliferation of interfaces (multiple interfaces might be identical except for performance guarantees, and this has problems of its own)
Example (due to Kiczales, as is much of this)
A window system provides a
mkwindow function, that creates a window of a specified size at a specified location on the screen
It also provides mouse-tracking, manages the shared screen, etc., while hiding internal data structures, implementation details, etc.
Somebody who wants to build a spreadsheet might write the following code:
for i := 1 to 100
for j := 1 to 100
mkwindow(100,100,i*100,j*100)
end
end
"Yet few experienced programmers would be surprised to learn that…its performance may be so bad as to render it useless. Why? Because the window-system implementation may not be tuned for this kind of use." (The implementor, for example, may have assumed a O(10) windows, not O(1000) windows.)
This specific, very regularized use, of windows also admits the potential for a far more optimized implementation than may be feasible for generalized window systems
What are possible solutions to this shortcoming of classic information hiding (sometimes called black-box abstraction)?
Break up into small groups for 5 minutes to develop potential solutions
Multiple interfaces, each with their own performance guarantees
This expands the name space, at the very least
Multiple implementations of a single interface, allowing the client to select which implementation is desired
How does the client specify which implementation is to be selected? (One approach is pragmas.)
This also expands the name space, although perhaps indirectly
A smart implementation that knows how to provide effective performance for any possible client
Not generally feasible
Clients can overcome the provision of a single implementation in two different ways
"Hematomas of duplication"
Essentially, use the basic abstraction that is provided and build another one on top of this
In this case, allocate one big window and build another window manager that breaks that big window up into the little spreadsheet-cell-sized windows
This happens a lot with memory managers
"Coding between the lines"
The client writes code in a contorted way to get the desired performance
This example doesn't have a good way of doing it, but a common example is where the programmer allocates objects in a way that provides good performance through an interface, even if this is not the "natural" way to allocate objects
Kiczales and others have proposed an alternative approach, called open implementation (OI)
Modules provide two interfaces: the base interface, which provides the standard functionality of the module; the meta interface, which allows clients to control the module's implementation (in varying degrees)
This is a limited form of reflective computing, in which the underlying operation of a system is essentially fully-programmable
The semantics of interfaces can be changed in a programmatic way
An aggressive example (and there are many) of reflective computing would be changing the meta-object protocol in Smalltalk-80
OI modules already exist in many domains; Kiczales et al. have identified it as a design principle
Some examples include
Virtual memory systems have a simple basic function: a bunch of memory addresses that can be read or written. The mapping dilemmas have primarily to do with how to map virtual addresses to pages, and how to map those pages to physical memory at any given time. A classic mapping conflict happens when a client, such as a database system, does a "scan" of one portion of its memory, while doing random access to another portion. A virtual memory implementation based on an LRU will perform poorly in this case.
One simple approach to this is captured by the Unix
madvise subroutine permits a process to advise the system about its expected future behavior in referencing a mapped file region or anonymous memory region."
MADV_NORMAL:
The system provides no further special treatment for the memory region
MADV_RANDOM
: The system expects random page references to that memory region.
MADV_SEQUENTIAL:
The system expects sequential page references to that memory region.
MADV_WILLNEED
: The system expects the process will need these pages.
MADV_DONTNEED:
The system expects the process does not need these pages
MADV_SPACEAVAIL:
The system will ensure that memory resources are reserved
Parallel programming languages provide a natural way of expressing a computation to be executed in parallel. There are three principle mapping dilemmas: how to distribute the computation across the physical processors, how to distribute the data and how to perform synchronization. A common source of mapping conflicts is in the layout of matrices. As a simple example consider that if the compiler distributes the ith rows of two matrices A and B to the same processor, then matrix addition will require no communications overhead, but matrix multiplication will be communication intensive
HPF (High-Performance Fortran) provides some support for this using pragmas that allow the programmer to direct the compiler to use specific layouts and distributions to processors
This is, of course, not very different from register and inline in C/C++
A dissertation from UW about 8 years ago, by Gail Alverson, was similar in many ways to this; a key point about Gail's work was that for some parallel language constructs, the only way to get efficient implementations on some classes of machines was to have multiple, independent base interfaces but to have the implementations of those modules interact with each other closely and directly
There are numerous other examples from communications, from databases, more from OS, etc.
Key questions include
What are appropriate modules for OI?
I think one absolute requirement is that the function of the module is extremely well-understood; it's hard enough to build effective base interfaces at first, so trying to build a meta-interface from the beginning, without experience in terms of the functions provided, is likely to fail
Are there design techniques for producing OI modules?