v Today: Open implementation or, “clients know best”
v One particular tenet of information hiding
is the principle that clients not know about how implementations are
implemented and that implementations not know about how clients will use them
v One implication is that clients should all
be satisfied with the same (e.g., any) implementation
v Clearly, this is not so
Ø In particular, (as we have discussed)
performance does in fact matter
Ø Even if we make performance a part of the
interface, as we have discussed, there is a serious downside: the broad
proliferation of interfaces (multiple interfaces might be identical except for performance guarantees, and
this has problems of its own)
v Example (due to Kiczales, as is much of
this)
Ø A window system provides a mkwindow function, that creates a window of a specified size at a
specified location on the screen
§
It also
provides mouse-tracking, manages the shared screen, etc., while hiding internal
data structures, implementation details, etc.
Ø Somebody who wants to build a spreadsheet
might write the following code:
§
for i := 1 to
100
for j := 1 to 100
mkwindow(100,100,i*100,j*100)
end
end
Ø
"Yet few
experienced programmers would be surprised to learn that…its performance may be
so bad as to render it useless.
Why? Because the window-system
implementation may not be tuned for this kind of use." (The implementor, for example, may have
assumed a O(10) windows, not O(1000) windows.)
Ø
This
specific, very regularized use, of windows also admits the potential for a far
more optimized implementation than may be feasible for generalized window
systems
v What are possible solutions to this
shortcoming of classic information hiding (sometimes called black-box
abstraction)?
Ø Break
up into small groups for 5 minutes to develop potential solutions
Ø Multiple interfaces, each with their own
performance guarantees
§
This expands
the name space, at the very least
Ø Multiple implementations of a single
interface, allowing the client to select which implementation is desired
§
How does the
client specify which implementation is to be selected? (One approach is pragmas.)
§
This also
expands the name space, although perhaps indirectly
Ø A smart implementation that knows how to
provide effective performance for any possible client
§
Not generally
feasible
Ø Clients can overcome the provision of a
single implementation in two different ways
§
"Hematomas
of duplication"
· Essentially, use the basic abstraction that
is provided and build another one on top of this
¨ In this case, allocate one big window and
build another window manager that breaks that big window up into the little
spreadsheet-cell-sized windows
¨ This happens a lot with memory managers
§
"Coding
between the lines"
· The client writes code in a contorted way
to get the desired performance
¨ This example doesn't have a good way of
doing it, but a common example is where the programmer allocates objects in a
way that provides good performance through an interface, even if this is not
the "natural" way to allocate objects
v Kiczales and others have proposed an
alternative approach, called open
implementation (OI)
Ø Modules provide two interfaces: the base
interface, which provides the standard functionality of the module; the meta
interface, which allows clients to control the module's implementation (in
varying degrees)
Ø This is a limited form of reflective
computing, in which the underlying operation of a system is essentially
fully-programmable
§
The semantics
of interfaces can be changed in a programmatic way
§
An aggressive
example (and there are many) of reflective computing would be changing the
meta-object protocol in Smalltalk-80
v OI modules already exist in many domains;
Kiczales et al. have identified it as a design principle
v Some examples include
Ø Virtual memory systems have a simple basic
function: a bunch of memory addresses that can be read or written. The mapping
dilemmas have primarily to do with how to map virtual addresses to pages, and
how to map those pages to physical memory at any given time. A classic mapping conflict happens when a
client, such as a database system, does a "scan" of one portion of
its memory, while doing random access to another portion. A virtual memory
implementation based on an LRU will perform poorly in this case.
§
One simple
approach to this is captured by the Unix madvise system call:
caddr_t
addr;
size_t
len;
int behav;
§
"The
madvise subroutine permits a process to advise the
system about its expected future behavior in referencing a mapped file region
or anonymous memory region."
§
MADV_NORMAL: The system provides no further special treatment for
the memory region
§
MADV_RANDOM: The system expects random page references to that
memory region.
§
MADV_SEQUENTIAL: The system expects sequential page references to that
memory region.
§
MADV_WILLNEED: The system expects the process will need these
pages.
§
MADV_DONTNEED: The system expects the process does not need these
pages
§ MADV_SPACEAVAIL: The system will ensure that memory resources are reserved
Ø Parallel programming languages provide a
natural way of expressing a computation to be executed in parallel. There are
three principle mapping dilemmas: how to distribute the computation across the
physical processors, how to distribute the data and how to perform
synchronization. A common source of mapping conflicts is in the layout of
matrices. As a simple example consider that if the compiler distributes the ith
rows of two matrices A and B to the same processor, then matrix addition will
require no communications overhead, but matrix multiplication will be
communication intensive
§
HPF
(High-Performance Fortran) provides some support for this using pragmas that
allow the programmer to direct the compiler to use specific layouts and
distributions to processors
· This is, of course, not very different from
register and inline in C/C++
§
A
dissertation from UW about 8 years ago, by Gail Alverson, was similar in many
ways to this; a key point about Gail's work was that for some parallel language
constructs, the only way to get efficient implementations on some classes of
machines was to have multiple, independent base interfaces but to have the
implementations of those modules interact with each other closely and directly
Ø There are numerous other examples from
communications, from databases, more from OS, etc.
v Key questions include
Ø What are appropriate modules for OI?
§
I think one
absolute requirement is that the function of the module is extremely
well-understood; it's hard enough to build effective base interfaces at first,
so trying to build a meta-interface from the beginning, without experience in
terms of the functions provided, is likely to fail
Ø Are there design techniques for producing
OI modules?
v Today
Ø Layering
Ø Possible projects
v Layering
Ø Layering is a standard software structuring
technique
Ø If you’ve taken OS, you’ve seen it in its
classic form in Dijkstra’s T.H.E. system
§
Level 5: User
Programs
§
Level 4:
Buffering for input and output devices
§
Level 3:
Operator Console Device Driver
§
Level 2:
Memory Management (Software Paging)
§
Level 1: CPU
Scheduling (priority) P and V operations
§
Level 0:
Hardware
Ø Parnas’ paper discusses layering in terms
of the potential ease of extension and contraction of systems
Ø Strictly, one layer can only use operations
at the next lower level
§
This
constrains, in the extreme, invocations between components at a given level
§
This
constrains recursion
§
It prohibits
calling down multiple levels
Ø But layering does allow for a nice style of
building: a lower layer can be debugged and tested without concern for the
higher layers, since there are by definition no dependences
v Invokes vs. uses
Ø Parnas discusses how the uses relation is
distinct from the invokes relation
§
A program A
uses a program B if the correctness of A depends on the presence of a correct
version of B
§
Requires
specification and implementation of A and the specification of B
§
Again, what
is the “specification”? The interface? Implied or informal semantics?
Ø Here's an example where invokes does not
imply uses
§
Invocation
without use: name service with cached hints
· ipAddr
:= cache(hostName);
if not(ping(ipAddr))
ipAddr := lookup(hostName)
endif
v What is the relationship, if any, between
layering and information hiding modules?
Ø HW2 has one question in this regard
Ø Strict application of layering in operating
systems is hard; for example, if you have lots of processes, you may not be
able to store all your process tables in memory, so you might have a dependency
from process table management to virtual memory
Ø Here's a small example of this, taken from
the FAMOS operating system (family of operating systems) research project from
the early 1980's
§
An OS might
have both a "process ADT" and a "segment ADT", each of
which hides secrets such as the data structures to support these functions
§
But in terms
of layering, you might have the structure:
· Level 3: Segment management
Level 2: Process management
Level 1: Segment creation
Level 0: Process creation
§
So, the
information hiding modules (the ADTs) span levels in this situation
v Language support for layering
Ø Why do we have tons of language support for
information hiding modules, but essentially none for layering?