Today: Design, managing complexity, information hiding, abstraction, encapsulation
Dijkstra:
"The technique of mastering complexity has been known since ancient times: Divide et impera (Divide and Rule). The analogy between proof construction and program construction is, again, striking. In both cases the available starting points are given (axioms and existing theory versus primitives and available library programs); in both cases the goal is given (the theorem being proved versus the desired performance); in both bases the complexity is tackled by division into parts (lemmas versus subprograms and procedures).
I assume the programmer’s genius matches the difficulty of his problem and assume that he has arrived at a suitable subdivision of the task. …"
Comments include
Mastering complexity requires decomposition
Assumes specification is defined a priori and properly
But, most important, assumes that the decomposition is in some sense straightforward and/or obvious
Composition
It may be that composition of systems from parts is becoming, or will become, even more critical than decomposition; I expect to spend more time on this later in the quarter
There is an implicit assumption that dividing alone causes conquering, independent of how the composition takes place; this is false
Michael Jackson is eloquent about how composition is key and that the notion of hierarchical decomposition alone is insufficient at best; in particular, he views composition more like CYMK color mixing, where the colors are overlayed together
As Parnas’s 1972 paper says:
"Usually nothing is said about the criteria to be used in dividing systems into modules."
His suggestion, of course, is to design for anticipated change
Question: does anticipating change help or hurt or have no affect on handling unanticipated change?
This led to his work on information hiding
Now there is oodles of work on criteria for design
Object-oriented modeling
Methodologies for function top-down decomposition, like Hatley-Pirbai, present another
A new, alternative use of the term "information hiding":
"Many researchers are interested in hiding information or in stopping other people doing this. Current research themes include copyright, marking of digital objects, covert channels in computer systems, subliminal channels in cryptographic protocols, low-probability-of-intercept communications, broadcast encryption schemes, and various kinds of anonymity services ranging from steganography through location security to digital elections."
From the web, a classic misguided definition of information hiding.
"The process of hiding details of an object or function. Information hiding is a powerful programming technique because it reduces complexity. One of the chief mechanisms for hiding information is encapsulation -- combining elements to create a larger entity. The programmer can then focus on the new object without worrying about the hidden details. In a sense, the entire hierarchy of programming languages – from machine languages to high-level languages -- can be seen as a form of information hiding.
"Information hiding is also used to prevent programmers from changing --- intentionally or unintentionally -- parts of a program."
I found a nice missive on "Abstraction, Encapsulation, and Information Hiding" on the web (by Edward Berard); it’s linked into the lecture notes
He argues that these are related, but not identical, terms (he includes a large number of definitions of these terms from the literature)
Abstraction
The central notion here is to focus on some "more important" pieces of information while suppressing other "less important" pieces of information
Another useful point is that abstraction is used both as a noun (entity) and as a verb (process); these are both useful definitions, but they are necessarily distinct
A final critical point is that abstraction (in software at least) implies not eliminating the "details" (i.e., the less important information) but rather handling them separately; but the basic notion of abstraction does not specify mechanisms for doing this.
Information hiding
Designing modules based on (a) explicit decisions about what design decisions are likely to change and which are likely to be stable and (b) separating clients from implementations (that represent the potentially unstable decisions) using interfaces (that represent the likely stable decisions)
Clients are to be unaware of how an interface is realized
Implementers of interfaces are to be unaware of how the module will be used
This raises the issue from the second lecture, about performance, once again---what is in an interface?
Question: Is it OK for a client to even see the hidden parts?
Given smart enough tools, an information hiding decomposition needn’t be less efficient than a classic functional decomposition (in theory, at least)
Modules are also considered to be work units
Encapsulation
Like abstraction, can be a noun or a verb
As a verb, it refers to "the act of enclosing one or more items within a (physical or logical) container"
As a noun, it refers "to a package or enclosure than holds one or more items."
"It is extremely important to note that nothing is said about `the walls of the enclosure.’ Specifically, they may be `transparent,’ `translucent’, or even `opaque.’"
Encapsulation may be a mechanism to support information hiding and to capture abstractions; the degree to which it is essential to hide entities has an affect on this
Programming language constructs indeed tend to support various kinds of encapsulation; in some cases, extra information is exposed due to the needs of the compiler (for instance, Ada requires size information about hidden data elements to be exported)
I guess my main plea is to not use terms like these completely interchangeably, although they certainly are fine distinctions at times
One important piece of repetition from last lecture: saying we should anticipate design decisions that change is far different from giving any guidance in effectively anticipating changes
My personal suspicion is that anticipation needs to be handled in several ways
Some well-known issues, like data representation (note that this isn’t always done or useful --- Unix has benefited vastly from exposing its central data representation)
Some domain-specific issues, like business rules (tax structures, etc. --- wouldn’t it be fun to know how Intuit structures TurboTax to handle updates to the tax code in an effective way?)
Perhaps some empiric studies to give us at least a historical view of what changed, with the hope that it will give us some future guidance