Derrick Tucker
CSE584 Fall 1998

Software Evolution

The Papers

Parnas. 1994. "Software Aging."

Key concept: software aging -- the metaphor of human geriatric medicine is carried throughout the discussion.

Summary: Software aging is inevitable, either because environmental changes render an unchanging system obsolescent or because product modifications degrade the organization and design of the system. Aging and degradation are inevitable because only perfect knowledge of future changes (and perfect execution of a design) could ensure an optimal implementation into the indefinite future. The paper discusses: (1) the costs of aging -- lost customer base, reduced performance, reduced reliability, higher maintenance costs; (2) preventive measures -- designing for expected future changes, keeping documentation current, design reviews in the development stage; (3) geriatric ameliorative measures -- stopping deterioration, retroactive documentation, retroactive incremental modularization, amputation or lost-cause subsystem, major surgery (restructuring).

Assessment: Great paper. I am a big believer in the value of big-picture perspectives in a field dominated by nose-to-the-code immediacy.

Muller et al. 1993. "Reverse Engineering Approach to Subsystem Structure Identification."

Key concepts: abstracting subsystem structure as directed weighted graphs, (k,2)-partite graphs, in particular; RFGs -- resource-flow graphs; CDGs -- composition dependency graphs.

Summary: The approach concentrates on structural as opposed to functional abstraction. RFGs represent the structure in the underlying code base. CDGs are used to recover the structure of the system through a semiautomatic procedure. Various measures of subsystem cohesion and inter-subsystem coupling are used, and can be tweaked, to make composition decisions. The code base itself is not modified.

Assessment: Notkin's comment about the possible correlation between distance from the code declining utility is duly noted. That is, a system that works with a representation of the code base rather than the code base may be less useful than one that works directly with the code.

My main problem with the system, though, is philosophical. The authors seem confused about exactly what it is their tool user is doing or accomplishing. Their statement of the problem situation is the usual one in which software aging, poor documentation, etc., has led to the need for reverse engineering to understand the system structure -- such as it is. They seem to believe, however, that (1) they can safely assume "that the same principles [of software engineering that they use to abstract structure] are adhered to during software development" and, implicitly, in previous evolutionary projects, and that (2) their tool allows the user to "recover architectural design information", even though (3) "[w]hen composing subsystem structures, software engineers make intuitive or subjective decisions based on experience, skill and insight." Clearly, this is nonsense. Tool users are not recovering system structure, but imputing it. The imputed structure may, but probably will not, conform to the original architectural design. Conformance between the original and the imputed structure is only likely when the original system was well-designed, and never modified.

Bowdidge and Griswold. 1994. "Automated Support for Encapsulating Abstract Data Types."

Key concept: Star diagrams.

Summary: The paper describes a graphical tool useful for the actual modification of the code base. The modification it addresses is the creation / encapsulation of useful abstract data types. The tool is useful when the user is working at a fairly low level -- with individual data structures. The star diagram is implemented on top of a program dependency graph (PDG) constructed from an abstract syntax tree (AST).

Assessment: Seems like a useful tool for the stated task. Again, my biggest problem with the system is philosophical. As with the abstraction in [Muller et al], the encapsulation here is art-ful -- intuitive, subjective, and experience-based. There is no concern with the original design, architecture, or ontology of the system. A collection of related functionalities is encapsulated -- related in the engineer's judgement and serving present purposes.

Murphy and Notkin. 1997. "Reengineering with Reflexion Models: A Case Study."

Key concept: Reflexion models

Summary: The reflexion model system works closer to the code than [Muller et al.] and at larger granularity than [Bowdidge and Griswold]. The basic technique is for the user to define a high-level model of the system, define a mapping into the modules of the high-level model, and then map the source code into the high-level model according to the map. Conformances and discrepencies between the model and the map are presented to the tool user. Both the model and the map can be refined iteratively. The user can focus refinement efforts on those parts of the model (and code base) relevant to the task at hand. The code base is not modified.

Assessment: Sounds like a great tool for the type of task presented in the case study: thin knowledge of the code base and its design; need to understand the code base in context of task-specific goals -- that is, not particularly concerned with the original or overall structure of the code base, but with the structure relevant to a particular problem. Very useful to develop program understanding. Yet again, my main problem with the tool is philosophical. If this tool were widely used and used repeatedly on the same code base, I suspect the net result would be to encourage faster and more severe aging and deterioration of systems. Why? Because it is a powerful tool, it would permit engineers to execute heroic ad-hoc system modifications -- and still survive. Though this is great for fighting fires, I do not think it is the model of software evolution that we would want to encourage, or should be particularly interested in encouraging.

General Comments

Software Maintenance and Software Evolution

"Software maintenance" and "software evolution" are closely related concepts. The distinction I find most useful is:

Software maintenance refers to modification of a product after delivery to: (1) correct faults, (2) improve performance, or (3) adapt the product to a changing environment [Muller et al.]
Software evolution refers to modification of a product after delivery to: (1) add new features or functionality, (2) modify existing features or functionality, (3) remove obsolescent features or functionality

Reverse Engineering and Re-engineering

[Muller et al.] distinguishes between reverse engineering and re-engineering:

To reverse engineer a system is to explicate the structural or functional organization of the existing code base
To re-engineer or re-implement a system is to modify the existing code base to accomplish a maintenance or evolution goal

The following generalizations about sofware evolution projects seem reasonable:

In the real world, almost all evolutionary projects will require some -- often, much -- reverse engineering of the system -- to understand it -- before the evolutionary re-engineering can be done.
In an ideal world, or even a slightly better one, very little reverse engineering would be needed as a prerequisite to evolutionary re-engineering. The high-level design, detailed architecture, and documentation for a given system would be available to the re-engineering team. (They would still need to spend time to absorb it, but not to explicate or impute it.)

Description and Prescription

The curious thing about this collection of four papers -- which I take is meant as a short, but representative survey of the software evolution literature and research -- is that we have one excellent overview paper that describes why aging / evolution is inevitable and prescribes what should be done to ameliorate the problem, and three papers describing pretty good tools to address the problems that arise if the prescriptive measures are ignored. No tools are surveyed that help the developer follow the prescriptions and do software evolution right. Question: Do such tools exist?

Soapbox: An Immodest Proposal

My immodest proposal is to suggest -- that is, prescribe -- how software evolution should be done and what a useful tool might look like.

If I have understood correctly everything I have heard in class and read in various places -- SEI, SEL, The Standish Group, Steve McConnell's Rapid Development, all the assigned readings -- there probably isn't a sofware development crisis, but there is a software development problem. The problem is not the inherent technical complexity of software systems, but much more mundanely, the poor management of software projects. In the context of software evolution, the tools described by [Muller et al.] and [Murphy and Notkin] are most useful when the prescriptions of [Parnas] are ignored. That is, when software development is poorly managed. A far more useful tool -- over the lifecycle of a software system -- would be one that operationalized Parnas's prescriptions. What might such a tool look like?

The tool would work closely with or, ideally, be fully integrated into, a development environment like Microsoft's Developer's Studio. When a new project is created, the user must specify a high-level modular design, basic architectural style -- pipes and filters, repository, etc. -- module hierarchies, module interfaces, resource-flow graphs, composition-dependency graphs, and so forth -- a combination of the style and substance presented in [Murphy and Notkin] and [Muller et al.]. All the required info should be available when implementation begins, or you really don't have a complete design -- the first sign of poor project management. The project design could also specify cohesion and coupling parameters (with default values) and useful naming conventions. A sufficiently rich and flexible tool could enforce a selection of design criteria from any number of reasonable possibilities.

The key consideration is that the project manager tool will not let the individual developer add code to the code base unless it conforms to the project design. This enforces a useful discipline on development efforts and the code base. The design itself can, of course, be modified during initial development and during later maintenance and evolutionary projects. But, changing the project design criteria should be like changing a database schema -- something done only reluctantly and after considerable thought. With the design environment enforcing adherence to the project design, design modification always precedes (or precludes) ad hoc modifications of the code, is under centralized control of the project manager or other powers that be, and, with version control, leaves a track that can be followed back through time and successive versions of the system to the original design. Note that not only the individual developer but also the team as a whole and the project manager must follow the discipline of the design. Even more importantly, the original design can only be degraded when an explicit decision is made to do so, and the active step is taken of changing the design. Degradation cannot occur implicitly and invisibly through ignorance, indifference, inattention, or neglect.

If such a tool were used over the lifecycle of a software system, it would never be necessary to creatively impute a structural or functional organization or reverse engineer an existing system. All the needed info would be carried in the metadata of the project: what the original design was, how it evolved through time, and how the last person to touch the code explicitly understood. Also, given an existing system and an evolutionary project, the engineer could execute what-if scenarios on the project metadata to assess the damage to the structure of the code base that would follow from different approaches.