Introduction

===========================================================================

This quarter, we will think deeply about the software development process.
This will be useful no matter what you eventually do.

We will think especially about quality and how to achieve
it (because it's most enjoyable and useful to build
software that works).

The main activity of the class is a group project, in which you will implement or evaluate a software engineering tool.  You will learn software engineering by doing it, and you will learn software engineering by thinking about the process.
You get to scratch your own itch by solving some development challenge that has bothered you.  Furthermore, the results of your project might help other programmers!

We will learn about the two main competencies that you must have in order to build successful software projects:
 * technical
 * managerial

===========================================================================

What is software?
It's a recipe for achieving some goal.
It's a set of instructions that, when performed, transform information in one form into information in another form.

===========================================================================

What comprises a software system?
What are the parts of a software system, or the artifacts that a developer works with day-to-day?
(I'm not asking about tools such as the IDE.)
 * code
 * tests
 * documentation
    * requirements
    * architecture
    * specifications
 * version control history
 * issue tracker
 * build system files
 * ... much more!

===========================================================================

What tasks does a developer perform?

Initial development:

   design: decompose into modules
    * determine quality of the design

   write specifications

   write code
    * code completion
    * what part of a library to use, how to use it

   write documentation
    * method-level
    * tutorials

   write tests
    * inputs
    * oracles

   integration
   deployment

   where to devote extra QA time
    * which modules are most error-prone or have latent errors?

Deployment:

  ...

Maintenance:

   security review (injection attacks, more)

   optimization/performance tuning

   refactoring

   code review

   code understanding or reverse engineering (often necessary, not an end in itself)

   debugging/bug fix or feature request
    * for bugs:
       * is it actually a bug
       * reproduce it
       * minimize the test case
    * who should fix it?
    * what is the relevant code?
    * what are the impacts of this cahneg?
       * on functionality
       * on code structure (modules, architecture)

===========================================================================

Why quality?

Motivation: bad problems

$312 billion per year global cost of software bugs (2013)
$300 billion dealing with the Y2K problem
$440 million loss by Knight Capital Group Inc. in 30 minutes in August 2012
$650 million loss by NASA Mars missions in 1999; unit conversion bug
$500 million Ariane 5 maiden flight in 1996; 64 bit to 16 bit conversion bug

That amount of money could end world hunger
$300 billion = every person in the world 3 meals a day for a year.
= 300 B-2 bombers
= 1.5% of US economy

Those are the cheap bugs!

Software bugs can cost lives
1997: 225 deaths: jet crash caused by radar software
1991: 28 deaths: Patriot missile guidance system
2003: 11 deaths: blackout
1985-2000: >8 deaths: radiation therapy
2011: Software is the cause for 25% of all medical device recalls

===========================================================================

What is quality?

Software quality:  "the program does what it is supposed to do"
 (our expectations match the behavior of the software)
Break this down into:
 * what is the program supposed to do?
 * does the program do that?

External vs. internal quality
 * external quality: visible to users of the software
    * correct (satisfies requirements)
    * robust/stable
    * not our focus:
       * usable
       * user delight
    * efficient, scalable
    * fault-tolerant
    * secure
 * internal quality: visible to implementers/programmers of the software
    * correct (satisfies specifications)
    * readable/understandable
    * extensible
    * modifiable/maintainable
    * debuggable
    * reusable
We will focus somewhat more on external quality.

Digression about internal quality:
What percentage of effort should be spent on each of these?
 * initial development
 * maintenance
The real breakdown is about 10%-90%.
Maintenance includes:
 * bug fixing
 * refactoring
 * optimization
 * specification/requirements changes
    * new features
    * regulations
    * port to new environment (OS, framework, programming language, data center)
 * security updates
These are not predictable.  To try to accommodate all these in the original
design would be a bad use of resources, and the project would fail.
We don't care about internal quality for its own sake, but because it will
make the 90% of the development life cycle that is devoted to maintenance
cheaper and easier.  But accept that you cannot predict all possible changes
and requests.

Back to external quality:

What it is supposed to do:
 * requirements
    * validation vs. verification
       * this class will focus on verification
 * specification

Does it do it?  Two ways to tell: dynamic analysis and static analysis

===========================================================================

Approaches to achieve quality

Dynamic analysis: runs the program
 * examples:
    * testing
    * log analysis
    * profiling
    * debugging
 * Testing:
    * precise
    * no guarantee for future executions
 * Debugging
    * reproduce failure
    * locate defect
    * fix defect
 * Given the question "Is my program correct?"
   Testing produces an answer of either
    * "incorrect", or
    * "I don't know"
   [exception: exhaustive testing]

Static analysis: does not run the program, but reasons about what the
program would do if it were run
 * examples:
    * compiler
    * type systems
    * linters
    * dataflow analysis, abstract interpretation, symbolic execution
       * all synonyms for the purpose of this class.
    * code reading
       * Introspect; watch what you do and be inspired to improve or automate it
    * model checking (can be done statically or dynamically)
  * Can be sound:  gives a proof or guarantee
    * imprecise -- uses abstractions
 * Given the question "Is my program correct?"
   Verification/proofs produces an answer of either
    * "correct", or
    * "I don't know"
   [exception: sometimes also "incorrect", but "I don't know" is always possible]

It is theoretically impossible to have a precise, complete analysis that
terminates.  The Halting Problem prevents this.

Key idea of static analysis:
Abstraction, or throwing away information.

===========================================================================