How to write quality code

This class has introduced you to exciting, deep ideas in program analysis.
It has also given you practical experience in construction of code analysis tools,
and you have learned a lot about how to evaluate the efficacy of a tool
(which is harder than you expected!).

Now, we will put this in the perspective of a working programmer.
In some cases, it makes sense to use the techniques and tools that we studied.
In other cases, the ideas will help you to be a better designer and
programmer, even though no tools are practical today.
In yet other cases, the ideas are exciting and interesting, but their
primary value is to inspire future research to make it practical in the
future.

These are my opinions, based on decades of experience as a working
programmer and on decades of research in software engineering, programming
languages, and related topics.  You don't have to do what I do.  It is
better to develop your own style that works for you.  However, before you
do, you should understand why these practices work for me, and then pick
and choose the best ones, and improve the others.  That is what I have done
regarding the advice I have received.  I like the books "Pragmatic
Programmer" and "Effective Java" (both are required texts for CSE 331, and
if you didn't read them then you should do so now).  I read "Code Complete"
many years ago and don't know how well it has aged, but it was good then.

Here are some general approaches to achieving code quality::
 * process (discipline, human work)
 * testing
 * automated analysis

This class has focused on automated analysis because
 * it can give a proof, offers guarantees no other approach can
 * they express powerful, beautiful technical ideas that you won't learn anywhere else

Should you use an automated analysis?
 * sometimes!
 * before you try, you never think it's going to be worthwhile
    * it's a pain to set up
    * surely *your* code doesn't contain any of that sort of error
 * every time I run an automated analysis, I always find something I want to fix


Lightweight tools:
 * code formatting
    * worth it!  (get over the fact that it isn't quite as nice as your manual formatting)
    * otherwise far too many code review comments are about formatting
 * linters (FindBugs, PMD, etc.)
    * easy to run
    * every 3 or 4 years I try again
    * I have never found them worthwhile:  too many of the rules are trivialities or overly strict
       * I do use Error Prone, but it's extremely limited and I still have to disable some rules
    * it may be better for brand-new code that can be lint-clean from the beginning
    * many people find linters useful
    * they do enforce coding guidelines and reduce uninteresting code review comments

List of tools we have looked at:
Randoop test generation:  I don't use it
 * except occasionally to find bugs in, say, equals methods
 * I don't use the generated tests as regression tests
 * oracles are still too weak:  too many illegal tests, too many errors not caught
Checker Framework pluggable type-checking:  I do use it
 * I enjoy it -- it's like a puzzle
 * finds bugs
 * documentation and code clarity/structure are big benefits, perhaps
   bigger than finding bugs
Model checking
 * Exciting idea, worthwhile in some domains; I haven't found it useful
Code synthesis
 * too niche right now
Others...


Abstractions are the key to managing complexity
 * design yours carefully

Documentation
 * Most important part of your software, even more so than the code
 * I always determine the spec (and document it) before writing the code
    * if you find this difficult to write, then either your abstractions
      are bad or you don't understand them.
      It is much easier to fix bad abstractions before you have written the
      code than after, and easier (and much less error-prone) to program if
      you understand the abstractions.  So it will save you a lot of time
      to write the documentation first.
      Examining the documentation is a quick and easy way to discover problems.
 * I usually write the user documentation before writing the code (I should
   always do so!)
    * Think about the problem from the user point of view.  What user
      problem is it solving?  Why does the user care?  Your documentation
      should only discuss these issues, regardless of how cool others are or
      how much time you spent on them.
    * Once you have to explain how to use your tool, you usually see ways
      to improve the design
        * you may also improve the design because you become embarrassed of
          the current one 
     * Manuals are a bit out of style today.  This is a shame.  You should
       write one even if you don't expect most users to read it.  This will
       also save time when answering user questions.
 * Undocumented code has no commercial value
 * I refuse to review code that is not documented
    * at a bare minimum, every class, field, and method (both public and
      private) must have at least one sentence of documentation; more is
      usually better
 * Javadoc tool will complain if you are missing @param or @return tags
    * sometimes pedantic, usually worth doing anyway.  Keep this turned on.

Testing
 * I write extensive unit tests for anything that feels like a library
    * cleanly separated from the rest of the project
    * could be reused separately, because it has no dependences into the rest
    * errors in these components can be hard, and demoralizing, to track down when you are trying to focus on the main functionality.
 * I write system tests to test the overall operation
 * Some programming methodologies, such as extreme programming and other agile methodologies, place a very heavy emphasis on testing every component and design for testability.
    * having all those tests is valuable
    * writing and maintaining all those tests does not feel like a productive use of time to me, compared to other development and quality activities
    * Agile methodologies arose in the context of dynamically typed languages, where you don't eve have a type system to help find errors, and where other analysis tools (even refactoring!) don't exist.
 * I sometimes write the tests before writing the code
    * I should do this more often.
 * Tests written by someone who didn't (yet) write the code provides a
   valuable external perspective on the code and its documentation.  It's
   not crucially important whether that person's title is "tester" or
   "developer".

Automate everything
 * manual work distracts you at the most inopportune times, such as deadlines
 * manual work leads to mistakes
 * manual work may be hard to reproduce
 * For automated testing, my current favorite tool is Travis CI (https://travis-ci.com/)

Code review
 * Single most effective code quality practice
 * Feedback from other people
    * different perspective; don't know/assume the things 
    * clearer code
    * notice bugs
 * When I have cut corners on reviewing code submitted by other people, I
   have suffered terribly later when I have had to maintain it.
 * Multiple rounds of code review feedback is the norm
    * Not done after the first round!
 * When your code is being reviewed, it's a bit irritating because you
   thought you were done.  However, if there are comments, then you weren't
   really done and you should be grateful that your software is now better.
 * A great way to learn how to write great code is by reading great (and not so great) code
 * Communicates team norms to new coders

When there is a bug.
 * admit that this means I screwed up
 * reproduce first
    * for about 50% of bugs, this is the biggest task
    * (those 50% are not the hardest bugs)
 * create test case
 * probably write more tests
 * ensure that the bug is actually fixed
 * never fix only the one bug that was discovered
    * if I made a mistake one place, I probably made it elsewhere too
     * look everywhere else that you might have made that mistake

A bug report should include:
 * exact inputs, such as files
    * the developers will appreciate it if you minimize them
 * exact command to reproduce the output
 * exact output
 * expectations about what the program should have done
 * environment:  OS and tool version numbers

Debugging
 * minimize
    * different things to minimize
       * input (minimizes run time)
       * commands
       * code (or, time in version control history between working and
         non-working versions)
    * sometimes useful for localization, sometimes not
    * always useful for making a test case that runs fast and can be
      included in the regression test suite
 * what tool to use?
    * debugger
       * the debugger may be complex; is a debugger even available?
       * you can examine any information; great for exploratory work
       * you can examine a snapshot at one moment
       * heisenbugs may disappear under the debugger
    * logging/tracing
       * easy to use
       * must predict information needed; this can make the turnaround time
         slow to collect just a little bit more information
       * logging output clutters the code
       * can go forward and backward in time by traversing the log
       * can compare two inputs by diffing the logs
       * can search for regular expressions in my editor

Bug fixing consists of three activities:
 * reproduce
 * locate (e.g., delta debugging)
 * fix the code
But you should also:
 * find similar bugs
 * figure out how to prevent them in the future

How to understand a new codebase
 * write documentation and add tests
    * ensures understanding, prevents errors
    * be afraid to change without tests
 * don't read it for its own sake; instead, try to perform some task

Stack Overflow is great
 * it's right 90% of the time
 * don't trust it 100% of the time
    * especially for conceptual material (like immutability)
    * information also gets out of date
 * don't just cut and paste
 * books are great because someone has bothered to organize the material
    * reward authors with your purchases
 * Don't treat Stack Overflow as the user manual for your software.

Listen to users, and take their comments seriously
 * if you have no users, you have a serious problem
 * improve the software and the manual based on their problems, and then
   future users will have fewer problems.
 * write your software as if it had users, or it will never have any

Don't do anything twice.
 * When I receive a question from a user or another developer, I:
    * look in the documentation
    * if there is no answer in the documentation, I write a new section
    * copy-and-paste the answer from the documentation

Use libraries
 * It can be fun to implement new code.
    * Avoid that when possible.
    * Even if you have to fix bugs in someone else's code.

Version control:
 * always do "git diff" before you do "git commit".
   This will prevent you from including stray or temporary changes in your commit.
   For more advice on version control, see:
   https://homes.cs.washington.edu/~mernst/advice/version-control.html

Tools:
 * You cannot achieve great results without using great tools, and becoming
   expert at them.  If you will spend a significant amount of time in a
   tool (such as an editor, IDE, debugger, etc.), then learn it well.  I
   have found it worthwhile to read the entire manual, in order to
   understand the concepts and what functionality is valuable.
 * Seek tools that will accelerate your work by a lot, not just save a few
   keystrokes here and there.  However, avoiding breaking flow can be a
   reason to automate small tasks (and is a better reason than just saving
   time).
 * Every tool is good at some things and poor at others.  Know which they
   are, and know how this affects your work.  Don't get into pointless
   religious arguments about small points of a tool.
 * I use regular expressions dozens of times per day.  If you don't know
   how to use them, consider learning.

Feedback from the class:
 * What practices have you found most effective for producing quality code?
 * What practices have you learned at an internship?
 * What practices did the company use that you consider useless or even counterproductive?

----------------

Koans of CSE 403:
 * the importance of abstraction, which is the hardest decision when designing a program analysis
    * tradeoffs between precision and efficiency
 * ... lots more for the class to fill in!
 * "What's the specification?"
 * Abstraction
    * for a program analysis
       * trade off precision and cost
       * the most important decision you make about the program analysis
 * soundness: no false positives
    * a sound tool is right if it does not answer "maybe"
 * completeness:  the tool never says "maybe"
 * usefulness
 * tool output: yes, maybe, no
    * a tool ususaly ususally just gives two possible answers
       * some tools output "yes" or "maybe"
       * other tools output "maybe" or "no"
 * Goals in helping a programmer
    * find a bug
    * prove correct
 * Testing can be complete and sound, if it's exhaustive
    * If goal is "find a bug", here is a tool:
       * outputs either "bug found" (ie, "yes), or "no bug found" (ie, "maybe")
    * If goal is prove correct, here is a tool:
       * outputs either "bug found" (ie, "no"), or "no bug found" (ie, "maybe")
    * Exercise: what is the relationship among these?
      For each, is it sound, and is it complete?
 * Analysis efficiency
    * symmetry reduction
    * test suite minimization
 * Testing
    * goal: find bugs
    * evaluate: coverage
 * Dynamic and static analyses are duals
 * Model checking
    * explicit state:
       * symmetry reduction
       * state hashing
       * bounds
    * symbolic   

===========================================================================