GREATEST HITS

These are koans of CSE 599E1.


EXPERIMENTAL DESIGN

learnability != usability != usefulness
  Use training tasks to ensure that you are not measuring learnability

Your experiment should include training on any task or tool that is new to the subjects.
The training should have the subjects do the same kind of task that the experiment requires.
It is not adequate to have the training be people reading a webpage or thinking
about the tool or doing the task -- don't trust users to think that they
understand how to use the tool.

Measure what matters (not what is easy to measure).
Whenever possible, measure the end-to-end task.
If you measure a part of an overall task, 
then justify your choice by indicating what percentage of the overall task it is.
If your task is an infrequent one rather than the subjects' whole job,
then justify your choice by indicating how frequently that task comes up.

Don't trust the user.  People are unreliable witnesses about their own minds.
Whenever it is possible to measure an effect rather than asking your
subjects their opinion, do the measurement.

If you cannot explain your results, they are likely wrong
  Have a causal chain that justifies why the change in inputs leads to the change in outputs
  If you cannot do this, then your results may be a statistical fluke,
    and others are less likely to believe your results even if they aren't.


WHEN TO DO A CONTROLLED EXPERIMENT

Consider alternatives to a controlled experiment
  Oftentimes, a controlled experiment is not the best choice

A controlled experiment is usually necessary only when there is a
controversy or a legitimate doubt in the scientific community.  If there is
no controversy, then it's may not be worth spending a lot of time to test a
hypothesis that everyone agrees on (whether to accept it or reject it).
  
Use the right method.  Recall the chart of techniques:
          Qualitative    Quantitative
Ask       interview      survey
Observe   case study     controlled experiment

Consider the opportunity cost
  Even if some task is worth doing, it may not be the best use of your time

It's OK for you to have an expectation about the outcome,
  but don't let that affect your behavior as a scientist.


WHAT CONTROLLED EXPERIMENT TO DO

Have a research question
It should focus on what is important
  The bottleneck in some process
  The weakest part of a previous result or of your plans
  The thing that keeps you up at night worrying about whether your technique will work overall
  This means that you might put off a controlled experiment until 

Ensure that your results will be actionable.
A positive result should change the way that people think and act.
If it doesn't, then your experiment is probably a waste of time.
Avoid doing experiments just because the results are "interesting".
  Show a connection to industrial or scientific practice.

Depending on your null hypothesis,
a positive result might show that the treatment is better or no better than the control.
Independently, it might support or refute your own expectation.


STATISTICS

Correlation != causation
Statistical significance != practical significance

You can reject a hypothesis, but you cannot prove a hypothesis (don't claim you have!)

No statistical innovation -- use well-understood techniques
  Leave creation of novel techniques to statistics experts

Your statistical intuition is terrible
  Don't make guesses by eyeballing data.


MISCELLANEOUS

Run a pilot study!  Then run more pilots.
You are certain to learn a lot about your tool/treatment
and about your experimental design and methodology.

Think about threats to validity.  A few examples are:
 * subject bias
 * survivor bias
 * researcher bias (you are prone to it no matter how hard you try to avoid it;
   and of course you must try hard to avoid it nonetheless)

Ethics -- don't harm people or society


MORE ADVICE TO OTHERS CONSIDERING A USER STUDY

Experimentations is hard, time-consuming work
  Start early

Get feedback
  from pilot studies
  from experienced people even before you perform pilot studies