GREATEST HITS These are koans of CSE 599E1. EXPERIMENTAL DESIGN learnability != usability != usefulness Use training tasks to ensure that you are not measuring learnability Your experiment should include training on any task or tool that is new to the subjects. The training should have the subjects do the same kind of task that the experiment requires. It is not adequate to have the training be people reading a webpage or thinking about the tool or doing the task -- don't trust users to think that they understand how to use the tool. Measure what matters (not what is easy to measure). Whenever possible, measure the end-to-end task. If you measure a part of an overall task, then justify your choice by indicating what percentage of the overall task it is. If your task is an infrequent one rather than the subjects' whole job, then justify your choice by indicating how frequently that task comes up. Don't trust the user. People are unreliable witnesses about their own minds. Whenever it is possible to measure an effect rather than asking your subjects their opinion, do the measurement. If you cannot explain your results, they are likely wrong Have a causal chain that justifies why the change in inputs leads to the change in outputs If you cannot do this, then your results may be a statistical fluke, and others are less likely to believe your results even if they aren't. WHEN TO DO A CONTROLLED EXPERIMENT Consider alternatives to a controlled experiment Oftentimes, a controlled experiment is not the best choice A controlled experiment is usually necessary only when there is a controversy or a legitimate doubt in the scientific community. If there is no controversy, then it's may not be worth spending a lot of time to test a hypothesis that everyone agrees on (whether to accept it or reject it). Use the right method. Recall the chart of techniques: Qualitative Quantitative Ask interview survey Observe case study controlled experiment Consider the opportunity cost Even if some task is worth doing, it may not be the best use of your time It's OK for you to have an expectation about the outcome, but don't let that affect your behavior as a scientist. WHAT CONTROLLED EXPERIMENT TO DO Have a research question It should focus on what is important The bottleneck in some process The weakest part of a previous result or of your plans The thing that keeps you up at night worrying about whether your technique will work overall This means that you might put off a controlled experiment until Ensure that your results will be actionable. A positive result should change the way that people think and act. If it doesn't, then your experiment is probably a waste of time. Avoid doing experiments just because the results are "interesting". Show a connection to industrial or scientific practice. Depending on your null hypothesis, a positive result might show that the treatment is better or no better than the control. Independently, it might support or refute your own expectation. STATISTICS Correlation != causation Statistical significance != practical significance You can reject a hypothesis, but you cannot prove a hypothesis (don't claim you have!) No statistical innovation -- use well-understood techniques Leave creation of novel techniques to statistics experts Your statistical intuition is terrible Don't make guesses by eyeballing data. MISCELLANEOUS Run a pilot study! Then run more pilots. You are certain to learn a lot about your tool/treatment and about your experimental design and methodology. Think about threats to validity. A few examples are: * subject bias * survivor bias * researcher bias (you are prone to it no matter how hard you try to avoid it; and of course you must try hard to avoid it nonetheless) Ethics -- don't harm people or society MORE ADVICE TO OTHERS CONSIDERING A USER STUDY Experimentations is hard, time-consuming work Start early Get feedback from pilot studies from experienced people even before you perform pilot studies