[Having a teaser for how Randoop works, before students read the paper, worked well.] =========================================================================== Suppose that we give up entirely on exhaustive testing, and use random testing instead. We saw pros and cons of this last time. Example: comparison to exhaustive testing over a small region of the code. It depends on which of our intuitions about how programs fail is best for the current case. Last time, we saw what is a test case for an OO program. How Randoop works. (Purposes of this part of lecture: * to see a way of automating a problem. Programmers hate to do things by hand! * to enable you to better complete HW3 (which you should have already started) ) [Start out with this on the board.] Randoop's main data structure is a pool of values. Each one is accompanied by a code snippet: a sequence of method calls that evaluates to the value. 0. Initialize pool with -1, 0, 1, 2, 10, null, "hello", constants in the program, etc. Loop: 1. Choose method 2. Choose arguments from pool [of appropriate types] 3. Make a new test: concatenate code snippets for the arguments, plus a new method call at the end. 4. Classify the test, heuristically * normal execution: put it in the pool, will build other tests from it add assertions * failure: output as a failing test * invalid: discard it What is poor about this? Essentially every step! * making the best choices for method? * making the best choices for arguments? * heuristic classification Enhancements to the basic algorithm: [Need a concrete, worked example for each to put on the board.] Void methods that return no value? Are they worth calling? Solution: when inserting in the pool, the outputs of a sequence include all the arguments. Needed for side effects. Now, a test may contain useless calls that don't actually modify the values. Solution: indicate pure (observer) methods. * avoid useless calls * improve assertion strength [There wasn't time for these, probably because too much of the beginning was spent on review.] Contrast with QuickCheck, which requires the user to input all possible input values, and sometimes to generate the data structures too. Contrast with EvoSuite, which has a fitness function.