CSE 403 Assignment 3: Test generation

The goal of this assignment is to appreciate the difficulty of writing tests, and the strengths and weaknesses of automatic systems for doing so.

You will submit this assignment as a zip file (.zip) containing two files, README.txt and commands.sh.

A TA will run the commands.sh file, then will read the README file. The TA will not perform any tasks manually, and will not run any further commands beyond running the commands.sh script. The commands.sh script must create any files that you mention later. It should contain comments indicating what part of the assignment each part of it is relevant to, such as "# 1" for the part relevant to question 1; that part of the file is probably just a git clone ... command to obtain your repository.
Answer each question below in your README file. Clearly indicate each question number above the corresponding answer. Ensure that each answer gives enough information for someone else to reproduce your results and conclusions.

Re-use the fork of the open-source Java project that you used in assignment 2. You may choose another project that satisfies the same criteria, if you have trouble with the one you used for assignment 2. Give the URL for your fork.
Find some aspect of the project's behavior that is not tested. Briefly explain how you know it is not tested. Be specific. For example, don't just say "the code wasn't covered". Instead, give a file to examine that was created by the commands in commands.sh, say where to look in the output, and how to interpret the information there.
Manually write one or more tests that improve the quality of the test suite. These could be unit tests, end-to-end tests, or something else. Give a URL to a commit that adds the tests. Point at files (created by your commands.sh file) that contain the coverage before and after the change. Summarize the difference in coverage before and after adding your tests.
If the coverage report gives a truncated result (say, an integer coverage percentage), then your new tests do not need to improve the the truncated result (you don't need to improve the whole test suite's coverage by a full percentage point). Change the coverage tool's output to include more digits of precision, or indicate a report that shows that your addition really did improve coverage.
Run the Randoop test generation tool on all the classes of the project. This will create two test suites: one that reveals errors that are already in the program, and one that creates regression tests. Commit any needed changes to the project's build files.
The commands.sh should include commands that download and install Randoop before running it.

If Randoop crashes due to a bug (this is usually indicated by "Randoop failed in an unexpected way." and a stack trace), submit a bug report and skip the two questions about the tests that it produces.

If you are having trouble running Randoop, first make sure that you can run this command from its user manual:
```
java -Xmx3000m -classpath $(RANDOOP_JAR) randoop.main.Main gentests --testclass=java.util.TreeSet --time-limit=10
```
How many error-revealing tests were generated? If there are any, choose the first one, and determine whether it really reveals an error, or it is a false positive. If it is a false positive, explain why and start over examining the next one. If the first three are all false positives, you can stop.
For the first test that is not a false positive, fix the underlying bug and add a regression test to the project's test suite. (Before you try to understand the regression test, minimize it, or run Randoop in the first place with the --minimize-error-test command-line option.) Show the (minimized) Randoop test and give a URL to the commit that fixes the bug and adds a test.
How many regression tests were generated? (The regression tests are very complicated; don't spend too much time trying to read them.) Point out some code that the original test suite covers but the Randoop test suite does not cover. Why wasn't Randoop able to cover it? Point out some code that the Randoop test suite covers but the original test suite does not. Why didn't the developers write tests to cover that code?
In 1-2 paragraphs, comment on the utility of the Randoop tool. What worked well? What worked poorly? When would you want to use it (if ever!)? How do you think it should be improved?

At the end of the document:

List people or resources who helped you with this assignment.
How many hours did you spend on this assignment?

Upload your zip file to Canvas.

Peer Reviews

This assignment will be peer reviewed. There is a rubric on canvas which has a set of Yes/No questions. If your answer is "No" for any of the criteria, give a short description of what failed for you. Also give comments on whether the given documentation was sufficient and easy to follow.

Peer review rubric

(1pt) I was able to clone the given Java repository.
(1pt) I executed the command to produce coverage report and understand the untested behavior by looking at the output and the documentation provided.
(1pt) I pulled the changes from the commit that has the added tests and the coverage results (both before and after adding tests) confirm to those reported by the authors.
(1pt) I was able to execute the commands given to run randoop.
(1pt) I understand the bug and the minimized randoop test based on the given URL and the documentation (OR) I understand the reasoning for all 3 false positives.
(1pt) I was able to locate code covered by the original test suite but not by randoop and I also understand why based on the documentation provided.
(1pt) I was able to locate code covered by randoop but not by the original test suite and I agree to the reason provided for why developers didn't write tests covering that code.

Common mistakes to avoid

Give the actual commands to clone your repo and generate coverage reports. Don't just give the url of the repo.
Make sure you either provide the randoop jar file in your zip file or your scripts contain commands that download the same.
Ensure that the scripts you provide don't contain any absolute paths.

Changelog

Wed April 11, 11:00am: Clarified point #3: You must manually write the tests, and you don't have to increase overall coverage by a whole 1%.
Fri April 13, 8:00am: Changed point #4: Put commands in a separate file commands.sh rather than in your main README.txt file.
Clarified point #4: If Randoop crashes, submit a bug report.
Clarified point #5: If Randoop generates no error-revealing tests, just report that fact.
Mon April 16, 3:00pm: Clarified that submission is a .zip file, not a .txt file.
Added text explaining that your script should clone your repo and it should not be included in your .zip file.
Mon May 7, 10:00am: Copied peer review rubric over from Canavs.