These potential projects complement the proposals that you will present in class on Friday. You will be able to select one of the projects presented in class, or one of these, as your project. Any one of these would make a fine class project, but other possibilities would too. =========================================================================== Concurrency: name protection vs. value protection Multiple threads run simultaneously in the same address space. If two threads access the same data structure simultaneously, the data structure can become corrupted, such as its invariants being violated. Typically, programmers prevent data races (concurrent access or modification) by following a _locking discipline_. A locking discipline specifies which locks should be held when a data structure is modified. Since only one thread can hold a lock, this prevents concurrent operations. There is a semi-standard way to express locking disciplines in Java: the @GuardedBy annotation. Object myLock; @GuardedBy("myLock") List words; // legal synchronized(myLock) { words.add("hello"); } // illegal words.add("hello"); // illegal synchronized(otherLock) { words.add("hello"); } The problem with this annotation is that it is unsound, because it is defined in terms of protecting _variable names_ instead of Java objects, and the run-time system performs locking based on objects. For example, it permits this code: List otherList; synchronized(myLock) { otherList = words; // OK, because myLock is currently held } otherList.add("hello"); // PROBLEM: may occur in parallel with other operations A better definition of a locking discipline requires that, after the assignment `otherList = words;`, then myLock must be held whenever otherList is accessed. And, myLock doesn't have to be held List otherList; otherList = words; // safe, but prohibited by name-protection semantics synchronized(myLock) { otherList.add("hello"); // safe, because myLock is held } An implementation exists of the correct value-protection semantics for Java. http://homes.cs.washington.edu/~mernst/pubs/locking-inference-checking-icse2016.pdf The project would be to perform case studies of this semantics: * Case studies of current practice: * How often are programmers' annotations incorrect with respect to the name-protection semantics? * How often are the annotations incorrect with respect to the value-protection semantics? * These studies can reveal whether programmers understand, and correctly use, the current @GuardedBy semantics. * Case studies of the sound value-protection semantics: * Annotate sizable programs with the value-protection semantics. * This will indicate how usable the semantics is, whether improvements need to be made, and how to support the refactoring effort. =========================================================================== Stack Overflow parsing Stack Overflow helps programmers to perform software development tasks. The information in Stack Overflow can also be used by automated tools. One task is summarizing source code. Stack Overflow answers often provide source code and a summary of that source code. Given this example data, Iyer and Zettlemoyer's paper "A neural encoder-decoder approach for summarizing source code" shows how to produce a system that summarizes arbitrary source code. Another task is code snippet retrieval: given an English description of a programming task, suggest relevant code. Again, Stack Overflow answers can be treated as source code plus English descriptions of them, and used for training a natural language technique. Hearst's paper "Multi-paragraph segmentation of expository text" takes this approach. Wang, Lin, and Loncaric's paper "Constructing Shell Commands from Natural Language" is even more ambitious: given an English description of a task, it generates a command to perform that task. All of these tools use Stack Overflow answers in a naive way: they associate the title of the question with the first code snippet in the answer. They ignore all the text in the answer, and they ignore all code snippets except the first one. Here are some problems with this technique: * Question titles are often short or non-descriptive * Text in the answer often serves an important explanatory purpose * Answers often have multiple code snippets. * It may be necessary to concatenate two snippets in order to achieve a particular goal. * An answer may give two different ways to solve a problem, in which case the two snippets should *not* be merged. The goal of this project would be to find a better way to parse Stack Overflow answers, or at least to segment them into distinct parts, and then re-run previous experiments to see whether the better Stack Overflow data results in better results for tools that use Stack Overflow. =========================================================================== Minimizing bug fixes Ideally, every commit in a version control repository has a single purpose: * add a feature * fix a bug * refactor However, in practice programmers often make a commit that mixes together multiple distinct changes. There might be multiple new features, or a feature plus some refactorings that were needed in order to implement it plus some bugs that the programmer noticed along the way. A commit that serves multiple purposes is harder for programmers and tools to interpret. The goal of this project is to create a tool that minimizes a patch with respect to some particular goal. For example, minimizing a patch with respect to a bug fix means finding the smallest part of the patch that fixes the bug. The patch would not include documentation changes, variable renaming, and other refactorings. =========================================================================== Prevent index-out-of-bounds errors A common programmer error is passing an illegal index to an array dereference or a list getter method: int i = -1; ... a[i] ... // run-time error inj j = myList.size(); ... myList.get(j) ... // run-time error With any error, it would be better to prevent the error by learning about it at compile time, rather than to discover the error at run time (or have a user discover it via a crash!). The Java compiler already gives warnings about certain types of errors: String s = "hello"; ... a[s] ... // compile-time error and the goal is to have the compiler give warnings about index-out-of-bounds errors as well. You know, from CSE 331, ways to prove that all array/list dereferences are within bounds. The goal of this project is to extend Java's type system so that it is cognizant of array bounds. You will replace Java's type system by a more expressive one that expresses whether each integer is within bounds of each array. Extending Java's type system is easy to do, thanks to a tool called the Checker Framework (http://CheckerFramework.org/). Dozens of people, including undergraduates, have used it to create custom type-checkers for Java. =========================================================================== Purity or side effect analysis A "pure" procedure is one that: * performs no visible side effects, and * returns the same value when it is called twice on the same values. Purity is a very important notion when analyzing a program. For example, suppose that a program contains the following code: if (this.myField != null) { int x = this.computeValue(); ... this.myField.toString() ... } Can the last statement, myField.toString(), throw a null pointer exception due to this.myField being null? It can! The reason is that the computeValue method might set myField to null. The programmer should write each procedure's possible side effects in its specification, but a programmer might fail to do this or might write an incorrect specification. It would be good to infer or check side effect specifications. There are many other uses for purity. As an example, consider automated test generation. Is it desirable to generate two tests such as these? MyClass x = new MyClass(); assert x.someMethod() == 5; MyClass x = new MyClass(); x.otherMethod(); assert x.someMethod() == 5; If otherMethod() might side-effect its receiver "x", then the second test case is a valuable addition to the test suite. On the other hand, if otherMethod() is known to have no side effects, then the second test case is undesirable and it would be better for the test generation tool to create different test cases. As another example related to test generation, suppose that during execution a particular method yetAnotherMethod() returns 22. Is it good to write an assertion about that? MyClass x = new MyClass(); int y = x.someMethod(); assert x.yetAnotherMethod() == 22; // is this a good assertion? int z = compute(x, y); If method yetAnotherMethod has a side effect, then the value of z could be different depending on whether assertions are enabled or not. Only pure methods should be called in assertions. For an implementation, a reasonable approach would be to implement the analysis in Salcianu and Rinard's paper "Purity and Side Effect Analysis for Java Programs". Their jppa tool was widely used, but has not been maintained and does not work with current versions of Java. Given that Java tools are much better now than they used to be, this re-implementation should be relatively straightforward. You would evaluate the new implementation against other tools for purity analysis, and you would plug it into downstream tools (such as test generation, etc.) to see how much it improves their results. You might find ways to improve the purity analysis as well. =========================================================================== Generating tests from documentation Tests are important for producing quality software, but programmers don't enjoy writing them and may forget to write them. On the other hand, programmers generally do write brief English descriptions of the behavior of their procedures. For example, they write Javadoc documentation for Java methods; this is supported by IDEs and is considered a standard practice. Those English descriptions are enough to enable a programmer to write tests; can we build a tool that automatically generates tests from English documentation? The idea is to parse descriptions such as "throws NullPointerException if any element of the array is null", and to generate assertions stating the expected behavior of different calls. For this project, we assume that the programmer already has test inputs; the only question is whether the code's behavior is correct. There are interesting challenges regarding using natural language processing and also pattern-matching to recognize the sorts of assertions programmers write. Another challenge is evaluating the tool: given English documentation and the tool's output, are its assertions correct and sufficient? One possible way to address this would be to pay programmers to produce goal files, via a crowdsourcing platform. Designing such an experiment is challenging -- for example, how do you know to trust the programmers on the crowdsourcing platform? =========================================================================== Here are 4 more lists of potential projects: http://homes.cs.washington.edu/~mernst/uw-only/research/potential-research-projects.html https://rawgit.com/randoop/randoop/master/doc/projectideas.html https://raw.githubusercontent.com/codespecs/daikon/master/doc/todo.txt https://github.com/typetools/checker-framework/blob/wiki/Ideas.md ===========================================================================