574 Project

Saleema Amershi & Kayur Patel

Proposal

Moderators in collaborate content creation systems, such as Wikipedia, are constantly fighting a battle or organize a growing, changing body of information. Information boxes are one example of how information is organized within Wikipedia, but manually creating these is time consuming and error prone. Furthermore, there are inconsistencies between information boxes across similar pages. Fully automated systems that use machine have the potential to reduce human effort; however, these systems are also prone to errors. By having humans validate and correct results generate by these automated systems, we can reduce human effort and improve the system by providing additional validated data.

The role of this project is to investigate different ways of interacting with the user to validate results from KYLIN, an automated system for generating information boxes for Wikipedia articles. We are looking to answer such questions as:

- How do you present classification results to a user?

- How do you allow users to correct errors?

- How do you allow users to augment classification results (e.g., add additional data to information boxes)?

- How do you leverage user feedback to improve the accuracy of the system?

One potential interface correcting information boxes can be seen in Dan's NSF proposal. In this interface the user is presented with a visual mapping between extracted attributes in the information box and the text from which they were extracted, and users can validate this by simply clicking on a checkbox. An alternative interface may present all potential sentences from which the attribute could be extracted. If the first choice selected by the algorithm is incorrect, users may be able to find the correct value in one of the other sentences.

Milestones

- Feb 13^th

o Set up KYLIN: Fei is going to help us understand KYLIN code and set up a version of KYLIN on our own machine so that we can call it as part of our system.

o IRB Forms: Ideally we would like to run a study comparing different interfaces for publication in UIST. To do this we need to specify our target population and submit IRB forms.

o Literature Review: Read related papers on mixed-initiative interfaces, information extraction, and interaction techniques for error-correction.

o Prototype Interfaces: Brainstorm and sketch out potential interfaces for correcting data.

- Feb 27^th

o Implement interfaces: Pick 3-4 interfaces from brainstorming session and implement.

o Recruit Subjects: Try to find Wikipedians and people familiar with Wikipeida.

o Generate Study Task.

- March 12

o Run Studies: Compare interfaces using metrics such as task completion time, accuracy of information boxes, and usability.

o Analyze our results.

o Create presentation and demo.

- March 21

o Write paper.