Default Project 1: Build an event extractor that takes news
stories as inputs and outputs a set of triples: SUBJECT
EVENT-TYPE OBJECT, where the event types are taken from a list
of events (provided). A baseline system can be built using
off-the-shelf components (open information extraction,
classification) and there are many ways to extend the system and
make it better. More details here.
Default Project 2: Build a citation extractor to extract semantic information about scholarly papers by leveraging textual descriptions that accompany citations. More details here.
Other ideas:
Eight projects (complete with plan, final presentation & report) from when Dan taught 454 two years ago; scroll down near bottom. (these are information extraction focused)
Jeff
Bigham's course
at CMU has a bunch of project ideas that are crowdsourcing
focused
Chris
Callison-Burch's course
at UPenn also has a bunch of project ideas that are crowdsourcing focused.
Fact-Checking App: Create an app that can do crowdsourced fact-checking. Such an app might allow the user to input statements, or could also be integrated into other applications so that a user can quickly highlight statements and request they be fact-checked. To do the actual fact-checking, devise a crowdsourcing workflow that can facilitate the fact-checking of a statement by a crowd of non-experts.
Bathroom Locator: Create an app that allows users to input information about bathrooms on campus. Such information could be static, like the total number of stalls, or could be real-time, like whether it's currently flooded with liquid of unknown origin, or whether there is any soap or paper towels left. Draw interesting conclusions from the data.
Extracting the size of objects See slides for 1/13 for how to do this from webtables and why it's important. Xiao Ling has offered to mentor.