This seminar is meant to provide exposure to various topics in AI research. This quarter, every week we will have presentations by CSE professors, outside speakers from industry and academia about their research projects. 590a provides an exciting opportunity to learn about AI research happening at UW and various other leading AI institutes.
Talk announcements will be made both on cse590a and uw-ai mailing list. The talk videos will also be made available online here.
|4/6||Barbara J. Grosz (Harvard)||
From the Turing Test to Smart Partners: or, “Is Your System Smart Enough To Be A Teammate?”
For much of its history, most AI research has centered on issues of building intelligent machines, independently of a consideration of their interactions with people. As the world of computing has evolved, and systems–smart or otherwise–pervade ever more facets of life and are used by groups not just individuals, tackling the challenges of building computer systems smart enough to work effectively with people has become of increasing importance. In this talk, I will argue for considering “people-in-the-loop” as central to AI for both pragmatic and cognitive science reasons, present some fundamental scientific questions this teamwork stance raises, and describe recent research by my group on using computational models of collaboration to support health-care coordination.
|4/13||Mike Lewis (UW)||
Wide-Coverage Semantic Parsing (video)
Natural language parsers are widely used in applications such as
question answering, information extraction and machine translation.
In this talk, we build on recent parsing advances, but introduce
new methods to (1) recover complex long-distance semantic
relationships, (2) parse with both state-of-the-art speed and
and (3) substantially improve out-of-domain performance.
|4/20||Ece Kamar (MSR)||
Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence
Historically, a common goal for the development of AI systems has been exhibiting intelligent behaviors that humans excel at. Consequently, most AI systems are designed to replace humans by completely automating well-defined tasks. Despite advances in AI, machines still have limitations in accomplishing tasks that come naturally to humans. In this talk, I will argue that our focus should not be in designing isolated AI systems, but instead we should focus on developing hybrid systems that combine the strengths of machine and human intelligence. I will present human computation platforms as an enabling technology for the development of hybrid systems and address some scientific challenges that arise from having humans in the loop. First, I will provide an overview of our research efforts on reasoning capabilities towards developing hybrid systems. I will show how planning techniques can be used to decide when to ask for human help. Then, I will discuss how agents learning how to act in the world can benefit from input of more experienced agents or humans. Next, I will present an overview of different studies on understanding humans as helpers to AI systems and improving the interaction between humans and AI systems. I will conclude the talk by discussing opportunities and challenges in the development of hybrid systems.
|4/27||Eric Horvitz (MSR)||
The One Hundred Year Study on Artificial Intelligence: An Enduring Study on AI and its Influence on People and Society (video*)
I will present an update on the One Hundred Year Study on AI. I will describe the background and status of the project, including the roots of the effort in earlier experiences with the 2008-09 AAAI Panel on Long-Term AI Futures that culminated in the AAAI Asilomar meeting. I will reflect about several directions for investigation, highlighting opportunities for reflection and investment in proactive research, monitoring, and guidance. I look forward to comments and feedback from seminar attendees.
|5/4||Hannaneh Hajishirzi (UW)||
Learning to Read, Ground, and Reason in Multimodal Text (video)
Web data, news, and textbooks offer informative but unstructured multimodal text. The ability to translate multimodal text into a semantic representation that is amenable to further reasoning is a key step toward taming information overload, one of the fundamental problems in modern AI. A core challenge is to do robust, scalable, context-aware semantic analysis and reasoning on multimodal text. Designing systems that can understand and use multimodal text requires multiple interconnected components: semantic interpretation, multimodal alignment, knowledge acquisition, and reasoning.
In this talk, I will go over a few recent projects in our group that aims at addressing the problems of semantic interpretation and knowledge acquisition in several question answering tasks. Some example tasks include designing automatic systems that can answer diagrammatic questions in geometry and science domains.
|5/11||Jeff Siskind (Purdue)||
Decoding the Brain to Help Build Machines
Humans can describe observations and act upon requests. This requires that language be grounded in perception and motor control. I will present several components of my long-term research program to understand the vision-language-motor interface in the human brain and emulate such on computers.
In the first half of the talk, I will present fMRI investigation of the vision-language interface in the human brain. Subjects were presented with stimuli in different modalities---spoken sentences, textual presentation of sentences, and video clips depicting activity that can be described by sentences---while undergoing fMRI. The scan data is analyzed to allow readout of individual constituent concepts and words---people/names, objects/nouns, actions/verbs, and spatial-relations/prepositions---as well as phrases and entire sentences. This can be done across subjects and across modality; we use classifiers trained on scan data for one subject to read out from another subject and use classifiers trained on scan data for one modality, say text, to read out from scans of another modality, say video or speech. Analysis of this indicates that the brain regions involved in processing the different kinds of constituents are largely disjoint but also largely shared across subjects and modality. Further, we can determine the predication relations; when the stimuli depict multiple people, objects, and actions, we can read out which people are performing which actions with which objects. This points to a compositional mental semantic representation common across subjects and modalities.
In the second half of the talk, I will use this work to motivate the development of three computational systems. First, I will present a system that can use sentential description of human interaction with previously unseen objects in video to automatically find and track those objects. This is done without any annotation of the objects and without any pretrained object detectors. Second, I will present a system that learns the meanings of nouns and prepositions from video and tracks of a mobile robot navigating through its environment paired with sentential descriptions of such activity. Such a learned language model then supports both generation of sentential description of new paths driven in new environments as well as automatic driving of paths to satisfy navigational instructions specified with new sentences in new environments. Third, I will present a system that can play a physically grounded game of checkers using vision to determine game state and robotic arms to change the game state by reading the game rules from natural-language instructions.
|5/18||Juho Kim (Stanford/KAIST)||
Learnersourcing: Improving Learning with Collective Learner Activity
Millions of learners today are watching videos on online platforms, such as Khan Academy, YouTube, Coursera, and edX, to take courses and master new skills. But existing video interfaces are not designed to support learning, with limited interactivity and lack of information about learners' engagement and content. Making these improvements requires deep semantic information about video that even state-of-the-art AI techniques cannot fully extract. I take a data-driven approach to address this challenge, using large-scale learning interaction data to dynamically improve video content and interfaces. Specifically, my research introduces learnersourcing, a form of crowdsourcing in which learners collectively contribute novel content for future learners while engaging in a meaningful learning experience themselves. In this talk, I will present learnersourcing applications designed for massive open online course videos and how-to tutorial videos, where learners' collective activities (1) highlight points of confusion or importance in a video, (2) extract a solution structure from a tutorial, and (3) improve explanations shown to future learners.
My research demonstrates how learnersourcing can enable more interactive, collaborative, and data-driven learning.
|5/25||Mark Hopkins (AI2)||
Automatic question answering has made great strides over the past decade, but mostly for information retrieval questions requiring little or no reasoning. In this talk, I will describe the vision and current progress of an initiative to create a question answering system for a much more varied and reasoning-heavy QA task: the Math SAT.
|6/1||Sameer Singh (UW)||
Interactive Machine Learning for Information Extraction
Most of the world's knowledge, be it factual news, scholarly research, social communication, subjective opinions, or even fictional content, is now easily accessible as digitized text. Unfortunately, due to the unstructured nature of text, much of the useful content in these documents is hidden. The goal of “information extraction” is to address this problem: extracting meaningful, structured knowledge (such as graphs and databases) from text collections. The biggest challenges when using machine learning for information extraction include the high cost of obtaining annotated data and lack of guidance on how to understand and fix mistakes.
In this talk, I propose interpretable representations that allow users and machine learning models to interact with each other: enabling users to inject domain knowledge into machine learning and machine learning models to provided explanations as to why a specific prediction was made. I study these techniques using relation extraction as the application, an important subtask of information extraction where the goal is to identify the types of relations between entities that are expressed in text. I first describe how symbolic domain knowledge, if provided by the user as first-order logic statements, can be injected into relational embeddings to improve the predictions. In the second part of the talk, I present an approach to “explain” machine learning predictions using symbolic representations, which the user may annotate directly for more effective supervision. I present experiments to demonstrate that an interactive interface between a user and machine learning is effective in reducing annotation effort and in quickly training accurate extraction systems.