Assignment 6: Bayes' Rule and Markov Decision Processes
CSE 415: Introduction to Artificial Intelligence
The University of Washington, Seattle, Autumn 2017
The reading on Bayes' rule is Section 7.2.3 of Probabilistic Reasoning. The reading for the MDP part of this assignment is is Chapter 3 of Sutton and Barto (see the Readings webpage).
Due Monday, November 20 via Catalyst CollectIt at 11:59 PM.
 
Problems
  1. Should Anyone Panic? (40 points)

    Lucy goes to Hall Health about a sprained ankle, but while waiting for that, the nurse there chooses Lucy randomly to take part in a test for HPAI (high pathogenicity avian influenza), and this test involves a blood draw. Let's assume that in this season, one out of 1000 folks in Seattle are affected by HPAI. The HPAI test is 95% effective, meaning that there's only a 5% chance of a false positive. Let's assume the probability of a false negative is 0. Lucy's HPAI test result is positive. (a) What's the updated probability that she has HPAI?

    James attends a friend's marriage ceremony in Belize, and then he comes back to the U.S. Let's assume that 1 out of 80 people coming back from Belize come back with HPAI. James takes the same test that Lucy had, and his result is also positive. (b) What's the updated probably that he has HPAI?

    Should anyone panic? Should either of Lucy or James seek further assistance?

    Show your work, including whatever formulas you are using.

    Make a table showing the possible conditions, the possible outcomes, and their corresponding probabilities.

  2. "The Mecha-Mouse at the Hostel for Travelling Droids" (60 points)

    The Hostel for Travelling Droids has four rooms: Dormitory (D), Lavatory (L), Pantry, and Mess Hall (M). There is a mechanical mouse ("Mecha-mouse") that inhabits the hostel, typically looking for a meal. The mouse has three actions: (X: exit current room; Y: alternative action; Z: remain as is). There is some danger than the "Compu-Cat" will ambush the mouse at any time, putting it in the Ambushed state, from which it can only go to the dead-end Kaput state. The activities in this hostel are governed by a Markov Decision Process with the following dynamics.
    s, a Dormitory Lavatory Pantry Mess Hall Ambushed Kaput
    Dormitory, X00.400.600
    Dormitory, Y00.600.400
    Dormitory, Z0.750000.250
    Lavatory, X0.400.6000
    Lavatory, Y0.600.4000
    Lavatory, Z00.75000.250
    Pantry, X00.600.400
    Pantry, Y00.400.600
    Pantry, Z000.7500.250
    Mess Hall, X0.400.6000
    Mess Hall, Y0.600.4000
    Mess Hall, Z0000.750.250
    Ambushed, *000001.0
    Kaput, *000001.0

    The reward here depends only on the current state s.
    sR(s)
    Dormitory0
    Lavatory4
    Pantry10
    Mess Hall2
    Ambushed-50
    Kaput0

    1. Give the number of different policies that are possible for Mecha-mouse in the hostel.
    2. Manually apply the values iteration method to this problem for six iterations. Show the value at each state in each iteration. Assume that the discount factor is 0.5.
    3. Based on your analysis, give the optimal policy as an action for each state.

Updates and Corrections

If necessary, updates and corrections will be posted here and mentioned in class or on GoPost.