HWK 3 CSE473 : Uncertainty(with solutions) (due Nov. 13)
1. R&N, Problem 14.3
SOLUTION:
It's good news that the disease is rare because it makes your overall chance of
having the disease extremely low even though the test was positive.
T  Test is positive
D  You have the disease
P(D) = 0.0001
P(TD) = 0.99
P(T) = 1, since we tested positive.
P(DT) = P(TD)*P(D) / P(T) using Baye's rule.
= (0.99 * 0.0001) / 1
= 0.000099
So the chances of having the disease are still less than 1%, even with a positive
test.
(Some of you came up with a different answer by using normalization):
P(~DT) = P(T~D)*P(~D) / P(T)
P(T) = P(TD)*P(D) * P(T~D)*P(~D)
and:
P(DT) = ( P(TD)*P(D) ) / P(TD)*P(D) + P(T~D)*P(~D)
P(~T~D) = 0.99
P(T~D) = 1  P(~T~D) = 0.01
P(~D) = 1  P(D) = 0.9999
P(DT) = ( 0.99*0.0001 ) / ( 0.99*0.0001 + 0.9999 ) = 0.0098
I accepted both answers.
2. You've been given the task to construct each of the models
detailed below. In each case, would you use fuzzy logic
or probability in your solution? Why does one make more
sense than the other?
(a) An expert system that will assist a human worker in a
steel factory. The worker makes decisions in the refining
process based on temperature constraints, the state of the
factory schedule, and the amount of impurity detected in
the metal. The problem is to translate the human expert's
notions about what is "too hot" or "time consuming" into
a model that can help schedule the factory.
(b) A program that will help schedule traffic lights based on
time of day, day of the week, anticipated events, etc. You
can assume that traffic data has been collected ahead of time.
SOLUTION:
a) Fuzzy logic makes more sense since we're dealing with human notions
such as "too hot," etc. It may not be possible to break these down
into absolutes, so fuzzy logic can help. Since we're not really
dealing with uncertainty here, probability would not be a good fit.
b) This is a clearcut application of probability. We can build a
probabilistic model based on the the traffic data (maybe a belief
net...), and then use this to predict the uncertain outcome of
future traffic scenarios. Fuzzy logic makes little sense here.
3. R&N, Problem 14.6
SOLUTION:
Show that P(A,BC) = P(AC)P(BC) is equivalent to P(AB,C) = P(AC)
1. Start with P(A,BC) = P(AC)P(BC)
2. Rewrite the lefthand side using the conditionalized version of the
generalized product rule: P(A,BC) = P(AB,C)P(BC)
3. We now have P(AB,C)P(BC) = P(AC)P(BC).
4. Cancel the P(BC) terms.
5. P(AB,C) = P(AC).
Show that P(A,BC) = P(AC)P(BC) is equivalent to P(BA,C) = P(BC)
1. Start with P(AB,C) = P(AC)
2. Rewrite the lefthand side using the conditionalized version of Bayes'
rule : P(AB,C) = P(BA,C)P(AC) / P(BC)
3. We now have P(BA,C)P(AC) / P(BC) = P(AC).
4. Multipy by P(BC) on both sides.
5. P(BA,C)P(AC) = P(BC)P(AC)
6. Cancel the P(AC) terms.
7. P(BA,C) = P(BC).
4. R&N, Problem 15.1 parts a,b,c,d only. For part b, give only the
table values for the "Starts," "StarterMotorWorking," and "Moves"
nodes in the belief net.
SOLUTION:
Parts a and b: There was no "right answer" for how you added the new nodes
"StarterMotorWorking" and "IcyWeather." There was also no right way
to choose the probabilities. Those were freebees.
Part c: If there's no conditional independence, then the joint distribution for
8 boolean nodes would be 2^8. I also accepted 2^8  1 because it is almost
Thanksgiving.
Part d: The answer to part d depends on how you drew your net. It's the total
number of values in the belief net tables (1 value for a node with no parents,
2 values for a node with 1 parent, 4 values for a node with two parents...).
Sum them up.
5. Consider the following belief net
A B
\ /
v v
C
/ \
v v
E F
p(A) = 0.8
p(B) = 0.2
P(CA,B) = 0.6 P(EC) = 0.9
P(CA,~B) = 0.5 P(FC) = 0.1
P(C~A,B) = 0.4
P(C~A,~B) = 0.1
(a) What is p(C)?
(b) What is p(CA)?
(c) What is p(AC)?
(d) What is p(C~A)?
(e) What is p(~CA)?
(f) Is the set of probabilities given above complete? (Can any
compound probability be calculated?) If not, what probabilities are
missing?
SOLUTION:
a) P(C) = P(CA,B)P(A)P(B) + P(CA,~B)P(A)P(~B) + P(C~A,B)P(~A)P(B) +
p(C~A,~B)P(~A)P(~B)
= (0.6*0.8*0.2) + (0.5*0.8*0.8) + (0.4*0.2*0.2) + (0.1*0.2*0.8)
= 0.448
b) P(CA) = P(CA,B)P(B) + P(CA,~B)P(~B)
= (0.6*0.2) + (0.5*0.8)
= 0.52
c) P(AC) = P(CA)P(A) / P(C)
= 0.52*0.8 / 0.448
= 0.928
d) P(C~A) = P(C~A,B)P(B) + P(C~A,~B)P(~B)
= (0.4*0.2) + (0.1*0.8)
= 0.16
e) P(~CA) = 1  P(CA)
= 1  0.52
= 0.48
f) You still need P(E~C), P(F~C) in order to be able to derive them all.
6. Consider the following belief network
E B
/ \ /
v v v
R A

v
C
The function dsep() will return "yes" if two nodes are dseperated, considering
the given information (if there is any). dsep() will return "no" otherwise.
Determine the return values of dsep() for each case:
(a) dsep( R, B ) {no information given}
(b) dsep( R, B  A )
(c) dsep( R, B  A, E )
SOLUTION:
The information that you needed to solve this problem is on page 445 of R&N.
The graph gives you the three cases.
a) Yes, by case 3. If you had a node such as A as evidence, they would be
dependent. Since none of these nodes are given, R and B are independent
and dsep() would return yes.
b) No, by case 3.
c) Yes, by case 2. E is the parent of A, so we still say yes even though
dsep(R,BA) is no.
7. For the following multiply connected belief net,
A
/ \
v v
B C
\ /
v
D
With the conditional probabilities:
P(A) = 0.6
P(BA) = 0.2
P(B~A) = 0.6
P(CA) = 0.7
P(C~A) = 0.1
P(DB,C) = 0.85
P(D~B,c) = 0.8
P(DB,~C) = 0.9
P(D~B,~C) = 0.1
Suppose you wanted to compute P(BA) using stochastic simulation. Compare the
number of samples you would need to reach the same level of approximation with
rejection sampling vs. likelihood weighting. What if your task was to compute
P(DB,C)?
SOLUTION:
I accepted all solutions that displayed an understanding of how likelihood
weighting can get the same approximation as rejection sampling while using
less samples overall. As was pointed out to me, it makes little sense to
stochastically simulate the particular values in the question. Stochastic
simulation is applied to belief nets that are vastly complicated, but also
unfortunately inappropriate for CSE473 homework questions. Simple examples may
be contrived, but they can still display the differences in the two algorithms
(in this case rejection sampling would still need many more samples to compute
the same approximation).
A good solution to this question:
P(BA):
With rejection sampling, you will have to drop the samples that do not comply
with the evidence {A}. For P(BA), you will have to reject around 40 of the 100
samples that end up with ~A. With likelihood weighting, you can keep all 100
of the samples after weighting them with the probability of generating each run
with the actual belief net.
P(DB,C):
The gains from using likelihood weighting are much higher here, since
P(B) = P(BA)P(A) + P(B~A)P(~A)
= (0.2 * 0.6) + (0.6 * 0.4)
= 0.36
and P(C) = P(CA)P(A) + P(C~A)P(~A)
= (0.7*0.6) + (0.1*0.4)
= 0.46
P(B,C) = 0.36 * 0.46
= 0.1656
...so for every 100 samples you'd expect to throw away 83 of them with
rejection sampling.
8. R&N, Problem 16.2
SOLUTION:
Expected monetary value of a lottery ticket:
Most of you came up with an actual value by making an assumption about the
utility of $1,000,000. They tell you not to do that, but I'm pretty sure that's
for the second part of the question (when you're asked to hypothesize about when
it's rational to buy a ticket at all). Otherwise, I don't know how you're
supposed to derive a "monetary value" for the ticket. I accepted several answers,
but the most common one was:
EMV = (1/50 * 10) + (1/2,000,000 * 1,000,000)
= $0.70
...so we're assuming that U($1) is $1, and U($1,000,000) is 1,000,000.
It was all right to account for the dollar lost to buy the ticket...
When is it rational to buy a lottery ticket?
Let's take a look at the utility equation again:
1/50 * 10 * U($1) + 1/2,000,000 * U($1,000,000)
The answer obviously depends on the utility of winning the $1,000,000, since
U($1) is going to be higher than 1/5 * U($1). So if we value the $1,000,000 at
more than it's actual dollar worth (perhaps we borrowed money from an *unforgiving*
source), it could be rational to buy the ticket.
And there's R&N's answer for the last part. Poor people just have a weighted
utility function that assigns more value to U($1,000,000) than we might expect
otherwise...
Some of you used some of the more specific utility equations from chapter 16,
which was great. This minimal logic (above) was enough to get full credit,
though.