Page 60 - Data Science Algorithms in a Week

P. 60

Naive Bayes

b) Here, the patient has the symptoms for the virus V, but the result of the test is
negative. Thus we have the following:

R=P(test_negative|virus)*P(symptoms|virus)*P(virus)

=(1-test_accuracy)*symptoms_accuracy*P(virus)

=(1-0.98)*0.85*0.04=0.00068
~R=P(test_negative|~virus)*P(symptoms|~virus)*P(~virus)

=test_accuracy*(1-symptoms_accuracy)*(1-P(virus))

=0.98*(1-0.85)*(1-0.04)=0.14112

Thus P(virus|test_negative,symptoms)=R/[R+~R]
=0.00068/[0.00068+0.14112]=0.0047954866~0.48%

Thus, a patient tested negative on the test, but with symptoms of virus V, has
a probability of 0.48% of having the virus.

3. We apply the basic form of Bayes' theorem:

P(tsunami|earthquake)=P(earthquake|tsunami)*P(tsunami)/P(earthquake)

~0.5*(4/(365*100))/(6/(365*100))

~0.5*4/6~1/3=33%
There is a chance of 33% that there will be a tsunami following the recorded
earthquake.

Note that here we set P(tsunami) to be the probability of a tsunami
happening on some particular day out of the days in the past 100
years. We used a day as a unit to calculate the probability
P(earthquake) as well. If we changed the unit to an hour, week, month,
and so on for both P(tsunami) and P(earthquake), the result would still
be the same. What is important in the calculation is the ratio
P(tsunami):P(earthquake)=4:6=2/3:1, that is, that a tsunami is 2/3 times
more likely to happen than an earthquake.

[ 48 ]

55 56 57 58 59 60 61 62 63 64 65