Page 59 - Data Science Algorithms in a Week

P. 59

Naive Bayes

2. Here, we can assume that symptoms and a positive test result are conditionally
independent events given that a patient suffers from virus V. The variables we
have are the following:

P(virus)=0.04

test_accuracy=0.98
symptoms_accuracy=85%=0.85

Since we have two independent random variables, we will use an extended
Bayes' theorem:
a) Let R=P(test_positive|virus)*P(symptoms|virus)*P(virus)

=test_accuracy*symptoms_accuracy*P(virus)

=0.98*0.85*0.04=0.03332

~R=P(test_positive|~virus)*P(symptoms|~virus)*P(~virus)
=(1-test_accuracy)*(1-symptoms_accuracy)*(1-P(virus))

=(1-0.98)*(1-0.85)*(1-0.04)=0.00288

Then P(virus|test_positive,symptoms) = R/[R+~R]
=0.03332/[0.03332+0.00288]=0.92044198895~92%.

So, the patient with the symptoms for virus V and the positive test result for
virus V suffers from the virus with a probability of approximately 92%.

Note that in the previous question, we learnt that a patient suffers from
the disease with the probability of only about 67% given that the result of
the test was positive. But after adding another independent random
variable, the confidence increased to 92% even though the symptom
assessment was reliable only on 85%. This implies that usually it is a very
good idea to add as many independent random variables as possible to
calculate the posterior probability with a higher accuracy and confidence.

[ 47 ]

54 55 56 57 58 59 60 61 62 63 64