Page 59 - Data Science Algorithms in a Week
P. 59
Naive Bayes
2. Here, we can assume that symptoms and a positive test result are conditionally
independent events given that a patient suffers from virus V. The variables we
have are the following:
P(virus)=0.04
test_accuracy=0.98
symptoms_accuracy=85%=0.85
Since we have two independent random variables, we will use an extended
Bayes' theorem:
a) Let R=P(test_positive|virus)*P(symptoms|virus)*P(virus)
=test_accuracy*symptoms_accuracy*P(virus)
=0.98*0.85*0.04=0.03332
~R=P(test_positive|~virus)*P(symptoms|~virus)*P(~virus)
=(1-test_accuracy)*(1-symptoms_accuracy)*(1-P(virus))
=(1-0.98)*(1-0.85)*(1-0.04)=0.00288
Then P(virus|test_positive,symptoms) = R/[R+~R]
=0.03332/[0.03332+0.00288]=0.92044198895~92%.
So, the patient with the symptoms for virus V and the positive test result for
virus V suffers from the virus with a probability of approximately 92%.
Note that in the previous question, we learnt that a patient suffers from
the disease with the probability of only about 67% given that the result of
the test was positive. But after adding another independent random
variable, the confidence increased to 92% even though the symptom
assessment was reliable only on 85%. This implies that usually it is a very
good idea to add as many independent random variables as possible to
calculate the posterior probability with a higher accuracy and confidence.
[ 47 ]