Page 58 - Data Science Algorithms in a Week
P. 58
Naive Bayes
What is the probability that the 11th person with a height of 172cm, weight of 60kg, and
long hair is a man?
Analysis:
1. Before the patient is given the test, the probability that he suffers from the virus is
4%, P(virus)=4%=0.04. The accuracy of the test is test_accuracy=98%=0.98. We
apply the formula from the medical test example:
P(test_positive)=P(test_positive|virus)*P(virus)+P(test_positive|virus)*P(no_virus)
= test_accuracy*P(virus)+(1-test_accuracy)*(1-P(virus))
= 2*test_accuracy*P(virus)+1-test_accuracy-P(virus)
Therefore, we have the following:
a) P(virus|test_positive)=P(test_positive|virus)*P(virus)/P(test_positive)
=test_accuracy*P(virus)/P(test_positive)
=test_accuracy*P(virus)/[2*test_accuracy*P(virus)+1-test_accuracy-P(virus)]
=0.98*0.04/[2*0.98*0.04+1-0.98-0.04]=0.67123287671~67%
Therefore, there is a probability of about 67% that a patient suffers from the
virus V if the result of the test is positive:
b) P(virus|test_negative)=P(test_negative|virus)*P(virus)/P(test_negative)
=(1-test_accuracy)*P(virus)/[1-P(test_positive)]
=(1-test_accuracy)*P(virus)/[1-2*test_accuracy*P(virus)-1+test_accuracy+P(virus)]
=(1-test_accuracy)*P(virus)/[test_accuracy+P(virus)-2*test_accuracy*P(virus)]
=(1-0.98)*0.04/[0.98+0.04-2*0.98*0.04]=0.000849617672~0.08%
If the test is negative, a patient can still suffer from the virus V with a
probability of 0.08%.
[ 46 ]