Page 54 - Data Science Algorithms in a Week
P. 54

Naive Bayes


             Gender Mean of height Variance of height

             Male     176.8           37.2
             Female 163.4             30.8
            Thus we could calculate the following:

            P(height=172|male)=exp[-(172- 176.8)2/(2*37.2)]/[sqrt(2*37.2*π)]=0

            P(height=172|female)=exp[-(172- 163.4)2/(2*30.8)]/[sqrt(2*30.8*π)]=0.02163711333

            Note that these are not the probabilities, just the values of the probability density function.
            However, from these values, we can already observe that a person with a measured height
            172 cm is more likely to be male than female because
            P(height=172|male)>P(height=172|female). To be more precise:

            P(male|height=172)=P(height=172|male)*P(male)/[P(height=172|male)*P(male)+P(height=17
            2|female)*P(female)]
            =0.04798962999*0.5/[0.04798962999*0.5+0.02163711333*0.5]=0.68924134178~68.9%

            Therefore, the person with the measured height 172 cm is a male with a probability of
            68.9%.



            Summary

            Bayes' theorem states the following:

            P(A|B)=(P(B|A) * P(A))/P(B)
            Here, P(A|B) is the conditional probability of A being true given that B is true. It is used to
            update the value of the probability that A is true given the new observations about other
            probabilistic events. This theorem can be extended to a statement with multiple random
            variables:

            P(A|B ,...,B )=[P(B |A) * ... * P(B |A) * P(A)] / [P(B |A) * ... * P(B |A) * P(A) + P(B |~A) * ... *
                                                          1
                                         n
                                                                                     1
                                                                      n
                             1
                  1
                       n
            P(B |~A) * P(~A)]
                n



                                                     [ 42 ]
   49   50   51   52   53   54   55   56   57   58   59