Page 52 - Data Science Algorithms in a Week
P. 52

Naive Bayes


            Output:

            We input the saved CSV file into the program naive_bayes.py. We get the following
            result:

                python naive_bayes.py chess_reduced.csv
                [['Warm', 'Strong', {'Yes': 0.49999999999999994, 'No': 0.5}]]

            The first class, Yes, is going to be true with the probability 50%. The numerical difference
            resulted from using Python's non-exact arithmetic on the float numerical data type. The
            second class, No, has the same probability, 50%, of being true. We, thus, cannot make a
            reasonable conclusion with the data that we have about the class of the vector (Warm,
            Strong). However, we probably have already noticed that this vector already occurs in the
            table with the resulting class No. Hence, our guess would be that this vector should just
            happen to exist in one class, No. But, to have greater statistical confidence, we would need
            more data or more independent variables involved.



            Gender classification - Bayes for continuous

            random variables

            So far, we have been given a probability event that belonged to one of a finite number of
            classes, for example, a temperature was classified as cold, warm, or hot. But how would we
            calculate the posterior probability if we were given the temperature in degrees Celsius
            instead?

            For this example, we are given five men and five women with their heights as in the
            following table:


             Height in cm Gender
             180           Male
             174           Male

             184           Male
             168           Male
             178           Male

             170           Female
             164           Female



                                                     [ 40 ]
   47   48   49   50   51   52   53   54   55   56   57