Page 52 - Data Science Algorithms in a Week
P. 52
Naive Bayes
Output:
We input the saved CSV file into the program naive_bayes.py. We get the following
result:
python naive_bayes.py chess_reduced.csv
[['Warm', 'Strong', {'Yes': 0.49999999999999994, 'No': 0.5}]]
The first class, Yes, is going to be true with the probability 50%. The numerical difference
resulted from using Python's non-exact arithmetic on the float numerical data type. The
second class, No, has the same probability, 50%, of being true. We, thus, cannot make a
reasonable conclusion with the data that we have about the class of the vector (Warm,
Strong). However, we probably have already noticed that this vector already occurs in the
table with the resulting class No. Hence, our guess would be that this vector should just
happen to exist in one class, No. But, to have greater statistical confidence, we would need
more data or more independent variables involved.
Gender classification - Bayes for continuous
random variables
So far, we have been given a probability event that belonged to one of a finite number of
classes, for example, a temperature was classified as cold, warm, or hot. But how would we
calculate the posterior probability if we were given the temperature in degrees Celsius
instead?
For this example, we are given five men and five women with their heights as in the
following table:
Height in cm Gender
180 Male
174 Male
184 Male
168 Male
178 Male
170 Female
164 Female
[ 40 ]