Page 55 - Data Science Algorithms in a Week
P. 55
Naive Bayes
The random variables B ,...,B have to be independent conditionally given A. The random
n
1
variables can be discrete or continuous and follow some probability distribution, for
example, normal (Gaussian) distribution.
For the case of a discrete random variable, it would be best to ensure you have a data item
for each value of a discrete random variable given any of the conditions (value of A) by
collecting enough data.
The more independent random variables we have, the more accurately we can determine
the posterior probability. However, the greater danger there is that some of these variables
could be dependent, resulting in imprecise final results. When the variables are dependent,
we may eliminate some of the dependent variables and consider only mutually
independent variables, or consider another algorithm as an approach to solving the data
science problem.
Problems
1. A patient is tested for having a virus V. The accuracy of the test is 98%. This virus
V is currently present in 4 out of 100 people in the region of the patient:
a) What is the probability that a patient suffers from the virus V if they tested
positive?
b) What is the probability that a patient can still suffer from the disease if the
result of the test was negative?
2. Apart from assessing the patients for suffering from the virus V (in question 2.1.),
by using the test, a doctor usually also checks for other symptoms. According to a
doctor, about 85% of patients with symptoms such as fever, nausea, abdominal
discomfort, and malaise suffer from the virus V:
a) What is the probability that a patient is suffering from the virus V if they have
the symptoms mentioned above and their test result for the virus V is positive?
b) How likely is it the patient is suffering from the virus V if they have the
symptoms mentioned above, but the result of the test is negative?
3. On a certain island, 1 in 2 tsunamis are preceded by an earthquake. There have
been 4 tsunamis and 6 earthquakes in the past 100 years. A seismological station
recorded an earthquake in the ocean near the island. What is the probability that
it will result in a tsunami?
[ 43 ]