Page 183 - Data Science Algorithms in a Week
P. 183
Statistics
Random variable: A function from the set of possible outcomes to the set of the
values (for example, real numbers).
Expectation: An expectation of a random variable is the limit of the average
values of the increasing sets of the values given by the random variable.
Variance: Measures the dispersion of the population from its mean.
Mathematically, the variance of a random variable X is the expected value of the
square of the difference between the random variable and the mean μ of X, i.e.
Var(X) = E[(X - μ) ].
2
Standard deviation: The deviation of the random variable X is the square root of
the variation of the variable X, i.e. SD(X)=sqrt(Var(X)).
Correlation: The measure of the dependency between the random variables.
Mathematically, for the random variables X and Y, the correlation is defined as
corr(X,Y) = E[(X - μ ) * (Y-μ )]/(SD(X) * SD(Y)).
X
Y
Causation: A dependence relation explaining the occurrence of one phenomena
through the occurrence of another phenomena. Causation implies correlation, but
not vice versa!
Slope: The variable a in the linear equation y=a*x+b.
Intercept: The variable b in the linear equation y=a*x+b.
Bayesian Inference
Let P(A), P(B) be the probabilities of A and B respectively. Let P(A|B) be the conditional
probability of A given B and P(B|A) be the probability of B given A. Then, Bayes' theorem
states:
P(A|B)=(P(B|A) * P(A))/P(B).
Distributions
Probability distribution is a function from the set of possible outcomes to the set of the
probabilities of those outcomes.
[ 171 ]