Page 390 - Understanding Machine Learning
P. 390
Appendix B
Measure Concentration
Let Z 1 ,..., Z m be an i.i.d. sequence of random variables and let µ be their mean.
The strong law of large numbers states that when m tends to infinity, the empirical
average, 1 m Z i , converges to the expected value µ, with probability 1. Measure
m i=1
concentration inequalities quantify the deviation of the empirical average from the
expectation when m is finite.
B.1 MARKOV’S INEQUALITY
We start with an inequality which is called Markov’s inequality. Let Z be a
nonnegative random variable. The expectation of Z canbe writtenasfollows:
∞
F
E[Z] = P[Z ≥ x]dx. (B.1)
x=0
Since P[Z ≥ x] is monotonically nonincreasing we obtain
a a
F F
∀a ≥ 0, E[Z] ≥ P[Z ≥ x]dx ≥ P[Z ≥ a]dx = a P[Z ≥ a]. (B.2)
x=0 x=0
Rearranging the inequality yields Markov’s inequality:
E[Z]
∀a ≥ 0, P[Z ≥ a] ≤ . (B.3)
a
For random variables that take value in [0,1], we can derive from Markov’s
inequality the following.
Lemma B.1. Let Z be a random variable that takes values in [0,1]. Assume that
E[Z] = µ. Then, for any a ∈ (0,1),
µ − (1 − a)
P[Z > 1 − a] ≥ .
a
This also implies that for every a ∈ (0,1),
µ − a
P[Z > a] ≥ ≥ µ − a.
1 − a
372