Page 397 - Understanding Machine Learning
P. 397
2
B.7 Concentration of χ Variables 379
k 2
Proof. Let us write Z = i=1 X where X i ∼ N(0,1). To prove both bounds we
i
−λX 2
use Chernoff’s bounding method. For the first inequality, we first bound E[e 1 ],
where λ> 0 will be specified later. Since e −a ≤ 1 − a + a 2 for all a ≥ 0 we have that
2
2
4
E[e −λX 2 1] ≤ 1 − λE[X ] + λ 2 E[X ].
1 1
2
2 4
Using the well known equalities, E[X ] = 1and E[X ] = 3, and the fact that 1−a ≤
1 1
e −a we obtain that 2 3 2
3 2
2 .
E[e −λX 1] ≤ 1 − λ + λ ≤ e −λ+ λ
2
Now, applying Chernoff’s bounding method we get that
P[ − Z ≥−(1 −
)k] = P e −λZ ≥ e −(1−
)kλ
≤ e (1−
)kλ E e −λZ
2 k
= e (1−
)kλ E e −λX 1
3 2
≤ e (1−
)kλ −λk+ λ k
e
2
3 2
= e −
kλ+ kλ .
2
Choose λ =
/3 we obtain the first inequality stated in the lemma.
For the second inequality, we use a known closed form expression for the
2
moment generating function of a χ distributed random variable:
k
2
1
∀λ< , E e λZ = (1 − 2λ) −k/2 . (B.7)
2
On the basis of the equation and using Chernoff’s bounding method we have
P[Z ≥ (1 +
)k)] = P e λZ ≥ e (1+
)kλ
−(1+
)kλ λZ
≤ e E e
= e −(1+
)kλ (1 − 2λ) −k/2
e
≤ e −(1+
)kλ kλ = e −
kλ ,
where the last inequality occurs because (1 − a) ≤ e −a . Setting λ =
/6(which is in
(0,1/2) by our assumption) we obtain the second inequality stated in the lemma.
Finally, the last inequality follows from the first two inequalities and the union
bound.