Page 397 - Understanding Machine Learning

P. 397

2
B.7 Concentration of χ Variables 379
k 2

Proof. Let us write Z = i=1 X where X i ∼ N(0,1). To prove both bounds we
i
−λX 2
use Chernoff’s bounding method. For the ﬁrst inequality, we ﬁrst bound E[e 1 ],
where λ> 0 will be speciﬁed later. Since e −a ≤ 1 − a + a 2 for all a ≥ 0 we have that
2
2
4
E[e −λX 2 1] ≤ 1 − λE[X ] + λ 2 E[X ].
1 1
2
2 4
Using the well known equalities, E[X ] = 1and E[X ] = 3, and the fact that 1−a ≤
1 1
e −a we obtain that 2 3 2
3 2
2 .
E[e −λX 1] ≤ 1 − λ + λ ≤ e −λ+ λ
2
Now, applying Chernoff’s bounding method we get that

P[ − Z ≥−(1 −
)k] = P e −λZ ≥ e −(1−
)kλ

≤ e (1−
)kλ E e −λZ

2 k
= e (1−
)kλ E e −λX 1
3 2
≤ e (1−
)kλ −λk+ λ k
e
2
3 2
= e −
kλ+ kλ .
2
Choose λ =
/3 we obtain the ﬁrst inequality stated in the lemma.
For the second inequality, we use a known closed form expression for the
2
moment generating function of a χ distributed random variable:
k

2
1
∀λ< , E e λZ = (1 − 2λ) −k/2 . (B.7)
2
On the basis of the equation and using Chernoff’s bounding method we have

P[Z ≥ (1 +
)k)] = P e λZ ≥ e (1+
)kλ
−(1+
)kλ λZ
≤ e E e
= e −(1+
)kλ (1 − 2λ) −k/2
e
≤ e −(1+
)kλ kλ = e −
kλ ,
where the last inequality occurs because (1 − a) ≤ e −a . Setting λ =
/6(which is in
(0,1/2) by our assumption) we obtain the second inequality stated in the lemma.
Finally, the last inequality follows from the ﬁrst two inequalities and the union
bound.

392 393 394 395 396 397 398 399 400 401 402