Page 397 - Understanding Machine Learning
P. 397

2
                                                         B.7 Concentration of χ Variables  379
                                      k   2

              Proof. Let us write Z =  i=1  X where X i ∼ N(0,1). To prove both bounds we
                                          i
                                                                                −λX 2
              use Chernoff’s bounding method. For the first inequality, we first bound E[e  1 ],
              where λ> 0 will be specified later. Since e −a  ≤ 1 − a +  a  2  for all a ≥ 0 we have that
                                                             2
                                                     2
                                                               4
                                  E[e −λX 2 1] ≤ 1 − λE[X ] +  λ 2  E[X ].
                                                     1         1
                                                         2
                                              2           4
              Using the well known equalities, E[X ] = 1and E[X ] = 3, and the fact that 1−a ≤
                                              1           1
              e −a  we obtain that       2                    3 2
                                                    3 2
                                                              2 .
                                  E[e −λX 1] ≤ 1 − λ + λ ≤ e −λ+ λ
                                                    2
              Now, applying Chernoff’s bounding method we get that

                               P[ − Z ≥−(1 − 
)k] = P e −λZ  ≥ e −(1−
)kλ

                                                ≤ e (1−
)kλ  E e −λZ
                                                           
    2     k
                                                = e (1−
)kλ  E e −λX  1
                                                             3 2
                                                ≤ e (1−
)kλ −λk+ λ k
                                                        e
                                                             2
                                                        3  2
                                                = e −
kλ+ kλ  .
                                                        2
              Choose λ = 
/3 we obtain the first inequality stated in the lemma.
                 For the second inequality, we use a known closed form expression for the
                                            2
              moment generating function of a χ distributed random variable:
                                            k
                                            
   2
                                        1
                                   ∀λ< , E e  λZ   = (1 − 2λ) −k/2 .             (B.7)
                                        2
              On the basis of the equation and using Chernoff’s bounding method we have

                                P[Z ≥ (1 + 
)k)] = P e λZ  ≥ e (1+
)kλ
                                                 −(1+
)kλ     λZ
                                               ≤ e      E e
                                               = e −(1+
)kλ  (1 − 2λ) −k/2
                                                        e
                                               ≤ e −(1+
)kλ kλ  = e −
kλ ,
              where the last inequality occurs because (1 − a) ≤ e −a . Setting λ = 
/6(which is in
              (0,1/2) by our assumption) we obtain the second inequality stated in the lemma.
                 Finally, the last inequality follows from the first two inequalities and the union
              bound.
   392   393   394   395   396   397   398   399   400   401   402