Page 312 - Understanding Machine Learning
P. 312

Dimensionality Reduction
           294

                         Show that the solution of the problem is to set w to be the first principle vector
                         of x 1 ,...,x m .
                      2. Let w 1 be the first principal component as in the previous question. Now, sup-
                                                                       d
                         pose we would like to find a second unit vector, w 2 ∈ R , that maximizes the
                         variance of  w 2 ,x , but is also uncorrelated to  w 1 ,x . That is, we would like to
                         solve
                                                  argmax        Var[ w,x ].
                                           w: w =1, E[( w 1 ,x )( w,x )]=0
                         Show that the solution to this problem is to set w to be the second principal
                         component of x 1 ,...,x m .
                         Hint: Note that

                                       E[( w 1 ,x )( w,x )] = w E[xx ]w = mw Aw,


                                                                        1
                                                           1


                                     x
                         where A =  i i x .Since w is an eigenvector of A we have that the constraint
                                       i
                         E[( w 1 ,x )( w,x )] = 0 is equivalent to the constraint
                                                      w 1 ,w = 0.
                 23.5 The Relation between SVD and PCA: Use the SVD theorem (Corollary C.6)for
                      providing an alternative proof of Theorem 23.2.
                 23.6 Random Projections Preserve Inner Products: The Johnson-Lindenstrauss lemma
                      tells us that a random projection preserves distances between a finite set of vectors.
                      In this exercise you need to prove that if the set of vectors are within the unit ball,
                      then not only are the distances between any two vectors preserved, but the inner
                      product is also preserved.
                                                     d
                        Let Q be a finite set of vectors in R and assume that for every x ∈ Q we have
                       x ≤ 1.
                      1. Let δ ∈ (0,1) and n be an integer such that
                                                   *
                                                             2
                                                      6log(|Q| /δ)
                                                
 =              ≤ 3.
                                                          n
                         Prove that with probability of at least 1 − δ over a choice of a random matrix
                         W ∈ R n,d , where each element of W is independently distributed according to
                         N(0,1/n), we have
                                                 | Wu,Wv −  u,v | ≤
                         for every u,v ∈ Q.
                         Hint: Use JL to bound both   W(u+v)   and   W(u−v)  .
                                                  u+v        u−v
                                                          d
                      2. (*) Let x 1 ,...,x m be a set of vectors in R of norm at most 1, and assume that
                                                                                        2
                         these vectors are linearly separable with margin of γ . Assume that d % 1/γ .
                         Show that there exists a constant c > 0 such that if we randomly project these
                                    n
                                               2
                         vectors into R ,for n = c/γ , then with probability of at least 99% it holds that
                         the projected vectors are linearly separable with margin γ/2.
   307   308   309   310   311   312   313   314   315   316   317