Page 401 - Understanding Machine Learning
P. 401

C.4 Singular Value Decomposition (SVD)  383


              Lemma C.5. Let A ∈ R m,n  with rank r. Define the following vectors:
                                        v 1 = argmax  Av
                                               n
                                            v∈R : v =1
                                        v 2 = argmax  Av
                                               n
                                            v∈R : v =1
                                              v,v 1  =0
                                        .
                                        .
                                        .
                                        v r =  argmax   Av
                                                n
                                             v∈R : v =1
                                            ∀i<r,  v,v i  =0

              Then, v 1 ,...,v r is an orthonormal set of right singular vectors of A.
              Proof. First note that since the rank of A is r, the range of A is a subspace of
              dimension r, and therefore it is easy to verify that for all i = 1,...,r,  Av i   > 0.
              Let W ∈ R n,n  be an orthonormal matrix obtained by the eigenvalue decompo-
              sition of A A, namely, A A = WDW ,with D being a diagonal matrix with




              D 1,1 ≥ D 2,2 ≥ ··· ≥ 0. We will show that v 1 ,...,v r are eigenvectors of A A that
              correspond to nonzero eigenvalues, and, hence, using Lemma C.4 it follows that
              these are also right singular vectors of A. The proof is by induction. For the basis of

              the induction, note that any unit vector v can be written as v = Wx,for x = W v,
              and note that  x = 1. Therefore,
                                                                       n

                        2         2         
    2         2       2       2  2
                     Av  = AWx  = WDW Wx  = WDx  = Dx  =                 D x i .
                                                                          i,i
                                                                      i=1
              Therefore,
                                                       n
                                                           2  2
                                             2
                                    max  Av  = max       D x i .
                                                           i,i
                                    v: v =1      x: x =1
                                                      i=1
              The solution of the right-hand side is to set x = (1,0,...,0), which implies that v 1 is

              the first eigenvector of A A.Since  Av 1   > 0 it follows that D 1,1 > 0 as required.
              For the induction step, assume that the claim holds for some 1 ≤ t ≤ r − 1. Then,
              any v which is orthogonal to v 1 ,...,v t canbe writtenas v = Wx with all the first t
              elements of x being zero. It follows that
                                                            n
                                                                2  2
                                                 2
                                    max       Av  = max       D x i .
                                                                i,i
                              v: v =1,∀i≤t,v v i =0  x: x =1

                                                          i=t+1
              The solution of the right-hand side is the all zeros vector except x t+1 = 1. This
              implies that v t+1 is the (t + 1)th column of W. Finally, since  Av t+1   > 0 it follows
              that D t+1,t+1 > 0 as required. This concludes our proof.
              Corollary C.6 (The SVD Theorem). Let A ∈ R m,n  with rank r.Then A = UDV
              where D is an r ×r matrix with nonzero singular values of A and the columns of U,V
              are orthonormal left and right singular vectors of A. Furthermore, for all i, D 2  is
                                                                                  i,i


              an eigenvalue of A A,the ith column of V is the corresponding eigenvector of A A

              and the ith column of U is the corresponding eigenvector of AA .
   396   397   398   399   400   401   402   403   404   405   406