Page 401 - Understanding Machine Learning
P. 401
C.4 Singular Value Decomposition (SVD) 383
Lemma C.5. Let A ∈ R m,n with rank r. Define the following vectors:
v 1 = argmax Av
n
v∈R : v =1
v 2 = argmax Av
n
v∈R : v =1
v,v 1 =0
.
.
.
v r = argmax Av
n
v∈R : v =1
∀i<r, v,v i =0
Then, v 1 ,...,v r is an orthonormal set of right singular vectors of A.
Proof. First note that since the rank of A is r, the range of A is a subspace of
dimension r, and therefore it is easy to verify that for all i = 1,...,r, Av i > 0.
Let W ∈ R n,n be an orthonormal matrix obtained by the eigenvalue decompo-
sition of A A, namely, A A = WDW ,with D being a diagonal matrix with
D 1,1 ≥ D 2,2 ≥ ··· ≥ 0. We will show that v 1 ,...,v r are eigenvectors of A A that
correspond to nonzero eigenvalues, and, hence, using Lemma C.4 it follows that
these are also right singular vectors of A. The proof is by induction. For the basis of
the induction, note that any unit vector v can be written as v = Wx,for x = W v,
and note that x = 1. Therefore,
n
2 2
2 2 2 2 2
Av = AWx = WDW Wx = WDx = Dx = D x i .
i,i
i=1
Therefore,
n
2 2
2
max Av = max D x i .
i,i
v: v =1 x: x =1
i=1
The solution of the right-hand side is to set x = (1,0,...,0), which implies that v 1 is
the first eigenvector of A A.Since Av 1 > 0 it follows that D 1,1 > 0 as required.
For the induction step, assume that the claim holds for some 1 ≤ t ≤ r − 1. Then,
any v which is orthogonal to v 1 ,...,v t canbe writtenas v = Wx with all the first t
elements of x being zero. It follows that
n
2 2
2
max Av = max D x i .
i,i
v: v =1,∀i≤t,v v i =0 x: x =1
i=t+1
The solution of the right-hand side is the all zeros vector except x t+1 = 1. This
implies that v t+1 is the (t + 1)th column of W. Finally, since Av t+1 > 0 it follows
that D t+1,t+1 > 0 as required. This concludes our proof.
Corollary C.6 (The SVD Theorem). Let A ∈ R m,n with rank r.Then A = UDV
where D is an r ×r matrix with nonzero singular values of A and the columns of U,V
are orthonormal left and right singular vectors of A. Furthermore, for all i, D 2 is
i,i
an eigenvalue of A A,the ith column of V is the corresponding eigenvector of A A
and the ith column of U is the corresponding eigenvector of AA .