Page 312 - Understanding Machine Learning
P. 312
Dimensionality Reduction
294
Show that the solution of the problem is to set w to be the first principle vector
of x 1 ,...,x m .
2. Let w 1 be the first principal component as in the previous question. Now, sup-
d
pose we would like to find a second unit vector, w 2 ∈ R , that maximizes the
variance of w 2 ,x , but is also uncorrelated to w 1 ,x . That is, we would like to
solve
argmax Var[ w,x ].
w: w =1, E[( w 1 ,x )( w,x )]=0
Show that the solution to this problem is to set w to be the second principal
component of x 1 ,...,x m .
Hint: Note that
E[( w 1 ,x )( w,x )] = w E[xx ]w = mw Aw,
1
1
x
where A = i i x .Since w is an eigenvector of A we have that the constraint
i
E[( w 1 ,x )( w,x )] = 0 is equivalent to the constraint
w 1 ,w = 0.
23.5 The Relation between SVD and PCA: Use the SVD theorem (Corollary C.6)for
providing an alternative proof of Theorem 23.2.
23.6 Random Projections Preserve Inner Products: The Johnson-Lindenstrauss lemma
tells us that a random projection preserves distances between a finite set of vectors.
In this exercise you need to prove that if the set of vectors are within the unit ball,
then not only are the distances between any two vectors preserved, but the inner
product is also preserved.
d
Let Q be a finite set of vectors in R and assume that for every x ∈ Q we have
x ≤ 1.
1. Let δ ∈ (0,1) and n be an integer such that
*
2
6log(|Q| /δ)
= ≤ 3.
n
Prove that with probability of at least 1 − δ over a choice of a random matrix
W ∈ R n,d , where each element of W is independently distributed according to
N(0,1/n), we have
| Wu,Wv − u,v | ≤
for every u,v ∈ Q.
Hint: Use JL to bound both W(u+v) and W(u−v) .
u+v u−v
d
2. (*) Let x 1 ,...,x m be a set of vectors in R of norm at most 1, and assume that
2
these vectors are linearly separable with margin of γ . Assume that d % 1/γ .
Show that there exists a constant c > 0 such that if we randomly project these
n
2
vectors into R ,for n = c/γ , then with probability of at least 99% it holds that
the projected vectors are linearly separable with margin γ/2.