Page 300 - Understanding Machine Learning
P. 300

Dimensionality Reduction
           282

                                                   PCA
                           input
                            Amatrix of m examples X ∈ R m,d
                            number of components n
                           if (m > d)

                            A = X X
                            Let u 1 ,...,u n be the eigenvectors of A with largest eigenvalues
                           else
                            B = XX
                            Let v 1 ,...,v n be the eigenvectors of B with largest eigenvalues
                                                  1
                            for i = 1,...,n set u i =  X v i

                                                 X v i
                           output: u 1 ,...,u n
                                                                       2
                    To illustrate how PCA works, let us generate vectors in R that approximately
                                                                     2
                 reside on a line, namely, on a one dimensional subspace of R . For example, suppose
                 that each example is of the form (x,x + y)where x is chosen uniformly at random
                 from [ − 1,1] and y is sampled from a Gaussian distribution with mean 0 and stan-
                 dard deviation of 0.1. Suppose we apply PCA on this data. Then, the eigenvector
                                                                                √    √
                 corresponding to the largest eigenvalue will be close to the vector (1/ 2,1/ 2).
                 When projecting a point (x,x + y) on this principal component we will obtain the
                        √ . The reconstruction of the original vector will be ((x + y/2),(x + y/2)).
                 scalar  2x+y
                         2
                 In Figure 23.1 we depict the original versus reconstructed data.

                       1.5



                         1



                       0.5


                         0



                      −0.5



                        −1



                      −1.5
                         −1.5      −1       −0.5      0        0.5       1        1.5
                                              2
                 Figure 23.1. A set of vectors in R (x’s) and their reconstruction after dimensionality
                             1
                 reduction to R using PCA (circles).
   295   296   297   298   299   300   301   302   303   304   305