Page 415 - Understanding Machine Learning
P. 415

Index  397


              Nearest Neighbor, 219                ridge regression, 138
               k-NN, 220                            kernel ridge regression, 188
              neural networks, 228                 RIP, 286
               feedforward networks, 229           risk, 14, 24, 26
               layered networks, 229               RLM, 137, 164
               SGD, 236
              No-Free-Lunch, 37                    sample complexity, 22
              nonuniform learning, 59              Sauer’s lemma, 49
              Normalized Discounted Cumulative Gain, see  self-boundedness, 130
                 NDCG                              sensitivity, 206
                                                   SGD, 156
              Occam’s razor, 65                    shattering, 45, 352
              OMP, 312                             single linkage, 267
              one-vs.-all, 191, 353                Singular Value Decomposition, see SVD
              one-vs.-rest, see one-vs.-all        Slud’s inequality, 378
              online convex optimization, 257      smoothness, 129, 143, 163
              online gradient descent, 257         SOA, 250
              online learning, 245                 sparsity-inducing norms, 315
              optimization error, 135              specificity, 206
              oracle inequality, 145               spectral clustering, 271
              orthogonal matching pursuit, see OMP  SRM, 60, 115
              overfitting, 15, 41, 121              stability, 139
                                                   Stochastic Gradient Descent, see SGD
              PAC, 22                              strong learning, 102
               agnostic PAC, 23, 25                Structural Risk Minimization, see SRM
               agnostic PAC for general loss, 27   structured output prediction, 198
              PAC-Bayes, 364                       subgradient, 154
              parametric density estimation, 295   Support Vector Machines, see SVM
              PCA, 279                             SVD, 381
              Pearson’s correlation coefficient, 311  SVM, 167, 333
              Perceptron, 92                        duality, 175
               kernelized Perceptron, 188           generalization bounds, 172, 333
               multiclass, 211                      hard-SVM, 168, 169
               online, 258                          homogenous, 170
              permutation matrix, 205               kernel trick, 181
              polynomial regression, 96             soft-SVM, 171
              precision, 206                        support vectors, 175
              predictor, 14
              prefix free language, 64              target set, 26
              Principal Component Analysis, see PCA  term frequency, 194
              prior knowledge, 39                  TF-IDF, 194
              Probably Approximately Correct, see PAC  training error, 15
              projection, 159                      training set, 13
               projection lemma, 159               true error, 14, 24
              proper, 28
              pruning, 216                         underfitting, 41, 121
                                                   uniform convergence, 31, 32
              Rademacher complexity, 325           union bound, 19
              random forests, 217                  unsupervised learning, 265
              random projections, 283
              ranking, 201                         validation, 114, 116
               bipartite, 206                       cross validation, 119
              realizability, 17                     train-validation-test split, 120
              recall, 206                          Vapnik-Chervonenkis dimension, see VC
              regression, 26, 94, 138                 dimension
              regularization, 137                  VC dimension, 43, 46
               Tikhonov, 138, 140                  version space, 247
              regularized loss minimization, see RLM  Viola-Jones, 110
              representation independent, 28, 80
              representative sample, 31, 325       weak learning, 101, 102
              representer theorem, 182             Weighted-Majority, 252
   410   411   412   413   414   415   416