Page 414 - Understanding Machine Learning
P. 414

Index
           396

                 estimation error, 37, 40              label, 13
                 Expectation-Maximization, see EM      Lasso, 316, 335
                                                        generalization bounds, 335
                 face recognition, see Viola-Jones     latent variables, 301
                 feasible, 73                          LDA, 300
                 feature, 13                           Ldim, 248, 249
                 feature learning, 319                 learning curves, 122
                 feature normalization, 316            least squares, 95
                 feature selection, 309, 310           likelihood ratio, 301
                 feature space, 179                    linear discriminant analysis, see LDA
                 feature transformations, 318          linear predictor, 89
                 filters, 310                            homogenous, 90
                 forward greedy selection, 312         linear programming, 91
                 frequentist, 305                      linear regression, 94
                                                       linkage, 266
                 gain, 215                             Lipschitzness, 128, 142, 157
                 GD, see gradient descent               subgradient, 155
                 generalization error, 14              Littlestone dimension, see Ldim
                 generative models, 295                local minimum, 126
                 Gini index, 215                       logistic regression, 97
                 Glivenko-Cantelli, 35                 loss, 15
                 gradient, 126                         loss function, 26
                 gradient descent, 151                  0-1 loss, 27, 134
                 Gram matrix, 183                       absolute value loss, 95, 99, 133
                 growth function, 49                    convex loss, 131
                                                        generalized hinge loss, 195
                 halfspace, 90                          hinge loss, 134
                   homogenous, 90, 170                  Lipschitz loss, 133
                   nonseparable, 90                     log-loss, 298
                   separable, 90                        logistic loss, 98
                 Halving, 247                           ramp loss, 174
                 hidden layers, 230                     smooth loss, 133
                 Hilbert space, 181                     square loss, 27
                 Hoeffding’s inequality, 33, 375        surrogate loss, 134, 259
                 holdout, 116
                 hypothesis, 14                        margin, 168
                 hypothesis class, 16                  Markov’s inequality, 372
                                                       Massart lemma, 330
                 i.i.d., 18                            max linkage, 267
                 ID3, 214                              maximum a posteriori, 307
                 improper, see representation independent  maximum likelihood, 295
                 inductive bias, see bias              McDiarmid’s inequality, 328
                 information bottleneck, 273           MDL, 63, 65, 213
                 information gain, 215                 measure concentration, 32, 372
                 instance, 13                          Minimum Description Length, see MDL
                   instance space, 13                  mistake bound, 246
                 integral image, 113                   mixture of Gaussians, 301
                                                       model selection, 114, 117
                 Johnson-Lindenstrauss lemma, 284      multiclass, 25, 190, 351
                                                        cost-sensitive, 194
                 k-means, 268, 270                      linear predictors, 193, 354
                   soft k-means, 304                    multivector, 193, 355
                 k-median, 269                          Perceptron, 211
                 k-medoids, 269                         reductions, 190, 354
                 Kendall tau, 201                       SGD, 198
                 kernel PCA, 281                        SVM, 197
                 kernels, 179                          multivariate performance measures, 206
                   Gaussian kernel, 184
                   kernel trick, 181                   Naive Bayes, 299
                   polynomial kernel, 183              Natarajan dimension, 351
                   RBF kernel, 184                     NDCG, 202
   409   410   411   412   413   414   415   416