Page 208 - Understanding Machine Learning
P. 208

17





                 Multiclass, Ranking, and Complex
                 Prediction Problems













                 Multiclass categorization is the problem of classifying instances into one of several
                 possible target classes. That is, we are aiming at learning a predictor h : X → Y,
                 where Y is a finite set of categories. Applications include, for example, categorizing
                 documents according to topic (X is the set of documents and Y is the set of possible
                 topics) or determining which object appears in a given image (X is the set of images
                 and Y is the set of possible objects).
                    The centrality of the multiclass learning problem has spurred the development of
                 various approaches for tackling the task. Perhaps the most straightforward approach
                 is a reduction from multiclass classification to binary classification. In Section 17.1
                 we discuss the most common two reductions as well as the main drawback of the
                 reduction approach.
                    We then turn to describe a family of linear predictors for multiclass problems.
                 Relying on the RLM and SGD frameworks from previous chapters, we describe
                 several practical algorithms for multiclass prediction.
                    In Section 17.3 we show how to use the multiclass machinery for complex pre-
                 diction problems in which Y can be extremely large but has some structure on it.
                 This task is often called structured output learning. In particular, we demonstrate
                 this approach for the task of recognizing handwritten words, in which Y is the set of
                 all possible strings of some bounded length (hence, the size of Y is exponential in
                 the maximal length of a word).
                    Finally, in Section 17.4 and Section 17.5 we discuss ranking problems in which
                 the learner should order a set of instances according to their “relevance.” A typical
                 application is ordering results of a search engine according to their relevance to the
                 query. We describe several performance measures that are adequate for assessing
                 the performance of ranking predictors and describe how to learn linear predictors
                 for ranking problems efficiently.



                 17.1 ONE-VERSUS-ALL AND ALL-PAIRS
                 The simplest approach to tackle multiclass prediction problems is by reduc-
                 tion to binary classification. Recall that in multiclass prediction we would like


           190
   203   204   205   206   207   208   209   210   211   212   213