Page 136 - Understanding Machine Learning

P. 136

Model Selection and Validation
118

To illustrate how validation is useful for model selection, consider again the
example of ﬁtting a one dimensional polynomial as described in the beginning of
this chapter. In the following we depict the same training set, with ERM polynomi-
als of degree 2, 3, and 10, but this time we also depict an additional validation set
(marked as red, unﬁlled circles). The polynomial of degree 10 has minimal training
error, yet the polynomial of degree 3 has the minimal validation error, and hence it
will be chosen as the best model.

11.2.3 The Model-Selection Curve

The model selection curve shows the training error and validation error as a function
of the complexity of the model considered. For example, for the polynomial ﬁtting
problem mentioned previously, the curve will look like:

0.4 Train
Validation
0.3

Error 0.2

0.1

0
2 4 6 8 10
d

As can be shown, the training error is monotonically decreasing as we increase the
polynomial degree (which is the complexity of the model in our case). On the other
hand, the validation error ﬁrst decreases but then starts to increase, which indicates
that we are starting to suffer from overﬁtting.
Plotting such curves can help us understand whether we are searching the correct
regime of our parameter space. Often, there may be more than a single parameter
to tune, and the possible number of values each parameter can take might be quite
large. For example, in Chapter 13 we describe the concept of regularization,in which
the parameter of the learning algorithm is a real number. In such cases, we start

131 132 133 134 135 136 137 138 139 140 141