Page 136 - Understanding Machine Learning
P. 136
Model Selection and Validation
118
To illustrate how validation is useful for model selection, consider again the
example of fitting a one dimensional polynomial as described in the beginning of
this chapter. In the following we depict the same training set, with ERM polynomi-
als of degree 2, 3, and 10, but this time we also depict an additional validation set
(marked as red, unfilled circles). The polynomial of degree 10 has minimal training
error, yet the polynomial of degree 3 has the minimal validation error, and hence it
will be chosen as the best model.
11.2.3 The Model-Selection Curve
The model selection curve shows the training error and validation error as a function
of the complexity of the model considered. For example, for the polynomial fitting
problem mentioned previously, the curve will look like:
0.4 Train
Validation
0.3
Error 0.2
0.1
0
2 4 6 8 10
d
As can be shown, the training error is monotonically decreasing as we increase the
polynomial degree (which is the complexity of the model in our case). On the other
hand, the validation error first decreases but then starts to increase, which indicates
that we are starting to suffer from overfitting.
Plotting such curves can help us understand whether we are searching the correct
regime of our parameter space. Often, there may be more than a single parameter
to tune, and the possible number of values each parameter can take might be quite
large. For example, in Chapter 13 we describe the concept of regularization,in which
the parameter of the learning algorithm is a real number. In such cases, we start