Page 102 - Data Science Algorithms in a Week
P. 102

86                              Fred K. Gruber

                       the  different  classifiers  that  are  obtained  for  the  different  value  of  the  parameters.  As
                       indicated  previously,  several  methods  try  to  estimate  the  generalization  error  of  a
                       classifier. Contrary to other applications of GAs, the objective function in this problem is
                       a random variable with associated variance and it is computationally expensive since it
                       involves  training  a  learning  algorithm.  In  order  to  decide  which  method  to  use,  we
                       developed several experiments in order to find the estimator with the lowest variance.
                       The results are summarized in Table 2.
                          The hold out technique had the highest standard deviation. Stratifying the method,
                       i.e., keeping the same ratio between classes in the training and testing set slightly reduced
                       the standard deviation. All crossvalidation estimates had a significantly lower standard
                       deviation than the hold out technique.

                                   Table 2. Mean and standard deviation of different types of
                                                generalization error estimates

                        Technique                                     Mean (%)   Standard Deviation (%)
                        10 fold Stratified Modified Crossvalidation   86.830     0.461
                        Modified Crossvalidation                      86.791     0.463
                        Stratified Crossvalidation                    86.681     0.486
                        Crossvalidation                               86.617     0.496
                        5 fold Stratified Modified Crossvalidation    86.847     0.540
                        5 fold Stratified Crossvalidation             86.567     0.609
                        5 fold Crossvalidation                        86.540     0.629
                        Stratified hold out                           86.215     1.809
                        Hold out                                      86.241     1.977

                          Since there is no statistically significant difference in the standard deviation between
                       the  different  crossvalidation  techniques,  we  use  one  of  the  most  common:  10-fold
                       crossvalidation.
                          We  also  considered  an  approximation  of  the  leave-one-out  estimator  that  was
                       proposed in Joachims  (1999)  but  we found that  the  estimated  error  diverged  from  the
                       crossvalidation estimates for large values of the parameter C . This behaviour was also
                       observed in the work of Duan et al. (2003).


                       Crossover, Selection, and Mutation

                          Several crossover operators are tested: one point, two point, uniform, and multiparent
                       diagonal.
   97   98   99   100   101   102   103   104   105   106   107