Page 115 - Data Science Algorithms in a Week
P. 115

Evolutionary Optimization of Support Vector Machines …          99

                                                       CONCLUSION


                          In this chapter, we explored the use of genetic algorithms to optimize the parameters
                       of  a  SVM  and  proposed  a  specific  variation  that  we  found  to  perform  better.  The
                       proposed algorithm uses 10-fold crossvalidation as its fitness function. Several types of
                       crossover and mutation for the genetic algorithm were implemented and compared and it
                       was found that a diagonal crossover with 4 parents and a fixed mutation rate provided the
                       best performance.
                          The SVM engine is based on a C++ version of LIBSVM (Chang and Lin, 2001). This
                       implementation  was  modified  to  include  a  kernel  that  is  a  mixture  of  Gaussian  and
                       polynomial kernels. Thus, the genetic algorithm has the flexibility to decide how much
                       weight to assign to each kernel or remove one altogether.
                          The  results  from  experiments  using  a  data  set  representing  individual  models  for
                       electronic  commerce  (Ryan,  1999)  show  that  GAs  are  able  to  find  a  good  set  of
                       parameters that in many cases lead to improve performance over using a SVM with fixed.
                          While  the  value  of  using  GAs  for  finding  optimal  parameters  might  not  seem  so
                       obvious for SVMs with simple kernels like a Gaussian RBF with only one parameter to
                       set, as  applications  continue  to  appear  and new, more  complicated  kernels (and  likely
                       with  more  parameters)  are  designed  for  specific  problems,  this  need  will  become
                       apparent. As illustration of this we created a new kernel which is a mixture of RBF and
                       complete polynomial kernel. This kernel was previously tested in regression problems by
                       other  researchers.  Here  we  found  that  it  also  gives  good  results  for  classification
                       problems.
                          It  was  also  shown  that  10  fold  crossvalidation  is  a  good  estimator  of  the
                       generalization  performance  of  support  vector  machines  and  it  allowed  us  to  guide  the
                       genetic algorithm to good values for the parameters of the SVM. In addition, we explored
                                                                                     
                       the possibility of using the efficient bound to leave-one-out known as   but we found to
                       be biased for large values of the parameter  C  .
                          Finally, we should state that this improvement in performance comes with the price
                       of  an  increased  processing  time.  This  downside  can  be  minimized  by  finding  more
                       efficient and unbiased estimates of the performance of SVMs.


                                                       REFERENCES

                       Ali,  S.  &  Smith,  K.  (2003,  October).  Automatic  parameter  selection  for  polynomial
                          kernel.  In  Information  Reuse  and  Integration,  2003.  IRI  2003.  IEEE  International
                          Conference on (pp. 243-249). IEEE.
   110   111   112   113   114   115   116   117   118   119   120