Page 92 - Data Science Algorithms in a Week
P. 92

76                              Fred K. Gruber

                       convergence,  control  of  classifier complexity,  as  well  as  a better  understanding  of  the
                       underlying  mathematical  foundations  based  on  optimization  and  statistical  learning
                       theory.
                          Nevertheless, as with most learning algorithms the practical performance depends on
                       the  selection  of  tuning  parameters  that  control  the  behaviour  and  that,  ultimately,
                       determines how good the resulting classifier is. The simplest way to find good parameter
                       values is using an exhaustive search, i.e., trying all possible combinations but this method
                       is impractical as the number of parameters increases. The problem of finding good values
                       for the parameters to improve the performance is called the model selection problem.
                          In  this  chapter  we  investigate  the  model  selection  problem  in  support  vector
                       machines  using  genetic algorithms  (GAs). The  main  contribution is to show that  GAs
                       provide an effective approach to finding good parameters for support vector machines
                       (SVMs). We describe a possible implementation of a GA and compare several variations
                       of the basic GA in terms of the convergence speed. In addition, it is shown that using a
                       convex sum of two kernels provides an effective modification of SVMs for classification
                       problems  and  not  only  for  regression  as  was  previously  shown  in  Smits  and  Jordaan
                       (2002). The algorithm is tested on a dataset that consists of information on 125 subjects
                       from  a  study  conducted  by  Ryan  (1999)  and  previously  used  for  comparing  several
                       learning algorithms in Rabelo (2001). The proposed algorithm is tested on a dataset that
                       represents individual models for electronic commerce.


                                                  LITERATURE SURVEY

                          Support  vector  machines  as  well  as  most  other  learning  algorithms  have  several
                       parameters  that  affect  their  performance  and  that  need  to  be  selected  in  advance.  For
                       SVMs,  these  parameters  include  the  penalty  value  C  ,  the  kernel type,  and  the  kernel
                       specific  parameters.  While  for  some  kernels,  like  the  Gaussian  radial  basis  function
                                                                 
                       kernel,  there  is  only  one  parameter  to  set  ( ),  more  complicated  kernels  need  an
                       increasing number of parameters. The usual way to find good values for these parameters
                       is to train different SVMs –each one with a different combination of parameter values–
                       and compare their performance on a test set or by using other generalization estimates
                       like leave one out or crossvalidation. Nevertheless, an exhaustive search of the parameter
                       space is time consuming and ineffective especially for more complicated kernels. For this
                       reason several researchers have proposed methods to find good set of parameters more
                       efficiently (see, for example, Cristianini and Shawe-Taylor et al. (1999), Chapelle et al.
                       (2002), Shao and Cherkassky (1999), and Ali and Smith (2003) for various approaches).
                          For  many  years  now,  genetic  algorithms  have  been  used  together  with  neural
                       networks.  Several  approaches  for  integrating  genetic  algorithms  and  neural  networks
   87   88   89   90   91   92   93   94   95   96   97