Page 138 - Proceeding of Atrans Young Researcher's Forum 2019_Neat
P. 138

“Transportation for A Better Life:
                                                                                                                       Smart Mobility for Now and Then”

                                                                                    23 August 2019, Bangkok, Thailand

             transport  users.  The  main  goal  of  SVM  is  to   hyperplane to the outline of each dataset (which is
             construct a hyperplane that distinctly separates data   called maximization of margin) (Figure 2).
             points, in order to maximize the distance from the


























                                                              is too large [55]. Using a Lagrange functional, the
                 Figure    2.   SVM     algorithm   for       solution of the SVM classification can be written as
                 classification problem                       follow [56]:   N    N

                                                                                            ( , )
                 There are two main features in SVM algorithm,        max   i  1    1 2   , i j  1   j  y y K x x j
                                                                                     i
                                                                                        i
                                                                                          j
                                                                                              i
                                                                              i
                                                                        
             which are the hypothesis spaces and the loss function         N                              (4)
             which  need  to  be  minimized  [53].  The  space  in     with    i y   0
                                                                               i
             which lies the hyperplane is induced by a kernel K            i 1
             that defines a dot product in that space [54]. The loss   where    are Lagrange  multipliers and  K  is the
             function in the SVM classification is defined as:        i
                                                              kernel  function,  which  can  be  chosen  such  as  the
                                 0                  if y f x    ( ) 1   Gaussian function as the following:
                , , ( ) 1 y f x 
              c x y f   x     ( )                 (2)
                                 
                                   
                                1 y f x                                                    2 
                                     ( )  else
                                                                       K  ,x x  exp    x   x        (5)
                                                                            i
                                                                                          i
             where x and y are coordinates of the data points, and   where     is  a  parameter  which  is  inversely
             f is the function of the hyperplane.
                 In general, the SVM classification problems are   proportional to the width of the kernel.
             done by solving the following minimization problem   To  construct  the  SVM  for  predicting  travel
             (e.g. minimize a trade-off between hypothesis space   decisions of transport users, in this study, a k-fold
             complexity and empirical error):                 cross-validation  was  applied  to  assess  the
                                                              performance of the model with the number of folds
                                       n                      was chosen as 15 folds.
                        arg min f  2   C  
                                 K       i              (3)
                                      i 1
                        subject to: y f ( ) 1x             2.3.3 Validating methods
                                  i   i      i
             where     is  slack  variable  that  estimates  error   In this study, various criteria namely confusion
                     i                                        matrix,  Root  Mean  Square  Error  (RMSE),  Mean
             committed by the algorithm at considered data point   Absolute Error (MAE) and accuracy were used to
             and C  is regularization parameter which influences   validate  the  performance  of  the  models  on  both
             the trade-off between proportion of training samples   training and testing datasets. To finely estimate the
             and hypothesis space complexity. The parameter C    robustness of the two proposed AI algorithms, 1000
             is important because overfit and high penalty for no   Monte  Carlo  simulations  were  then  used  and
             separable points which might be occurred if its value   performed.


                                                           113
   133   134   135   136   137   138   139   140   141   142   143