Page 138 - Proceeding of Atrans Young Researcher's Forum 2019_Neat
P. 138
“Transportation for A Better Life:
Smart Mobility for Now and Then”
23 August 2019, Bangkok, Thailand
transport users. The main goal of SVM is to hyperplane to the outline of each dataset (which is
construct a hyperplane that distinctly separates data called maximization of margin) (Figure 2).
points, in order to maximize the distance from the
is too large [55]. Using a Lagrange functional, the
Figure 2. SVM algorithm for solution of the SVM classification can be written as
classification problem follow [56]: N N
( , )
There are two main features in SVM algorithm, max i 1 1 2 , i j 1 j y y K x x j
i
i
j
i
i
which are the hypothesis spaces and the loss function N (4)
which need to be minimized [53]. The space in with i y 0
i
which lies the hyperplane is induced by a kernel K i 1
that defines a dot product in that space [54]. The loss where are Lagrange multipliers and K is the
function in the SVM classification is defined as: i
kernel function, which can be chosen such as the
0 if y f x ( ) 1 Gaussian function as the following:
, , ( ) 1 y f x
c x y f x ( ) (2)
1 y f x 2
( ) else
K ,x x exp x x (5)
i
i
where x and y are coordinates of the data points, and where is a parameter which is inversely
f is the function of the hyperplane.
In general, the SVM classification problems are proportional to the width of the kernel.
done by solving the following minimization problem To construct the SVM for predicting travel
(e.g. minimize a trade-off between hypothesis space decisions of transport users, in this study, a k-fold
complexity and empirical error): cross-validation was applied to assess the
performance of the model with the number of folds
n was chosen as 15 folds.
arg min f 2 C
K i (3)
i 1
subject to: y f ( ) 1x 2.3.3 Validating methods
i i i
where is slack variable that estimates error In this study, various criteria namely confusion
i matrix, Root Mean Square Error (RMSE), Mean
committed by the algorithm at considered data point Absolute Error (MAE) and accuracy were used to
and C is regularization parameter which influences validate the performance of the models on both
the trade-off between proportion of training samples training and testing datasets. To finely estimate the
and hypothesis space complexity. The parameter C robustness of the two proposed AI algorithms, 1000
is important because overfit and high penalty for no Monte Carlo simulations were then used and
separable points which might be occurred if its value performed.
113