Page 641 - NGTU_paper_withoutVideo
P. 641
Modern Geomatics Technologies and Applications
( + ). ( + )
1 = (3)
( + + + )²
( + ). ( + )
2 = (4)
( + + + )²
ₑ = 1 + 2 (5)
The Kappa index is obtained through the following relation:
( – ₑ)
Ƙ = (6)
(1 − ₑ)
where TP is the true positive, TN is the true negative, FN is the false negative, FT is the false positive, OA is the overall
accuracy and Ƙ is the Kappa index.
Finally, in this paper, classification maps will be generated for methods and identified effective parameters.
3.1. Decision tree models
A decision tree is a tree like collection of nodes intended to create a decision on values affiliation to a class or an estimate
of a numerical target value. Each node represents a splitting rule for one specific Attribute. For classification, this rule separates
values belonging to different classes and for regression, it separates them in order to reduce the error in an optimal way for the
selected parameter criterion. The building of new nodes is repeated until the stopping criteria are met. A prediction for the class
label Attribute is determined depending on the majority of Examples which reached this leaf during generation, while an
estimation for a numerical value is obtained by averaging the values in a leaf. The used decision tree algorithms are given below
:
3.1.1. CART
Classification And Regression Trees (CART) algorithm[19] is a classification algorithm for building a decision tree based
on Gini’s impurity index as splitting criterion. CART is a binary tree build by splitting node into two child nodes repeatedly. The
algorithm works repeatedly in three steps:
1. Find each feature’s best split. For each feature with K different values, there exist K-1 possible splits. Find the split,
which maximizes the splitting criterion. The resulting set of splits contains best splits (one for each feature).
2. Find the node’s best split. Among the best splits from Step 1 find the one, which maximizes the splitting criterion.
3. Split the node using best node split from Step 2 and repeat from Step 1 until stopping criterion is satisfied.
As splitting criterion, we used Gini’s impurity index defined for node t as[19]:
( ) = ∑ ( | ) ( | ) ( | ) (7)
,
where C(i|j) is the cost of misclassifying a class j case as a class i case (in our case C(i|j) = 1, if i≠j and C(i|j) = 0 if i = j,
p(i|t) (p(j|t) respectively) is probability of case in class i(j) given that falls into node t.
The Gini impurity criterion is type of decrease of impurity, which is defined as[19]:
∆ ( , ) = ( ) − ( ) − ( ) (8)
5