Page 92 - Data Science Algorithms in a Week
P. 92
Random Forest
Construction of random decision tree number 1
We are given six features as the input data. Out of these, we choose randomly six features
with replacement for the construction of this random decision tree:
[['Good', 'Warm', 'Yes'], ['None', 'Warm', 'No'], ['Good', 'Cold', 'No'],
['None', 'Cold', 'No'], ['None', 'Warm', 'No'], ['Small', 'Warm', 'No']]
The rest of the construction of random decision tree number 1 is similar to the construction
of the previous random decision tree number 0. The only difference is that the tree is built
with the different randomly generated subset (as seen above) of the initial data.
We start the construction with the root node to create the first node of the tree. We would
like to add children to the node [root].
We have the following variables available ['swimming_suit', 'water_temperature'].
As there are fewer of them than the parameter m=3, we consider all of them. Of these, the
variable with the highest information gain is the variable swimming_suit.
Therefore, we will branch the node further on this variable. We also remove this variable
from the list of the available variables for the children of the current node. Using the
variable swimming_suit, we partition the data in the current node as follows:
Partition for swimming_suit=Small: [['Small', 'Warm', 'No']]
Partition for swimming_suit=None: [['None', 'Warm', 'No'], ['None',
'Cold', 'No'], ['None', 'Warm', 'No']]
Partition for swimming_suit=Good: [['Good', 'Warm', 'Yes'],
['Good', 'Cold', 'No']]
Now, given the partitions, let us create the branches and the child nodes. We add a child
node [swimming_suit=Small] to the node [root]. This branch classifies one feature(s):
[['Small', 'Warm', 'No']].
We would like to add children to the node [swimming_suit=Small].
We have the following variable available ['water_temperature']. As there are fewer of
them than the parameter m=3, we consider all of them. Of these, the variable with the
highest information gain is the variable water_temperature. Therefore, we will branch
the node further on this variable. We also remove this variable from the list of the available
variables for the children of the current node. For the chosen variable
water_temperature, all the remaining features have the same value: Warm. So, we end
the branch with a leaf node. We add the leaf node [swim=No].
[ 80 ]