Page 112 - Data Science Algorithms in a Week
P. 112
Random Forest
│ └──[Wind=Strong]
│ └── [Play=No]
├── [Season=Spring]
│ ├── [Temperature=Cold]
│ │ └── [Play=Yes]
│ └── [Temperature=Warm]
│ └── [Play=Yes]
├── [Season=Winter]
│ └── [Play=No]
└── [Season=Summer]
└── [Play=Yes]
The total number of trees in the random forest=4.
The maximum number of the variables considered at the node is m=4.
Classication
Feature: ['Warm', 'Strong', 'Spring', '?']
Tree 0 votes for the class: No
Tree 1 votes for the class: Yes
Tree 2 votes for the class: Yes
Tree 3 votes for the class: Yes
The class with the maximum number of votes is 'Yes'. Thus the constructed
random forest classifies the feature ['Warm', 'Strong', 'Spring', '?'] into
the class 'Yes'.
2. When we construct a tree in a random forest, we use only a random subset of the
data with replacement. This is to eliminate the bias of the classifier towards
certain features. However, if we use only one tree, that tree may happen to
contain features with bias and might miss some important feature to provide an
accurate classification. So, a random forest classifier with one decision tree would
likely lead to a very poor classification. Therefore, we should construct more
decision trees in a random forest to benefit from the reduction of bias and
variance in the classification.
3. During cross-validation, we divide the data into the training and the testing data.
Training data is used to train the classifier and the test data is to evaluate which
parameters or methods would be the best fit to improve the classification.
Another advantage of cross-validation is the reduction of bias because we only
use partial data, thereby decreasing the chance of overfitting to the specific
dataset.
However, in a decision forest, we address problems that cross-validation
addresses in an alternative way. Each random decision tree is constructed
only on the subset of the data -reducing the chance of overfitting. In the end,
the classification is the combination of results from each of these trees. The
best decision in the end is not made by tuning the parameters on a test
dataset, but by taking the majority vote of all the trees with reduced bias.
[ 100 ]