Page 103 - Data Science Algorithms in a Week
P. 103

Random Forest


            We have the following variable available ['Sunshine']. As there are fewer of them than
            the parameter m=4, we consider all of them. Of these, the variable with the highest
            information gain is the variable Sunshine. Therefore, we will branch the node further on
            this variable. We also remove this variable from the list of the available variables for the
            children of the current node. For the chosen variable Sunshine, all the remaining features
            have the same value: Sunny. So, we end the branch with a leaf node. We add the leaf node
            [Play=Yes].

            We add a child node [Wind=Strong] to the node [Temperature=Warm]. This branch
            classifies two feature(s): [['Warm', 'Strong', 'Cloudy', 'No'], ['Warm',
            'Strong', 'Cloudy', 'No']]
            We would like to add children to the node [Wind=Strong].

            We have the following variable available ['Sunshine']. As there are fewer of them than
            the parameter m=4, we consider all of them. Of these, the variable with the highest
            information gain is the variable: Sunshine. Therefore, we will branch the node further on
            this variable. We also remove this variable from the list of the available variables for the
            children of the current node. For the chosen variable Sunshine, all the remaining features
            have the same value: Cloudy. So, we end the branch with a leaf node. We add the leaf node
            [Play=No].

            Now, we have added all the children nodes for the node [Temperature=Warm].

            We add a child node [Temperature=Hot] to the node [root]. This branch classifies three
            feature(s): [['Hot', 'Breeze', 'Cloudy', 'Yes'], ['Hot', 'Breeze',
            'Cloudy', 'Yes'], ['Hot', 'Breeze', 'Cloudy', 'Yes']]

            We would like to add children to the node [Temperature=Hot].

            We have the following variables available ['Wind', 'Sunshine']. As there are fewer of
            them than the parameter m=4, we consider all of them. Of these, the variable with the
            highest information gain is the variable Wind. Therefore, we will branch the node further
            on this variable. We also remove this variable from the list of the available variables for the
            children of the current node. For the chosen variable Wind, all the remaining features have
            the same value: Breeze. So, we end the branch with a leaf node. We add the leaf node
            [Play=Yes].

            Now, we have added all the children nodes for the node [root].








                                                     [ 91 ]
   98   99   100   101   102   103   104   105   106   107   108