Page 91 - Data Science Algorithms in a Week
P. 91

Random Forest


            We have the following variable available ['water_temperature']. As there are fewer of
            them than the parameter m=3, we consider all of them. Of these, the variable with the
            highest information gain is the variable water_temperature. Therefore, we will branch
            the node further on this variable. We also remove this variable from the list of the available
            variables for the children of the current node. For the chosen variable
            water_temperature, all the remaining features have the same value: Cold. So, we end the
            branch with a leaf node. We add the leaf node [swim=No].

            We now add a child node [swimming_suit=None] to the node [root]. This branch
            classifies two feature(s): [['None', 'Warm', 'No'], ['None', 'Warm', 'No']].

            We would like to add children to the node [swimming_suit=None].

            We have the following variable available ['water_temperature']. As there are fewer of
            them than the parameter m=3, we consider all of them. Of these, the variable with the
            highest information gain is the variable water_temperature. Therefore, we will branch
            the node further on this variable. We also remove this variable from the list of the available
            variables for the children of the current node. For the chosen variable
            water_temperature, all the remaining features have the same value: Warm. So, we end the
            branch with a leaf node. We add the leaf node [swim=No].

            We now add a child node [swimming_suit=Good] to the node [root]. This branch
            classifies three feature(s): [['Good', 'Cold', 'No'], ['Good', 'Cold', 'No'],
            ['Good', 'Cold', 'No']]
            We would like to add children to the node [swimming_suit=Good].

            We have the following variable available ['water_temperature']. As there are fewer of
            them than the parameter m=3, we consider all of them. Of these, the variable with the
            highest information gain is the variable water_temperature. Therefore, we will branch
            the node further on this variable. We also remove this variable from the list of the available
            variables for the children of the current node. For the chosen variable
            water_temperature, all the remaining features have the same value: Cold. So, we end the
            branch with a leaf node. We add the leaf node [swim=No].
            Now, we have added all the children nodes for the node [root].












                                                     [ 79 ]
   86   87   88   89   90   91   92   93   94   95   96