Page 102 - Data Science Algorithms in a Week
P. 102

Random Forest


            We would like to add children to the node [Wind=Breeze].

            We have the following variable available ['Sunshine']. As there are fewer of them than
            the parameter m=4, we consider all of them. Of these, the variable with the highest
            information gain is the variable Sunshine. Therefore, we will branch the node further on
            this variable. We also remove this variable from the list of the available variables for the
            children of the current node. For the chosen variable Sunshine, all the remaining features
            have the same value: Cloudy. So, we end the branch with a leaf node. We add the leaf
            node [Play=No].

            Now, we have added all the children nodes for the node [Temperature=Cold].

            We add a child node [Temperature=Warm] to the node [root]. This branch classifies
            three feature(s): [['Warm', 'Strong', 'Cloudy', 'No'], ['Warm', 'Strong',
            'Cloudy', 'No'], ['Warm', 'Breeze', 'Sunny', 'Yes']]

            We would like to add children to the node [Temperature=Warm].

            The available variables that we have still left are ['Wind', 'Sunshine']. As there are
            fewer of them than the parameter m=4, we consider all of them. Out of these variables, the
            variable with the highest information gain is the variable Wind. Thus we will branch the
            node further on this variable. We also remove this variable from the list of the available
            variables for the children of the current node. Using the variable Wind, we partition the
            data in the current node, where each partition of the data will be for one of the new
            branches from the current node [Temperature=Warm]. We have the following partitions:

                      Partition for Wind=Breeze: [['Warm', 'Breeze', 'Sunny', 'Yes']]
                      Partition for Wind=Strong: [['Warm', 'Strong', 'Cloudy', 'No'],
                      ['Warm', 'Strong', 'Cloudy', 'No']]
            Now, given the partitions, let us form the branches and the child nodes.

            We add a child node [Wind=Breeze] to the node [Temperature=Warm]. This branch
            classifies one feature(s): [['Warm', 'Breeze', 'Sunny', 'Yes']]

            We would like to add children to the node [Wind=Breeze].












                                                     [ 90 ]
   97   98   99   100   101   102   103   104   105   106   107