Page 101 - Data Science Algorithms in a Week
P. 101

Random Forest


            Now, given the partitions, let us create the branches and the child nodes.

            We add a child node [Temperature=Cold] to the node [root]. This branch classifies four
            feature(s): [['Cold', 'Breeze', 'Cloudy', 'No'], ['Cold', 'None', 'Sunny',
            'Yes'], ['Cold', 'Breeze', 'Cloudy', 'No'], ['Cold', 'Breeze', 'Cloudy',
            'No']].

            We would like to add children to the node [Temperature=Cold].

            We have the following variables available ['Wind', 'Sunshine']. As there are fewer of
            them than the parameter m=4, we consider all of them. Of these, the variable with the
            highest information gain is the variable Wind. Therefore, we will branch the node further
            on this variable. We also remove this variable from the list of the available variables for the
            children of the current node. Using the variable water Wind, we partition the data in the
            current node as follows:

                      Partition for Wind=None: [['Cold', 'None', 'Sunny', 'Yes']]
                      Partition for Wind=Breeze: [['Cold', 'Breeze', 'Cloudy', 'No'],
                      ['Cold', 'Breeze', 'Cloudy', 'No'], ['Cold', 'Breeze',
                      'Cloudy', 'No']]

            Now, given the partitions, let us create the branches and the child nodes.
            We add a child node [Wind=None] to the node [Temperature=Cold]. This branch
            classifies one feature(s): [['Cold', 'None', 'Sunny', 'Yes']]

            We would like to add children to the node [Wind=None].

            We have the following variable available['Sunshine']. As there are fewer of them than
            the parameter m=4, we consider all of them. Of these, the variable with the highest
            information gain is the variable Sunshine. Therefore, we will branch the node further on
            this variable. We also remove this variable from the list of the available variables for the
            children of the current node. For the chosen variable Sunshine, all the remaining features
            have the same value: Sunny. So, we end the branch with a leaf node. We add the leaf node
            [Play=Yes].

            We add a child node [Wind=Breeze] to the node [Temperature=Cold]. This branch
            classifies three feature(s): [['Cold', 'Breeze', 'Cloudy', 'No'], ['Cold',
            'Breeze', 'Cloudy', 'No'], ['Cold', 'Breeze', 'Cloudy', 'No']]







                                                     [ 89 ]
   96   97   98   99   100   101   102   103   104   105   106