Page 68 - Data Science Algorithms in a Week
P. 68

Decision Trees


            S small ={(small,cold,no),(small,warm,no)}

            S good ={(good,cold,no),(good,warm,yes)}
            The information entropy of S is E(S)=-(1/6)*log (1/6)-(5/6)*log (5/6)~0.65002242164.
                                                                    2
                                                       2
            The information entropy of the partitions is:

            E(S none )=-(2/2)*log (2/2)=-log (1)=0 since all instances have the class no.
                            2
                                     2
            E(S small )=0 for a similar reason.

            E(S good )=-(1/2)*log (1/2)=1
                            2
            Therefore, the information gain is:

            IG(S,swimming suit)=E(S)-[(2/6)*E(S none )+(2/6)*E(S small )+(2/6)*E(S good )]

            =0.65002242164-(1/3)=0.3166890883

            If we chose the attribute water temperature to partition the set S, what would be the
            information gain IG(S,water temperature)? The water temperature partitions the set S into
            the following sets:
            S ={(none,cold,no),(small,cold,no),(good,cold,no)}
              cold
            S warm ={(none,warm,no),(small,warm,no),(good,warm,yes)}

            Their entropies are:

            E(S )=0 as all instances are classified as no.
               cold
            E(S warm )=-(2/3)*log (2/3)-(1/3)*log (1/3)~0.91829583405
                            2
                                         2
            Therefore, the information gain from partitioning the set S by the attribute water
            temperature is:

            IG(S,water temperature)=E(S)-[(1/2)*E(S )+(1/2)*E(S warm )]
                                                cold
            = 0.65002242164-0.5*0.91829583405=0.19087450461
            This is less than IG(S,swimming suit). Therefore, we can gain more information about the set
            S (the classification of its instances) by partitioning it per the attribute swimming suit
            instead of the attribute water temperature. This finding will be the basis of the ID3
            algorithm constructing a decision tree in the next section.



                                                     [ 56 ]
   63   64   65   66   67   68   69   70   71   72   73