Page 85 - Data Science Algorithms in a Week
P. 85
Decision Trees
Here note that the information entropy of the multisets that have more than two
classes is greater than 1, so we need more than one bit of information to represent
the result. But is this true for every multiset that has more than two classes of
elements?
2. E({10% of heads, 90% of tails})=-0.1*log (0.1)-(0.9)*log (0.9)=0.46899559358
2
2
3. a) The information gains for the three attributes are as follows:
IG(S,temperature)=0.0954618442383
IG(S,wind)=0.0954618442383
IG(S,season)=0.419973094022
b) Therefore, we would choose the attribute season to branch from the root
node as it has the highest information gain. Alternatively, we can put all the
input data into the program to construct a decision tree:
Root
├── [Season=Autumn]
│ ├──[Wind=Breeze]
│ │ └──[Play=Yes]
│ ├──[Wind=Strong]
│ │ └──[Play=No]
│ └──[Wind=None]
│ └──[Play=Yes]
├── [Season=Summer]
│ ├──[Temperature=Hot]
│ │ └──[Play=Yes]
│ └──[Temperature=Warm]
│ └──[Play=Yes]
├── [Season=Winter]
│ └──[Play=No]
└── [Season=Spring]
├── [Temperature=Hot]
│ └──[Play=No]
├── [Temperature=Warm]
│ └──[Play=Yes]
└── [Temperature=Cold]
└── [Play=Yes]
c) According to the constructed decision tree, we would classify the data
sample (warm,strong,spring,?) to the class Play=Yes by going to the
bottommost branch from the root node and then arriving to the leaf node by
taking the middle branch.
[ 73 ]