Page 89 - Data Science Algorithms in a Week
P. 89

Random Forest


            Swim preference - analysis with random

            forest

            We will use the example from the previous chapter about the swim preference. We have the
            same data table:

             Swimming suit Water temperature Swim preference
             None            Cold               No

             None            Warm               No
             Small           Cold               No
             Small           Warm               No

             Good            Cold               No
             Good            Warm               Yes

            We would like to construct a random forest from this data and use it to classify an item
            (Good,Cold,?).

            Analysis:

            We are given M=3 variables according to which a feature can be classified. In a random
            forest algorithm, we usually do not use all three variables to form tree branches at each
            node. We use only m variables out of M. So we choose m such that m is less than or equal to
            M. The greater m is, the stronger the classifier is in each constructed tree. However, as
            mentioned earlier, more data leads to more bias. But, because we use multiple trees (with
            smaller m), even if each constructed tree is a weak classifier, their combined classification
            accuracy is strong. As we want to reduce a bias in a random forest, we may want to
            consider to choose a parameter m to be slightly less than M.
            Thus we choose the maximum number of the variables considered at the node to be
            m=min(M,math.ceil(2*math.sqrt(M)))=min(M,math.ceil(2*math.sqrt(3)))=3.

            We are given the following features:
                [['None', 'Cold', 'No'], ['None', 'Warm', 'No'], ['Small', 'Cold', 'No'],
                ['Small', 'Warm', 'No'], ['Good', 'Cold', 'No'], ['Good', 'Warm', 'Yes']]








                                                     [ 77 ]
   84   85   86   87   88   89   90   91   92   93   94