Page 120 - Data Science Algorithms in a Week
P. 120

Clustering into K Clusters


            The red cluster with the features (155,46), (164,53), (162,52), (166,55) will have the centroid
            ((155+164+162+166)/4,(46+53+52+55)/4)=(161.75, 51.5).

            Reclassifying the points using the new centroid, the classes of the points do not change. The
            blue cluster will have the points (180,75), (174,71), (184,83), (168,63), (178,70), (170,59),
            (172,60). The red cluster will have the points (155,46), (164,53), (162,52), (166,55). Therefore
            the clustering algorithm terminates with clusters as displayed in the following image 5.2:


































                                         Image 5.2: Clustering of people by their height and weight
            Now we would like to classify the instance (172,60) as to whether it is a male or a female.
            The instance (172,60) is in the blue cluster. So it is similar to the features in the blue cluster.
            Are the remaining features in the blue cluster more likely males or females? 5 out of 6
            features are males, only 1 is a female. Since the majority of the features are males in the blue
            cluster and the person (172,60) is in the blue cluster as well, we classify the person with the
            height 172cm and the weight 60kg as a male.










                                                    [ 108 ]
   115   116   117   118   119   120   121   122   123   124   125