Page 119 - Data Science Algorithms in a Week
P. 119

Clustering into K Clusters


            So let us apply k-means clustering algorithm to the data we have. First we pick up the initial
            centroids. Let the first centroid be for example a person with the height 180cm and the
            weight 75kg denoted in a vector as (180,75). Then the point that is furthest away from
            (180,75) is (155,46). So that will be the second centroid.
            The points that are closer to the first centroid (180,75) by taking Euclidean distance are
            (180,75), (174,71), (184,83), (168,63), (178,70), (170,59), (172,60). So these points will be in the
            first cluster. The points that are closer to the second centroid (155,46) are (155,46), (164,53),
            (162,52), (166,55). So these points will be in the second cluster. We display the current
            situation of these two clusters in Image 5.1. below.







































                                         Image 5.1: Clustering of people by their height and weight
            Let us recompute the centroids of the clusters. The blue cluster with the features (180,75),
            (174,71), (184,83), (168,63), (178,70), (170,59), (172,60) will have the centroid
            ((180+174+184+168+178+170+172)/7,(75+71+83+63+70+59+60)/7)~(175.14,68.71).






                                                    [ 107 ]
   114   115   116   117   118   119   120   121   122   123   124