Page 119 - Data Science Algorithms in a Week
P. 119
Clustering into K Clusters
So let us apply k-means clustering algorithm to the data we have. First we pick up the initial
centroids. Let the first centroid be for example a person with the height 180cm and the
weight 75kg denoted in a vector as (180,75). Then the point that is furthest away from
(180,75) is (155,46). So that will be the second centroid.
The points that are closer to the first centroid (180,75) by taking Euclidean distance are
(180,75), (174,71), (184,83), (168,63), (178,70), (170,59), (172,60). So these points will be in the
first cluster. The points that are closer to the second centroid (155,46) are (155,46), (164,53),
(162,52), (166,55). So these points will be in the second cluster. We display the current
situation of these two clusters in Image 5.1. below.
Image 5.1: Clustering of people by their height and weight
Let us recompute the centroids of the clusters. The blue cluster with the features (180,75),
(174,71), (184,83), (168,63), (178,70), (170,59), (172,60) will have the centroid
((180+174+184+168+178+170+172)/7,(75+71+83+63+70+59+60)/7)~(175.14,68.71).
[ 107 ]