Page 142 - Data Science Algorithms in a Week
P. 142

Clustering into K Clusters


                             Step number 1: point_groups = [((2.0, 2.0), 0), ((2.0, 5.0),
                             0), ((10.0, 4.0), 1), ((3.0, 5.0), 0), ((7.0, 3.0), 1), ((5.0,
                             9.0), 0), ((2.0, 8.0), 0), ((4.0, 10.0), 0), ((7.0, 4.0), 1),
                             ((4.0, 4.0), 0), ((5.0, 8.0), 0), ((9.0, 3.0), 1)]
                             centroids = [(2.8333333333333335, 5.666666666666667),
                             (7.166666666666667, 5.166666666666667)]
                             Step number 2: point_groups = [((2.0, 2.0), 0), ((2.0, 5.0),
                             0), ((10.0, 4.0), 1), ((3.0, 5.0), 0), ((7.0, 3.0), 1), ((5.0,
                             9.0), 0), ((2.0, 8.0), 0), ((4.0, 10.0), 0), ((7.0, 4.0), 1),
                             ((4.0, 4.0), 0), ((5.0, 8.0), 0), ((9.0, 3.0), 1)]
                             centroids = [(3.375, 6.375), (8.25, 3.5)]

                          Output for 3 clusters:

                             $ python k-means_clustering.py problem5_2b.csv 3 last
                             The total number of steps: 2
                             The history of the algorithm:
                             Step number 0: point_groups = [((2.0, 2.0), 0), ((2.0, 5.0),
                             0), ((10.0, 4.0), 1), ((3.0, 5.0), 0), ((7.0, 3.0), 1), ((5.0,
                             9.0), 2), ((2.0, 8.0), 2), ((4.0, 10.0), 2), ((7.0, 4.0), 1),
                             ((4.0, 4.0), 0), ((5.0, 8.0), 2), ((9.0, 3.0), 1)]
                             centroids = [(2.0, 2.0), (10.0, 4.0), (4.0, 10.0)]
                             Step number 1: point_groups = [((2.0, 2.0), 0), ((2.0, 5.0),
                             0), ((10.0, 4.0), 1), ((3.0, 5.0), 0), ((7.0, 3.0), 1), ((5.0,
                             9.0), 2), ((2.0, 8.0), 2), ((4.0, 10.0), 2), ((7.0, 4.0), 1),
                             ((4.0, 4.0), 0), ((5.0, 8.0), 2), ((9.0, 3.0), 1)]
                             centroids = [(2.75, 4.0), (8.25, 3.5), (4.0, 8.75)]
                          Output for 4 clusters:

                             $ python k-means_clustering.py problem5_2b.csv 4 last
                             The total number of steps: 2
                             The history of the algorithm:
                             Step number 0: point_groups = [((2.0, 2.0), 0), ((2.0, 5.0),
                             3), ((10.0, 4.0), 1), ((3.0, 5.0), 3), ((7.0, 3.0), 1), ((5.0,
                             9.0), 2), ((2.0, 8.0), 2), ((4.0, 10.0), 2), ((7.0, 4.0), 1),
                             ((4.0, 4.0), 3), ((5.0, 8.0), 2), ((9.0, 3.0), 1)]
                             centroids = [(2.0, 2.0), (10.0, 4.0), (4.0, 10.0), (3.0, 5.0)]
                             Step number 1: point_groups = [((2.0, 2.0), 0), ((2.0, 5.0),
                             3), ((10.0, 4.0), 1), ((3.0, 5.0), 3), ((7.0, 3.0), 1), ((5.0,
                             9.0), 2), ((2.0, 8.0), 2), ((4.0, 10.0), 2), ((7.0, 4.0), 1),
                             ((4.0, 4.0), 3), ((5.0, 8.0), 2), ((9.0, 3.0), 1)]
                             centroids = [(2.0, 2.0), (8.25, 3.5), (4.0, 8.75), (3.0,
                             4.666666666666667)]







                                                    [ 130 ]
   137   138   139   140   141   142   143   144   145   146   147