Page 27 - Data Science Algorithms in a Week
P. 27

12                               Ramazan Ünlü

                              they  showed  both  samples  and  clusters  of  the  ensemble  simultaneously  as
                              vertices in the bipartite graph. In this graph, edges are only between clusters and
                              samples and there is no edge if the weight is zero meaning the sample does not
                              belong to the cluster. The final partition is obtained by using a graph similarity-
                              based algorithm.
                            Based  on  information  theory:  Information  theory  based  algorithms  define  the
                              ensembling problem as the finding median partition by a heuristic solution. In
                              these methods, the category utility  function is used to determine the similarity
                              measures between clusters. Within the context of clustering, the category utility
                              function (Gluck, 1989) can be defined as the partition quality scoring function. It
                              is  proved  that  this  function  is  same  as  within  cluster  variance  minimization
                              problem and it can be maximized by using k-means algorithm (Mirkin, 2001).
                              Using  k-means  algorithms,  on  the  other  hand,  bring  a  deficiency  which  is  the
                              necessity of determining the number of cluster as an initial parameter. Besides,
                              the  method  should  be  run  multiple  times  to  avoid  bad  local  minima.  For  the
                              methodological details and implementation of the method, readers can refer to
                              (Gluck, 1989; Topchy et al., 2005).
                            Based on local adaptation: Local adoption based algorithm combines multiple
                              partitions using locally adaptive clustering algorithm (LAC) which is proposed
                              by  (Domeniconi  et  al.,  2007)  with  different  parameters  initialization.  Weighty
                              similarity  partition  algorithm  (WSPA),  weighty  bipartite  partition  algorithm
                              (WBPA)  (Domeniconi  &  Al-Razgan,  2009),  and  weighted  subspace  bipartite
                              partitioning  algorithm  (WSPA).  To  obtain  final  partition,  each  method  uses  a
                              graph partitioning algorithm such as METIS. The strong restriction of these kinds
                              of methods is that LAC algorithms can be applied to only numerical data.
                            Based on kernel method: Weighted partition consensus via Kernels (WPCK) is
                              proposed by (Vega-Pons, Correa-Morris, & Ruiz-Shulcloper, 2010). This method
                              uses an intermediate step called Partition Relevance Analysis to assign weights to
                              represent the significance of the partition in the ensemble. Also, this approach
                              defines  the  consensus  clustering  via  the  median  partition  problem  by  using  a
                              kernel  function  as  the  similarity  measure  between  the  clusters  (Vega-Pons  &
                              Ruiz-Shulcloper,  2011).  Other  proposed  methods  using  the  same  idea  can  be
                              found  in  (Vega-Pons,  Correa-Morris,  &  Ruiz-Shulcloper,  2008;  Vega-Pons  &
                              Ruiz-Shulcloper, 2009).
                            Based on fuzzy theory: So far, it has been explained ensemble clustering methods
                              whose methodology is developed based on hard partitioning. However, the soft
                              partitioning might also work in various cases. There are clustering methods like
                              EM and fuzzy-c-means that produce soft partition or fuzzy partition of the data.
                              Thus, to combine fuzzy partition instead of hard ones as an internal step of the
   22   23   24   25   26   27   28   29   30   31   32