Page 122 - Data Science Algorithms in a Week
P. 122

106             Loris Nanni, Sheryl Brahnam and Alessandra Lumini

                          tested  for  generalizability  across  several  well-known  benchmark  datasets  that  reflect  a
                          diversity  of  classification  problems.  Our  experiments  show  that  when  different
                          approaches  for  transforming  a  vector  into  a  matrix  are  combined  with  several  texture
                          descriptors the resulting system works well on many different problems without requiring
                          any  ad-hoc  optimization.  Moreover,  because  texture-based  and  standard  vector-based
                          descriptors  preserve  different  aspects  of  the  information  available  in  patterns,  our
                          experiments demonstrate that the combination of the two improves overall classification
                          performance. The MATLAB code for our proposed system will be publicly available to
                          other researchers for future comparisons.


                       Keywords: two-dimensional representation


                                                     INTRODUCTION

                          Most machine pattern recognition problems require the transformation of raw sensor
                       data so that relevant features can be extracted for input into one or more classifiers. A
                       common  first  step  in  machine  vision,  for  instance,  is  to  reshape  the  sensor  matrix  by
                       concatenating  its  elements  into  a  one  dimensional  vector  so  that  various  feature
                       transforms, such as principal component analysis (PCA) (Beymer & Poggio, 1996), can
                       be applied that side step the curse of dimensionality by reducing the number of features
                       without eliminating too much vital information. Reshaping the data matrix into a vector,
                       however,  is  not  necessarily  the  only  nor  the  best  approach  for  representing  raw  input
                       values [16]. One problem with vectorizing a data matrix is that it destroys some of the
                       original  structural  knowledge  (D.  Li,  Zhu,  Wang,  Chong,  &  Gao,  2016;  H.  Wang  &
                       Ahuja, 2005).
                          In  contrast  to  vectorization,  direct  manipulation  of  matrices  offers  a  number  of
                       advantages, including an improvement in the performance of canonical transforms when
                       applied  to  matrices,  a  significant  reduction in  computational  complexity  (Loris Nanni,
                       Brahnam,  &  Lumini,  2012;  Z.  Wang,  Chen,  Liu,  &  Zhang,  2008),  and  enhanced
                       discrimination  using  classifiers  developed  specifically  to  handle  two-dimensional  data
                       (see, for example, (Z. Wang & Chen, 2008) and (Z. Wang et al., 2008)). Moreover, some
                       of the most powerful state-of-the-art two-dimensional feature extraction methods, such as
                       Gabor  filters  (Eustice,  Pizarro,  Singh,  &  Howland,  2002)  and  Local  binary  patterns
                       (LBP)  (L.  Nanni  &  Lumini,  2008;  Ojala,  Pietikainen,  &  Maeenpaa,  2002),  and  their
                       variants,  extract  descriptors  directly  from  matrices.  Other  methods,  such  as  Two-
                       Dimensional  Principal  Component  Analysis  (2DPCA)  (Yang,  Zhang,  Frangi,  &  Yang,
                       2004) and Two-Dimensional Linear Discriminant Analysis (2DLDA) (J. Li, Janardan, &
                       Li,  2002),  allow  classic  transforms,  such  as  PCA  and  Linear  Discriminant  Analysis
                       (LDA)  (Zhang,  Jing,  &  Yang,  2006),  to  work  directly  on  matrix  data.  By  projecting
                       matrix patterns via matrices, both 2DPCA and 2DLDA avoid the singular scatter matrix
                       problem.  Classifier  systems  that  are  designed  to  handle  two-dimensional  data  include
   117   118   119   120   121   122   123   124   125   126   127