Page 125 - Data Science Algorithms in a Week
P. 125

Texture Descriptors for The Generic Pattern Classification Problem   109

                          5.  FFT:  the  same  procedure  as  DCT  but,  instead  of  using  a  discrete  cosine
                       transform, the Fast Fourier transform is used. Similar to DCT, the FFT decomposes a
                       finite-length vector into a sum of scaled-and-shifted basis functions. The difference is the
                       type  of  basis  function  used  by  each transform:  while  the  DCT uses  only  (real-valued)
                       cosine  functions,  the  DFT  uses  a  set  of  harmonically-related  complex  exponential
                       functions.  After  several  tests,  we  obtained  the  best  performance  using  the  first  FFT
                       coefficient (i.e., the sum of values of the vector).

                          The following methods were used to describe a given matrix:
                            Multiscale  Local  Phase  Quantization  (MLPQ)  (Chan,  Tahir,  Kittler,  &
                              Pietikainen,  2013;  Ojansivu  &  Heikkila,  2008),  where  R,  the  radius  of  the
                              neighborhood is set to R=3 and R=5. MLPQ is a variant of LPQ, which is a blur-
                              robust image descriptor designed as a multiscale evolution of the LPQ. The main
                              idea behind LPQ is to extract the phase information in the frequency domain so
                              that it is robust to blur variation. The local phase information is extracted using a
                              2D  windowed  Fourier  transform  on  a  local  window  surrounding  each  pixel
                              position.  MLPQ  is  computed  regionally  and  adopts  a  component-based
                              framework  to  maximize  the  insensitivity  to  misalignment,  a  phenomenon
                              frequently encountered in blurring. Regional features are combined using kernel
                              fusion;
                            Complete local binary pattern (CLBP) (Guo, Zhang, & Zhang, 2010): with values
                              (R=1; P=8) and (R=2; P=16), where R is the radius, and P is the number of the
                              neighborhood. CLBP is a variant of LBP, which is an effective texture descriptor
                              used  in  various  image  processing  and  computer  vision  applications.  LBP  is
                              obtained from the neighboring region of a pixel by thresholding the neighbors
                              with the center pixel to generate a binary number. The LBP only uses the sign
                              information of a local difference while ignoring the magnitude information. In
                              the  CLBP  scheme,  the  image  local  differences  are  decomposed  into  two
                              complementary  components: the  signs and  magnitudes.  In  our experiments  we
                              used two values of R and P, and we concatenate the descriptors.
                            Histogram of Gradients (HoG) (Dalal & Triggs, 2005): HoG represents an image
                              by a set of local histograms that counts occurrences of gradient orientation in a
                              local  subwindow  of  the  image.  The  HoG  descriptor  can  be  extracted  by
                              computing the gradients of the image, followed by dividing the image into small
                              subwindows,  where  a  histogram  of  gradient  directions  is  built  for  each
                              subwindow. In this work the input matrix is divided into 5×6 non-overlapping
                              subwindows,  and  gradient  orientation  histograms  extracted  from  each  sub-
                              windows  are  first  normalized  to  achieve  better  invariance  to  changes  in
                              illumination  or  shadowing  and  then  concatenated  for  representing  the  original
                              input matrix;
                            Wavelet  features  (WAVE):  a  wavelet  is  a  “small  wave”  which  has  its  energy
                              concentrated in time. In image processing, wavelets are used as a transformation
   120   121   122   123   124   125   126   127   128   129   130