Page 125 - Data Science Algorithms in a Week
P. 125
Texture Descriptors for The Generic Pattern Classification Problem 109
5. FFT: the same procedure as DCT but, instead of using a discrete cosine
transform, the Fast Fourier transform is used. Similar to DCT, the FFT decomposes a
finite-length vector into a sum of scaled-and-shifted basis functions. The difference is the
type of basis function used by each transform: while the DCT uses only (real-valued)
cosine functions, the DFT uses a set of harmonically-related complex exponential
functions. After several tests, we obtained the best performance using the first FFT
coefficient (i.e., the sum of values of the vector).
The following methods were used to describe a given matrix:
Multiscale Local Phase Quantization (MLPQ) (Chan, Tahir, Kittler, &
Pietikainen, 2013; Ojansivu & Heikkila, 2008), where R, the radius of the
neighborhood is set to R=3 and R=5. MLPQ is a variant of LPQ, which is a blur-
robust image descriptor designed as a multiscale evolution of the LPQ. The main
idea behind LPQ is to extract the phase information in the frequency domain so
that it is robust to blur variation. The local phase information is extracted using a
2D windowed Fourier transform on a local window surrounding each pixel
position. MLPQ is computed regionally and adopts a component-based
framework to maximize the insensitivity to misalignment, a phenomenon
frequently encountered in blurring. Regional features are combined using kernel
fusion;
Complete local binary pattern (CLBP) (Guo, Zhang, & Zhang, 2010): with values
(R=1; P=8) and (R=2; P=16), where R is the radius, and P is the number of the
neighborhood. CLBP is a variant of LBP, which is an effective texture descriptor
used in various image processing and computer vision applications. LBP is
obtained from the neighboring region of a pixel by thresholding the neighbors
with the center pixel to generate a binary number. The LBP only uses the sign
information of a local difference while ignoring the magnitude information. In
the CLBP scheme, the image local differences are decomposed into two
complementary components: the signs and magnitudes. In our experiments we
used two values of R and P, and we concatenate the descriptors.
Histogram of Gradients (HoG) (Dalal & Triggs, 2005): HoG represents an image
by a set of local histograms that counts occurrences of gradient orientation in a
local subwindow of the image. The HoG descriptor can be extracted by
computing the gradients of the image, followed by dividing the image into small
subwindows, where a histogram of gradient directions is built for each
subwindow. In this work the input matrix is divided into 5×6 non-overlapping
subwindows, and gradient orientation histograms extracted from each sub-
windows are first normalized to achieve better invariance to changes in
illumination or shadowing and then concatenated for representing the original
input matrix;
Wavelet features (WAVE): a wavelet is a “small wave” which has its energy
concentrated in time. In image processing, wavelets are used as a transformation