Page 82 - Data Science Algorithms in a Week
P. 82

66                        Olmer Garcia and Cesar Diaz

                       Pre-Processing and Data Augmentation

                          The input images to the neural network went through a few preprocessing steps to
                       help train the network. Pre-processing can include:

                            Resizing the image: A specific size is required. 32x32 is a good value based on
                              the literature.
                            Color Space Conversion: It is possible to transform to gray scale if you think that
                              the colors do not matter in the classification or may be changed from RGB (Red,
                              Green,  and  Blue)  space  to  some  color  space  like  HSV  (Hue,  Saturation,  and
                              Brightness). Some other approach can include balanced brightness and contrast
                              of the images.
                            Normalization:  This  part  is  very  important  because  the  algorithms  in  neural
                              networks work just with the data in some interval, normally between 0 and 1 or -
                              1 and 1. This could be done by dividing each dimension by its standard deviation
                              once it is zero-centered. This process causes each feature to have a similar range
                              so that our gradients do not go out of control (Heaton, 2013).

                          Unbalanced data, as shown in Figure 6, means that there are many more samples of
                       one traffic sign than the others. This could generate overfitting and/or other problems in
                       the  learning  process.  One  solution  is  to  generate  new  images  or  to  take  some  images
                       randomly and change through a random combination of the following techniques:

                            Translation: Move the image horizontally or vertically and some pixels around
                              the center of the image.
                            Rotation: Rotate the image at random angle with axes at the center of the image.
                            Affine transformations: Make a zoom over the image or change the perspective
                              of the image.


                       Definition of an Initial CNN Architecture

                          A  good  way  to  start  assembling  your  own  deep  neural  network  is  to  review  the
                       literature  and  look  for  a  deep  learning  architecture  which  has  been  used  in  a  similar
                       problem. The first one was the architecture presented by LeCun et al. (1998): LeNet-5
                       (Figure 7). Let’s assume that we select LeNet-5. Therefore, the first step is to understand
                       LeNet-5 which is composed of 8 layers. LeNet-5 is explained as follows:
   77   78   79   80   81   82   83   84   85   86   87