Page 44 - Data Science Algorithms in a Week
        P. 44
     Using Deep Learning to Configure Parallel Distributed Discrete-Event Simulators  29
                          Hinton,  Osindero,  &  Teh  (2006)  provided  novel  training  algorithms  that  trained
                       multi-hidden  layer  deep  belief  neural  networks  (DBNs).  Their  work  introduced  the
                       greedy  learning  algorithm  to  train  a  stack  of  restricted  Boltzmann  machines  (RBMs),
                       which compose a DBN, one layer at a time. The central concept of accurately training a
                       DBN,  that  extracts  complex  patterns  in  data,  is  to  find  the  matrix  of  synaptic  neuron
                       connection weights that produce the smallest error for the training (input-data) vectors.
                          The  fundamental  learning  blocks  of  a  DBN  are  stacked  restricted  Boltzmann
                       machines. The greedy algorithm proposed by Hinton et al. (2006) focused on allowing
                       each RBM model in the stack to process a different representation of the data. Then, each
                       model transforms its input-vectors non-linearly and generates output-vectors that are then
                       used as input for the next RBM in the sequence.
                          When  RBMs  are  stacked,  they  form  a  composite  generative  model.  RBMs  are
                       generative  probabilistic  models  between  input  units  (visible)  and  latent  (hidden)  units
                       (Längkvist, Karlsson, & Loutfi, 2014). An RBM is also defined by Zhang, Zhang, Ji, &
                       Guo (2014) as a parameterized generative model representing a probability distribution.
                       Figure 4 shows an RBM (at lower level) with binary variables in the visible layer and
                       stochastic binary variables in the hidden layer (Hinton et al., 2012). Visible units have not
                       synaptic connections between them. Similarly, hidden units are not interconnected. No
                       hidden-hidden or visible-visible connectivity makes the Boltzmann machines restricted.
                          During learning, the RBM at higher-level (Figure 4) uses the data generated by the
                       hidden activities of the lower RBM.
                       Figure 4: Two RBMs.





