Page 72 - Data Science Algorithms in a Week
P. 72
56 Olmer Garcia and Cesar Diaz
Where xi is the value of each input to the node, wi are weight parameters which
multiply each input, b is known as the bias parameter and f (.) is known as the activation
function. The commonly used functions are the sigmoidal activation functions, the
hyperbolic tangent functions and the rectified linear unit (ReLU). Heaton (2015) proposes
that while most current literature in deep learning suggests using the ReLU activation
function exclusively, it is necessary to understand sigmoidal and hyperbolic tangent to
see the benefits of ReLU.
Varying the weights and the bias would vary the amount of influence any given input
has on the output. The learning aspect of neural networks takes place during a process
known as back-propagation used by the most common algorithm developed in the
1980’s. In the learning process, the network modifies the weights and bias to improve the
network’s output like any algorithm of machine learning. Backpropagation is an
optimization process which uses the chain rule of the derivative to minimize the error in
order to improve the output accuracy. This process is developed by numerical methods
where stochastic gradient descent (SGD) is a dominant scheme.
Finally, the way in which nodes are connected defines the architecture of the neural
network. Some of the popularly known algorithms are as follows:
Self-organizing maps (Kohonen, 1998): Unsupervised learning algorithm used
for clustering problems, used principally to understand some information of
perception problems.
Feedforward artificial neural networks (Widrow & Lehr, 1990): Supervised
learning algorithm that is used for classification and regression. It has been
applied to robotics and vision problems. This architecture is very common in
traditional Neural Networks (NNs) and was heavily used in the multilayer
Perceptron. They can be used as universal function regressors.
Boltzmann machines (Hinton, Sejnowski, & Ackley, 1984): Supervised learning
algorithm that is used for classification and optimization problems. A Boltzmann
machine is essentially a fully connected two-layer neural network.
Hopfield neural networks (Hopfield, 1982): Supervised learning algorithm is
used for classification and optimization problems. It is a fully connected single
layer, auto associative network. It works well for incomplete or distorted
patterns, and they can be used for optimization problems such as the traveling
salesman problem.
Convolutional neural networks (CNNs): Although Fukushima (1980) introduced
the concepts of CNN, many authors have worked on CNN. LeCun et al. (1998)
developed a neural network architecture: LeNet-5. LeNet-5 has become of the
most accepted architectures. A CNN is a supervised learning algorithm. CNN's
map their input into 2D grids. CNN have taken image and recognition to a higher
level of capability. This advance in CNN's is due to years of research on