Page 191 - Data Science Algorithms in a Week
P. 191
Predictive Analytics using Genetic Programming 175
GENETIC PROGRAMMING
Evolutionary algorithms are search and optimization procedures that are motivated
by the principles of natural genetics and natural selection (Deb, 2001). This concept was
first developed during the 70’s by John Holland and his students at the University of
Michigan, Ann Arbor (Deb, 1989). The goals of their research have been twofold: (1) to
abstract and rigorously explain the adaptive processes of natural systems, and (2) to
design artificial systems software that retains the important mechanics natural selection
(Goldberg, 1989). Eventually, this approach has led to important discoveries and
advancements in both natural and artificial systems science.
Over the last two decades, EAs have been extensively used as search and
optimization tools in various problem domains, including science, commerce and
engineering (Deb, 2001). These algorithms have been found to be very successful in
arriving at an optimal/near-optimal solution to complex optimization problems, where
traditional search techniques fail or converge to a local optimum solution. The primary
reasons for their success are their wide applicability, ease of use, and global dimension.
There are several variations of EAs. Blum et al. (2011) stated that standard EA includes a
set of principles and a common cycle. This set of principles is explained as follows:
1. Populations of individuals which represent solutions or strategies
2. Populations change dynamically due to the different “natural” processes of
reproduction and generations
3. An individual survives and reproduces according to the “advantages” given by its
level of fitness.
4. New generations resemble their parents but are not identical.
EAs follow a cycle similar to the one depicted in Figure 1. An initial population is
built based on random creation of individuals with their respective chromosomes. Some
individuals of this initial population can be generated using metaheuristics and other
optimization engineering schemes. The population is mapped from the genetic
representation (i.e., chromosome instance) to a fitness based one (representation required
to be assessed by the environment). That means that the particular individual needs to be
represented in a different way to obtain the value of the objective function (as given by
the assessment process). For example, a chromosome instance (representing a particular
individual) can represent now a discrete-event simulation program that needs to be
executed to obtain the value of the objective function. If the performance criterion is met,
this cycle (i.e., evolution) stops. Otherwise, the evolutionary cycle continues with the
generation of the next population. That means that after the values of the objective
function are obtained for each member of the population, the fitness values are
determined in a relative manner. The mating pool is formed by the members which have
the best relative fitness.