Page 193 - Data Science Algorithms in a Week
P. 193
Predictive Analytics using Genetic Programming 177
GP is basically a variation of the genetic algorithm and it follows the standards of
EAs as outlined above. The population of individuals in GP are computer programs. The
representation of individuals/computer programs are trees. These hierarchical or
structured trees are the population and they can have different sizes. Given the tree
representation, genetic operators such as tree crossover must be modified accordingly.
The following computer programs are parental programs as displayed in Figure 2.
The first one is: 0.25Y + X + 1.75 and using a LISP S-expression (+ (* 0.25 Y) (+ X
1.75)). The second program is: XY(X / 0.455Z) and using a LISP S-expression (* (* Y
X) (/ X (* 0.455 Z)). These parental programs have “point-labeled” trees with ordered
branches.
Figure 2: Two parental computer programs which are rooted with ordered branches.
The standard crossover operation of an EA in the case of GP creates offspring by the
exchange of subtrees. Subtrees can be considered subroutines, subprocedures or
subfunctions. These subtrees selected by GP are essentially for crossover (see Figure 3).
For example, it can select specific labeled sections of the parental computer programs of
Figure 2 and decide on a subtree from each parent to be used for the creation of offspring
(Figure 3). The first one is selected from point labeled 2 (randomly selected) of the first
parental computer program: 0.25Y and using a LISP S-expression (* 0.25 Y). The second
subtree is selected from point labeled 5 (randomly selected) of the second parental
computer program: X / 0.455Z and using a LISP S-expression (/ X (* 0.455 Z)).
Figure 3: Subtrees of two parental computer programs selected for crossover.