Page 203 - Data Science Algorithms in a Week
P. 203
Predictive Analytics using Genetic Programming 187
Figure 12: Predicted responses for each decile (from top to bottom).
The GenIQ Response Model Tree, in Figure 13, reflects the best model of the decile
table shown in Table 1. The model is represented using a tree structure. The output of the
GenIQ Model is two-fold (Ratner, 2008): a graph known as a parse tree (as in Figure 13).
A parse tree is comprised of variables, which are connected to other variables with
functions (e.g., arithmetic {+, -, /, x}, trigonometric {sine, tangent, cosine}, Boolean
{and, or, xor}). In this case, it is a model to predict when to do the overhaul. This model
was very simple and the performance in the validation set (74%) was very comparable to
other models using neural networks trained with the backpropagation paradigm.
Figure 13: Example of one of the earlier GP Models developed to calibrate the genetic process and the
generation of specific data. The model tries to predict when to do the overhaul.
After this moderate performance, the emphasis was on synthetic variables to be used
with neural networks. It was decided to develop a synthetic variable denominated Quality
Index (that was the value obtained from thermography). This synthetic variable is
displayed in Figure 14. The GenIQ Response Model computer code (model equation) is