Page 199 - Data Science Algorithms in a Week
P. 199
Predictive Analytics using Genetic Programming 183
predictive modeling system to find symptoms of damage, deterioration, or excessive wear
in future flights.
Figure 10: RCC is a lightweight heat-shielding material (NASA, 2008).
In the years of 2008, 2009, 2010, and 2011 NASA assembled a Tiger Team to study
potential issues with the shuttle’s Reinforced Carbon-Carbon (RCC) leading-edge panel
(Dale, 2008). The Tiger Team’s investigation generated huge amounts of structured and
unstructured data of the RCC panels. This big data was able to be used with different
methodologies to build analysis and predictor models. One of the methodologies studied
was GP.
USING GENETIC PROGRAMMING
We will be explaining in more detail step 6 of the framework outlined in the Section
Complexity and Predictive Analytics. We are assuming that steps 1 – 5 have been
completed successfully (an effort that can take several months for this case study).
Knowledge Discovery and Predictive Modeling
Input engineering is about the investigation of the most important predictors. There
are different phases such as attribute selection to select the most relevant attributes. This
involves the removing of the redundant and/or irrelevant attributes. This will lead to
simpler models that are easier to interpret and we can add some structural knowledge.
There are different filters to be used with the respective objectives such as:
Information Gain
Gain ratio