Page 205 - Data Science Algorithms in a Week
P. 205

Predictive Analytics using Genetic Programming            189

                                                       CONCLUSION


                          Our  experience  working  with  complex  problems,  incomplete  data,  and  high  noise
                       levels  have  provided  us  with  a  more  comprehensive  methodology  where  machine
                       learning base-models can be used with other types of empirical and exact models. Data
                       science  is  very  popular  in  the  marketing  domain  where  first-principle  models  are  not
                       common. However, the next frontier of big data analytics is to use information fusion -
                       also known as multi-source data fusion (Sala-Diakanda, Sepulveda & Rabelo, 2010). Hall
                       and  Llinas  (1997)  define  data  fusion  as  “a  formal  framework  in  which  are  expressed
                       means and tools for the alliance of data originating from different sources, with the aim
                       of  obtaining  information  of  greater  quality”.  Information  fusion  is  going  to  be  very
                       important to create predictive models for complex problems. AI paradigms such as GP,
                       are a philosophy of the “data fits the model.” This viewpoint has many advantages for
                       automatic programming and the future of predictive analytics.
                          As future research, we propose combining GP concepts with operations research and
                       operations management techniques, to develop methodologies where the data helps the
                       model creation to support prescriptive analytics (Bertsimas & Kallus, 2014). As we see in
                       this paper these methodologies are applicable to decision problems. In addition, it is a
                       current tendency in the prescriptive analytics community to find and use better metrics to
                       measure  the  efficiency  of  the  models  besides  the  confusion  matrix  or  decile  tables.
                       Another important point for engineered systems is the utilization of model-based system
                       engineering.  SysML  can  be  combined  with  ontologies  in  order  to  develop  better  GP
                       models (Rabelo & Clark, 2015). One point is clear: GP has the potential to be superior to
                       regression/classification trees due to the fact that GP has more operators which include
                       the ones from regression/classification trees.


                                                  ACKNOWLEDGMENTS


                          We would like to give thanks to Dr. Bruce Ratner. Bruce provided the GenIQ Model
                       for this project (www.GenIQModel.com). In addition, we would like to give thanks to the
                       NASA Kennedy Space Center (KSC). KSC is the best place to learn about complexity.
                          The  views  expressed  in  this  paper  are  solely  those  of  the  authors  and  do  not
                       necessarily reflect the views of NASA.


                                                       REFERENCES

                       Bertsimas,  D.,  &  Kallus,  N.  (2014).  From  predictive  to  prescriptive  analytics.  arXiv
                          preprint arXiv:1402.5481.
   200   201   202   203   204   205   206   207   208   209   210