Page 333 - Data Science Algorithms in a Week
P. 333

314             Mayra Bornacelli, Edgar Gutierrez and John Pastrana

                          Figure 9 shows the performance of the neural network (NN) developed with the most
                       important variables according to sensitivity analysis. The neural network uses 8 variables
                       as input, 10 hidden neurons in a hidden layer, and the output represents the price in US $
                       of the thermal coal future quarter.


                       Using Regression Trees to Predict the Price of Thermal Coal

                          It was decided to use a second artificial intelligence paradigm (Regression Trees) to
                       verify  the  results  obtained  with  neural  networks.  This  provided  a  good  opportunity  to
                       compare  both  methodologies.  In  regression  trees,  the  objective  is  to  model  the
                       dependence  of  a  response  variable  with  one  or  more  predictor  variables.  The  analysis
                       method  MARS,  Multivariate  Adaptive  Regression  Splines,  (Friedman,  1991)  offers  us
                       the structure of a set of variables of an object as a linear combination equation to describe
                       a problem in terms of this equation, knowing their most influential variables. It is a non-
                       parametric  regression  technique.  MARS  is  as  an  extension  of  linear  models  that
                       automatically  models  nonlinearities  and  interactions  between  variables.  The  analysis
                       determines the best possible variable to split the data into separate sets. The variable for
                       splitting  is  chosen  based  on  maximizing  the  average  “purity”  of  the  two  child  nodes.
                       Each  node is  assigned  a  predicted  outcome  class.  This  process  is repeated  recursively
                       until it is impossible to continue. The result is the maximum sized tree which perfectly
                       fits to training data. The next step is to then prune the tree to create a generalized model
                       that will work with outside data sets. This pruning is performed by reducing the cost-
                       complexity  of  the  tree  while  maximizing  the  prediction  capability.  An  optimal  tree  is
                       selected which provides the best prediction capability on outside data sets and has the
                       least degree of complexity.
                          Models based on MARS have the following form:

                            (  ) =     + ∑          ℎ  (  )                                        (3)
                                   0
                                                    
                                           =1

                       where hm(X) is a function from a set of candidate functions (and that can include products
                       of at least two or more of such functions). αm are the coefficients obtained by minimizing
                       residual sum of squares.
                          The process to build a tree using MARS is very straightforward. The procedure has to
                       calculate a set of candidate functions using reflected pairs of basis functions. In addition,
                       the  number  of  constraints/restrictions  must  be  specified  and  the  degrees  of  interaction
                       allowed. A forward pass follows and new functions products are tried to see which ones
                       decreases  the  training  error.  After  the  forward  pass,  a  backward  pass  is  next.  The
                       backward pass fix the overfit. Finally, generalized cross validation (GCV) is estimated in
                       order to find the optimal number of terms in the model. GCV is defined by:
   328   329   330   331   332   333   334   335   336   337   338