Page 328 - Data Science Algorithms in a Week
P. 328
Predictive Analytics for Thermal Coal Prices Using Neural Networks … 309
From renewable energies we can say that they are not pollutants like coal and that
there are countries with a high degree of development and implementation of them,
however, in an imperious reality where the availability and cost / benefit of using coal to
produce energy is the best choice for many countries yet. Sources of renewable energy in
the short term could not be a major threat to coal prices.
With these results and other variables such as the price of electricity, the costs of coal
transportation and the oversupply in the market, we started to collect the data available
for 25 years. This data can be analyzed using neural networks and regression trees.
NEURAL NETWORKS AND REGRESSION TREES
Our goal was now to understand the most important variables and justify them by
using historical data. Delphi demonstrated the importance of quantitative and qualitative
variables. We decided to use different techniques of the data mining domain: Neural
Networks and Classification /Regression Trees, with the variables resulting from the
Delphi process the data for 25 years were investigated quarterly (due to the availability of
the data). The data used was retrieved from the institutions which collect statistical data
for the coal market (Finley, 2013; EIA, 2013; DANE, 2013). In addition, considerations
for seasonality and dependence in previous periods were also added to the formulations.
Neural Networks
The analysis is performed by using neural networks to determine the most important
factors and build a series of predictive models. This study included the use of supervised
learning systems in which a database for learning is used (Singh & Chauhan, 2009). It is
important to say that in supervised learning we try to adapt a neural network so that its
results (μ) approach the targets (t) from a historical dataset. The aim is to adapt the
parameters of the network to perform well for samples from outside the training set.
Neural networks are trained with 120 input variables representing the relevant factors and
their values in time sequential quarterly and annual cycles and the output represents the
increment in price of thermal coal for the future quarter. We have 95 data samples, out of
which 63 are used for training and validation and 32 are used exclusively for prediction.
Figure 5 represents a generic diagram for a neural network with a feedforward
architecture.