Page 637 - NGTU_paper_withoutVideo
P. 637
Modern Geomatics Technologies and Applications
Comparison of CART and C4.5 decision tree algorithms for classification of particulate
matter pollution less than 2.5 microns (PM2.5)
1
2*
Mohamadreza Heydari , Parham Pahlavani , Behnaz Bigdeli 3
1 GIS M.Sc. Student at School of Surveying and Geospatial Engineering, College of Engineering, University of
Tehran, Tehran, Iran
2 Assistant Professor at School of Surveying and Geospatial Engineering, College of Engineering, University of
Tehran, Tehran, Iran
3 Assistant Professor at School of Civil Engineering, Shahrood University of Technology, Shahrood, Iran
* pahlavani@ut.ac.ir
Abstract: Today, with the development of industry and the growth of cities, many environmental problems have arisen.
One of the most important environmental problems is the air pollution. An important part of air pollution is related to
the pollution of suspended particles less than 2.5 microns. Classification of particulate matter pollution less than 2.5
microns in order to control and reduce air pollution is of great importance to achieve sustainable development in cities.
In order to classify pollution in this paper from meteorological parameters (wind speed, wind direction, temperature,
relative humidity, air pressure, rainfall), topographic status, intensity of temperature inversion and pollution for the two
nearest stations, and the time dependence were used as influencing factors. Among the classification methods, tree-based
methods were chosen due to the better understanding and simplicity. In this paper, CART(Classification And Regression
Tree) and C4.5 decision tree algorithms are used to classify the pollution of suspended particles less than 2.5 microns.
Among these methods, C4.5 method with the overall accuracy of 78.3% and the kappa index of 74.8% was selected as
the best classification method. Pollution parameters of the two nearest neighbours, topography, temperature, air pressure,
rainfall, intensity of temperature inversion, relative humidity, wind speed, wind direction, month of the year, day of the
week, hour of the day have the greatest impact on the classification of the superior method, respectively.
Keyword: Air pollution, CART, C4.5, PM2.5, Overall accuracy, Kappa index.
1. Introduction
During the past few years, severe air-pollution problem has garnered worldwide attention due to its effect on health and
wellbeing of individuals[1]. Air pollution is affected by six environmental pollutants including carbon monoxide, lead, nitrogen
dioxide, ozone, sulfur dioxide, and particulate matter less than 10 and 2.5 microns[2]. Suspended particles due to their size and
dimensions can pass through the first defensive barrier (nose and throat) and damage the lungs[3]. In this paper, particulate
matter less than 2.5 microns is considered as an effective pollutant in air pollution. Also parameters related to hour of the day,
day of the week, month of the year, topographic status, meteorological parameters (air pressure, temperature, humidity, rainfall,
wind speed, wind direction), intensity of temperature inversion and pollution of the two nearest neighbours as the effective
factors are considered for preparing classification map of pollution risk. There are various techniques and methods for
classification, and in this paper, the CART and C4.5 decision tree algorithms are using to classify the pollution of particulate
matter less than 2.5 microns. Due to the fact that air pollution in the cold season of the year is higher due to temperature
inversion[4], in this study, data related to the autumn season of 2017 and 2018 have been used. Valuable research has been done
in the field of particulate matter pollution, identification of effective factors and their classification.
Elangasinghe et al.[5], proposed a neural network model for estimation the concentration of nitrogen dioxide pollutants
by considering the meteorological parameters including wind speed, wind direction, radiation intensity, temperature, relative
humidity, time of day, day of the week and month of the year. In another study, Klein Dieters et al.[6] modelled urban pollution
1