Page 40 - NGTU_paper_withoutVideo
P. 40
Modern Geomatics Technologies and Applications
3. Implementation and result
In this study Boston accident data and the 311 non-emergency service requests during four years from 2015 to 2018 are
employed. These include 17,360 crash records, 524,525 service requests records, 166,248 parcels data and 19,006 street segment.
1
The dataset was obtained from BostonMaps Open Data . Table 1 shows frequency and percentage parameters for land use, types
of vehicles, street types and the time of day and day of the week which an accident has happened. Table2 depicts the citizen
collected non-emergency reports which are classified in 44 topics.
Rule mining was performed using the ‘a priori’ algorithm according to the methodology introduced by Agrawal [1]. The
WEKA 3.6.9 machine learning toolkit is used to implement the Apriori algorithm.
To determine spatial and temporal relation between accident and land use data and between accident and environmental
reports, the presented approach applied on accident and land use data and then on accident and environmental reports, separately.
Alternatively, to set the optimum spatial threshold, the result of some different spatial thresholds (e.g. 10m, 25m, 50m,
75m, 100 m, 200 m … 1500m) was examined. Then, the 100-meter spatial threshold was considered as the spatial threshold.
In the case of a large number of discovered rules, type I error may occur. To be more precise, it means that the rules may
be extracted based on chance rather than a hidden pattern in the dataset [22]. To reduce the risk of type I error, the dataset is
divided randomly into test and train: the train samples include 75% of the total crash in the dataset and the test samples include
25% of the total crash dataset. The train samples were used to generate the rules model based on the pre-defined threshold values
including minsup, minconf, and minL. Afterward, the test samples were used to evaluate the determined association rule.
The result of the proposed approach is explained in two separate sub-sections as follow:
3.1. Crash characteristics and land use Relations
Fig 4 presents statistics of citizens requests reports and Table 2 shows the result of the proposed method to generate
association rules between accident and land use data based on the pre-defined threshold values including Sup >
10% minConf ≥ 50% and minLift ≥ 1.1. For instance, in the case of a motor vehicle, commercial land use, and weekday as
the time of the accident, with 61% confidence, the crash occurred on a road intersection.
TABLE 1 TRAFFIC ACCIDENTS CHARACTERISTICS
Crash characteristics Frequency Percent Crash characteristics Frequency Percent
Vehicle Type Day Of Week
Motor Vehicle 12,489 71.94% Weekday 12,680 73%
Pedestrian 3,130 18.03% Weekend 4,683 27%
Bike 1,741 10.03% Land Use
Street Type Commercial 3,722 21.44%
Intersection 8,698 50.10% Exempt 7,786 44.85%
Street 7,237 41.69% Residential 5,009 28.86%
Other 1,425 8.21% Industrial 247 1.42%
Time Of Day Agricultural 238 1.37%
Morning 4,581 26.39% Parking 34 0.19%
Noon 6,112 35.21% Mixed-use 325 1.87%
Night 4,609 26.55%
Mid Night 2,058 11.85%
1 http://bostonopendata-boston.opendata.arcgis.com
https://data.boston.gov/dataset/311-service-requests
5