Page 205 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 205
186 Big Data Analytics for Connected Vehicles and Smart Cities Building a Data Lake 187
define how the proposed data lake will fit in to the overall picture of data and
information exchange on a citywide basis.
Existing Data Scattered and Not Well Understood
It is highly likely that the existing transportation data is scattered and not well
understood. There may or may not be an existing data catalogue, and even if
one exists, it may not be complete and up-to-date. To create a data lake, it is
necessary to identify the sources of data and plan to have access to the data
that will be placed in the data lake. In many cases this can take a considerable
amount of time and consume significant resources.
Difficulty in Turning Data into Action
Bringing the data together into a data lake does not guarantee results. To har-
ness the value of the data lake, it is necessary to support an entire process that
results in actionable insights and the development of strategies to be applied in
response to the new insight and understanding. In many cases, this may require
some organizational adjustment to empower staff to take advantage of the ana-
lytics developed from the data lake.
Lack of Big Data Skills
The use of big data techniques and analytics is relatively new to transportation
in smart cities. Therefore, it is likely that the big data skills required to success-
fully establish and operate the data lake may not exist within smart city or trans-
portation organizations. When planning for the establishment of a data lake, it
will be necessary to identify the required skills and decide how those skills will
be sourced, whether by outsourcing or new hires.
Insufficient Governance and Security
The adoption of a bottom-up approach that is not guided by a clear strategy
or an unambiguous understanding of the final big picture can lead to insuffi-
cient governance and security. Taking advantage of the power and flexibility of
available technology can support rapid progress, but it can also allow essential
activities related to governance and security to be bypassed.
The Degradation of Data Over Time without Data Quality Control
An unfortunate trend in the application of advanced technologies to transpor-
tation is the creation of a trajectory for technology application. In the trajectory,
considerable progress is made in the implementation of advanced technologies,
and the target levels of service are attained. These service levels then degrade
over time as insufficient resources are allocated to operations and maintenance
of the initial technology deployment. The same challenge exists with respect to
data lakes for smart cities and transportation. It is necessary to take steps to not