Page 209 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 209
190 Big Data Analytics for Connected Vehicles and Smart Cities Building a Data Lake 191
• The development of plans and briefing materials for senior executives to
form the basis for further decision-making;
• The definition of a complete hardware and software environment or
architecture to support the full-capability data lake;
• An implementation plan including activities, schedule, and budget esti-
mates for the expansion of the pilot project to full capability.
Table 9.1 provides a summary of how the proposed approach meets the
challenges defined in Section 9.7.
9.9 Organizing for Success
It is hoped that when a transportation organization embarks on the creation of
a data lake, the effort will also result in a new approach to data collection and
acquisition. All too often in transportation, data is collected on a speculative
basis with little concern for the eventual use of such data. The establishment of
a data lake and organizational alignment to the data lake should provide some
guidance and insight into a new approach to data collection based on the needs
of the data and the need for which the data has been collected. Ideally, data
collection and acquisition will lead to the conversion of data to information
using analytics, and experience gained using analytics will provide feedback on
the need for additional or higher-quality data. In this respect, the data lake cells
function as a feedback mechanism to guide data collection and acquisition. In
an ideal environment, data collection and acquisition will be driven by a clearer
understanding of the use to which the data will be put. For example, the use of
the initial data lake to conduct some preliminary analytics may result in a much
more detailed understanding of the completed data set required to get results.
This will also provide insight into the required accuracy of the data. While
early results are not invalid as they may provide insight that was not previously
available, even better results may be possible with better data. Accordingly, the
use of a data lake can provide significant input into a structured and planned
approach to data collection and acquisition. Prior experience in the application
of information and communication technologies to transportation within a sys-
tem engineering framework has indicated that, in most cases, the technology
solution that results from the planning and design process represents a theo-
retical ideal. Approaches to the development of system architectures assume
that the ideal technology solution will be applied and that it will be necessary
to adjust or fine-tune organizational arrangements to match the needs of the
technology solution. It is more likely that the ideal technological solution will
be adjusted and perhaps suboptimized to fit existing organizational arrange-