Page 209 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 209

190	       Big	Data	Analytics	for	Connected	Vehicles	and	Smart	Cities	                	                        Building a Data Lake	                    191


               • The development of plans and briefing materials for senior executives to
                form the basis for further decision-making;
               • The definition of a complete hardware and software environment or
                architecture to support the full-capability data lake;
               • An implementation plan including activities, schedule, and budget esti-
                mates for the expansion of the pilot project to full capability.


               Table 9.1 provides a summary of how the proposed approach meets the
          challenges defined in Section 9.7.


          9.9  Organizing for Success

          It is hoped that when a transportation organization embarks on the creation of
          a data lake, the effort will also result in a new approach to data collection and
          acquisition. All too often in transportation, data is collected on a speculative
          basis with little concern for the eventual use of such data. The establishment of
          a data lake and organizational alignment to the data lake should provide some
          guidance and insight into a new approach to data collection based on the needs
          of the data and the need for which the data has been collected. Ideally, data
          collection and acquisition will lead to the conversion of data to information
          using analytics, and experience gained using analytics will provide feedback on
          the need for additional or higher-quality data. In this respect, the data lake cells
          function as a feedback mechanism to guide data collection and acquisition. In
          an ideal environment, data collection and acquisition will be driven by a clearer
          understanding of the use to which the data will be put. For example, the use of
          the initial data lake to conduct some preliminary analytics may result in a much
          more detailed understanding of the completed data set required to get results.
          This will also provide insight into the required accuracy of the data. While
          early results are not invalid as they may provide insight that was not previously
          available, even better results may be possible with better data. Accordingly, the
          use of a data lake can provide significant input into a structured and planned
          approach to data collection and acquisition. Prior experience in the application
          of information and communication technologies to transportation within a sys-
          tem engineering framework has indicated that, in most cases, the technology
          solution that results from the planning and design process represents a theo-
          retical ideal. Approaches to the development of system architectures assume
          that the ideal technology solution will be applied and that it will be necessary
          to adjust or fine-tune organizational arrangements to match the needs of the
          technology solution. It is more likely that the ideal technological solution will
          be adjusted and perhaps suboptimized to fit existing organizational arrange-
   204   205   206   207   208   209   210   211   212   213   214