Page 202 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 202

182	  Big	Data	Analytics	for	Connected	Vehicles	and	Smart	Cities	  	  Building a Data Lake	  183

            The Foundation for Large-Scale Proactive Analytics

            The data lake is also the basis for further efforts related to large-scale proactive
            analytics. While this will also require cultural and organizational change, the
            existence of the data lake opens the way for the smart city organization to apply
            large-scale analytics that will guide many aspects of planning and delivery for
            smart city transportation. This enables the adoption of results-driven actions
            and the establishment of scientific approaches to transportation service delivery,
            based on observation, understanding of mechanisms, and data.

            Steppingstone  Toward  Automation  Through  Predictive  Analytics  and  Machine
            Learning
            There is considerable interest in activity in the concept of an automated vehicle,
            and it would seem relevant to also consider how automation can be applied
            to back-office processes in the smart city. While it may not be appropriate or
            even desirable to leap toward an automated back office overnight, the establish-
            ment of analytics and the development of ability to make predictions can form
            the basis for the past toward automation. The availability of the data in the
            data lake can also form the raw material for the support of machine learning
            and deep learning techniques that support the stated development of artificial
            intelligence in the smart city back office. It is likely that this will begin with
            sophisticated decision support for the humans involved, with full automation a
            possibility over the longer term.

            Reduce Costs Due to Data Management Duplication and Processing Duplication
            Adopting a fragmented approach to data collection storage and management
            will inevitably lead to duplication. In fact, the cost of duplication may be buried
            within the overall cost of operating and maintaining the current data collec-
            tion, storage, and management system. The process of creating a data lake is
            likely to shine a light on the volume of duplication and provide estimates of
            the costs involved. Cost savings are likely to be identified in data collection, as
            well as data storage and processing. Based on experience, the average transporta-
            tion agency supports multiple redundancy with respect to data collection, with
            considerable amount of ad hoc data collection for project-specific purposes. If
            such data is not visible across the organization, then it is likely that other ad
            hoc initiatives will collect the same or similar data. In some cases, awareness of
            the data is insufficient, and an inability to access the data in a reasonable time
            frame forces project specific data collection to go ahead even if duplication is
            understood. Cost savings are also likely to be realized with respect to software
            licenses. Multiple software licenses may have been procured to support a frag-
            mented approach to data storage and management. As the data lake is created,
            opportunities may be revealed to save money by consolidating software licenses.
   197   198   199   200   201   202   203   204   205   206   207