Page 207 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 207

188	       Big	Data	Analytics	for	Connected	Vehicles	and	Smart	Cities	                	                        Building a Data Lake	                    189
















          Figure 9.3	 A	proposed	approach	methodology	for	the	implementation	of	a	smart	city	and
          transportation	data	lake.


               The overall philosophy behind the approach is to make incremental or
          stepwise progress toward the establishment and operation of the data lake. Ini-
          tially the focus is placed on a very small number of use cases that are addressed
          in a pilot project. The pilot use cases are then used as the basis for conducting
          the approach methodology on a pilot project. During the pilot project, all ele-
          ments of the operation of the data lake are brought into play, including inges-
          tion, preparation, discovery, and exchange of data. Each step in the methodol-
          ogy is explained in the following sections.


          Preparing for the Data Lake
          Preparation or planning for the data lake includes exploring requirements and
          objectives with the target end users. In the case of a smart city transportation
          initiative, end users typically consist of city officials and other city transporta-
          tion partners such as departments of transportation, transit agencies and other
          transportation service providers. It would also be helpful to bring relevant pri-
          vate-sector participants into the dialogue at this early stage.

          Identifying Pilot Subjects
          The use case concept is employed as a major tool in identifying subjects for
          the pilot. The use case pilot described in Chapter 5 can be an effective format
          for capturing the use case information. Whether the format is used or not, it
          is essential to capture the primary ingredients for the pilot, including the data
          to be used, the expected value to be achieved, and an initial understanding of
          the analytics to be applied. The selection criteria for the use cases are as follows:

               • The use cases deliver immediate value to the city.
               • Data to support the analytics required for the use cases is readily available
                either from city sources, transportation partners, or the private sector.
   202   203   204   205   206   207   208   209   210   211   212