Page 206 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 206

186	  Big	Data	Analytics	for	Connected	Vehicles	and	Smart	Cities	  	  Building a Data Lake	  187


            just launch the data lake, but to ensure that it does not degrade over time. This
            also includes taking the necessary steps to monitor, manage, and ensure data
            quality both entering the data lake and being maintained within the data lake.

            Lack of Self-Service Capabilities and Long Development Times
            Very often in technology-driven data lake implementations, the data lake is
            developed as a tool that requires highly specialist operation. This leads to a situ-
            ation in which end users cannot have direct access to the data or the analytics.
            The absence of such self-service capabilities can produce a heavy workload on
            a few members of staff, leading to long development times and slow responses
            to end user needs.

            Lack of Features to Motivate and Enable Smart City and Transportation Exponents
            In a similar vein to the above challenge, an information technology–driven data
            lake program can ignore the need to motivate and enable end users in smart city
            and transportation contexts. This can lead to a lack of interest on the part of the
            end users and a consequent inability to monetize the investment made in the
            data lake. Early experience indicates that it is not sufficient to merely share data;
            it is also necessary to provide models and illustrations that can motivate the end
            users. This would include helping end users to understand the data lake and
            to understand the analytics possibilities through the communication of model
            analytics and the support of a dialogue on the development of custom analytics
            for the end user’s specific job function.


            9.8  An Approach to Building a Data Lake

            Early experience in the creation of data lakes for multiple organization types in
            both the public and private sector has revealed that there are a few pitfalls to
            be avoided in the successful development of a data lake. Learning lessons from
            this early experience, it is possible to put together a robust approach that mini-
            mizes the chances of encountering these barriers while maximizing the chances
            of success. To provide practical advice on the creation of a transportation data
            lake, a data lake creation methodology has been identified and is defined in
            subsequent sections of this chapter. The approach is based on the experiences
            of a company called Think Big [4], and it has been evolved as a direct result of
            experience gained in working with public- and private-sector clients. The origi-
            nal approach methodology has been adapted based on experience with several
            transportation clients to create an approach that is specifically adapted to the
            needs of transportation and smart cities. Figure 9.3 presents an overview of the
            approach methodology.
   201   202   203   204   205   206   207   208   209   210   211