Page 193 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 193

174	       Big	Data	Analytics	for	Connected	Vehicles	and	Smart	Cities	                	                        Building a Data Lake	                    175





















          Figure 9.1	 Chapter	9	word	cloud.

          9.3  Introduction

          Chapters 2, 3, and 5–7 touch on data lakes, with a basic definition provided in
          Chapter 2 when the term was first introduced. This chapter provides a detailed
          explanation of the term data lake, an explanation of the value of the data lake
          and a suggested robust approach toward building a data lake. The data lake is
          an analogy for bringing data together from different sources, making it acces-
          sible and transforming it into a format that can be useful across the enterprise
          or organization. The analogy or visual offered by the term is an extremely useful
          communication tool, and it is valuable when introducing new data science and
          data analytics concepts to transportation professionals. Ideally, transportation
          specialists will be able to maintain a focus on their area of expertise, while mak-
          ing use of new data science and analytics tools to assist in gaining new insights
          and understandings. The data lake analogy allows the overall characteristics of
          the concept to be discussed and a value to be defined without diving into the
          weeds with respect to data science and data analytics. Bearing in mind that this
          book is designed to provide an overview of big data and data analytics tech-
          niques for smart cities, the data lake analogy provides an ideal communication
          tool. The purpose of this chapter is not to provide a how-to guide on selecting
          and using technology related to dig data and analytics. Rather, it is intended to
          provide an overview of how the data lake fits within the bigger picture and the
          value that can be delivered by taking this radically new approach to the storage
          of data. Providing an overview, rather than a detailed exposition, also avoids
          technology selection or bias toward a specific set of tools. While some specific
          solutions and approaches are used to illustrate the data lake concept, the over-
          all approach allows the selection of multiple technologies and solutions to fit
          within the needs of the specific smart city.
   188   189   190   191   192   193   194   195   196   197   198