Page 54 - Big Data Analytics for Connected Vehicles and Smart Cities
P. 54
34 Big Data Analytics for Connected Vehicles and Smart Cities What Is Big Data? 35
• Type;
• Volume;
• Velocity;
• Variety;
• Variability;
• Complexity;
• Veracity.
Let’s take a look at each of these in turn.
Type
There are two major categories of data: real-time and archive. The literature
indicates that these are given many different names; for example, real-time data
may be referred to as transactional and archive data may be referred to as static
data. They are often referred to as “hot” and “cold” data, giving the sense that
hot data is live and used in the short term while cold data is stored for longer-
term use. The terms data at rest and data in motion are also used to differentiate
static and dynamic data. The distinction lies in how the data is being used at any
given time. Real-time data must be kept in a manner that is accessible quickly.
To support this, less frequently used data can be moved to an archive where the
data can be stored in large volumes for long periods of time at lower cost. These
days it is also possible to conduct analytics on a real-time data stream while it’s
on the way to being stored. The use of real-time analytics is another reason for
separating real-time data from archive data.
Volume
The volume dimension of big data is an obvious one. The adjective big gives
you the sense that this part of data science is about volume. In the past, there
has been a tendency to fragment bigger data sets to store data more efficiently
and enable fast access. These days, with the advent of fast and low-cost data
storage, the tendency is to consolidate and bring data to a central repository.
This has the effect of creating an enterprise-wide view of the data, which could
be difficult if the data is fragmented and stored in silos across the organization.
So how big is big data? Here are a few examples, from beyond transportation:
• Approximately 1 Pb of data is uploaded to YouTube every day [5].
• It is estimated that the human brain has a functional memory capacity
of 2.5 Pb [6].
• Netflix users stream approximately 4.7 Pb of data every year [7, 8].