Page 426 - ITGC_Audit Guides
P. 426
However, the use of unstructured data is growing and becoming more common within
organizations. This type of data is not confined to traditional data structures or constraints. It is
typically more difficult to manage, due to its evolving and unpredictable nature, and it is usually
sourced from large, disparate, and often external data sources. Consequently, new solutions have
been developed to manage and analyze this type of data. See Figure 2 for a diagram that shows
the difference between structured and unstructured data.
Figure 2: Examples of Structured and Unstructured Data
Source: The IIA
Data Storage
A large repository of enterprisewide data specifically designed for analytics and reporting is known
as a data warehouse. Data warehouses are typically relational databases that store information
from multiple sources. Data is uploaded from operational systems that contain transactional data
to data warehouses, which store complete information about one or more subjects. ETL (extract,
transform, and load) or ELT (extract, load, and transform) tools are configured to move data from
the operational system to the data warehouse. The data is loaded in the format and structure of
the data warehouse, which is often aggregated.
Data lakes are becoming an increasingly popular solution to support big data storage and data
discovery. Data lakes are similar to data warehouses in that they store large amounts of data from
various sources, but they also store additional data attributes from source systems at a level of
granularity that would ordinarily be lost in data aggregation for data warehouses. This provides big
data solutions with all available data elements at a sufficient level of granularity to perform a
complete analysis. Yet, it offers organizations the flexibility to solve unanticipated problems,
because it maintains all data in a readily available format.
7 — theiia.org