Page 426 - ITGC_Audit Guides
P. 426

However,  the use of unstructured  data  is  growing  and  becoming  more  common  within
                   organizations. This type of data is not confined to traditional data structures or constraints. It is
                   typically more difficult to manage, due to its evolving and unpredictable nature, and it is usually
                   sourced from large, disparate, and often external data sources. Consequently, new solutions have
                   been developed to manage and analyze this type of data. See Figure 2 for a diagram that shows
                   the difference between structured and unstructured data.

                   Figure 2: Examples of Structured and Unstructured Data





























                   Source: The IIA

                   Data Storage

                   A large repository of enterprisewide data specifically designed for analytics and reporting is known
                   as a data warehouse. Data warehouses are typically relational databases that store information
                   from multiple sources. Data is uploaded from operational systems that contain transactional data
                   to data warehouses, which store complete information about one or more subjects. ETL (extract,
                   transform, and load) or ELT (extract, load, and transform) tools are configured to move data from
                   the operational system to the data warehouse. The data is loaded in the format and structure of
                   the data warehouse, which is often aggregated.

                   Data lakes are becoming an increasingly popular solution to support big data storage and data
                   discovery. Data lakes are similar to data warehouses in that they store large amounts of data from
                   various sources, but they also store additional data attributes from source systems at a level of
                   granularity that would ordinarily be lost in data aggregation for data warehouses. This provides big
                   data solutions with all available data elements at a sufficient level of granularity to perform a
                   complete analysis. Yet,  it offers  organizations  the flexibility to solve  unanticipated  problems,
                   because it maintains all data in a readily available format.





                   7 — theiia.org
   421   422   423   424   425   426   427   428   429   430   431