Page 102 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)_Neat
P. 102

Appendix C







             Questions to Help Evaluate

             the Outputs of Data Science







             Kenett and Shmueli (2014) defined information quality (InfoQ) as the utility derived from a
             specific analysis of a specific data set, conditioned on the analysis goals. InfoQ is determined
             by eight dimensions discussed in Chapter 13. Questions to help assess these dimensions for a
             specific report are listed below (see also Kenett and Shmueli 2016a).


              Dimension                                 Questions
              1.  Data resolution    1.1  Is the data scale aligned with the stated goal?
                                  1.2  How reliable and precise are the measuring devices or data sources?
                                  1.3  Is the data analysis suitable for the data aggregation level?
              2.  Data structure    2.1  Is the type of data used aligned with the stated goal?
                                  2.2  Are data integrity details (corrupted/missing values) described and
                                   handled appropriately?
                                  2.3  Are the analysis methods suitable for the data structure?
              3.  Data integration    3.1  Is the data from multiple sources properly integrated? If so, what is the
                                   credibility of each source?
                                  3.2  How is the integration done? Are there linkage issues that lead to crucial
                                   information being dropped?
                                  3.3  Does the data integration add value in terms of the stated goal?
                                  3.4  Does the data integration cause any privacy or confidentiality concerns?
              4.  Temporal relevance  4.1  Are data collection, data analysis, and deployment time‐sensitive?
                                  4.2  Does the time gap between data collection and analysis cause any
                                   concern?
                                  4.3  Is the time gap between the data collection and analysis and the intended
                                   use of the model (e.g. in terms of policy recommendations) of any
                                   concern?





             The Real Work of Data Science: Turning Data into Information, Better Decisions, and Stronger Organizations,
             First Edition. Ron S. Kenett and Thomas C. Redman.
             © 2019 Ron S. Kenett and Thomas C. Redman. Published 2019 by John Wiley & Sons Ltd.
             Companion website: www.wiley.com/go/kenett-redman/datascience
   97   98   99   100   101   102   103   104   105   106   107