Page 73 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)

Page 73 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)_Neat

P. 73

Evaluating Data Science Outputs

More Formally

In the last chapter, we focused on teaching your colleagues some basics and providing a starter
set of questions for decision‐makers. Of course, this business of helping decision‐makers
become increasingly better consumers of data science never ends. As they gain experience,
you need to provide them a more formal template based on the eight dimensions of the
information quality model (Kenett and Shmueli 2016a). This will help them go deeper, facili-
tate discussions regarding trade‐offs, and help them improve the quality of information gener-
ated in their organizations. Breiman (2001) depicts two cultures in the use of statistical
modeling to reach conclusions from data, data modeling, and algorithmic analysis. The InfoQ
framework addresses outputs from both approaches, in the context of business, academic,
services, and industrial work.

Assessing Information Quality

The InfoQ framework provides a structured approach for evaluating the analytic work. InfoQ
is defined as the utility, U, derived by conducting a certain analysis, f, on a given data set, X,
with respect to a given goal, g. For the mathematically inclined:

,
,,
InfoQU fX g U f X g .

As an example, consider cellular operators who want to reduce churn by launching a cus-
tomer retention campaign. Their goal, g, is to correctly identify customers with high poten-
tial for churn – the logical target of the campaign. The data, X, consists of customer usage,
lists of customers who’ve changed operators, traffic patterns, and problems reported to the
call center. The data scientist plans to use a decision tree, f, which will help him or her
define business rules that identify groups of customers with similar churn probabilities. The
utility, U, is increased profits by targeting this campaign only on customers with a high
churn potential.

The Real Work of Data Science: Turning Data into Information, Better Decisions, and Stronger Organizations,
First Edition. Ron S. Kenett and Thomas C. Redman.
© 2019 Ron S. Kenett and Thomas C. Redman. Published 2019 by John Wiley & Sons Ltd.
Companion website: www.wiley.com/go/kenett-redman/datascience

68 69 70 71 72 73 74 75 76 77 78