Page 34 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)_Neat
P. 34
Understand the Real Problem 19
The second warning involves scope creep. The discussion with the manager in the example
above could just as easily have gone along the following lines:
sample size general lack of management data aneed fornewbussiness
’
intelligence BI tools aculture thatdoesn tinvest in technology
All of these might be real, even important, problems, but they quickly grow far beyond
scope. And beware, there are always forces that will complicate even the simplest problem.
To guard against scope creep, especially early on, we recommend that you limit yourself to
problems that can be solved quickly, using existing resources, and within existing budgets.
Redman and the manager in the example above did just that. After they solved the first
problem, they defined the next problem, and the next. In doing so, they took on larger, more
complex issues. But they did so guided by hard facts, experience, and increasing confidence.
There are, of course, problems that require you to think longer term and spend real money
from the very beginning. Developing a predictive model that optimizes profits from credit
decisions or works out the proper dosage for a new cancer medication are good examples. But
the same thinking applies.
Understanding the problem is the first step in the life cycle we introduced in Chapter 1.
Here the domain expert ecosystem is translated into the analytic ecosystem, based on our
understanding of the problem. A poor translation can have disastrous results. As an example,
Schmarzo (2017) describes the negative effects of the Medicare Access and CHIP
Reauthorization Act of 2015, or MACRA. Among the major provisions of MACRA is the
Quality Payment Program. Under the Quality Payment Program, physicians and nurses receive
positive, neutral, or negative Medicare payment adjustments based upon a “Patient Satisfaction
Score.” But satisfying patients and helping them get better are not always the same thing, and
the program had a negative consequence on patient outcomes.
On a more humorous side, Box (2001) told the story of a man who was very tall and his
4‐year‐old son. They were walking to get a newspaper, and the father suddenly realized that
the little boy had to run to keep up with him. So he said, “Sorry, Tommy am I walking too
fast?” And the little boy said, “No, Daddy. I am.”
More examples on good problem solicitation as a prerequisite to proper statistical analysis
are provided in Kenett and Thyregod (2006).
Implications
A recurring theme throughout our careers has been how often managers who admit that
something “just doesn’t feel right” have been spot‐on. There are few real facts, and the initial
problem is to get some. We call these “We don’t know anything and need to sort out what’s
going on” problems. They occupy one end of a continuum of problems on which data scien-
tists should engage. On the other end of the continuum are what we call “optimality prob-
lems.” Current efforts work tolerably well, and the problem is to optimize performance or save
money. And some problems occupy the middle ranges of the continuum.
Thus, the real work of data scientists involves listening to decision‐makers, learning their
languages, translating their languages into yours and back, engaging in ways that make them
feel comfortable, and working together to sort out real problems that they can address. Get
very good at this.