Page 53 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)_Neat
P. 53

40                                                  The Real Work of Data Science


           V{P} = value of the problem to be solved
           V{PS} = value of the problem actually solved
           P{S} = probability level the problem actually gets solved
           P{I} = probability level the solution is actually implemented
           T{I} = time the solution stays implemented
           E{R} = expected number of replications.

           Let’s explore each in turn.

           V{D} = Value of the Collected Data
           The application of data science depends on data, so obtaining the right data of the right quality
           is critical. A high V{D} corresponds to data being most relevant to the problem, trusted,
           clearly understood by relevant stakeholders, and collected comprehensively without bias. We
           discussed this in Chapter 6.

           V{M} = Value of the Analytic Methods Employed
           This concept is closest to the original idea of mathematical statistical efficiency and includes
           the idea that the method should be as efficient as possible. As an example, suppose a manager
           wishes to reduce billing errors and must first obtain an accurate baseline error rate. Suppose
           there are two candidate methods, A and B. Method A is more efficient than method B if
           method A requires a smaller sample to provide the required estimate with the same prespecified
           error. More generally, a high V{M} is assigned to methods with proven mathematical prop-
           erties, such as unbiasedness and consistency.
           V{P} = Value of the Problem to Be Solved
           Data scientists sometimes forget this part of the equation. Some might choose problems on the
           basis of technical depth rather than the value of solving them. To illustrate, one of us spent
           time figuring out how to reduce billing errors that were worth over $700,000/year, a fact
           crucial to management, even though solving the problem was not particularly difficult. A high
           V{P} is assigned to problems of strategic importance to the organization.

           V{PS} = Value of the Problem Actually Solved
           Usually no one method actually solves the entire problem, only part of it, so this part of the
           equation is expressed as a fraction of V{P}. In the case of the billing example, the manager
           expected to reduce the billings errors from 24,000 to 3,000 per billing cycle, a success rate of
           87.5%. Problems with high V{P} that are fully solved get a high V{PS}.

           P{S} = Probability the Problem Actually Gets Solved
           This is both a statistical question and a management question. Did the method work and lead
           to a solution that worked, and were the data, information, and resources available to solve the
           problem? Part of this PSE component is related to management and technical personnel’s
           buy‐in and in meeting the challenge of facing the problem tackled. This is achieved by getting
           the relevant stakeholders to play an active role in specifying the problem and interpreting
           results. A high value of P{S} implies that proper planning and effective execution have been
           carried out.
   48   49   50   51   52   53   54   55   56   57   58