Page 49 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)_Neat
P. 49

36                                                  The Real Work of Data Science



                                          Modes of                    Decisions
                                                                        about
                                        generalization
                                                                     Population
               Data                                                  from which
                                                                    the data were
                                     Laws of nature  Mechanistic       drawn
                                                    models


              Hard data                Statistical
                                     generalization
                                                    Predictive        The future
                                                    analytics
                                        Domain
                                      generalization
                                                    Transportability
               Soft                                                    A related
               data                    Intuition                      population








                                   Figure 8.1  Modes of generalization.



           Modes of Generalization

           The best, most reliable form of generalization involves the laws of nature. These include
           conservation of mass, conservation of energy, conservation of momentum, Newton’s laws, the
           principle of least action, the laws of thermodynamics, and Maxwell’s equations. Sometimes
           these are called “mechanistic models of modes of action.” These laws started as empirical
           laws that were embraced as laws of nature and have stood the test of time. They have been
           verified time and again, and today, we do not need further data to invoke them, only knowledge
           of physics, chemistry, biology, or other scientific disciplines
             Mathematics, the queen of the sciences, offers a unique context. Paul Erdos, the famous
           mathematician, used to talk about The Book, in which God maintains the perfect proofs of
           mathematical theorems (Aigner and Ziegler 2000). The laws of nature build on The Book.
             Now consider statistical generalizability. Sorting it out requires deep understanding of the
           goals (Chapter 4). In making inference about a population parameter from a sample, statistical
           generalizability  and  sampling bias are  the  focus, and  the question of  interest  is, “What
           population does the sample represent?” (Rao 1985). In contrast, for predicting the values of
           new observations, the question is whether the analysis captures associations in the training
           data (i.e. the data used in model building) that generalize to the to‐be‐predicted situations.
           Control charts present a good example.  The logic goes like this: “Assuming the process
           remains stable, we expect performance to vary within the upper and lower control limits. We
           further expect average performance to be close to the center line.”
   44   45   46   47   48   49   50   51   52   53   54