Page 25 - The Real Work Of Data Science Turning Data Into Information, Better Decisions, And Stronger Organizations by Ron S. Kenett, Thomas C. Redman (z-lib.org)_Neat
P. 25
10 The Real Work of Data Science
Great data scientists cast a much deeper and wider net. They “go deep” by studying past
polls to get a get a sense of their strengths and weaknesses. In doing so, they will have learned
(for example) that people lie to pollsters. In mixed company, not a single person we hang out
with confessed that he or she planned to vote for Trump. But privately many admitted, “I’m
going to vote for Trump. I just don’t want my wife [or husband] to know.”
Similarly, a few in the media commented on how much more energy they felt at Trump
rallies than at Clinton rallies. They concluded that those who said they were going to vote
were more likely to actually do so. Even a small amount of lying or misplaced optimism about
voting could skew poll results. The great data scientist will conduct some simple simulations
to learn more.
Further, there are plenty of other predictors of presidential victors, based on the economy,
the rate of employment, the winner of the previous Super Bowl, and so forth. Thus a great data
scientist will “go wide” also. To illustrate, some note that Americans eschew political dynasty.
So, after one party has held the presidency for two terms, Americans will lean toward the
other. Prior to 2016, we count eight relevant elections, the “other party” having won six. By
this logic, one would estimate the probability of a Trump victory as 6/8 = 75%.
Note that great data scientists are not simply searching for the single best set of data, expla-
nation, or model. They are seeking to understand many perspectives, to see which support one
another, which conflict, how much variation they portend, and anything else that bears. They
talk to all sorts of people, try out new theories, ruthlessly discard those that do not satisfy, and
are always on the lookout for more and different data. This is how they find out the way the
world works!
Appendix A lists some of the traits of such data scientists.
Over the years we’ve had the privilege of working with dozens, maybe hundreds, of good
data scientists, statisticians, and analysts. And a few great ones. This relentless focus on learning
about the world is the key differentiator. The great ones possess four other traits as well:
1. They grow and take advantage of large networks. They need them. They are interested in
many things and can’t possibly be expert in all of them. Great data scientists cultivate
relationships with people who have different perspectives than their own. So much the
better to explore the world, learn of new sources of data, and try out interim theories.
2. They have a certain quantitative knack. Great data scientists simply see things that others
don’t. For example, a summer intern (who now uses his analytical prowess as head of a
media company) on his second day at an investment bank exhibited this inherent capability.
His boss had given him a stack of things to read, and in scanning through, he spotted an
error in a return’s calculation. It took him about an hour to verify the error and determine
the correction.
What’s important here is that thousands of others did not see the error. It was obvious
to him, but not to anyone else. And this was a top‐tier investment bank. Presumably, at
least a few good analysts read the same material and did not spot it. Mathematics
has turned out to provide a convenient, amazingly effective language (Einstein used
the phrase “unreasonably effective”) for describing the real world. The great data
scientist taps into that language intuitively and easily in ways that even good data
scientists cannot.
3. They have persistence. The great data scientists are persistent, and in many ways. The
intern in the vignette above made his discovery at a glance and confirmed it in an hour.