Page 223 - Deep Learning

P. 223

206 Adaptation

Every action is a probe that bounces off the environment like a sonar signal,
returning with the outlines of otherwise unseen causes, objects and processes.
Deviations between expected and observed returns tell us that the world is not
as we assumed and hence provide an opportunity to correct our assumptions.
The prevalence of error must have been one of the factors that exerted selec-
tive pressure on early humans once they set out on the unique evolutionary
pathway of relying more on acquired than innate skills. As the hunter-gatherer
bands moved through habitat after habitat on their great migration across the
3
globe, their survival strategies were forever becoming maladaptive. It is plausi-
ble that they evolved a special-purpose cognitive mechanism for making use of
the information that resides in errors, failures and other undesirable outcomes
to improve the fit between their strategies and their environments. If so, errors
are not merely eliminated as a side effect of successful adaptation. Errors play an
active role in their own elimination; we unlearn errors by learning from them.
The theoretical questions are these: What information resides in erroneous out-
comes? How, by what cognitive processes, can that information be extracted and
utilized? What behavioral implications follow from those processes?

FRAMING THE PROBLEM

The old proverbs burnt child dreads the fire and once bitten, twice shy suggest
that learning from error is straightforward: The cure is to refrain from perform-
ing whatever action produced the bad outcome. Edward Thorndike codified
4
this idea in the second half of his Law of Effect: What he called an “annoy-
ing aftereffect” (i.e., an undesirable outcome) lowers the probability that the
5
learner – adult, animal or child– will perform that same action in the future.
Repeated negative outcomes cause the erroneous action to disappear from the
learner’s behavior. In Thorndike’s colorful terminology, the “futile impulse” is
6
“stamped out.” In the terminology introduced in Chapter 6, this effect can be
modeled by reducing – weakening – the strength of the rule that produced the
offending action. That rule will then lose to competing rules during conflict res-
olution and hence apply less often and therefore generate fewer errors. Learning
a skill is to a considerable extent a matter of learning what not to do.
This don’t do it again response to error works with respect to sticking
fingers into flames, but it does not work as a general explanation. Unlike
burning one’s fingers, most actions are not intrinsically correct or incorrect.
For example, if an absent-minded professor tries to open the door to his
home with the key to his office, the lesson cannot be never use the office key,
lest he lock himself out of his office forever. The more plausible lesson is not

218 219 220 221 222 223 224 225 226 227 228