Page 64 - Deep Learning
P. 64
The Nature of the Enterprise 47
existence and prevalence, interactions, scaling and transition to practical
applications.
It is important to distinguish between two types of empirical support for
a hypothesized learning mechanism. A laboratory experiment might dem-
onstrate that a particular learning mechanism exists – that is, that people do
possess such a process and they can be induced to execute it by the right
experimental manipulation. The process might nevertheless be unimportant
because it is rarely triggered in everyday life and so does not explain a large
number of cognitive changes occurring outside the laboratory. Existence does
not guarantee prevalence. A laboratory experiment – an artificial situation
specifically arranged to enable observation – is in principle unable to provide
information about prevalence. As a result, information about prevalence is
almost always missing, complicating the evaluation of hypothesized learning
mechanisms.
The duration of individual learning events vary from a fraction of a sec-
ond to a few seconds. To explain large patterns in cognitive change (learning
curves, developmental patterns, the life-time growth of expertise, etc.), a the-
ory has to show how such events combine to generate the observed effects at
longer time scales. If there is a basic process that creates new links in memory
(under some set of triggering conditions), then what kind of memory network
does the repeated application of that process create over time? For example,
does it produce hierarchical structures? If not, it might not be a good hypoth-
esis about the acquisition of conceptual knowledge. If the mind composes cog-
nitive operations that repeatedly occur in sequence into a single operation,
what type of structure does that process produce in the long run? Deriving the
cumulative effect over time is difficult, but proposed mechanisms must pro-
duce realistic results over days, years and decades to be plausible. 54
Scaling over time is closely related to scaling across system levels. If a basic
change process produces such-and-such an effect at the level of individual knowl-
edge representations, what are the implications for the behavior of the cognitive
architecture as a whole? For example, if an association process creates, say, 10,000
new associations over, say, 20 years of living, what is the effect on the person’s
cognitive functioning? If every new link is a potential retrieval path, will work-
ing memory be continuously flooded by retrieved information items of dubious
relevance for the task at hand? For learning theories that postulate multiple basic
change processes, scaling to the cognitive system as a whole also requires atten-
tion to how these processes interact. If there is more than one learning mecha-
nism, observable behavior is to be explained as the composite outcome of the
simultaneous operation of these multiple interacting mechanisms. For example,