Page 22 - Understanding Machine Learning
P. 22
Introduction
4
digitally recorded data, it becomes obvious that there are treasures of mean-
ingful information buried in data archives that are way too large and too
complex for humans to make sense of. Learning to detect meaningful pat-
terns in large and complex data sets is a promising domain in which the
combination of programs that learn with the almost unlimited memory
capacity and ever increasing processing speed of computers opens up new
horizons.
Adaptivity. One limiting feature of programmed tools is their rigidity – once the
program has been written down and installed, it stays unchanged. However,
many tasks change over time or from one user to another. Machine learning
tools – programs whose behavior adapts to their input data – offer a solution to
such issues; they are, by nature, adaptive to changes in the environment they
interact with. Typical successful applications of machine learning to such prob-
lems include programs that decode handwritten text, where a fixed program can
adapt to variations between the handwriting of different users; spam detection
programs, adapting automatically to changes in the nature of spam e-mails; and
speech recognition programs.
1.3 TYPES OF LEARNING
Learning is, of course, a very wide domain. Consequently, the field of machine
learning has branched into several subfields dealing with different types of learning
tasks. We give a rough taxonomy of learning paradigms, aiming to provide some
perspective of where the content of this book sits within the wide field of machine
learning.
We describe four parameters along which learning paradigms can be classified.
Supervised versus Unsupervised Since learning involves an interaction between the
learner and the environment, one can divide learning tasks according to the
nature of that interaction. The first distinction to note is the difference between
supervised and unsupervised learning. As an illustrative example, consider the
task of learning to detect spam e-mail versus the task of anomaly detection.
For the spam detection task, we consider a setting in which the learner receives
training e-mails for which the label spam/not-spam is provided. On the basis of
such training the learner should figure out a rule for labeling a newly arriving
e-mail message. In contrast, for the task of anomaly detection, all the learner
gets as training is a large body of e-mail messages (with no labels) and the
learner’s task is to detect “unusual” messages.
More abstractly, viewing learning as a process of “using experience to gain
expertise,” supervised learning describes a scenario in which the “experience,”
a training example, contains significant information (say, the spam/not-spam
labels) that is missing in the unseen “test examples” to which the learned exper-
tise is to be applied. In this setting, the acquired expertise is aimed to predict
that missing information for the test data. In such cases, we can think of the
environment as a teacher that “supervises” the learner by providing the extra
information (labels). In unsupervised learning, however, there is no distinction
between training and test data. The learner processes input data with the goal