Page 21 - Understanding Machine Learning
P. 21

1.2 When Do We Need Machine Learning?   3


              of some food is followed by the administration of unpleasant electrical shock, the
              rats do not tend to avoid that food. Similar failure of conditioning occurs when the
              characteristic of the food that implies nausea (such as taste or smell) is replaced
              by a vocal signal. The rats seem to have some “built in” prior knowledge telling
              them that, while temporal correlation between food and nausea can be causal, it is
              unlikely that there would be a causal relationship between food consumption and
              electrical shocks or between sounds and nausea.
                 We conclude that one distinguishing feature between the bait shyness learn-
              ing and the pigeon superstition is the incorporation of prior knowledge that biases
              the learning mechanism. This is also referred to as inductive bias. The pigeons in
              the experiment are willing to adopt any explanation for the occurrence of food.
              However, the rats “know” that food cannot cause an electric shock and that the
              co-occurrence of noise with some food is not likely to affect the nutritional value
              of that food. The rats’ learning process is biased toward detecting some kind of
              patterns while ignoring other temporal correlations between events.
                 It turns out that the incorporation of prior knowledge, biasing the learning pro-
              cess, is inevitable for the success of learning algorithms (this is formally stated and
              proved as the “No-Free-Lunch theorem” in Chapter 5). The development of tools
              for expressing domain expertise, translating it into a learning bias, and quantifying
              the effect of such a bias on the success of learning is a central theme of the theory
              of machine learning. Roughly speaking, the stronger the prior knowledge (or prior
              assumptions) that one starts the learning process with, the easier it is to learn from
              further examples. However, the stronger these prior assumptions are, the less flex-
              ible the learning is – it is bound, a priori, by the commitment to these assumptions.
              We shall discuss these issues explicitly in Chapter 5.



              1.2 WHEN DO WE NEED MACHINE LEARNING?
              When do we need machine learning rather than directly program our computers to
              carry out the task at hand? Two aspects of a given problem may call for the use of
              programs that learn and improve on the basis of their “experience”: the problem’s
              complexity and the need for adaptivity.

              Tasks That Are Too Complex to Program.

                    Tasks Performed by Animals/Humans: There are numerous tasks that we
                     human beings perform routinely, yet our introspection concerning how
                     we do them is not sufficiently elaborate to extract a well defined pro-
                     gram. Examples of such tasks include driving, speech recognition, and
                     image understanding. In all of these tasks, state of the art machine learn-
                     ing programs, programs that “learn from their experience,” achieve quite
                     satisfactory results, once exposed to sufficiently many training examples.
                    Tasks beyond Human Capabilities: Another wide family of tasks that ben-
                     efit from machine learning techniques are related to the analysis of very
                     large and complex data sets: astronomical data, turning medical archives
                     into medical knowledge, weather prediction, analysis of genomic data, Web
                     search engines, and electronic commerce. With more and more available
   16   17   18   19   20   21   22   23   24   25   26