Page 213 - Deep Learning
P. 213

196                         Adaptation


            Stage 2: Mastery
            The second stage of practice begins when the learner completes the task for the
            first time and it lasts until he can reliably perform the task correctly. during
            this stage, the main theoretical problem is how the learner can improve his
            incomplete and possibly incorrect version of the strategy-to-be-learned. The
            most important source of information during this middle stage consists of the
            outcomes of the actions generated by the current version of the target skill. to
            act tentatively in an unfamiliar environment is to ask questions of that environ-
            ment. pushing a button is to ask, what does this button do? and the machine’s
            response is the answer; poking a piece of coral with the tip of a diver’s knife
            is to ask, what is this?; if it scampers off, the information gained is that this a
            well-camouflaged fish. in general, the relation between action and outcome
            contains information. Since the emergence of cybernetics in the 1940s, infor-
            mation to the effect that the learner’s action was appropriate, correct or useful
            is called positive feedback. The source of such information can be an instructor
            or a peer (good job; that’s the right answer; etc.), but it can also be the material
            task environment (oh, I see; this button does turn on the red light). information
            to the effect that the learner’s action was erroneous, incorrect, inappropriate or
            unhelpful in some respect is called negative feedback. it, too, can come from a
            tutor (perhaps you should double-check that answer) or originate in the mate-
            rial environment, as when a novice driver fishtails when trying to drive on ice
            for the first time.
               different  mechanisms  are  required  to  learn  from  these  two  types  of
            information. in their 1966 review of learning theories, Ernest R. Hilgard and
            Gordon H. Bower stated this point succinctly, using Right to refer to positive
            feedback and Wrong to refer to negative feedback:

               There is a logical difference between responding in the intelligent direc-
               tion to Right and Wrong. The intelligent response to Right is to do again
               what was last done. This makes possible immediate rehearsal; the task is
               clear. The intelligent response to Wrong is to do something different, but
               what to do is less clear. it is necessary both to remember what not to do
               and to form some sort of hypothesis as to what to do. 58

               The main conundrum regarding learning from positive feedback is what
            there is to learn: to elicit positive feedback, the learner must have done the
            right thing. But if he already knows what to do, then how does the positive
            feedback help? it almost certainly has some role to play, because giving pos-
                                                            59
            itive feedback is a common move by experienced tutors.  one possibility is
            that many actions taken during skill acquisition are tentative. The learner
   208   209   210   211   212   213   214   215   216   217   218