Page 38 - Zoo Animal Learning and Training
P. 38

10  1  Learning Theory

                                                      Schedules of reinforcement can differ in
              Generally, schedules of reinforcement can
  VetBooks.ir  be continuous or intermittent. On a continu-  two ways. Firstly, they can differ based on
            ous reinforcement schedule, every emitted
                                                     responses or the amount of time passed. In
            target behaviour is followed by a reinforcer.   whether they come from the number of
            An example of this in the natural environment   ratio schedules, reinforcement depends on the
            would be a baby learning to drink from their   number of responses made. Ratio schedules
            mother’s teat. Every time they suck on the teat   are set to deliver reinforcement following a
            they will get milk, which increases their   particular number of responses.  Interval
            chances of sucking on the teat in the future.  schedules  are  set  to  deliver  reinforcement
              However, schedules of natural contingencies   when one response is made after some
            are often intermittent: the behaviour isn’t rein­  amount of time has passed. In fixed schedules,
            forced every time it occurs. A classic example   the number of responses needed to obtain
            of an intermittent schedule in nature is bees’   reinforcement is the same every time. The
            foraging behaviour. A single bee will visit sev­  number of responses can be 1 or 1000, but
            eral different flowers to find nectar. However, if   that number is fixed. In variable schedules,
            they visit the same flowers, they may not find   the  number  of responses required for  rein­
            nectar every time because the flower needs   forcement varies around some average.
            time to fill. We also see this in zoos/aquariums   Let’s go over some examples. A keeper
            if you vary whether food is put in environmen­  training an elephant to touch a target delivers
            tal enrichment items. The animal might check   food on a variable ratio 5 (written as VR 5).
            the environmental enrichment item every   This means that on average, every fifth
            time  it  is  placed into the  enclosure, but will   response, the elephant will receive food. So
            only manipulate the item if it is filled with food.  the elephant might receive a piece of sweet
              Continuous and intermittent reinforcement   potato on the first response, sixth response,
            can be broken down into four schedules or   second response, eighth response, fifth
            reinforcement: fixed ratio, variable ratio, fixed   response, and the eighth response and so
            interval, and variable interval (Table 1.2).  on.  If we were to train that same elephant


            Table 1.2  Reinforcement schedules.

             Reinforcement
             schedule     Definition               Example

             Fixed interval  Reinforcement is delivered at a   Turning out the animals to the yard: every morning at
                          predictable time interval.  10 am the keeper opens the night enclosure door, but the
                                                   animal’s behaviour of checking the door to go outside
                                                   isn’t reinforced until they check the door after 10 am.
             Variable     Response is reinforced after an   Animal feedings: the time of feeding an animal may
             interval     interval of time which varies but   vary from day to day, but on average a keeper gives
                          centres around some average   food every 4 hours. Therefore, the animal’s response of
                          amount of time.          checking the bowl will not be reinforced until an
                                                   average of 4 hours has passed.
             Fixed ratio  Response is reinforced only   Multiple repetitions: you want the animal you are
                          after a specified number of   training to do multiple repetitions of the same
                          responses.               behaviour. Therefore, you deliver reinforcement after
                                                   every 2 correct responses.
             Variable ratio  Response is reinforced after an   Laboratory study: the lever in a Skinner box gives a
                          average number of responses.  pellet on average after 20 pulls. Thus the rat might
                                                   receive a pellet after 2 pulls or after 15 pulls, but on
                                                   average it is 20 pulls of the lever to receive a pellet.
   33   34   35   36   37   38   39   40   41   42   43