Page 265 - Deep Learning
P. 265

248                         Adaptation

                                                           54
            that has received much attention in educational research.  We began by defin-
            ing the basic cognitive capabilities that are needed to do subtraction. These
            include the ability to allocate visual attention as well as motor skills like writ-
            ing and crossing out digits. The initial rules for subtraction were sufficient for
            canonical subtraction problems in which the subtrahend digit is greater than
            the minuend digit in each column, for example, 678 – 234 = ?. Regrouping
            (popularly known as “borrowing“) was not necessary to solve the canonical
            problems. We then taught the model to perform correctly on problems that
            require regrouping as well. This two-step instructional sequence – canonical
            problems followed by problems that require regrouping – corresponds to the
            one observed in classroom teaching. 55
               We tutored the subtraction model in two different methods for noncanon-
            ical problems, the regrouping method taught in most American schools and the
            augmenting method preferred in some European schools. The two procedures
            differ primarily in how they handle noncanonical columns, either by decre-
            menting the subtrahend in the column with the next higher place value or by
            incrementing the minuend in that column. This difference was once thought
                                                             56
            by mathematics educators to be of pedagogical importance.  We tutored each
            of these subtraction methods in two ways that we referred to as procedural
            and conceptual. In procedural (rote) arithmetic, the learner sees an arithmetic
            problem as a spatial arrangement of digits on a page. In this version, the con-
            straints referred to that arrangement, for example, the answer should have a
            single digit in each column. In contrast, a conceptual arithmeter thinks about
            subtraction mathematically instead of typographically. The characters are sym-
            bols for numbers. In this version, the model’s internal representation explicitly
            enoded the place value of each digit, and the constraints encoded mathemat-
            ical relations, such as if the value of a digits D in column N is D * 10 , then the
                                                                    m
            value of a digit E in column N+1 should be E * 10 m+1 . Mathematics educators
            regard the conceptual approach to arithmetic as pedagogically superior.
               The  distinction  between  procedural  (rote)  versus  conceptual  learning
            combined  with  the  two  different  algorithms,  regrouping  and  augmenting,
            to define four different learning scenarios. We tutored HS until mastery in
            each scenario with the same procedure we would use to tutor a student: We
            watched the model solve a problem until it made an error. We interrupted
            the model and typed in a constraint that we thought would allow the model
            to correct that error. The constraint was added to the constraint base in the
            model, and the model was restarted. This cycle continued until the model per-
            formed  correctly. Mastery was assessed by running the model on a test set of
            66  subtraction problems that were not used during training.
   260   261   262   263   264   265   266   267   268   269   270