Page 218 - Linear Models for the Prediction of Animal Breeding Values 3rd Edition
P. 218

11.9   Cross-validation and Genomic Reliabilities

         As described in previous sections, the computation of SNP effects is usually in a refer-
         ence population using animals with observations. In the case of the dairy industry, the
         estimation of SNP effects has been carried out using mostly bulls with high reliability
         as the reference population with deregressed breeding values (DRB) used as observa-
         tions. Recently, some countries have started including cows in the reference popula-
         tions, which require weighting the cow records appropriately. Ideally, it is necessary
         that the estimates of SNP effects are validated in another data set, which has not
         contributed any information to the reference population to assess accuracy of predic-
         tion. In practice, the cross-validation should be evaluated in differently randomly
         sampled validation data sets to avoid any bias.
            The DGV computed for the validation data sets are compared with their DRP.
         An estimate of the correlation between the DGV and the DRP in the validation ani-
         mals provides an estimate of the accuracy of genomic predictions, although this does
         not take into account the accuracy of the DRP themselves. For the purposes of illus-
         tration, the correlation between the DGVs from the SNP or GBLUP models with the
         DRPs for the validation animals in the data for Example 11.1 is 0.49, which gives a
         reliability of 0.24. The accuracies or reliabilities from the cross-validation studies are
         usually referred to as realized reliabilities.
            Theoretical reliabilities, as calculated in traditional BLUP, can also be computed
         from the inverse of equations similar to those used to compute DGVs. For individuals
         with observations, reliabilities for the DGV can be computed (VanRaden, 2008) by
         first computing B as follows:
                 ⎛        2 ⎞⎞  −1
            B =  G G +  R ⎜ ⎛ s e  ⎟⎟  G
                 ⎜
                 ⎝      ⎝ s 2 a  ⎠⎠
                                                2
                                                   2
         Then reliability for animal i = rel  = 1 − (b*s /s ), where b  is the diagonal element
                                     i       ii  e  a       ii
         of B for the animal. Similarly, for validation candidates with no records, B is:
                  ⎛     ⎛ s ⎞⎞  − 1
                           2
            B = C G +  R ⎜  e 2 ⎟⎟  C′
                  ⎜
                  ⎝     ⎝  s ⎠⎠
                          a
         Then reliability is computed from the diagonal elements of B as described for the
         reference animals.
            However, these theoretical reliability estimates tend to be too high. These can be
         scaled by the realized reliabilities from the cross-validation study. In addition, with a
         large data set, the inversion required for the computation of the reliabilities could be
         a source of limitation to the use of the methodology.



         11.10   Understanding SNP Solutions from the Various Models

         The vector g can be computed from the second row of the MME in Eqn 11.7. Thus:
                                        ˆ
                                 −1
                   −1
                           −1
            ˆ
            g = (Z′R Z + Ia) (Z′R (y − Xb))
          202                                                            Chapter 11
   213   214   215   216   217   218   219   220   221   222   223