Page 218 - Linear Models for the Prediction of Animal Breeding Values 3rd Edition
P. 218
11.9 Cross-validation and Genomic Reliabilities
As described in previous sections, the computation of SNP effects is usually in a refer-
ence population using animals with observations. In the case of the dairy industry, the
estimation of SNP effects has been carried out using mostly bulls with high reliability
as the reference population with deregressed breeding values (DRB) used as observa-
tions. Recently, some countries have started including cows in the reference popula-
tions, which require weighting the cow records appropriately. Ideally, it is necessary
that the estimates of SNP effects are validated in another data set, which has not
contributed any information to the reference population to assess accuracy of predic-
tion. In practice, the cross-validation should be evaluated in differently randomly
sampled validation data sets to avoid any bias.
The DGV computed for the validation data sets are compared with their DRP.
An estimate of the correlation between the DGV and the DRP in the validation ani-
mals provides an estimate of the accuracy of genomic predictions, although this does
not take into account the accuracy of the DRP themselves. For the purposes of illus-
tration, the correlation between the DGVs from the SNP or GBLUP models with the
DRPs for the validation animals in the data for Example 11.1 is 0.49, which gives a
reliability of 0.24. The accuracies or reliabilities from the cross-validation studies are
usually referred to as realized reliabilities.
Theoretical reliabilities, as calculated in traditional BLUP, can also be computed
from the inverse of equations similar to those used to compute DGVs. For individuals
with observations, reliabilities for the DGV can be computed (VanRaden, 2008) by
first computing B as follows:
⎛ 2 ⎞⎞ −1
B = G G + R ⎜ ⎛ s e ⎟⎟ G
⎜
⎝ ⎝ s 2 a ⎠⎠
2
2
Then reliability for animal i = rel = 1 − (b*s /s ), where b is the diagonal element
i ii e a ii
of B for the animal. Similarly, for validation candidates with no records, B is:
⎛ ⎛ s ⎞⎞ − 1
2
B = C G + R ⎜ e 2 ⎟⎟ C′
⎜
⎝ ⎝ s ⎠⎠
a
Then reliability is computed from the diagonal elements of B as described for the
reference animals.
However, these theoretical reliability estimates tend to be too high. These can be
scaled by the realized reliabilities from the cross-validation study. In addition, with a
large data set, the inversion required for the computation of the reliabilities could be
a source of limitation to the use of the methodology.
11.10 Understanding SNP Solutions from the Various Models
The vector g can be computed from the second row of the MME in Eqn 11.7. Thus:
ˆ
−1
−1
−1
ˆ
g = (Z′R Z + Ia) (Z′R (y − Xb))
202 Chapter 11