Page 199 - Linear Models for the Prediction of Animal Breeding Values 3rd Edition
P. 199

2. A model estimating breeding values directly, with the (co)variance among breeding
                 2
        values Gs  fitted, where G is the genomic relationship matrix. The matrix G repre-
                 a
        sents the realized proportion of the genome that animals share in common and is
        estimated from the SNPs.
            These models will now be described in more detail.

        11.5.1  SNP-BLUP model


        In matrix form, the mixed linear model for estimating SNP effects can be written as
        (Meuwissen et al., 2001; VanRaden, 2008):
            y = Xb + Zg + e                                                 (11.6)
        where g is a vector of additive genetic effects corresponding to allele substitution
        effects for each SNP and all other terms defined as in Eqn 11.3. The matrix Z relates
        SNP effects to the phenotypes. The sum of g over all marker loci is assumed equal to
        the vector of breeding values (a), i.e. DGV = a = Zg. The MME for Eqn 11.6 are:
            ⎛ X′RX    X′R Z      ⎞ ⎛ ⎞ ˆ b  ⎛ X′Ry⎞
                          −1
                                             −1
                 −1
            ⎜                    ⎟ ⎜ ⎟  =  ⎜ ⎜  ⎟                           (11.7)
            ⎝ Z′RX    Z′R Z + a    g⎠   ⎝ Z′Ry⎠
                          −1
                 −1
                                            −1
                               I ⎠ ⎝ ˆ
        where a = s /s  and R is a diagonal matrix of weights (see Eqn 11.5). The MME in
                   2
                      2
                   e  g
        Eqn 11.7 can easily be set up and solutions obtained for each SNP and the fixed
                                                                        2
                                                2
        effects. However, in practice, the value of s  may not be known and s  could be
                                                g
                                                                        g
                               2
                          2
                                                                            2
                                                                        2
        obtained either as s  = s /m, with m = the number of markers, or as s  = s /2Σp j
                                                                        g
                          g
                                                                            a
                               a
        (1 – p ). The latter is preferred as it takes into account the differences in allele frequencies.
             j
                                               2
                                    2
                                       2
        With the latter, a = 2Σp(1 – p)*[s /s ], with s  being the additive genetic variance for the
                           j    j   e  a       a
        trait and p  is as defined in Section 11.3. Hayes and Daetwyler (2013) indicated that there
                 j
        is a potential problem with this estimate as it assumes the LD between SNP and QTL is
        perfect and all genetic variance is captured by the SNP. This may not be the case in practice
        and they recommended the method described by Moser et al. (2010) for estimating a
        through cross-validation. The method involves estimating SNP effects with different values
        of a and predicting DGV in validation data sets that have not contributed to the estimation
        of SNP effects. The value of a that minimizes the mean square error between the DGV and
        y is taken as the appropriate estimate. This process can be repeated, dropping out different
        subsets of the data and obtaining an estimate of a by averaging across data sets.
        Example 11.2
        Using the data and genetic parameters given in Example 11.1, SNP effects are pre-
        dicted using Eqn 11.6 and all ten SNPs. Then DGVs are computed for the reference
                                                                                2
        and validation animals. Initially, analyses are carried out without weights, thus R = Is .
                                                                                e
        Then the data were re-analysed using EDCs as weights, with R in Eqn 11.7 being a
        diagonal matrix containing EDCs for reference bulls.
        COMPUTING THE REQUIRED MATRICES AND a
        The allele frequencies for the ten SNPs have been calculated in Example 11.1. Using
        those frequencies, 2Σp (1 – p ) = 3.5383. Thus a = 3.5383*(245/35.242) = 24.598.
                            j    j
        Computation of Genomic Breeding Values and Genomic Selection         183
   194   195   196   197   198   199   200   201   202   203   204