Page 195 - Linear Models for the Prediction of Animal Breeding Values 3rd Edition
P. 195

2 and 0 for the two homozygotes (AA and BB) and 1 for the heterozygotes (AB or
        BA). If alleles are expressed in terms of nucleotides, and the reference allele at a locus
        is G and the alternative allele is C, then the code is 0 = GG, 1 = GC and 2 = CC.
        The diagonal elements of MM′ then indicate the individual relationship with itself
        (inbreeding) and the off-diagonals indicate the number of alleles shared by relatives
        (VanRaden, 2007).
            Commonly, in genomic evaluations (VanRaden, 2008), the elements of M are
        scaled to set the mean values of the allele effects to zero and account for differences
        in allele frequencies of the various SNPs. Let the frequency of the second or alterna-
        tive allele at locus j be p  and then elements of M can be scaled by subtracting 2p . Let
                             j                                               j
        the element for column j of a matrix P equal 2p , then the matrix Z, which con-
                                                    j
        tained the scaled elements of M, can be computed as Z = M − P. Note that the sum
        of the elements of each column of Z equals zero. Furthermore, the elements of Z can
        be normalized by dividing the column for marker j by its standard deviation, which
        is assumed to be  2p (1 −  p ). This is assuming that the locus is at Hardy Weinberg
                           j    j
        equilibrium. However, in this chapter Z computed as M – P has been used.



        11.4 Fixed Effect Model for SNP Effects

        Several methods for genomic selection were presented by Meuwissen et al. (2001), and
        one such method includes the least squares approach with chromosome segments or
        SNPs considered as fixed. There is no assumption made about the distribution of the
        SNP effects and it usually involves two steps.
        1. Analysis of each SNP using the simple model in Eqn 11.1, with g  defined as the
                                                                    i
        vector of fixed ith SNP effect.
        2. Select the k most significant SNPs and estimate their effects simultaneously (in the
        same data) using a multiple regression with the term for SNP effects in Eqn 11.1 equal to:
             k
            ∑ Mg
                ii
             i
        This approach suffers from two major limitations. First, the estimation of effects
        based on an SNP selected by single SNP analysis will result in overestimation of
        the SNP effects, as the large amount of multiple testing ensures the selected SNPs are
        those with positive error terms. Second, determining the level of significance for the
        choice of SNPs to include in the final analysis is far from straightforward.
            In an animal breeding context, assuming the few SNPs that have significant effects
        on a trait have been identified, then these SNPs can fitted as fixed effects in a model
        that includes the polygenic effect as a random effect. Thus the genomic breeding value
        for animal i (GEBV ) can be computed as a sum of the direct genomic breeding value
                         i
        (DGV ) calculated from the marker (SNP) effects as M gˆ  and the polygenic effects (uˆ ).
              i                                        i i                      i
            Such a linear model could be written as:
            y = Xb + Zg + Wu + e                                            (11.3)
        where g represents the fixed marker or SNP effects, Z is the scaled matrix of geno-
        types defined in Section 11.2, which relates SNPs to phenotypes, and other terms are
        defined as in Eqn 11.2.


        Computation of Genomic Breeding Values and Genomic Selection         179
   190   191   192   193   194   195   196   197   198   199   200