Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hickey JM, Gorjanc G. Simulated data for genomic selection and genome-wide association studies using a combination of coalescent and gene drop methods. G3 (Bethesda) 2012;2:425-7. [PMID: 22540033 DOI: 10.1534/g3.111.001297] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2011] [Accepted: 11/09/2011] [Indexed: 11/18/2022]

For:	Hickey JM, Gorjanc G. Simulated data for genomic selection and genome-wide association studies using a combination of coalescent and gene drop methods. G3 (Bethesda) 2012;2:425-7. [PMID: 22540033 DOI: 10.1534/g3.111.001297] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2011] [Accepted: 11/09/2011] [Indexed: 11/18/2022]

Number

Cited by Other Article(s)

Anilkumar C, Muhammed Azharudheen TP, Sah RP, Sunitha NC, Devanna BN, Marndi BC, Patra BC. Gene based markers improve precision of genome-wide association studies and accuracy of genomic predictions in rice breeding. Heredity (Edinb) 2023;130:335-345. [PMID: 36792661 PMCID: PMC10163052 DOI: 10.1038/s41437-023-00599-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 02/02/2023] [Accepted: 02/03/2023] [Indexed: 02/17/2023] Open

Klosa J, Simon N, Liebscher V, Wittenburg D. A Fitted Sparse-Group Lasso for Genome-Based Evaluations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:30-38. [PMID: 35254989 DOI: 10.1109/tcbb.2022.3156805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Gowane GR, Alex R, Mukherjee A, Vohra V. Impact and utility of shallow pedigree using single-step genomic BLUP for prediction of unbiased genomic breeding values. Trop Anim Health Prod 2022;54:339. [DOI: 10.1007/s11250-022-03340-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 10/04/2022] [Indexed: 11/28/2022]

Bonnett D, Li Y, Crossa J, Dreisigacker S, Basnet B, Pérez-Rodríguez P, Alvarado G, Jannink JL, Poland J, Sorrells M. Response to Early Generation Genomic Selection for Yield in Wheat. FRONTIERS IN PLANT SCIENCE 2022;12:718611. [PMID: 35087542 PMCID: PMC8787636 DOI: 10.3389/fpls.2021.718611] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 10/22/2021] [Indexed: 06/14/2023]

Abstract

We investigated increasing genetic gain for grain yield using early generation genomic selection (GS). A training set of 1,334 elite wheat breeding lines tested over three field seasons was used to generate Genomic Estimated Breeding Values (GEBVs) for grain yield under irrigated conditions applying markers and three different prediction methods: (1) Genomic Best Linear Unbiased Predictor (GBLUP), (2) GBLUP with the imputation of missing genotypic data by Ridge Regression BLUP (rrGBLUP_imp), and (3) Reproducing Kernel Hilbert Space (RKHS) a.k.a. Gaussian Kernel (GK). F2 GEBVs were generated for 1,924 individuals from 38 biparental cross populations between 21 parents selected from the training set. Results showed that F2 GEBVs from the different methods were not correlated. Experiment 1 consisted of selecting F2s with the highest average GEBVs and advancing them to form genomically selected bulks and make intercross populations aiming to combine favorable alleles for yield. F4:6 lines were derived from genomically selected bulks, intercrosses, and conventional breeding methods with similar numbers from each. Results of field-testing for Experiment 1 did not find any difference in yield with genomic compared to conventional selection. Experiment 2 compared the predictive ability of the different GEBV calculation methods in F2 using a set of single plant-derived F2:4 lines from randomly selected F2 plants. Grain yield results from Experiment 2 showed a significant positive correlation between observed yields of F2:4 lines and predicted yield GEBVs of F2 single plants from GK (the predictive ability of 0.248, P < 0.001) and GBLUP (0.195, P < 0.01) but no correlation with rrGBLUP_imp. Results demonstrate the potential for the application of GS in early generations of wheat breeding and the importance of using the appropriate statistical model for GEBV calculation, which may not be the same as the best model for inbreds.

Collapse

Rios EF, Andrade MHML, Resende MFR, Kirst M, de Resende MDV, de Almeida Filho JE, Gezan SA, Munoz P. Genomic prediction in family bulks using different traits and cross-validations in pine. G3-GENES GENOMES GENETICS 2021;11:6321952. [PMID: 34544139 PMCID: PMC8496210 DOI: 10.1093/g3journal/jkab249] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 07/02/2021] [Indexed: 11/13/2022]

Casellas J, Martín de Hijas-Villalba M, Vázquez-Gómez M, Id-Lahoucine S. Low-coverage whole-genome sequencing in livestock species for individual traceability and parentage testing. Livest Sci 2021. [DOI: 10.1016/j.livsci.2021.104629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Gaynor RC, Gorjanc G, Hickey JM. AlphaSimR: an R package for breeding program simulations. G3-GENES GENOMES GENETICS 2021;11:6025179. [PMID: 33704430 PMCID: PMC8022926 DOI: 10.1093/g3journal/jkaa017] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 11/05/2020] [Indexed: 01/03/2023]

Obšteter J, Jenko J, Hickey JM, Gorjanc G. Efficient use of genomic information for sustainable genetic improvement in small cattle populations. J Dairy Sci 2019;102:9971-9982. [PMID: 31477287 DOI: 10.3168/jds.2019-16853] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 07/13/2019] [Indexed: 11/19/2022]

Abstract

In this study, we compared genetic gain, genetic variation, and the efficiency of converting variation into gain under different genomic selection scenarios with truncation or optimum contribution selection in a small dairy population by simulation. Breeding programs have to maximize genetic gain but also ensure sustainability by maintaining genetic variation. Numerous studies have shown that genomic selection increases genetic gain. Although genomic selection is a well-established method, small populations still struggle with choosing the most sustainable strategy to adopt this type of selection. We developed a simulator of a dairy population and simulated a model after the Slovenian Brown Swiss population with ∼10,500 cows. We compared different truncation selection scenarios by varying (1) the method of sire selection and their use on cows or bull-dams, and (2) selection intensity and the number of years a sire is in use. Furthermore, we compared different optimum contribution selection scenarios with optimization of sire selection and their usage. We compared scenarios in terms of genetic gain, selection accuracy, generation interval, genetic and genic variance, rate of coancestry, effective population size, and conversion efficiency. The results showed that early use of genomically tested sires increased genetic gain compared with progeny testing, as expected from changes in selection accuracy and generation interval. A faster turnover of sires from year to year and higher intensity increased the genetic gain even further but increased the loss of genetic variation per year. Although maximizing intensity gave the lowest conversion efficiency, faster turnover of sires gave an intermediate conversion efficiency. The largest conversion efficiency was achieved with the simultaneous use of genomically and progeny-tested sires that were used over several years. Compared with truncation selection, optimizing sire selection and their usage increased the conversion efficiency by achieving either comparable genetic gain for a smaller loss of genetic variation or higher genetic gain for a comparable loss of genetic variation. Our results will help breeding organizations implement sustainable genomic selection.

Collapse

Genomic Prediction of Additive and Non-additive Effects Using Genetic Markers and Pedigrees. G3-GENES GENOMES GENETICS 2019;9:2739-2748. [PMID: 31263059 PMCID: PMC6686920 DOI: 10.1534/g3.119.201004] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Gowane GR, Lee SH, Clark S, Moghaddar N, Al-Mamun HA, van der Werf JHJ. Effect of selection and selective genotyping for creation of reference on bias and accuracy of genomic prediction. J Anim Breed Genet 2019;136:390-407. [PMID: 31215699 DOI: 10.1111/jbg.12420] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 05/22/2019] [Accepted: 05/23/2019] [Indexed: 01/17/2023]

Mota RR, Vanderick S, Colinet FG, Hammami H, Wiggans GR, Gengler N. Additional considerations to the use of single-step genomic predictions in a dominance setting. J Anim Breed Genet 2019;136:430-440. [PMID: 31161675 DOI: 10.1111/jbg.12406] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 04/23/2019] [Accepted: 05/03/2019] [Indexed: 11/27/2022]

Abstract

Recent publications indicate that single-step models are suitable to estimate breeding values, dominance deviations and total genetic values with acceptable quality. Additive single-step methods implicitly extend known number of allele information from genotyped to non-genotyped animals. This theory is well derived in an additive setting. It was recently shown, at least empirically, that this basic strategy can be extended to dominance with reasonable prediction quality. Our study addressed two additional issues. It illustrated the theoretical basis for extension and validated genomic predictions to dominance based on single-step genomic best linear unbiased prediction theory. This development was then extended to include inbreeding into dominance relationships, which is a currently not yet solved issue. Different parametrizations of dominance relationship matrices were proposed. Five dominance single-step inverse matrices were tested and described as C¹ , C² , C³ , C⁴ and C⁵ . Genotypes were simulated for a real pig population (n = 11,943 animals). In order to avoid any confounding issues with additive effects, pseudo-records including only dominance deviations and residuals were simulated. SNP effects of heterozygous genotypes were summed up to generate true dominance deviations. We added random noise to those values and used them as phenotypes. Accuracy was defined as correlation between true and predicted dominance deviations. We conducted five replicates and estimated accuracies in three sets: between all (S₁ ), non-genotyped (S₂ ) and inbred non-genotyped (S₃ ) animals. Potential bias was assessed by regressing true dominance deviations on predicted values. Matrices accounting for inbreeding (C³ , C⁴ and C⁵ ) best fit. Accuracies were on average 0.77, 0.40 and 0.46 in S₁ , S₂ and S₃ , respectively. In addition, C³ , C⁴ and C⁵ scenarios have shown better accuracies than C¹ and C² , and dominance deviations were less biased. Better matrix compatibility (accuracy and bias) was observed by re-scaling diagonal elements to 1 minus the inbreeding coefficient (C⁵ ).

Collapse

Pégard M, Rogier O, Bérard A, Faivre-Rampant P, Paslier MCL, Bastien C, Jorge V, Sánchez L. Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population. BMC Genomics 2019;20:302. [PMID: 30999856 PMCID: PMC6471894 DOI: 10.1186/s12864-019-5660-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 03/29/2019] [Indexed: 12/30/2022] Open

Abstract

Background

Genomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alternative during the first stages of genomic resources development. Our study is the first to explore the imputation in a tree species: black poplar. About 1000 pure-breed Populus nigra trees from a breeding population were selected and genotyped with a 12K custom Infinium Bead-Chip. Forty-three of those individuals corresponding to nodal trees in the pedigree were fully sequenced (reference), while the remaining majority (target) was imputed from 8K to 1.4 million SNPs using FImpute. Each SNP and individual was evaluated for imputation errors by leave-one-out cross validation in the training sample of 43 sequenced trees. Some summary statistics such as Hardy-Weinberg Equilibrium exact test p-value, quality of sequencing, depth of sequencing per site and per individual, minor allele frequency, marker density ratio or SNP information redundancy were calculated. Principal component and Boruta analyses were used on all these parameters to rank the factors affecting the quality of imputation. Additionally, we characterize the impact of the relatedness between reference population and target population.

Results

During the imputation process, we used 7540 SNPs from the chip to impute 1,438,827 SNPs from sequences. At the individual level, imputation accuracy was high with a proportion of SNPs correctly imputed between 0.84 and 0.99. The variation in accuracies was mostly due to differences in relatedness between individuals. At a SNP level, the imputation quality depended on genotyped SNP density and on the original minor allele frequency. The imputation did not appear to result in an increase of linkage disequilibrium. The genotype densification not only brought a better distribution of markers all along the genome, but also we did not detect any substantial bias in annotation categories.

Conclusions

This study shows that it is possible to impute low-density marker panels to whole genome sequence with good accuracy under certain conditions that could be common to many breeding populations.

Electronic supplementary material

The online version of this article (10.1186/s12864-019-5660-y) contains supplementary material, which is available to authorized users.

Collapse

Genomic Prediction Using Individual-Level Data and Summary Statistics from Multiple Populations. Genetics 2018;210:53-69. [PMID: 30021793 DOI: 10.1534/genetics.118.301109] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 07/16/2018] [Indexed: 01/27/2023] Open

A method for allocating low-coverage sequencing resources by targeting haplotypes rather than individuals. Genet Sel Evol 2017;49:78. [PMID: 29070022 PMCID: PMC5655873 DOI: 10.1186/s12711-017-0353-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Accepted: 10/18/2017] [Indexed: 11/25/2022] Open

Abstract

Background

This paper describes a heuristic method for allocating low-coverage sequencing resources by targeting haplotypes rather than individuals. Low-coverage sequencing assembles high-coverage sequence information for every individual by accumulating data from the genome segments that they share with many other individuals into consensus haplotypes. Deriving the consensus haplotypes accurately is critical for achieving a high phasing and imputation accuracy. In order to enable accurate phasing and imputation of sequence information for the whole population, we allocate the available sequencing resources among individuals with existing phased genomic data by targeting the sequencing coverage of their haplotypes.

Results

Our method, called AlphaSeqOpt, prioritizes haplotypes using a score function that is based on the frequency of the haplotypes in the sequencing set relative to the target coverage. AlphaSeqOpt has two steps: (1) selection of an initial set of individuals by iteratively choosing the individuals that have the maximum score conditional on the current set, and (2) refinement of the set through several rounds of exchanges of individuals. AlphaSeqOpt is very effective for distributing a fixed amount of sequencing resources evenly across haplotypes, which results in a reduction of the proportion of haplotypes that are sequenced below the target coverage. AlphaSeqOpt can provide a greater proportion of haplotypes sequenced at the target coverage by sequencing less individuals, as compared with other methods that use a score function based on haplotype frequencies in the population. A refinement of the initially selected set can provide a larger more diverse set with more unique individuals, which is beneficial in the context of low-coverage sequencing. We extend the method with an approach for filtering rare haplotypes based on their flanking haplotypes, so that only those that are likely to derive from a recombination event are targeted.

Conclusions

We present a method for allocating sequencing resources so that a greater proportion of haplotypes are sequenced at a coverage that is sufficiently high for population-based imputation with low-coverage sequencing. The haplotype score function, the refinement step, and the new approach for filtering rare haplotypes make AlphaSeqOpt more effective for that purpose than previously reported methods for reducing sequencing redundancy.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-017-0353-y) contains supplementary material, which is available to authorized users.

Collapse

Gonen S, Battagin M, Johnston SE, Gorjanc G, Hickey JM. The potential of shifting recombination hotspots to increase genetic gain in livestock breeding. Genet Sel Evol 2017;49:55. [PMID: 28676070 PMCID: PMC5496647 DOI: 10.1186/s12711-017-0330-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 06/26/2017] [Indexed: 01/01/2023] Open

Abstract

Background

This study uses simulation to explore and quantify the potential effect of shifting recombination hotspots on genetic gain in livestock breeding programs.

Methods

We simulated three scenarios that differed in the locations of quantitative trait nucleotides (QTN) and recombination hotspots in the genome. In scenario 1, QTN were randomly distributed along the chromosomes and recombination was restricted to occur within specific genomic regions (i.e. recombination hotspots). In the other two scenarios, both QTN and recombination hotspots were located in specific regions, but differed in whether the QTN occurred outside of (scenario 2) or inside (scenario 3) recombination hotspots. We split each chromosome into 250, 500 or 1000 regions per chromosome of which 10% were recombination hotspots and/or contained QTN. The breeding program was run for 21 generations of selection, after which recombination hotspot regions were kept the same or were shifted to adjacent regions for a further 80 generations of selection. We evaluated the effect of shifting recombination hotspots on genetic gain, genetic variance and genic variance.

Results

Our results show that shifting recombination hotspots reduced the decline of genetic and genic variance by releasing standing allelic variation in the form of new allele combinations. This in turn resulted in larger increases in genetic gain. However, the benefit of shifting recombination hotspots for increased genetic gain was only observed when QTN were initially outside recombination hotspots. If QTN were initially inside recombination hotspots then shifting them decreased genetic gain.

Discussion

Shifting recombination hotspots to regions of the genome where recombination had not occurred for 21 generations of selection (i.e. recombination deserts) released more of the standing allelic variation available in each generation and thus increased genetic gain. However, whether and how much increase in genetic gain was achieved by shifting recombination hotspots depended on the distribution of QTN in the genome, the number of recombination hotspots and whether QTN were initially inside or outside recombination hotspots.

Conclusions

Our findings show future scope for targeted modification of recombination hotspots e.g. through changes in zinc-finger motifs of the PRDM9 protein to increase genetic gain in production species.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-017-0330-5) contains supplementary material, which is available to authorized users.

Collapse

CASELLAS JOAQUIM, CAÑAS-ÁLVAREZ JHONJACOBO, FINA MARTA, PIEDRAFITA JESÚS, CECCHINATO ALESSIO. Fine mapping by composite genome-wide association analysis. Genet Res (Camb) 2017;99:e4. [PMID: 28583209 PMCID: PMC6865146 DOI: 10.1017/s0016672317000027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Revised: 02/22/2017] [Accepted: 03/07/2017] [Indexed: 11/06/2022] Open

Gonen S, Ros-Freixedes R, Battagin M, Gorjanc G, Hickey JM. A method for the allocation of sequencing resources in genotyped livestock populations. Genet Sel Evol 2017;49:47. [PMID: 28521728 PMCID: PMC5437657 DOI: 10.1186/s12711-017-0322-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Accepted: 05/12/2017] [Indexed: 11/18/2022] Open

Abstract

Background

This paper describes a method, called AlphaSeqOpt, for the allocation of sequencing resources in livestock populations with existing phased genomic data to maximise the ability to phase and impute sequenced haplotypes into the whole population.

Methods

We present two algorithms. The first selects focal individuals that collectively represent the maximum possible portion of the haplotype diversity in the population. The second allocates a fixed sequencing budget among the families of focal individuals to enable phasing of their haplotypes at the sequence level. We tested the performance of the two algorithms in simulated pedigrees. For each pedigree, we evaluated the proportion of population haplotypes that are carried by the focal individuals and compared our results to a variant of the widely-used key ancestors approach and to two haplotype-based approaches. We calculated the expected phasing accuracy of the haplotypes of a focal individual at the sequence level given the proportion of the fixed sequencing budget allocated to its family.

Results

AlphaSeqOpt maximises the ability to capture and phase the most frequent haplotypes in a population in three ways. First, it selects focal individuals that collectively represent a larger portion of the population haplotype diversity than existing methods. Second, it selects focal individuals from across the pedigree whose haplotypes can be easily phased using family-based phasing and imputation algorithms, thus maximises the ability to impute sequence into the rest of the population. Third, it allocates more of the fixed sequencing budget to focal individuals whose haplotypes are more frequent in the population than to focal individuals whose haplotypes are less frequent. Unlike existing methods, we additionally present an algorithm to allocate part of the sequencing budget to the families (i.e. immediate ancestors) of focal individuals to ensure that their haplotypes can be phased at the sequence level, which is essential for enabling and maximising subsequent sequence imputation.

Conclusions

We present a new method for the allocation of a fixed sequencing budget to focal individuals and their families such that the final sequenced haplotypes, when phased at the sequence level, represent the maximum possible portion of the haplotype diversity in the population that can be sequenced and phased at that budget.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-017-0322-5) contains supplementary material, which is available to authorized users.

Collapse

Garcia-Baccino CA, Legarra A, Christensen OF, Misztal I, Pocrnic I, Vitezica ZG, Cantet RJC. Metafounders are related to F _st fixation indices and reduce bias in single-step genomic evaluations. Genet Sel Evol 2017;49:34. [PMID: 28283016 PMCID: PMC5439149 DOI: 10.1186/s12711-017-0309-2] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 03/03/2017] [Indexed: 01/03/2023] Open

Abstract

Background

Metafounders are pseudo-individuals that encapsulate genetic heterozygosity and relationships within and across base pedigree populations, i.e. ancestral populations. This work addresses the estimation and usefulness of metafounder relationships in single-step genomic best linear unbiased prediction (ssGBLUP).

Results

We show that ancestral relationship parameters are proportional to standardized covariances of base allelic frequencies across populations, such as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{\text{st}}$$\end{document}Fst fixation indexes. These covariances of base allelic frequencies can be estimated from marker genotypes of related recent individuals and pedigree. Simple methods for their estimation include naïve computation of allele frequencies from marker genotypes or a method of moments that equates average pedigree-based and marker-based relationships. Complex methods include generalized least squares (best linear unbiased estimator (BLUE)) or maximum likelihood based on pedigree relationships. To our knowledge, methods to infer \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{\text{st}}$$\end{document}Fst coefficients from marker data have not been developed for related individuals. We derived a genomic relationship matrix, compatible with pedigree relationships, that is constructed as a cross-product of {−1,0,1} codes and that is equivalent (apart from scale factors) to an identity-by-state relationship matrix at genome-wide markers. Using a simulation with a single population under selection in which only males and youngest animals are genotyped, we observed that generalized least squares or maximum likelihood gave accurate and unbiased estimates of the ancestral relationship parameter (true value: 0.40) whereas the naïve method and the method of moments were biased (average estimates of 0.43 and 0.35). We also observed that genomic evaluation by ssGBLUP using metafounders was less biased in terms of estimates of genetic trend (bias of 0.01 instead of 0.12), resulted in less overdispersed (0.94 instead of 0.99) and as accurate (0.74) estimates of breeding values than ssGBLUP without metafounders and provided consistent estimates of heritability.

Conclusions

Estimation of metafounder relationships can be achieved using BLUP-like methods with pedigree and markers. Inclusion of metafounder relationships reduces bias of genomic predictions with no loss in accuracy.

Collapse

Antolín R, Nettelblad C, Gorjanc G, Money D, Hickey JM. A hybrid method for the imputation of genomic data in livestock populations. Genet Sel Evol 2017;49:30. [PMID: 28253858 PMCID: PMC5439152 DOI: 10.1186/s12711-017-0300-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 02/13/2017] [Indexed: 11/24/2022] Open

Abstract

BACKGROUND

This paper describes a combined heuristic and hidden Markov model (HMM) method to accurately impute missing genotypes in livestock datasets. Genomic selection in breeding programs requires high-density genotyping of many individuals, making algorithms that economically generate this information crucial. There are two common classes of imputation methods, heuristic methods and probabilistic methods, the latter being largely based on hidden Markov models. Heuristic methods are robust, but fail to impute markers in regions where the thresholds of heuristic rules are not met, or the pedigree is inconsistent. Hidden Markov models are probabilistic methods which typically do not require specific family structures or pedigree information, making them very flexible, but they are computationally expensive and, in some cases, less accurate.

RESULTS

We implemented a new hybrid imputation method that combined heuristic and HMM methods, AlphaImpute and MaCH, and compared the computation time and imputation accuracy of the three methods. AlphaImpute was the fastest, followed by the hybrid method and then the HMM. The computation time of the hybrid method and the HMM increased linearly with the number of iterations used in the hidden Markov model, however, the computation time of the hybrid method increased almost linearly and that of the HMM quadratically with the number of template haplotypes. The hybrid method was the most accurate imputation method for low-density panels when pedigree information was missing, especially if minor allele frequency was also low. The accuracy of the hybrid method and the HMM increased with the number of template haplotypes. The imputation accuracy of all three methods increased with the marker density of the low-density panels. Excluding the pedigree information reduced imputation accuracy for the hybrid method and AlphaImpute. Finally, the imputation accuracy of the three methods decreased with decreasing minor allele frequency.

CONCLUSIONS

The hybrid heuristic and probabilistic imputation method is able to impute all markers for all individuals in a population, as the HMM. The hybrid method is usually more accurate and never significantly less accurate than a purely heuristic method or a purely probabilistic method and is faster than a standard probabilistic method.

Collapse

Gonen S, Jenko J, Gorjanc G, Mileham AJ, Whitelaw CBA, Hickey JM. Potential of gene drives with genome editing to increase genetic gain in livestock breeding programs. Genet Sel Evol 2017;49:3. [PMID: 28093068 PMCID: PMC5240390 DOI: 10.1186/s12711-016-0280-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 12/14/2016] [Indexed: 01/10/2023] Open

Abstract

BACKGROUND

This paper uses simulation to explore how gene drives can increase genetic gain in livestock breeding programs. Gene drives are naturally occurring phenomena that cause a mutation on one chromosome to copy itself onto its homologous chromosome.

METHODS

We simulated nine different breeding and editing scenarios with a common overall structure. Each scenario began with 21 generations of selection, followed by 20 generations of selection based on true breeding values where the breeder used selection alone, selection in combination with genome editing, or selection with genome editing and gene drives. In the scenarios that used gene drives, we varied the probability of successfully incorporating the gene drive. For each scenario, we evaluated genetic gain, genetic variance [Formula: see text], rate of change in inbreeding ([Formula: see text]), number of distinct quantitative trait nucleotides (QTN) edited, rate of increase in favourable allele frequencies of edited QTN and the time to fix favourable alleles.

RESULTS

Gene drives enhanced the benefits of genome editing in seven ways: (1) they amplified the increase in genetic gain brought about by genome editing; (2) they amplified the rate of increase in the frequency of favourable alleles and reduced the time it took to fix them; (3) they enabled more rapid targeting of QTN with lesser effect for genome editing; (4) they distributed fixed editing resources across a larger number of distinct QTN across generations; (5) they focussed editing on a smaller number of QTN within a given generation; (6) they reduced the level of inbreeding when editing a subset of the sires; and (7) they increased the efficiency of converting genetic variation into genetic gain.

CONCLUSIONS

Genome editing in livestock breeding results in short-, medium- and long-term increases in genetic gain. The increase in genetic gain occurs because editing increases the frequency of favourable alleles in the population. Gene drives accelerate the increase in allele frequency caused by editing, which results in even higher genetic gain over a shorter period of time with no impact on inbreeding.

Collapse

Faux AM, Gorjanc G, Gaynor RC, Battagin M, Edwards SM, Wilson DL, Hearne SJ, Gonen S, Hickey JM. AlphaSim: Software for Breeding Program Simulation. THE PLANT GENOME 2016;9. [PMID: 27902803 DOI: 10.3835/plantgenome2016.02.0013] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]

Battagin M, Gorjanc G, Faux AM, Johnston SE, Hickey JM. Effect of manipulating recombination rates on response to selection in livestock breeding programs. Genet Sel Evol 2016;48:44. [PMID: 27335010 PMCID: PMC4917950 DOI: 10.1186/s12711-016-0221-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Accepted: 06/07/2016] [Indexed: 11/10/2022] Open

Abstract

Background

In this work, we performed simulations to explore the potential of manipulating recombination rates to increase response to selection in livestock breeding programs.

Methods

We carried out ten replicates of several scenarios that followed a common overall structure but differed in the average rate of recombination along the genome (expressed as the length of a chromosome in Morgan), the genetic architecture of the trait under selection, and the selection intensity under truncation selection (expressed as the proportion of males selected). Recombination rates were defined by simulating nine different chromosome lengths: 0.10, 0.25, 0.50, 1, 2, 5, 10, 15 and 20 Morgan, respectively. One Morgan was considered to be the typical chromosome length for current livestock species. The genetic architecture was defined by the number of quantitative trait variants (QTV) that affected the trait under selection. Either a large (10,000) or a small (1000 or 500) number of QTV was simulated. Finally, the proportions of males selected under truncation selection as sires for the next generation were equal to 1.2, 2.4, 5, or 10 %.

Results

Increasing recombination rate increased the overall response to selection and decreased the loss of genetic variance. The difference in cumulative response between low and high recombination rates increased over generations. At low recombination rates, cumulative response to selection tended to asymptote sooner and the genetic variance was completely eroded. If the trait under selection was affected by few QTV, differences between low and high recombination rates still existed, but the selection limit was reached at all rates of recombination.

Conclusions

Higher recombination rates can enhance the efficiency of breeding programs to turn genetic variation into response to selection. However, to increase response to selection significantly, the recombination rate would need to be increased 10- or 20-fold. The biological feasibility and consequences of such large increases in recombination rates are unknown.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-016-0221-1) contains supplementary material, which is available to authorized users.

Collapse

de Almeida Filho JE, Guimarães JFR, E Silva FF, de Resende MDV, Muñoz P, Kirst M, Resende MFR. The contribution of dominance to phenotype prediction in a pine breeding and simulated population. Heredity (Edinb) 2016;117:33-41. [PMID: 27118156 PMCID: PMC4901355 DOI: 10.1038/hdy.2016.23] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Revised: 12/07/2015] [Accepted: 03/04/2016] [Indexed: 02/01/2023] Open

Pérez-Enciso M, Legarra A. A combined coalescence gene-dropping tool for evaluating genomic selection in complex scenarios (ms2gs). J Anim Breed Genet 2016;133:85-91. [PMID: 26995218 DOI: 10.1111/jbg.12200] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 12/07/2015] [Indexed: 11/28/2022]

Gorjanc G, Jenko J, Hearne SJ, Hickey JM. Initiating maize pre-breeding programs using genomic selection to harness polygenic variation from landrace populations. BMC Genomics 2016;17:30. [PMID: 26732811 PMCID: PMC4702314 DOI: 10.1186/s12864-015-2345-z] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 12/21/2015] [Indexed: 11/23/2022] Open

Abstract

Background

The limited genetic diversity of elite maize germplasms raises concerns about the potential to breed for new challenges. Initiatives have been formed over the years to identify and utilize useful diversity from landraces to overcome this issue. The aim of this study was to evaluate the proposed designs to initiate a pre-breeding program within the Seeds of Discovery (SeeD) initiative with emphasis on harnessing polygenic variation from landraces using genomic selection. We evaluated these designs with stochastic simulation to provide decision support about the effect of several design factors on the quality of resulting (pre-bridging) germplasm. The evaluated design factors were: i) the approach to initiate a pre-breeding program from the selected landraces, doubled haploids of the selected landraces, or testcrosses of the elite hybrid and selected landraces, ii) the genetic parameters of landraces and phenotypes, and iii) logistical factors related to the size and management of a pre-breeding program.

Results

The results suggest a pre-breeding program should be initiated directly from landraces. Initiating from testcrosses leads to a rapid reconstruction of the elite donor genome during further improvement of the pre-bridging germplasm. The analysis of accuracy of genomic predictions across the various design factors indicate the power of genomic selection for pre-breeding programs with large genetic diversity and constrained resources for data recording. The joint effect of design factors was summarized with decision trees with easy to follow guidelines to optimize pre-breeding efforts of SeeD and similar initiatives.

Conclusions

Results of this study provide guidelines for SeeD and similar initiatives on how to initiate pre-breeding programs that aim to harness polygenic variation from landraces.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-2345-z) contains supplementary material, which is available to authorized users.

Collapse

Casellas J, Piedrafita J. Accuracy and expected genetic gain under genetic or genomic evaluation in sheep flocks with different amounts of pedigree, genomic and phenotypic data. Livest Sci 2015. [DOI: 10.1016/j.livsci.2015.10.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Gorjanc G, Bijma P, Hickey JM. Reliability of pedigree-based and genomic evaluations in selected populations. Genet Sel Evol 2015;47:65. [PMID: 26271246 PMCID: PMC4536753 DOI: 10.1186/s12711-015-0145-1] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2014] [Accepted: 07/29/2015] [Indexed: 11/14/2022] Open

Abstract

Background

Reliability is an important parameter in breeding. It measures the precision of estimated breeding values (EBV) and, thus, potential response to selection on those EBV. The precision of EBV is commonly measured by relating the prediction error variance (PEV) of EBV to the base population additive genetic variance (base PEV reliability), while the potential for response to selection is commonly measured by the squared correlation between the EBV and breeding values (BV) on selection candidates (reliability of selection). While these two measures are equivalent for unselected populations, they are not equivalent for selected populations. The aim of this study was to quantify the effect of selection on these two measures of reliability and to show how this affects comparison of breeding programs using pedigree-based or genomic evaluations.

Methods

Two scenarios with random and best linear unbiased prediction (BLUP) selection were simulated, where the EBV of selection candidates were estimated using only pedigree, pedigree and phenotype, genome-wide marker genotypes and phenotype, or only genome-wide marker genotypes. The base PEV reliabilities of these EBV were compared to the corresponding reliabilities of selection. Realized genetic selection intensity was evaluated to quantify the potential of selection on the different types of EBV and, thus, to validate differences in reliabilities. Finally, the contribution of different underlying processes to changes in additive genetic variance and reliabilities was quantified.

Results

The simulations showed that, for selected populations, the base PEV reliability substantially overestimates the reliability of selection of EBV that are mainly based on old information from the parental generation, as is the case with pedigree-based prediction. Selection on such EBV gave very low realized genetic selection intensities, confirming the overestimation and importance of genotyping both male and female selection candidates. The two measures of reliability matched when the reductions in additive genetic variance due to the Bulmer effect, selection, and inbreeding were taken into account.

Conclusions

For populations under selection, EBV based on genome-wide information are more valuable than suggested by the comparison of the base PEV reliabilities between the different types of EBV. This implies that genome-wide marker information is undervalued for selected populations and that genotyping un-phenotyped female selection candidates should be reconsidered.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0145-1) contains supplementary material, which is available to authorized users.

Collapse

Jenko J, Gorjanc G, Cleveland MA, Varshney RK, Whitelaw CBA, Woolliams JA, Hickey JM. Potential of promotion of alleles by genome editing to improve quantitative traits in livestock breeding programs. Genet Sel Evol 2015;47:55. [PMID: 26133579 PMCID: PMC4487592 DOI: 10.1186/s12711-015-0135-3] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 06/15/2015] [Indexed: 12/29/2022] Open

Abstract

Background

Genome editing (GE) is a method that enables specific nucleotides in the genome of an individual to be changed. To date, use of GE in livestock has focussed on simple traits that are controlled by a few quantitative trait nucleotides (QTN) with large effects. The aim of this study was to evaluate the potential of GE to improve quantitative traits that are controlled by many QTN, referred to here as promotion of alleles by genome editing (PAGE).

Methods

Multiple scenarios were simulated to test alternative PAGE strategies for a quantitative trait. They differed in (i) the number of edits per sire (0 to 100), (ii) the number of edits per generation (0 to 500), and (iii) the extent of use of PAGE (i.e. editing all sires or only a proportion of them). The base line scenario involved selecting individuals on true breeding values (i.e., genomic selection only (GS only)-genomic selection with perfect accuracy) for several generations. Alternative scenarios complemented this base line scenario with PAGE (GS + PAGE). The effect of different PAGE strategies was quantified by comparing response to selection, changes in allele frequencies, the number of distinct QTN edited, the sum of absolute effects of the edited QTN per generation, and inbreeding.

Results

Response to selection after 20 generations was between 1.08 and 4.12 times higher with GS + PAGE than with GS only. Increases in response to selection were larger with more edits per sire and more sires edited. When the total resources for PAGE were limited, editing a few sires for many QTN resulted in greater response to selection and inbreeding compared to editing many sires for a few QTN. Between the scenarios GS only and GS + PAGE, there was little difference in the average change in QTN allele frequencies, but there was a major difference for the QTN with the largest effects. The sum of the effects of the edited QTN decreased across generations.

Conclusions

This study showed that PAGE has great potential for application in livestock breeding programs, but inbreeding needs to be managed.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0135-3) contains supplementary material, which is available to authorized users.

Collapse

Gorjanc G, Cleveland MA, Houston RD, Hickey JM. Potential of genotyping-by-sequencing for genomic selection in livestock populations. Genet Sel Evol 2015;47:12. [PMID: 25887531 PMCID: PMC4344748 DOI: 10.1186/s12711-015-0102-z] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Accepted: 01/29/2015] [Indexed: 12/12/2022] Open

Abstract

Background

Next-generation sequencing techniques, such as genotyping-by-sequencing (GBS), provide alternatives to single nucleotide polymorphism (SNP) arrays. The aim of this work was to evaluate the potential of GBS compared to SNP array genotyping for genomic selection in livestock populations.

Methods

The value of GBS was quantified by simulation analyses in which three parameters were varied: (i) genome-wide sequence read depth (x) per individual from 0.01x to 20x or using SNP array genotyping; (ii) number of genotyped markers from 3000 to 300 000; and (iii) size of training and prediction sets from 500 to 50 000 individuals. The latter was achieved by distributing the total available x of 1000x, 5000x, or 10 000x per genotyped locus among the varying number of individuals. With SNP arrays, genotypes were called from sequence data directly. With GBS, genotypes were called from sequence reads that varied between loci and individuals according to a Poisson distribution with mean equal to x. Simulated data were analyzed with ridge regression and the accuracy and bias of genomic predictions and response to selection were quantified under the different scenarios.

Results

Accuracies of genomic predictions using GBS data or SNP array data were comparable when large numbers of markers were used and x per individual was ~1x or higher. The bias of genomic predictions was very high at a very low x. When the total available x was distributed among the training individuals, the accuracy of prediction was maximized when a large number of individuals was used that had GBS data with low x for a large number of markers. Similarly, response to selection was maximized under the same conditions due to increasing both accuracy and selection intensity.

Conclusions

GBS offers great potential for developing genomic selection in livestock populations because it makes it possible to cover large fractions of the genome and to vary the sequence read depth per individual. Thus, the accuracy of predictions is improved by increasing the size of training populations and the intensity of selection is increased by genotyping a larger number of selection candidates.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0102-z) contains supplementary material, which is available to authorized users.

Collapse

Zhang Z, Erbe M, He J, Ober U, Gao N, Zhang H, Simianer H, Li J. Accuracy of whole-genome prediction using a genetic architecture-enhanced variance-covariance matrix. G3 (BETHESDA, MD.) 2015;5:615-27. [PMID: 25670771 PMCID: PMC4390577 DOI: 10.1534/g3.114.016261] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 02/05/2015] [Indexed: 01/22/2023]

Abstract

Obtaining accurate predictions of unobserved genetic or phenotypic values for complex traits in animal, plant, and human populations is possible through whole-genome prediction (WGP), a combined analysis of genotypic and phenotypic data. Because the underlying genetic architecture of the trait of interest is an important factor affecting model selection, we propose a new strategy, termed BLUP|GA (BLUP-given genetic architecture), which can use genetic architecture information within the dataset at hand rather than from public sources. This is achieved by using a trait-specific covariance matrix ( T: ), which is a weighted sum of a genetic architecture part ( S: matrix) and the realized relationship matrix ( G: ). The algorithm of BLUP|GA (BLUP-given genetic architecture) is provided and illustrated with real and simulated datasets. Predictive ability of BLUP|GA was validated with three model traits in a dairy cattle dataset and 11 traits in three public datasets with a variety of genetic architectures and compared with GBLUP and other approaches. Results show that BLUP|GA outperformed GBLUP in 20 of 21 scenarios in the dairy cattle dataset and outperformed GBLUP, BayesA, and BayesB in 12 of 13 traits in the analyzed public datasets. Further analyses showed that the difference of accuracies for BLUP|GA and GBLUP significantly correlate with the distance between the T: and G: matrices. The new strategy applied in BLUP|GA is a favorable and flexible alternative to the standard GBLUP model, allowing to account for the genetic architecture of the quantitative trait under consideration when necessary. This feature is mainly due to the increased similarity between the trait-specific relationship matrix ( T: matrix) and the genetic relationship matrix at unobserved causal loci. Applying BLUP|GA in WGP would ease the burden of model selection.

Collapse

Onogi A, Ideta O, Inoshita Y, Ebana K, Yoshioka T, Yamasaki M, Iwata H. Exploring the areas of applicability of whole-genome prediction methods for Asian rice (Oryza sativa L.). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2015;128:41-53. [PMID: 25341369 DOI: 10.1007/s00122-014-2411-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 10/03/2014] [Indexed: 05/25/2023]

Evaluation of measures of correctness of genotype imputation in the context of genomic prediction: a review of livestock applications. Animal 2014;8:1743-53. [PMID: 25045914 DOI: 10.1017/s1751731114001803] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

In livestock, many studies have reported the results of imputation to 50k single nucleotide polymorphism (SNP) genotypes for animals that are genotyped with low-density SNP panels. The objective of this paper is to review different measures of correctness of imputation, and to evaluate their utility depending on the purpose of the imputed genotypes. Across studies, imputation accuracy, computed as the correlation between true and imputed genotypes, and imputation error rates, that counts the number of incorrectly imputed alleles, are commonly used measures of imputation correctness. Based on the nature of both measures and results reported in the literature, imputation accuracy appears to be a more useful measure of the correctness of imputation than imputation error rates, because imputation accuracy does not depend on minor allele frequency (MAF), whereas imputation error rate depends on MAF. Therefore imputation accuracy can be better compared across loci with different MAF. Imputation accuracy depends on the ability of identifying the correct haplotype of a SNP, but many other factors have been identified as well, including the number of genotyped immediate ancestors, the number of animals with genotypes at the high-density panel, the SNP density on the low- and high-density panel, the MAF of the imputed SNP and whether imputed SNP are located at the end of a chromosome or not. Some of these factors directly contribute to the linkage disequilibrium between imputed SNP and SNP on the low-density panel. When imputation accuracy is assessed as a predictor for the accuracy of subsequent genomic prediction, we recommend that: (1) individual-specific imputation accuracies should be used that are computed after centring and scaling both true and imputed genotypes; and (2) imputation of gene dosage is preferred over imputation of the most likely genotype, as this increases accuracy and reduces bias of the imputed genotypes and the subsequent genomic predictions.

Collapse

Hickey JM, Gorjanc G, Hearne S, Huang BE. AlphaMPSim: flexible simulation of multi-parent crosses. Bioinformatics 2014;30:2686-8. [DOI: 10.1093/bioinformatics/btu206] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

DAIRRy-BLUP: a high-performance computing approach to genomic prediction. Genetics 2014;197:813-22. [PMID: 24736932 DOI: 10.1534/genetics.114.163683] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Bouwman AC, Hickey JM, Calus MPL, Veerkamp RF. Imputation of non-genotyped individuals based on genotyped relatives: assessing the imputation accuracy of a real case scenario in dairy cattle. Genet Sel Evol 2014;46:6. [PMID: 24490796 PMCID: PMC3929150 DOI: 10.1186/1297-9686-46-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2013] [Accepted: 01/07/2014] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Imputation of genotypes for ungenotyped individuals could enable the use of valuable phenotypes created before the genomic era in analyses that require genotypes. The objective of this study was to investigate the accuracy of imputation of non-genotyped individuals using genotype information from relatives.

METHODS

Genotypes were simulated for all individuals in the pedigree of a real (historical) dataset of phenotyped dairy cows and with part of the pedigree genotyped. The software AlphaImpute was used for imputation in its standard settings but also without phasing, i.e. using basic inheritance rules and segregation analysis only. Different scenarios were evaluated i.e.: (1) the real data scenario, (2) addition of genotypes of sires and maternal grandsires of the ungenotyped individuals, and (3) addition of one, two, or four genotyped offspring of the ungenotyped individuals to the reference population.

RESULTS

The imputation accuracy using AlphaImpute in its standard settings was lower than without phasing. Including genotypes of sires and maternal grandsires in the reference population improved imputation accuracy, i.e. the correlation of the true genotypes with the imputed genotype dosages, corrected for mean gene content, across all animals increased from 0.47 (real situation) to 0.60. Including one, two and four genotyped offspring increased the accuracy of imputation across all animals from 0.57 (no offspring) to 0.73, 0.82, and 0.92, respectively.

CONCLUSIONS

At present, the use of basic inheritance rules and segregation analysis appears to be the best imputation method for ungenotyped individuals. Comparison of our empirical animal-specific imputation accuracies to predictions based on selection index theory suggested that not correcting for mean gene content considerably overestimates the true accuracy. Imputation of ungenotyped individuals can help to include valuable phenotypes for genome-wide association studies or for genomic prediction, especially when the ungenotyped individuals have genotyped offspring.

Collapse

Lourenco DAL, Misztal I, Wang H, Aguilar I, Tsuruta S, Bertrand JK. Prediction accuracy for a simulated maternally affected trait of beef cattle using different genomic evaluation models. J Anim Sci 2013;91:4090-8. [PMID: 23893997 DOI: 10.2527/jas.2012-5826] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

Different methods for genomic evaluation were compared for accuracy and feasibility of evaluation using phenotypic, pedigree, and genomic information for a trait influenced by a maternal effect. A simulated population was constructed that included 15,800 animals in 5 generations. Genotypes from 45,000 SNP were available for 1,500 animals in the last 3 generations. Genotyped animals in the last generation had no phenotypes. Weaning weight data were simulated using an animal model with direct and maternal effects. Additive direct and maternal effects were considered either noncorrelated (formula in text) or negatively correlated (formula in text). Methods of analysis were traditional BLUP, BayesC using phenotypes and ignoring maternal effects (BayesCPR), BayesC using deregressed EBV (BayesCDEBV), and single-step genomic BLUP (ssGBLUP). Whereas BayesCPR can be used when phenotypes of only genotyped animals are available, BayesCDEBV can be used when BLUP EBV of genotyped animals are available, and ssGBLUP is suitable when genotypes, phenotypes, and pedigrees are jointly available. For all genotyped and young genotyped animals, mean accuracies from BayesCPR and BayesCDEBV were lower than accuracies from BLUP for direct and maternal effects. The differences in mean accuracy were greater when genetic correlation was negative. Gains in accuracy were observed when ssGBLUP was compared with BLUP; for the direct (maternal) effect the average gain was 0.01 (0.02) for all genotyped animals and 0.03 (0.02) for young genotyped animals without phenotypes. Similar gains were observed for 0 and negative genetic correlation. Accuracy with BayesCPR was affected by ignoring phenotypes of nongenotyped animals and maternal effect and by not accounting for parent average. Accuracy with BayesCDEBV was affected by approximations needed for deregression, not accounting for parent average, and sequential rather than simultaneous fitting of genomic and nongenomic information. Whereas BayesCDEBV presented a considerable bias, especially for maternal effect, ssGBLUP was unbiased for both effects. The computing time was 1 s for BLUP, 44 s for ssGBLUP, and over 2,000 s for BayesC. Greatest computational efficiency and accuracy of genomic prediction for a maternally affected trait was obtained when information from all nongenotyped but related individuals was included and phenotypes, pedigree, and genotypes were available and considered jointly. Increasing the gain in accuracy of genomic predictions obtained by ssGBLUP over BLUP may require an increase in the number of genotyped animals.

Collapse

Casellas J, Esquivelzeta C, Legarra A. Short communication: Accounting for new mutations in genomic prediction models. J Dairy Sci 2013;96:5398-402. [PMID: 23746579 DOI: 10.3168/jds.2012-6468] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Accepted: 04/22/2013] [Indexed: 11/19/2022]

A novel generalized ridge regression method for quantitative genetics. Genetics 2013;193:1255-68. [PMID: 23335338 DOI: 10.1534/genetics.112.146720] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Hickey JM, Kinghorn BP, Tier B, Clark SA, van der Werf JHJ, Gorjanc G. Genomic evaluations using similarity between haplotypes. J Anim Breed Genet 2012;130:259-69. [PMID: 23855628 DOI: 10.1111/jbg.12020] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Accepted: 11/07/2012] [Indexed: 10/27/2022]

Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking. Genetics 2012;193:347-65. [PMID: 23222650 DOI: 10.1534/genetics.112.147983] [Citation(s) in RCA: 239] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Abstract

The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits with fewer QTL variable selection did have some advantages. In the real data sets examined here all methods had very similar accuracies. We conclude that no single method can serve as a benchmark for genomic prediction. We recommend comparing accuracy and bias of new methods to results from genomic best linear prediction and a variable selection approach (e.g., BayesB), because, together, these methods are appropriate for a range of genetic architectures. An accompanying article in this issue provides a comprehensive review of genomic prediction methods and discusses a selection of topics related to application of genomic prediction in plants and animals.

Collapse

Setting the standard: a special focus on genomic selection in GENETICS and G3. Genetics 2012;190:1151-2. [PMID: 22491887 PMCID: PMC3316635 DOI: 10.1534/genetics.112.139907] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

Setting the Standard: A Special Focus on Genomic Selection in GENETICS and G3. G3-GENES GENOMES GENETICS 2012;2:423. [PMID: 22540032 PMCID: PMC3337469 DOI: 10.1534/g3.112.002295] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

A Bayesian antedependence model for whole genome prediction. Genetics 2011;190:1491-501. [PMID: 22135352 DOI: 10.1534/genetics.111.131540] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open