Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Weigel KA, de los Campos G, González-Recio O, Naya H, Wu XL, Long N, Rosa GJM, Gianola D. Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Dairy Sci 2009;92:5248-57. [PMID: 19762843 DOI: 10.3168/jds.2009-2092] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

For:	Weigel KA, de los Campos G, González-Recio O, Naya H, Wu XL, Long N, Rosa GJM, Gianola D. Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Dairy Sci 2009;92:5248-57. [PMID: 19762843 DOI: 10.3168/jds.2009-2092] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Number

Cited by Other Article(s)

Su G, Brøndum RF, Ma P, Guldbrandtsen B, Aamand GP, Lund MS. Comparison of genomic predictions using medium-density (∼54,000) and high-density (∼777,000) single nucleotide polymorphism marker panels in Nordic Holstein and Red Dairy Cattle populations. J Dairy Sci 2012;95:4657-65. [PMID: 22818480 DOI: 10.3168/jds.2012-5379] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Accepted: 04/17/2012] [Indexed: 12/22/2022]

Abraham G, Kowalczyk A, Zobel J, Inouye M. Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genet Epidemiol 2012. [PMID: 23203348 DOI: 10.1002/gepi.21698] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Colombani C, Legarra A, Fritz S, Guillaume F, Croiseau P, Ducrocq V, Robert-Granié C. Application of Bayesian least absolute shrinkage and selection operator (LASSO) and BayesCπ methods for genomic selection in French Holstein and Montbéliarde breeds. J Dairy Sci 2012;96:575-91. [PMID: 23127905 DOI: 10.3168/jds.2011-5225] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2011] [Accepted: 09/14/2012] [Indexed: 11/19/2022]

Abstract

Recently, the amount of available single nucleotide polymorphism (SNP) marker data has considerably increased in dairy cattle breeds, both for research purposes and for application in commercial breeding and selection programs. Bayesian methods are currently used in the genomic evaluation of dairy cattle to handle very large sets of explanatory variables with a limited number of observations. In this study, we applied 2 bayesian methods, BayesCπ and bayesian least absolute shrinkage and selection operator (LASSO), to 2 genotyped and phenotyped reference populations consisting of 3,940 Holstein bulls and 1,172 Montbéliarde bulls with approximately 40,000 polymorphic SNP. We compared the accuracy of the bayesian methods for the prediction of 3 traits (milk yield, fat content, and conception rate) with pedigree-based BLUP, genomic BLUP, partial least squares (PLS) regression, and sparse PLS regression, a variable selection PLS variant. The results showed that the correlations between observed and predicted phenotypes were similar in BayesCπ (including or not pedigree information) and bayesian LASSO for most of the traits and whatever the breed. In the Holstein breed, bayesian methods led to higher correlations than other approaches for fat content and were similar to genomic BLUP for milk yield and to genomic BLUP and PLS regression for the conception rate. In the Montbéliarde breed, no method dominated the others, except BayesCπ for fat content. The better performances of the bayesian methods for fat content in Holstein and Montbéliarde breeds are probably due to the effect of the DGAT1 gene. The SNP identified by the BayesCπ, bayesian LASSO, and sparse PLS regression methods, based on their effect on the different traits of interest, were located at almost the same position on the genome. As the bayesian methods resulted in regressions of direct genomic values on daughter trait deviations closer to 1 than for the other methods tested in this study, bayesian methods are suggested for genomic evaluations of French dairy cattle.

Collapse

Khatkar MS, Moser G, Hayes BJ, Raadsma HW. Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle. BMC Genomics 2012;13:538. [PMID: 23043356 PMCID: PMC3531262 DOI: 10.1186/1471-2164-13-538] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Accepted: 10/06/2012] [Indexed: 12/21/2022] Open

Abstract

Background

We investigated strategies and factors affecting accuracy of imputing genotypes from lower-density SNP panels (Illumina 3K, 7K, Affymetrix 15K and 25K, and evenly spaced subsets) up to one medium (Illumina 50K) and one high-density (Illumina 800K) SNP panel. We also evaluated the utility of imputed genotypes on the accuracy of genomic selection using Australian Holstein-Friesian cattle data from 2727 and 845 animals genotyped with 50K and 800K SNP chip, respectively. Animals were divided into reference and test sets (genotyped with higher and lower density SNP panels, respectively) for evaluating the accuracies of imputation. For the accuracy of genomic selection, a comparison of direct genetic values (DGV) was made by dividing the data into training and validation sets under a range of imputation scenarios.

Results

Of the three methods compared for imputation, IMPUTE2 outperformed Beagle and fastPhase for almost all scenarios. Higher SNP densities in the test animals, larger reference sets and higher relatedness between test and reference animals increased the accuracy of imputation. 50K specific genotypes were imputed with moderate allelic error rates from 15K (2.85%) and 25K (2.75%) genotypes. Using IMPUTE2, SNP genotypes up to 800K were imputed with low allelic error rate (0.79% genome-wide) from 50K genotypes, and with moderate error rate from 3K (4.78%) and 7K (2.00%) genotypes. The error rate of imputing up to 800K from 3K or 7K was further reduced when an additional middle tier of 50K genotypes was incorporated in a 3-tiered framework. Accuracies of DGV for five production traits using imputed 50K genotypes were close to those obtained with the actual 50K genotypes and higher compared to using 3K or 7K genotypes. The loss in accuracy of DGV was small when most of the training animals also had imputed (50K) genotypes. Additional gains in DGV accuracies were small when SNP densities increased from 50K to imputed 800K.

Conclusion

Population-based genotype imputation can be used to predict and combine genotypes from different low, medium and high-density SNP chips with a high level of accuracy. Imputing genotypes from low-density SNP panels to at least 50K SNP density increases the accuracy of genomic selection.

Collapse

Segelke D, Chen J, Liu Z, Reinhardt F, Thaller G, Reents R. Reliability of genomic prediction for German Holsteins using imputed genotypes from low-density chips. J Dairy Sci 2012;95:5403-5411. [DOI: 10.3168/jds.2012-5466] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2012] [Accepted: 05/23/2012] [Indexed: 02/05/2023]

de los Campos G, Klimentidis YC, Vazquez AI, Allison DB. Prediction of expected years of life using whole-genome markers. PLoS One 2012;7:e40964. [PMID: 22848416 PMCID: PMC3405107 DOI: 10.1371/journal.pone.0040964] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Accepted: 06/15/2012] [Indexed: 01/27/2023] Open

Abstract

Genetic factors are believed to account for 25% of the interindividual differences in Years of Life (YL) among humans. However, the genetic loci that have thus far been found to be associated with YL explain a very small proportion of the expected genetic variation in this trait, perhaps reflecting the complexity of the trait and the limitations of traditional association studies when applied to traits affected by a large number of small-effect genes. Using data from the Framingham Heart Study and statistical methods borrowed largely from the field of animal genetics (whole-genome prediction, WGP), we developed a WGP model for the study of YL and evaluated the extent to which thousands of genetic variants across the genome examined simultaneously can be used to predict interindividual differences in YL. We find that a sizable proportion of differences in YL--which were unexplained by age at entry, sex, smoking and BMI--can be accounted for and predicted using WGP methods. The contribution of genomic information to prediction accuracy was even higher than that of smoking and body mass index (BMI) combined; two predictors that are considered among the most important life-shortening factors. We evaluated the impacts of familial relationships and population structure (as described by the first two marker-derived principal components) and concluded that in our dataset population structure explained partially, but not fully the gains in prediction accuracy obtained with WGP. Further inspection of prediction accuracies by age at death indicated that most of the gains in predictive ability achieved with WGP were due to the increased accuracy of prediction of early mortality, perhaps reflecting the ability of WGP to capture differences in genetic risk to deadly diseases such as cancer, which are most often responsible for early mortality in our sample.

Collapse

An ensemble-based approach to imputation of moderate-density genotypes for genomic selection with application to Angus cattle. Genet Res (Camb) 2012;94:133-50. [PMID: 22809677 DOI: 10.1017/s001667231200033x] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open

Abstract

Summary Imputation of moderate-density genotypes from low-density panels is of increasing interest in genomic selection, because it can dramatically reduce genotyping costs. Several imputation software packages have been developed, but they vary in imputation accuracy, and imputed genotypes may be inconsistent among methods. An AdaBoost-like approach is proposed to combine imputation results from several independent software packages, i.e. Beagle(v3.3), IMPUTE(v2.0), fastPHASE(v1.4), AlphaImpute, findhap(v2) and Fimpute(v2), with each package serving as a basic classifier in an ensemble-based system. The ensemble-based method computes weights sequentially for all classifiers, and combines results from component methods via weighted majority 'voting' to determine unknown genotypes. The data included 3078 registered Angus cattle, each genotyped with the Illumina BovineSNP50 BeadChip. SNP genotypes on three chromosomes (BTA1, BTA16 and BTA28) were used to compare imputation accuracy among methods, and the application involved the imputation of 50K genotypes covering 29 chromosomes based on a set of 5K genotypes. Beagle and Fimpute had the greatest accuracy among the six imputation packages, which ranged from 0·8677 to 0·9858. The proposed ensemble method was better than any of these packages, but the sequence of independent classifiers in the voting scheme affected imputation accuracy. The ensemble systems yielding the best imputation accuracies were those that had Beagle as first classifier, followed by one or two methods that utilized pedigree information. A salient feature of the proposed ensemble method is that it can solve imputation inconsistencies among different imputation methods, hence leading to a more reliable system for imputing genotypes relative to independent methods.

Collapse

Mulder HA, Calus MPL, Druet T, Schrooten C. Imputation of genotypes with low-density chips and its effect on reliability of direct genomic values in Dutch Holstein cattle. J Dairy Sci 2012;95:876-89. [PMID: 22281352 DOI: 10.3168/jds.2011-4490] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 10/25/2011] [Indexed: 11/19/2022]

Abstract

Genomic selection using 50,000 single nucleotide polymorphism (50k SNP) chips has been implemented in many dairy cattle breeding programs. Cheap, low-density chips make genotyping of a larger number of animals cost effective. A commonly proposed strategy is to impute low-density genotypes up to 50,000 genotypes before predicting direct genomic values (DGV). The objectives of this study were to investigate the accuracy of imputation for animals genotyped with a low-density chip and to investigate the effect of imputation on reliability of DGV. Low-density chips contained 384, 3,000, or 6,000 SNP. The SNP were selected based either on the highest minor allele frequency in a bin or the middle SNP in a bin, and DAGPHASE, CHROMIBD, and multivariate BLUP were used for imputation. Genotypes of 9,378 animals were used, from which approximately 2,350 animals had deregressed proofs. Bayesian stochastic search variable selection was used for estimating SNP effects of the 50k chip. Imputation accuracies and imputation error rates were poor for low-density chips with 384 SNP. Imputation accuracies were higher with 3,000 and 6,000 SNP. Performance of DAGPHASE and CHROMIBD was very similar and much better than that of multivariate BLUP for both imputation accuracy and reliability of DGV. With 3,000 SNP and using CHROMIBD or DAGPHASE for imputation, 84 to 90% of the increase in DGV reliability using the 50k chip, compared with a pedigree index, was obtained. With multivariate BLUP, the increase in reliability was only 40%. With 384 SNP, the reliability of DGV was lower than for a pedigree index, whereas with 6,000 SNP, about 93% of the increase in reliability of DGV based on the 50k chip was obtained when using DAGPHASE for imputation. Using genotype probabilities to predict gene content increased imputation accuracy and the reliability of DGV and is therefore recommended for applications of imputation for genomic prediction. A deterministic equation was derived to predict accuracy of DGV based on imputation accuracy, which fitted closely with the observed relationship. The deterministic equation can be used to evaluate the effect of differences in imputation accuracy on accuracy and reliability of DGV.

Collapse

Maltecca C, Parker KL, Cassady JP. Application of multiple shrinkage methods to genomic predictions1. J Anim Sci 2012;90:1777-87. [DOI: 10.2527/jas.2011-4350] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Imputation of genotypes from low- to high-density genotyping platforms and implications for genomic selection. Animal 2012;5:1162-9. [PMID: 22440168 DOI: 10.1017/s1751731111000309] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Abstract

The objective of this study was to quantify the accuracy achievable from imputing genotypes from a commercially available low-density marker panel (2730 single nucleotide polymorphisms (SNPs) following edits) to a commercially available higher density marker panel (51 602 SNPs following edits) in Holstein-Friesian cattle using Beagle, a freely available software package. A population of 764 Holstein-Friesian animals born since 2006 were used as the test group to quantify the accuracy of imputation, all of which had genotypes for the high-density panel; only SNPs on the low-density panel were retained with the remaining SNPs to be imputed. The reference population for imputation consisted of 4732 animals born before 2006 also with genotypes on the higher density marker panel. The concordance between the actual and imputed genotypes in the test group of animals did not vary across chromosomes and was on average 95%; the concordance between actual and imputed alleles was, on average, 97% across all SNPs. Genomic predictions were undertaken across a range of production and functional traits for the 764 test group animals using either their real or imputed genotypes. Little or no mean difference in the genomic predictions was evident when comparing direct genomic values (DGVs) using real or imputed genotypes. The average correlation between the DGVs estimated using the real or imputed genotypes for the 15 traits included in the Irish total merit index was 0.97 (range of 0.92 to 0.99), indicating good concordance between proofs from real or imputed genotypes. Results show that a commercially available high-density marker panel can be imputed from a commercially available lower density marker panel, which will also have a lower cost, thereby facilitating a reduction in the cost of genomic selection. Increased available numbers of genotyped and phenotyped animals also has implications for increasing the accuracy of genomic prediction in the entire population and thus genetic gain using genomic selection.

Collapse

Use of partial least squares regression to predict single nucleotide polymorphism marker genotypes when some animals are genotyped with a low-density panel. Animal 2012;5:833-7. [PMID: 22440021 DOI: 10.1017/s1751731110002600] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

Pérez-Cabal MA, Vazquez AI, Gianola D, Rosa GJM, Weigel KA. Accuracy of Genome-Enabled Prediction in a Dairy Cattle Population using Different Cross-Validation Layouts. Front Genet 2012;3:27. [PMID: 22403583 PMCID: PMC3288819 DOI: 10.3389/fgene.2012.00027] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 02/13/2012] [Indexed: 11/26/2022] Open

A simple method for genomic selection of moderately sized dairy cattle populations. Animal 2012;6:193-202. [DOI: 10.1017/s1751731111001704] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

Long N, Gianola D, Rosa GJM, Weigel KA. Application of support vector regression to genome-assisted prediction of quantitative traits. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2011;123:1065-1074. [PMID: 21739137 DOI: 10.1007/s00122-011-1648-y] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 06/22/2011] [Indexed: 05/31/2023]

Zhang Z, Ding X, Liu J, Zhang Q, de Koning DJ. Accuracy of genomic prediction using low-density marker panels. J Dairy Sci 2011;94:3642-50. [PMID: 21700054 DOI: 10.3168/jds.2010-3917] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2010] [Accepted: 03/15/2011] [Indexed: 12/26/2022]

Abstract

Genomic selection has been widely implemented in national and international genetic evaluation in the dairy cattle industry, because of its potential advantages over traditional selection methods and the availability of commercial high-density (HD) single nucleotide polymorphism (SNP) panels. However, this method may not be cost-effective for cow selection and for other livestock species, because the cost of HD SNP panels is still relatively high. One possible solution that can enable other species to benefit from this promising method is genomic selection with low-density (LD) SNP panels. In this simulation study, LD SNP panels designed with different strategies and different SNP densities were compared. The effects of number of quantitative trait loci, heritability, and effective population size were evaluated in the framework of genomic selection with LD SNP panels. Methodologies of Bayesian variable selection; BLUP with a trait-specific, marker-derived relationship matrix; and BLUP with a realized relationship matrix were employed to predict genomic estimated breeding values with both HD and LD SNP panels. Up to 95% of accuracy obtained by using an HD panel can be obtained by using only a small proportion of markers. The LD panel with markers selected on the basis of their effects always performs better than the LD panel with evenly spaced markers. Both the genetic architecture of the trait and the effective population size have a significant effect on the performance of the LD panels. We concluded that, to implement genomic selection with LD panels, a training population of sufficient size and genotyped with an HD panel is necessary. The trade-off between the LD panels with evenly spaced markers and selected markers must be considered, which depends on the number of target traits in a breeding program and the genetic architecture of these traits. Genomic selection with LD panels could be feasible and cost-effective, though before implementation, a further detailed genetic and economic analysis is recommended.

Collapse

Advances in genomic selection in domestic animals. ACTA ACUST UNITED AC 2011. [DOI: 10.1007/s11434-011-4632-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

Pre-selection of markers for genomic selection. BMC Proc 2011;5 Suppl 3:S12. [PMID: 21624168 PMCID: PMC3103197 DOI: 10.1186/1753-6561-5-s3-s12] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Weller JI, Ron M. Invited review: quantitative trait nucleotide determination in the era of genomic selection. J Dairy Sci 2011;94:1082-90. [PMID: 21338774 DOI: 10.3168/jds.2010-3793] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 11/09/2010] [Indexed: 12/24/2022]

Makowsky R, Pajewski NM, Klimentidis YC, Vazquez AI, Duarte CW, Allison DB, de los Campos G. Beyond missing heritability: prediction of complex traits. PLoS Genet 2011;7:e1002051. [PMID: 21552331 PMCID: PMC3084207 DOI: 10.1371/journal.pgen.1002051] [Citation(s) in RCA: 210] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2010] [Accepted: 03/02/2011] [Indexed: 01/25/2023] Open

Long N, Gianola D, Rosa G, Weigel K. Dimension reduction and variable selection for genomic selection: application to predicting milk yield in Holsteins. J Anim Breed Genet 2011;128:247-57. [DOI: 10.1111/j.1439-0388.2011.00917.x] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

VanRaden PM, O'Connell JR, Wiggans GR, Weigel KA. Genomic evaluations with many more genotypes. Genet Sel Evol 2011;43:10. [PMID: 21366914 PMCID: PMC3056758 DOI: 10.1186/1297-9686-43-10] [Citation(s) in RCA: 180] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2010] [Accepted: 03/02/2011] [Indexed: 01/17/2023] Open

Abstract

BACKGROUND

Genomic evaluations in Holstein dairy cattle have quickly become more reliable over the last two years in many countries as more animals have been genotyped for 50,000 markers. Evaluations can also include animals genotyped with more or fewer markers using new tools such as the 777,000 or 2,900 marker chips recently introduced for cattle. Gains from more markers can be predicted using simulation, whereas strategies to use fewer markers have been compared using subsets of actual genotypes. The overall cost of selection is reduced by genotyping most animals at less than the highest density and imputing their missing genotypes using haplotypes. Algorithms to combine different densities need to be efficient because numbers of genotyped animals and markers may continue to grow quickly.

METHODS

Genotypes for 500,000 markers were simulated for the 33,414 Holsteins that had 50,000 marker genotypes in the North American database. Another 86,465 non-genotyped ancestors were included in the pedigree file, and linkage disequilibrium was generated directly in the base population. Mixed density datasets were created by keeping 50,000 (every tenth) of the markers for most animals. Missing genotypes were imputed using a combination of population haplotyping and pedigree haplotyping. Reliabilities of genomic evaluations using linear and nonlinear methods were compared.

RESULTS

Differing marker sets for a large population were combined with just a few hours of computation. About 95% of paternal alleles were determined correctly, and > 95% of missing genotypes were called correctly. Reliability of breeding values was already high (84.4%) with 50,000 simulated markers. The gain in reliability from increasing the number of markers to 500,000 was only 1.6%, but more than half of that gain resulted from genotyping just 1,406 young bulls at higher density. Linear genomic evaluations had reliabilities 1.5% lower than the nonlinear evaluations with 50,000 markers and 1.6% lower with 500,000 markers.

CONCLUSIONS

Methods to impute genotypes and compute genomic evaluations were affordable with many more markers. Reliabilities for individual animals can be modified to reflect success of imputation. Breeders can improve reliability at lower cost by combining marker densities to increase both the numbers of markers and animals included in genomic evaluation. Larger gains are expected from increasing the number of animals than the number of markers.

Collapse

Vazquez AI, Rosa GJM, Weigel KA, de los Campos G, Gianola D, Allison DB. Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. J Dairy Sci 2011;93:5942-9. [PMID: 21094768 DOI: 10.3168/jds.2010-3335] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2010] [Accepted: 08/12/2010] [Indexed: 01/15/2023]

Abstract

Genome-enabled prediction of breeding values using high-density panels (HDP) can be highly accurate, even for young sires. However, the cost of the assay may limit its use to elite animals only. Low-density panels (LDP) containing a subset of single nucleotide polymorphisms (SNP) may give reasonably accurate predictions and could be used cost-effectively with young males and females. This study evaluates strategies for selecting subsets of SNP for several traits, compares predictive ability of LDP with that of HDP, and assesses the benefits of including parent average (PA) as a predictor in models using LDP. Data consisting of progeny-test predicted transmitting ability (PTA) for net merit and 6 other traits of economic interest from 4,783 Holstein sires were evaluated using testing and training sets with regressions on their high-density genotypes and parent averages for net merit index. Additionally, SNP subsets of different sizes were selected using different strategies, including the "best" SNP based on the absolute values of their estimated effects from HDP models for either the trait itself or lifetime net merit, and evenly spaced (ES) SNP across the genome. Overall, HDP models had the best predictive ability, setting an upper bound for the predictive ability of LDP sets. Low-density panels targeting the SNP with strongest effects (for either a single trait or lifetime net merit) provided reasonably accurate predictions and generally outperformed predictions based on evenly spaced SNP. For example, evenly spaced sets would require at least 5,000 to 7,500 SNP to reach 95% of the predictive ability provided by HDP. On the other hand, this level of predictive ability can be achieved with sets of 2,000 SNP when SNP are selected based on magnitude of estimated effects for the trait. Accuracy of predictions based on LDP can be improved markedly by including parent average as a fixed effect in the model; for example, a set with the 1,000 best SNP using the parent average achieved the 95% of the accuracy of a HDP model.

Collapse

Zhang Z, Druet T. Marker imputation with low-density marker panels in Dutch Holstein cattle. J Dairy Sci 2011;93:5487-94. [PMID: 20965364 DOI: 10.3168/jds.2010-3501] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 08/13/2010] [Indexed: 11/19/2022]

Improved Lasso for genomic selection. Genet Res (Camb) 2010;93:77-87. [PMID: 21144129 DOI: 10.1017/s0016672310000534] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

L2-Boosting algorithm applied to high-dimensional problems in genomic selection. Genet Res (Camb) 2010;92:227-37. [PMID: 20667166 DOI: 10.1017/s0016672310000261] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

Abstract

The L(2)-Boosting algorithm is one of the most promising machine-learning techniques that has appeared in recent decades. It may be applied to high-dimensional problems such as whole-genome studies, and it is relatively simple from a computational point of view. In this study, we used this algorithm in a genomic selection context to make predictions of yet to be observed outcomes. Two data sets were used: (1) productive lifetime predicted transmitting abilities from 4702 Holstein sires genotyped for 32 611 single nucleotide polymorphisms (SNPs) derived from the Illumina BovineSNP50 BeadChip, and (2) progeny averages of food conversion rate, pre-corrected by environmental and mate effects, in 394 broilers genotyped for 3481 SNPs. Each of these data sets was split into training and testing sets, the latter comprising dairy or broiler sires whose ancestors were in the training set. Two weak learners, ordinary least squares (OLS) and non-parametric (NP) regression were used for the L2-Boosting algorithm, to provide a stringent evaluation of the procedure. This algorithm was compared with BL [Bayesian LASSO (least absolute shrinkage and selection operator)] and BayesA regression. Learning tasks were carried out in the training set, whereas validation of the models was performed in the testing set. Pearson correlations between predicted and observed responses in the dairy cattle (broiler) data set were 0.65 (0.33), 0.53 (0.37), 0.66 (0.26) and 0.63 (0.27) for OLS-Boosting, NP-Boosting, BL and BayesA, respectively. The smallest bias and mean-squared errors (MSEs) were obtained with OLS-Boosting in both the dairy cattle (0.08 and 1.08, respectively) and broiler (-0.011 and 0.006) data sets, respectively. In the dairy cattle data set, the BL was more accurate (bias=0.10 and MSE=1.10) than BayesA (bias=1.26 and MSE=2.81), whereas no differences between these two methods were found in the broiler data set. L2-Boosting with a suitable learner was found to be a competitive alternative for genomic selection applications, providing high accuracy and low bias in genomic-assisted evaluations with a relatively short computational time.

Collapse

de los Campos G, Gianola D, Allison DB. Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat Rev Genet 2010;11:880-6. [DOI: 10.1038/nrg2898] [Citation(s) in RCA: 211] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Weigel K, de los Campos G, Vazquez A, Rosa G, Gianola D, Van Tassell C. Accuracy of direct genomic values derived from imputed single nucleotide polymorphism genotypes in Jersey cattle. J Dairy Sci 2010;93:5423-35. [DOI: 10.3168/jds.2010-3149] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2010] [Accepted: 07/05/2010] [Indexed: 01/22/2023]

Weigel KA, Van Tassell CP, O'Connell JR, VanRaden PM, Wiggans GR. Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms. J Dairy Sci 2010;93:2229-38. [PMID: 20412938 DOI: 10.3168/jds.2009-2849] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2009] [Accepted: 01/04/2010] [Indexed: 12/30/2022]

Abstract

The availability of dense single nucleotide polymorphism (SNP) genotypes for dairy cattle has created exciting research opportunities and revolutionized practical breeding programs. Broader application of this technology will lead to situations in which genotypes from different low-, medium-, or high-density platforms must be combined. In this case, missing SNP genotypes can be imputed using family- or population-based algorithms. Our objective was to evaluate the accuracy of imputation in Jersey cattle, using reference panels comprising 2,542 animals with 43,385 SNP genotypes and study samples of 604 animals for which genotypes were available for 1, 2, 5, 10, 20, 40, or 80% of loci. Two population-based algorithms, fastPHASE 1.2 (P. Scheet and M. Stevens; University of Washington TechTransfer Digital Ventures Program, Seattle, WA) and IMPUTE 2.0 (B. Howie and J. Marchini; Department of Statistics, University of Oxford, UK), were used to impute genotypes on Bos taurus autosomes 1, 15, and 28. The mean proportion of genotypes imputed correctly ranged from 0.659 to 0.801 when 1 to 2% of genotypes were available in the study samples, from 0.733 to 0.964 when 5 to 20% of genotypes were available, and from 0.896 to 0.995 when 40 to 80% of genotypes were available. In the absence of pedigrees or genotypes of close relatives, the accuracy of imputation may be modest (generally <0.80) when low-density platforms with fewer than 1,000 SNP are used, but population-based algorithms can provide reasonably good accuracy (0.80 to 0.95) when medium-density platforms of 2,000 to 4,000 SNP are used in conjunction with high-density genotypes (e.g., >40,000 SNP) from a reference population. Accurate imputation of high-density genotypes from inexpensive low- or medium-density platforms could greatly enhance the efficiency of whole-genome selection programs in dairy cattle.

Collapse

Moser G, Khatkar MS, Hayes BJ, Raadsma HW. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers. Genet Sel Evol 2010;42:37. [PMID: 20950478 PMCID: PMC2964565 DOI: 10.1186/1297-9686-42-37] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Accepted: 10/16/2010] [Indexed: 08/26/2023] Open

Abstract

Background

At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI).

Methods

Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length.

Results

RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls.

Conclusions

Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP.

Collapse

Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 2010;186:713-24. [PMID: 20813882 DOI: 10.1534/genetics.110.118521] [Citation(s) in RCA: 402] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Bayesian inference of genetic parameters based on conditional decompositions of multivariate normal distributions. Genetics 2010;185:645-54. [PMID: 20351218 DOI: 10.1534/genetics.110.114249] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Garrick DJ, Taylor JF, Fernando RL. Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet Sel Evol 2009;41:55. [PMID: 20043827 PMCID: PMC2817680 DOI: 10.1186/1297-9686-41-55] [Citation(s) in RCA: 417] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2009] [Accepted: 12/31/2009] [Indexed: 11/15/2022] Open