1
|
Qu J, Runcie D, Cheng H. Mega-scale Bayesian regression methods for genome-wide prediction and association studies with thousands of traits. Genetics 2023; 223:6931802. [PMID: 36529897 PMCID: PMC9991502 DOI: 10.1093/genetics/iyac183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 05/06/2022] [Accepted: 11/17/2022] [Indexed: 12/23/2022] Open
Abstract
Large-scale phenotype data are expected to increase the accuracy of genome-wide prediction and the power of genome-wide association analyses. However, genomic analyses of high-dimensional, highly correlated traits are challenging. We developed a method for implementing high-dimensional Bayesian multivariate regression to simultaneously analyze genetic variants underlying thousands of traits. As a demonstration, we implemented the BayesC prior in the R package MegaLMM. Applied to Genomic Prediction, MegaBayesC effectively integrated hyperspectral reflectance data from 620 hyperspectral wavelengths to improve the accuracy of genetic value prediction on grain yield in a wheat dataset. Applied to Genome-Wide Association Studies, we used simulations to show that MegaBayesC can accurately estimate the effect sizes of QTL across a range of genetic architectures and causes of correlations among traits. To apply MegaBayesC to a realistic scenario involving whole-genome marker data, we developed a 2-stage procedure involving a preliminary step of candidate marker selection prior to multivariate regression. We then used MegaBayesC to identify genetic associations with flowering time in Arabidopsis thaliana, leveraging expression data from 20,843 genes. MegaBayesC selected 15 single nucleotide polymorphisms as important for flowering time, with 13 located within 100 kb of known flowering-time related genes, a higher validation rate than achieved by a single-stage analysis using only the flowering time data itself. These results demonstrate that MegaBayesC can efficiently and effectively leverage high-dimensional phenotypes in genetic analyses.
Collapse
Affiliation(s)
- Jiayi Qu
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA
| | - Daniel Runcie
- Department of Plant Sciences, University of California Davis, Davis, CA 95616, USA
| | - Hao Cheng
- Department of Plant Sciences, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
2
|
Wolc A, Dekkers JCM. Application of Bayesian genomic prediction methods to genome-wide association analyses. Genet Sel Evol 2022; 54:31. [PMID: 35562659 PMCID: PMC9103490 DOI: 10.1186/s12711-022-00724-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Background Bayesian genomic prediction methods were developed to simultaneously fit all genotyped markers to a set of available phenotypes for prediction of breeding values for quantitative traits, allowing for differences in the genetic architecture (distribution of marker effects) of traits. These methods also provide a flexible and reliable framework for genome-wide association (GWA) studies. The objective here was to review developments in Bayesian hierarchical and variable selection models for GWA analyses. Results By fitting all genotyped markers simultaneously, Bayesian GWA methods implicitly account for population structure and the multiple-testing problem of classical single-marker GWA. Implemented using Markov chain Monte Carlo methods, Bayesian GWA methods allow for control of error rates using probabilities obtained from posterior distributions. Power of GWA studies using Bayesian methods can be enhanced by using informative priors based on previous association studies, gene expression analyses, or functional annotation information. Applied to multiple traits, Bayesian GWA analyses can give insight into pleiotropic effects by multi-trait, structural equation, or graphical models. Bayesian methods can also be used to combine genomic, transcriptomic, proteomic, and other -omics data to infer causal genotype to phenotype relationships and to suggest external interventions that can improve performance. Conclusions Bayesian hierarchical and variable selection methods provide a unified and powerful framework for genomic prediction, GWA, integration of prior information, and integration of information from other -omics platforms to identify causal mutations for complex quantitative traits.
Collapse
Affiliation(s)
- Anna Wolc
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.,Hy-Line International, 2583 240th Street, Dallas Center, IA, 50063, USA
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.
| |
Collapse
|
3
|
Yoosefzadeh-Najafabadi M, Eskandari M, Belzile F, Torkamaneh D. Genome-Wide Association Study Statistical Models: A Review. Methods Mol Biol 2022; 2481:43-62. [PMID: 35641758 DOI: 10.1007/978-1-0716-2237-7_4] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Statistical models are at the core of the genome-wide association study (GWAS). In this chapter, we provide an overview of single- and multilocus statistical models, Bayesian, and machine learning approaches for association studies in plants. These models are discussed based on their basic methodology, cofactors adjustment accounted for, statistical power and computational efficiency. New statistical models and machine learning algorithms are both showing improved performance in detecting missed signals, rare mutations and prioritizing causal genetic variants; nevertheless, further optimization and validation studies are required to maximize the power of GWAS.
Collapse
Affiliation(s)
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Quebec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Quebec City, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada.
| |
Collapse
|
4
|
McGaugh SE, Lorenz AJ, Flagel LE. The utility of genomic prediction models in evolutionary genetics. Proc Biol Sci 2021; 288:20210693. [PMID: 34344180 PMCID: PMC8334854 DOI: 10.1098/rspb.2021.0693] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 07/15/2021] [Indexed: 12/25/2022] Open
Abstract
Variation in complex traits is the result of contributions from many loci of small effect. Based on this principle, genomic prediction methods are used to make predictions of breeding value for an individual using genome-wide molecular markers. In breeding, genomic prediction models have been used in plant and animal breeding for almost two decades to increase rates of genetic improvement and reduce the length of artificial selection experiments. However, evolutionary genomics studies have been slow to incorporate this technique to select individuals for breeding in a conservation context or to learn more about the genetic architecture of traits, the genetic value of missing individuals or microevolution of breeding values. Here, we outline the utility of genomic prediction and provide an overview of the methodology. We highlight opportunities to apply genomic prediction in evolutionary genetics of wild populations and the best practices when using these methods on field-collected phenotypes.
Collapse
Affiliation(s)
- Suzanne E. McGaugh
- Ecology, Evolution, and Behavior, University of Minnesota, 140 Gortner Lab, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| | - Aaron J. Lorenz
- Agronomy and Plant Genetics, University of Minnesota, 411 Borlaug Hall, 1991 Upper Buford Circle, Saint Paul, MN 55108, USA
| | - Lex E. Flagel
- Plant and Microbial Biology, University of Minnesota, 140 Gortner Lab, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
- Bayer Crop Science, 700 W Chesterfield Parkway, Chesterfield, MO 63017, USA
| |
Collapse
|
5
|
Joshi R, Skaarud A, Alvarez AT, Moen T, Ødegård J. Bayesian genomic models boost prediction accuracy for survival to Streptococcus agalactiae infection in Nile tilapia (Oreochromus nilioticus). Genet Sel Evol 2021; 53:37. [PMID: 33882834 PMCID: PMC8058985 DOI: 10.1186/s12711-021-00629-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 04/06/2021] [Indexed: 11/10/2022] Open
Abstract
Background Streptococcosis is a major bacterial disease in Nile tilapia that is caused by Streptococcus agalactiae infection, and development of resistant strains of Nile tilapia represents a sustainable approach towards combating this disease. In this study, we performed a controlled disease trial on 120 full-sib families to (i) quantify and characterize the potential of genomic selection for survival to S. agalactiae infection in Nile tilapia, and (ii) identify the best genomic model and the optimal density of single nucleotide polymorphisms (SNPs) for this trait. Methods In total, 40 fish per family (15 fish intraperitoneally injected and 25 fish as cohabitants) were used in the challenge test. Mortalities were recorded every 3 h for 35 days. After quality control, genotypes (50,690 SNPs) and phenotypes (0 for dead and 1 for alive) for 2472 cohabitant fish were available. Genetic parameters were obtained using various genomic selection models (genomic best linear unbiased prediction (GBLUP), BayesB, BayesC, BayesR and BayesS) and a traditional pedigree-based model (PBLUP). The pedigree-based analysis used a deep 17-generation pedigree. Prediction accuracy and bias were evaluated using five replicates of tenfold cross-validation. The genomic models were further analyzed using 10 subsets of SNPs at different densities to explore the effect of pruning and SNP density on predictive accuracy. Results Moderate estimates of heritabilities ranging from 0.15 ± 0.03 to 0.26 ± 0.05 were obtained with the different models. Compared to a pedigree-based model, GBLUP (using all the SNPs) increased prediction accuracy by 15.4%. Furthermore, use of the most appropriate Bayesian genomic selection model and SNP density increased the prediction accuracy up to 71%. The 40 to 50 SNPs with non-zero effects were consistent for all BayesB, BayesC and BayesS models with respect to marker id and/or marker locations. Conclusions These results demonstrate the potential of genomic selection for survival to S. agalactiae infection in Nile tilapia. Compared to the PBLUP and GBLUP models, Bayesian genomic models were found to boost the prediction accuracy significantly. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00629-y.
Collapse
Affiliation(s)
- Rajesh Joshi
- GenoMar Genetics AS, Tjuvholmen allé 11, 0252, Oslo, Norway.
| | - Anders Skaarud
- GenoMar Genetics AS, Tjuvholmen allé 11, 0252, Oslo, Norway
| | | | - Thomas Moen
- AquaGen AS, Sluppen, P.O. Box 1240, 7462, Trondheim, Norway
| | - Jørgen Ødegård
- AquaGen AS, Sluppen, P.O. Box 1240, 7462, Trondheim, Norway
| |
Collapse
|
6
|
Dissecting the Genetic Architecture of Biofuel-Related Traits in a Sorghum Breeding Population. G3-GENES GENOMES GENETICS 2020; 10:4565-4577. [PMID: 33051261 PMCID: PMC7718745 DOI: 10.1534/g3.120.401582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In sorghum [Sorghum bicolor (L.) Moench], hybrid cultivars for the biofuel industry are desired. Along with selection based on testcross performance, evaluation of the breeding population per se is also important for the success of hybrid breeding. In addition to additive genetic effects, non-additive (i.e., dominance and epistatic) effects are expected to contribute to the performance of early generations. Unfortunately, studies on early generations in sorghum breeding programs are limited. In this study, we analyzed a breeding population for bioenergy sorghum, which was previously developed based on testcross performance, to compare genomic selection models both trained on and evaluated for the per se performance of the 3rd generation S0 individuals. Of over 200 ancestral inbred accessions in the base population, only 13 founders contributed to the 3rd generation as progenitors. Compared to the founders, the performances of the population per se were improved for target traits. The total genetic variance within the S0 generation progenies themselves for all traits was mainly additive, although non-additive variances contributed to each trait to some extent. For genomic selection, linear regression models explicitly considering all genetic components showed a higher predictive ability than other linear and non-linear models. Although the number and effect distribution of underlying loci was different among the traits, the influence of priors for marker effects was relatively small. These results indicate the importance of considering non-additive effects for dissecting the genetic architecture of early breeding generations and predicting the performance per se.
Collapse
|
7
|
van Bergen GHH, Duenk P, Albers CA, Bijma P, Calus MPL, Wientjes YCJ, Kappen HJ. Bayesian neural networks with variable selection for prediction of genotypic values. Genet Sel Evol 2020; 52:26. [PMID: 32414320 PMCID: PMC7227313 DOI: 10.1186/s12711-020-00544-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 04/28/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Estimating the genetic component of a complex phenotype is a complicated problem, mainly because there are many allele effects to estimate from a limited number of phenotypes. In spite of this difficulty, linear methods with variable selection have been able to give good predictions of additive effects of individuals. However, prediction of non-additive genetic effects is challenging with the usual prediction methods. In machine learning, non-additive relations between inputs can be modeled with neural networks. We developed a novel method (NetSparse) that uses Bayesian neural networks with variable selection for the prediction of genotypic values of individuals, including non-additive genetic effects. RESULTS We simulated several populations with different phenotypic models and compared NetSparse to genomic best linear unbiased prediction (GBLUP), BayesB, their dominance variants, and an additive by additive method. We found that when the number of QTL was relatively small (10 or 100), NetSparse had 2 to 28 percentage points higher accuracy than the reference methods. For scenarios that included dominance or epistatic effects, NetSparse had 0.0 to 3.9 percentage points higher accuracy for predicting phenotypes than the reference methods, except in scenarios with extreme overdominance, for which reference methods that explicitly model dominance had 6 percentage points higher accuracy than NetSparse. CONCLUSIONS Bayesian neural networks with variable selection are promising for prediction of the genetic component of complex traits in animal breeding, and their performance is robust across different genetic models. However, their large computational costs can hinder their use in practice.
Collapse
Affiliation(s)
- Giel H. H. van Bergen
- SNN Machine Learning Group, Biophysics Department, Donders Institute for Brain Cognition and Behavior, Radboud University, 6525 AJ Nijmegen, The Netherlands
| | - Pascal Duenk
- Animal Breeding and Genomics, Wageningen University and Research, 6700 AH Wageningen, The Netherlands
| | - Cornelis A. Albers
- Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, Radboud University, 6500 HB Nijmegen, The Netherlands
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, 6500 HB Nijmegen, The Netherlands
- Present Address: Euretos B.V., Yalelaan 1, 3584 CL Utrecht, The Netherlands
| | - Piter Bijma
- Animal Breeding and Genomics, Wageningen University and Research, 6700 AH Wageningen, The Netherlands
| | - Mario P. L. Calus
- Animal Breeding and Genomics, Wageningen University and Research, 6700 AH Wageningen, The Netherlands
| | - Yvonne C. J. Wientjes
- Animal Breeding and Genomics, Wageningen University and Research, 6700 AH Wageningen, The Netherlands
| | - Hilbert J. Kappen
- SNN Machine Learning Group, Biophysics Department, Donders Institute for Brain Cognition and Behavior, Radboud University, 6525 AJ Nijmegen, The Netherlands
| |
Collapse
|
8
|
Wolc A, Drobik-Czwarno W, Jankowski T, Arango J, Settar P, Fulton JE, Fernando RL, Garrick DJ, Dekkers JCM. Accuracy of genomic prediction of shell quality in a White Leghorn line. Poult Sci 2020; 99:2833-2840. [PMID: 32475416 PMCID: PMC7597664 DOI: 10.1016/j.psj.2020.01.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 01/13/2020] [Accepted: 01/20/2020] [Indexed: 11/16/2022] Open
Abstract
Several genomic methods were applied for predicting shell quality traits recorded at 4 different hen ages in a White Leghorn line. The accuracies of genomic prediction of single-step GBLUP and single-trait Bayes B were compared with predictions of breeding values based on pedigree-BLUP under single-trait or multitrait models. Breaking strength (BS) and dynamic stiffness (Kdyn) measurements were collected on 18,524 birds from 3 consecutive generations, of which 4,164 animals also had genotypes from an Affymetrix 50K panel containing 49,591 SNPs after quality control edits. All traits had low to moderate heritability, ranging from 0.17 for BS to 0.34 for Kdyn. The highest accuracies of prediction were obtained for the multitrait single-step model. The use of marker information resulted in higher prediction accuracies than pedigree-based models for almost all traits. A genome-wide association study based on a Bayes B model was conducted to detect regions explaining the largest proportion of genetic variance. Across all 8 shell quality traits analyzed, 7 regions each explaining over 2% of genetic variance and 54 regions each explaining over 1% of genetic variance were identified. The windows explaining a large proportion of genetic variance overlapped with several potential candidate genes with biological functions linked to shell formation. A multitrait repeatability model using a single-step method is recommended for genomic evaluation of shell quality in layer chickens.
Collapse
Affiliation(s)
- A Wolc
- Department of Animal Sciences, Iowa State University, Ames, IA 50011-1178, USA; Hy-Line International, Dallas Center, IA 50063, USA.
| | - W Drobik-Czwarno
- Department of Animal Genetics and Conservation, Institute of Animal Science, Warsaw University of Life Sciences, 02-787 Warsaw, Poland
| | | | - J Arango
- Hy-Line International, Dallas Center, IA 50063, USA
| | - P Settar
- Hy-Line International, Dallas Center, IA 50063, USA
| | - J E Fulton
- Hy-Line International, Dallas Center, IA 50063, USA
| | - R L Fernando
- Department of Animal Sciences, Iowa State University, Ames, IA 50011-1178, USA
| | - D J Garrick
- Department of Animal Sciences, Iowa State University, Ames, IA 50011-1178, USA
| | - J C M Dekkers
- Department of Animal Sciences, Iowa State University, Ames, IA 50011-1178, USA
| |
Collapse
|
9
|
Xavier A. Efficient Estimation of Marker Effects in Plant Breeding. G3 (BETHESDA, MD.) 2019; 9:3855-3866. [PMID: 31690600 PMCID: PMC6829119 DOI: 10.1534/g3.119.400728] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 09/18/2019] [Indexed: 12/15/2022]
Abstract
The evaluation of prediction machines is an important step for a successful implementation of genomic-enabled selection in plant breeding. Computation time and predictive ability constitute key metrics to determine the methodology utilized for the consolidation of genomic prediction pipeline. This study introduces two methods designed to couple high prediction accuracy with efficient computational performance: 1) a non-MCMC method to estimate marker effects with a Laplace prior; and 2) an iterative framework that allows solving whole-genome regression within mixed models with replicated observations in a single-stage. The investigation provides insights on predictive ability and marker effect estimates. Various genomic prediction techniques are compared based on cross-validation, assessing predictions across and within family. Properties of quantitative trait loci detection and single-stage method were evaluated on simulated plot-level data from unbalanced data structures. Estimation of marker effects by the new model is compared to a genome-wide association analysis and whole-genome regression methods. The single-stage approach is compared to a GBLUP fitted via restricted maximum likelihood, and a two-stages approaches where genetic values fit a whole-genome regression. The proposed framework provided high computational efficiency, robust prediction across datasets, and accurate estimation of marker effects.
Collapse
Affiliation(s)
- Alencar Xavier
- Corteva Agrisciences, 8305 NW 62nd Ave. Johnston IA, and
- Purdue University, 915 W State St. West Lafayette IN
| |
Collapse
|
10
|
Yáñez JM, Yoshida GM, Parra Á, Correa K, Barría A, Bassini LN, Christensen KA, López ME, Carvalheiro R, Lhorente JP, Pulgar R. Comparative Genomic Analysis of Three Salmonid Species Identifies Functional Candidate Genes Involved in Resistance to the Intracellular Bacterium Piscirickettsia salmonis. Front Genet 2019; 10:665. [PMID: 31428125 PMCID: PMC6690157 DOI: 10.3389/fgene.2019.00665] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2019] [Accepted: 06/25/2019] [Indexed: 12/23/2022] Open
Abstract
Piscirickettsia salmonis is the etiologic agent of salmon rickettsial syndrome (SRS) and is responsible for considerable economic losses in salmon aquaculture. The bacterium affects coho salmon (CS; Oncorhynchus kisutch), Atlantic salmon (AS; Salmo salar), and rainbow trout (RT; Oncorhynchus mykiss) in several countries, including Norway, Canada, Scotland, Ireland, and Chile. We used Bayesian genome-wide association study analyses to investigate the genetic architecture of resistance to P. salmonis in farmed populations of these species. Resistance to SRS was defined as the number of days to death and as binary survival (BS). A total of 828 CS, 2130 RT, and 2601 AS individuals were phenotyped and then genotyped using double-digest restriction site-associated DNA sequencing and 57K and 50K Affymetrix® Axiom® single nucleotide polymorphism (SNP) panels, respectively. Both traits of SRS resistance in CS and RT appeared to be under oligogenic control. In AS, there was evidence of polygenic control of SRS resistance. To identify candidate genes associated with resistance, we applied a comparative genomics approach in which we systematically explored the complete set of genes adjacent to SNPs, which explained more than 1% of the genetic variance of resistance in each salmonid species (533 genes in total). Thus, genes were classified based on the following criteria: i) shared function of their protein domains among species, ii) shared orthology among species, iii) proximity to the SNP explaining the highest proportion of the genetic variance, and iv) presence in more than one genomic region explaining more than 1% of the genetic variance within species. Our results allowed us to identify 120 candidate genes belonging to at least one of the four criteria described above. Of these, 21 of them were part of at least two of the criteria defined above and are suggested to be strong functional candidates influencing P. salmonis resistance. These genes are related to diverse biological processes, such as kinase activity, GTP hydrolysis, helicase activity, lipid metabolism, cytoskeletal dynamics, inflammation, and innate immune response, which seem essential in the host response against P. salmonis infection. These results provide fundamental knowledge on the potential functional genes underpinning resistance against P. salmonis in three salmonid species.
Collapse
Affiliation(s)
- José M. Yáñez
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile
- Núcleo Milenio INVASAL, Concepción, Chile
| | - Grazyella M. Yoshida
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile
| | - Ángel Parra
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile
- Instituto de Nutrición y Tecnología de los Alimentos, Universidad de Chile, Santiago, Chile
- Doctorado en Acuicultura. Programa Cooperativo Universidad de Chile, Universidad Católica del Norte, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile
- Facultad de Ciencias del Mar, Universidad Católica del Norte, Coquimbo, Chile
| | | | - Agustín Barría
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh Easter Bush, Midlothian, United Kingdom
| | - Liane N. Bassini
- Escuela de Medicina Veterinaria, Facultad de Ciencias de la Vida, Universidad Andres Bello, Santiago, Chile
| | | | - Maria E. López
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Roberto Carvalheiro
- School of Agricultural and Veterinarian Sciences, São Paulo State University (Unesp), Jaboticabal, Brazil
- National Council for Scientific and Technological Development (CNPq), Brasília, Brazil
| | | | - Rodrigo Pulgar
- Instituto de Nutrición y Tecnología de los Alimentos, Universidad de Chile, Santiago, Chile
| |
Collapse
|
11
|
Genomic Prediction Accuracy for Resistance Against Piscirickettsia salmonis in Farmed Rainbow Trout. G3-GENES GENOMES GENETICS 2018; 8:719-726. [PMID: 29255117 PMCID: PMC5919750 DOI: 10.1534/g3.117.300499] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Salmonid rickettsial syndrome (SRS), caused by the intracellular bacterium Piscirickettsia salmonis, is one of the main diseases affecting rainbow trout (Oncorhynchus mykiss) farming. To accelerate genetic progress, genomic selection methods can be used as an effective approach to control the disease. The aims of this study were: (i) to compare the accuracy of estimated breeding values using pedigree-based best linear unbiased prediction (PBLUP) with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP), Bayes C, and Bayesian Lasso (LASSO); and (ii) to test the accuracy of genomic prediction and PBLUP using different marker densities (0.5, 3, 10, 20, and 27 K) for resistance against P. salmonis in rainbow trout. Phenotypes were recorded as number of days to death (DD) and binary survival (BS) from 2416 fish challenged with P. salmonis. A total of 1934 fish were genotyped using a 57 K single-nucleotide polymorphism (SNP) array. All genomic prediction methods achieved higher accuracies than PBLUP. The relative increase in accuracy for different genomic models ranged from 28 to 41% for both DD and BS at 27 K SNP. Between different genomic models, the highest relative increase in accuracy was obtained with Bayes C (∼40%), where 3 K SNP was enough to achieve a similar accuracy to that of the 27 K SNP for both traits. For resistance against P. salmonis in rainbow trout, we showed that genomic predictions using GBLUP, ssGBLUP, Bayes C, and LASSO can increase accuracy compared with PBLUP. Moreover, it is possible to use relatively low-density SNP panels for genomic prediction without compromising accuracy predictions for resistance against P. salmonis in rainbow trout.
Collapse
|
12
|
Chen C, Steibel JP, Tempelman RJ. Genome-Wide Association Analyses Based on Broadly Different Specifications for Prior Distributions, Genomic Windows, and Estimation Methods. Genetics 2017; 206:1791-1806. [PMID: 28637709 PMCID: PMC5560788 DOI: 10.1534/genetics.117.202259] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2017] [Accepted: 06/19/2017] [Indexed: 11/18/2022] Open
Abstract
A currently popular strategy (EMMAX) for genome-wide association (GWA) analysis infers association for the specific marker of interest by treating its effect as fixed while treating all other marker effects as classical Gaussian random effects. It may be more statistically coherent to specify all markers as sharing the same prior distribution, whether that distribution is Gaussian, heavy-tailed (BayesA), or has variable selection specifications based on a mixture of, say, two Gaussian distributions [stochastic search and variable selection (SSVS)]. Furthermore, all such GWA inference should be formally based on posterior probabilities or test statistics as we present here, rather than merely being based on point estimates. We compared these three broad categories of priors within a simulation study to investigate the effects of different degrees of skewness for quantitative trait loci (QTL) effects and numbers of QTL using 43,266 SNP marker genotypes from 922 Duroc-Pietrain F2-cross pigs. Genomic regions were based either on single SNP associations, on nonoverlapping windows of various fixed sizes (0.5-3 Mb), or on adaptively determined windows that cluster the genome into blocks based on linkage disequilibrium. We found that SSVS and BayesA lead to the best receiver operating curve properties in almost all cases. We also evaluated approximate maximum a posteriori (MAP) approaches to BayesA and SSVS as potential computationally feasible alternatives; however, MAP inferences were not promising, particularly due to their sensitivity to starting values. We determined that it is advantageous to use variable selection specifications based on adaptively constructed genomic window lengths for GWA studies.
Collapse
Affiliation(s)
- Chunyu Chen
- Department of Animal Science, Michigan State University, East Lansing, Michigan 48824
| | - Juan P Steibel
- Department of Animal Science, Michigan State University, East Lansing, Michigan 48824
| | - Robert J Tempelman
- Department of Animal Science, Michigan State University, East Lansing, Michigan 48824
| |
Collapse
|
13
|
Persistency of Prediction Accuracy and Genetic Gain in Synthetic Populations Under Recurrent Genomic Selection. G3-GENES GENOMES GENETICS 2017; 7:801-811. [PMID: 28064189 PMCID: PMC5345710 DOI: 10.1534/g3.116.036582] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Recurrent selection (RS) has been used in plant breeding to successively improve synthetic and other multiparental populations. Synthetics are generated from a limited number of parents [Formula: see text] but little is known about how [Formula: see text] affects genomic selection (GS) in RS, especially the persistency of prediction accuracy ([Formula: see text]) and genetic gain. Synthetics were simulated by intermating [Formula: see text]= 2-32 parent lines from an ancestral population with short- or long-range linkage disequilibrium ([Formula: see text]) and subjected to multiple cycles of GS. We determined [Formula: see text] and genetic gain across 30 cycles for different training set (TS) sizes, marker densities, and generations of recombination before model training. Contributions to [Formula: see text] and genetic gain from pedigree relationships, as well as from cosegregation and [Formula: see text] between QTL and markers, were analyzed via four scenarios differing in (i) the relatedness between TS and selection candidates and (ii) whether selection was based on markers or pedigree records. Persistency of [Formula: see text] was high for small [Formula: see text] where predominantly cosegregation contributed to [Formula: see text], but also for large [Formula: see text] where [Formula: see text] replaced cosegregation as the dominant information source. Together with increasing genetic variance, this compensation resulted in relatively constant long- and short-term genetic gain for increasing [Formula: see text] > 4, given long-range LDA in the ancestral population. Although our scenarios suggest that information from pedigree relationships contributed to [Formula: see text] for only very few generations in GS, we expect a longer contribution than in pedigree BLUP, because capturing Mendelian sampling by markers reduces selective pressure on pedigree relationships. Larger TS size ([Formula: see text]) and higher marker density improved persistency of [Formula: see text] and hence genetic gain, but additional recombinations could not increase genetic gain.
Collapse
|
14
|
Mehrban H, Lee DH, Moradi MH, IlCho C, Naserkheil M, Ibáñez-Escriche N. Predictive performance of genomic selection methods for carcass traits in Hanwoo beef cattle: impacts of the genetic architecture. Genet Sel Evol 2017; 49:1. [PMID: 28093066 PMCID: PMC5240470 DOI: 10.1186/s12711-016-0283-0] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 12/22/2016] [Indexed: 12/15/2022] Open
Abstract
Background Hanwoo beef is known for its marbled fat, tenderness, juiciness and characteristic flavor, as well as for its low cholesterol and high omega 3 fatty acid contents. As yet, there has been no comprehensive investigation to estimate genomic selection accuracy for carcass traits in Hanwoo cattle using dense markers. This study aimed at evaluating the accuracy of alternative statistical methods that differed in assumptions about the underlying genetic model for various carcass traits: backfat thickness (BT), carcass weight (CW), eye muscle area (EMA), and marbling score (MS). Methods Accuracies of direct genomic breeding values (DGV) for carcass traits were estimated by applying fivefold cross-validation to a dataset including 1183 animals and approximately 34,000 single nucleotide polymorphisms (SNPs). Results Accuracies of BayesC, Bayesian LASSO (BayesL) and genomic best linear unbiased prediction (GBLUP) methods were similar for BT, EMA and MS. However, for CW, DGV accuracy was 7% higher with BayesC than with BayesL and GBLUP. The increased accuracy of BayesC, compared to GBLUP and BayesL, was maintained for CW, regardless of the training sample size, but not for BT, EMA, and MS. Genome-wide association studies detected consistent large effects for SNPs on chromosomes 6 and 14 for CW. Conclusions The predictive performance of the models depended on the trait analyzed. For CW, the results showed a clear superiority of BayesC compared to GBLUP and BayesL. These findings indicate the importance of using a proper variable selection method for genomic selection of traits and also suggest that the genetic architecture that underlies CW differs from that of the other carcass traits analyzed. Thus, our study provides significant new insights into the carcass traits of Hanwoo cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12711-016-0283-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hossein Mehrban
- Department of Animal Science, Shahrekord University, P.O. Box 115, Shahrekord, 88186-34141, Iran
| | - Deuk Hwan Lee
- Department of Animal Life and Environment Science, Hankyong National University, Jungang-ro 327, Anseong-si, Gyeonggi-do, 456-749, Korea.
| | - Mohammad Hossein Moradi
- Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arāk, 38156-8-8349, Iran
| | - Chung IlCho
- Hanwoo Improvement Center, National Agricultural Cooperative Federation, Haeun-ro 691, Unsan-myeon, Seosan-si, Chungnam-do, 356-831, Korea
| | - Masoumeh Naserkheil
- Department of Animal Science, University College of Agriculture and Natural Resources, University of Tehran, P.O. Box 4111, Karaj, 31587-11167, Iran
| | - Noelia Ibáñez-Escriche
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, UK
| |
Collapse
|
15
|
Lee J, Cheng H, Garrick D, Golden B, Dekkers J, Park K, Lee D, Fernando R. Comparison of alternative approaches to single-trait genomic prediction using genotyped and non-genotyped Hanwoo beef cattle. Genet Sel Evol 2017; 49:2. [PMID: 28093065 PMCID: PMC5240330 DOI: 10.1186/s12711-016-0279-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Accepted: 12/09/2016] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Genomic predictions from BayesA and BayesB use training data that include animals with both phenotypes and genotypes. Single-step methodologies allow additional information from non-genotyped relatives to be included in the analysis. The single-step genomic best linear unbiased prediction (SSGBLUP) method uses a relationship matrix computed from marker and pedigree information, in which missing genotypes are imputed implicitly. Single-step Bayesian regression (SSBR) extends SSGBLUP to BayesB-like models using explicitly imputed genotypes for non-genotyped individuals. METHODS Carcass records included 988 genotyped Hanwoo steers with 35,882 SNPs and 1438 non-genotyped steers that were measured for back-fat thickness (BFT), carcass weight (CWT), eye-muscle area, and marbling score (MAR). Single-trait pedigree-based BLUP, Bayesian methods using only genotyped individuals, SSGBLUP and SSBR methods were compared using cross-validation. RESULTS Methods using genomic information always outperformed pedigree-based BLUP when the same phenotypic data were modeled from either genotyped individuals only or both genotyped and non-genotyped individuals. For BFT and MAR, accuracies were higher with single-step methods than with BayesB, BayesC and BayesCπ. Gains in accuracy with the single-step methods ranged from +0.06 to +0.09 for BFT and from +0.05 to +0.07 for MAR. For CWT, SSBR always outperformed the corresponding Bayesian methods that used only genotyped individuals. However, although SSGBLUP incorporated information from non-genotyped individuals, prediction accuracies were lower with SSGBLUP than with BayesC (π = 0.9999) and BayesB (π = 0.98) for CWT because, for this particular trait, there was a benefit from the mixture priors of the effects of the single nucleotide polymorphisms. CONCLUSIONS Single-step methods are the preferred approaches for prediction combining genotyped and non-genotyped animals. Alternative priors allow SSBR to outperform SSGBLUP in some cases.
Collapse
Affiliation(s)
- Joonho Lee
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA
| | - Hao Cheng
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA.,Department of Statistics, Iowa State University, Ames, IA, 50011, USA
| | - Dorian Garrick
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA.,Institute of Veterinary, Animal and Biomedical Sciences, Massey University, Palmerston North, New Zealand.,ThetaSolutions, LLC, Atascadero, CA, USA
| | | | - Jack Dekkers
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA
| | - Kyungdo Park
- Department of Animal Biotechnology, Chonbuk National University, Chonju, Jeollabuk-do, South Korea
| | - Deukhwan Lee
- Department of Animal Science, Hankyong National University, Anseong, Gyeonggi-do, South Korea
| | - Rohan Fernando
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|