1
|
Sharif-Islam M, van der Werf JHJ, Wood BJ, Hermesch S. The predicted benefits of genomic selection on pig breeding objectives. J Anim Breed Genet 2024; 141:685-701. [PMID: 38779724 DOI: 10.1111/jbg.12873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/30/2024] [Accepted: 05/04/2024] [Indexed: 05/25/2024]
Abstract
The premise was tested that the additional genetic gain was achieved in the overall breeding objective in a pig breeding program using genomic selection (GS) compared to a conventional breeding program, however, some traits achieved larger gain than other traits. GS scenarios based on different reference population sizes were evaluated. The scenarios were compared using a deterministic simulation model to predict genetic gain in scenarios with and without using genomic information as an additional information source. All scenarios were compared based on selection accuracy and predicted genetic gain per round of selection for objective traits in both sire and dam lines. The results showed that GS scenarios increased overall response in the breeding objectives by 9% to 56% and 3.5% to 27% in the dam and sire lines, respectively. The difference in response resulted from differences in the size of the reference population. Although all traits achieved higher selection accuracy in GS, traits with limited phenotypic information at the time of selection or with low heritability, such as sow longevity, number of piglets born alive, pre- and post-weaning survival, as well as meat and carcass quality traits achieved the largest additional response. This additional response came at the expense of smaller responses for traits that are easy to measure, such as back fat and average daily gain in GS compared to the conventional breeding program. Sow longevity and drip loss percentage did not change in a favourable direction in GS with a reference population of 500 pigs. With a reference population of 1000 pigs or onwards, sow longevity and drip loss percentage began to change in a favourable direction. Despite the smaller responses for average daily gain and back fat thickness in GS, the overall breeding objective achieved additional gain in GS.
Collapse
Affiliation(s)
- Md Sharif-Islam
- AGBU, a Joint Venture of NSW Department of Primary Industries, University of New England, Armidale, New South Wales, Australia
| | - Julius H J van der Werf
- School of Environmental and Rural Science, University of New England, Armidale, New South Wales, Australia
| | - Benjamin J Wood
- School of Veterinary Science, The University of Queensland, Lawes, Queensland, Australia
| | - Susanne Hermesch
- AGBU, a Joint Venture of NSW Department of Primary Industries, University of New England, Armidale, New South Wales, Australia
| |
Collapse
|
2
|
Oget-Ebrad C, Heumez E, Duchalais L, Goudemand-Dugué E, Oury FX, Elsen JM, Bouchet S. Validation of cross-progeny variance genomic prediction using simulations and experimental data in winter elite bread wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:226. [PMID: 39292265 PMCID: PMC11410863 DOI: 10.1007/s00122-024-04718-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 08/16/2024] [Indexed: 09/19/2024]
Abstract
KEY MESSAGE From simulations and experimental data, the quality of cross progeny variance genomic predictions may be high, but depends on trait architecture and necessitates sufficient number of progenies. Genomic predictions are used to select genitors and crosses in plant breeding. The usefulness criterion (UC) is a cross-selection criterion that necessitates the estimation of parental mean (PM) and progeny standard deviation (SD). This study evaluates the parameters that affect the predictive ability of UC and its two components using simulations. Predictive ability increased with heritability and progeny size and decreased with QTL number, most notably for SD. Comparing scenarios where marker effects were known or estimated using prediction models, SD was strongly impacted by the quality of marker effect estimates. We proposed a new algebraic formula for SD estimation that takes into account the uncertainty of the estimation of marker effects. It improved predictions when the number of QTL was superior to 300, especially when heritability was low. We also compared estimated and observed UC using experimental data for heading date, plant height, grain protein content and yield. PM and UC estimates were significantly correlated for all traits (PM: 0.38, 0.63, 0.51 and 0.91; UC: 0.45, 0.52, 0.54 and 0.74; for yield, grain protein content, plant height and heading date, respectively), while SD was correlated only for heading date and plant height (0.64 and 0.49, respectively). According to simulations, SD estimations in the field would necessitate large progenies. This pioneering study experimentally validates genomic prediction of UC but the predictive ability depends on trait architecture and precision of marker effect estimates. We advise the breeders to adjust progeny size to realize the SD potential of a cross.
Collapse
Affiliation(s)
- Claire Oget-Ebrad
- UMR1095, GDEC, INRAE-Université Clermont-Auvergne, Clermont-Ferrand, France
| | - Emmanuel Heumez
- INRAE-UE Lille, 2 Chaussée Brunehaut, Estrées Mons, BP50136, 80203, Peronne Cedex, France
| | - Laure Duchalais
- Agri-Obtentions, Ferme de Gauvilliers, 78660, Orsonville, France
| | | | | | - Jean-Michel Elsen
- UMR1388, GenPhySE, INRAE-Université de Toulouse, Castanet-Tolosan, France
| | - Sophie Bouchet
- UMR1095, GDEC, INRAE-Université Clermont-Auvergne, Clermont-Ferrand, France.
| |
Collapse
|
3
|
Tian R, Mahmoodi M, Tian J, Esmailizadeh Koshkoiyeh S, Zhao M, Saminzadeh M, Li H, Wang X, Li Y, Esmailizadeh A. Leveraging Functional Genomics for Understanding Beef Quality Complexities and Breeding Beef Cattle for Improved Meat Quality. Genes (Basel) 2024; 15:1104. [PMID: 39202463 PMCID: PMC11353656 DOI: 10.3390/genes15081104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 08/17/2024] [Accepted: 08/19/2024] [Indexed: 09/03/2024] Open
Abstract
Consumer perception of beef is heavily influenced by overall meat quality, a critical factor in the cattle industry. Genomics has the potential to improve important beef quality traits and identify genetic markers and causal variants associated with these traits through genomic selection (GS) and genome-wide association studies (GWAS) approaches. Transcriptomics, proteomics, and metabolomics provide insights into underlying genetic mechanisms by identifying differentially expressed genes, proteins, and metabolic pathways linked to quality traits, complementing GWAS data. Leveraging these functional genomics techniques can optimize beef cattle breeding for enhanced quality traits to meet high-quality beef demand. This paper provides a comprehensive overview of the current state of applications of omics technologies in uncovering functional variants underlying beef quality complexities. By highlighting the latest findings from GWAS, GS, transcriptomics, proteomics, and metabolomics studies, this work seeks to serve as a valuable resource for fostering a deeper understanding of the complex relationships between genetics, gene expression, protein dynamics, and metabolic pathways in shaping beef quality.
Collapse
Affiliation(s)
- Rugang Tian
- Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences, Hohhot 010031, China; (J.T.); (M.Z.); (H.L.); (X.W.); (Y.L.)
| | - Maryam Mahmoodi
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman P.O. Box 76169-133, Iran; (M.M.); (S.E.K.); (M.S.); (A.E.)
| | - Jing Tian
- Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences, Hohhot 010031, China; (J.T.); (M.Z.); (H.L.); (X.W.); (Y.L.)
| | - Sina Esmailizadeh Koshkoiyeh
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman P.O. Box 76169-133, Iran; (M.M.); (S.E.K.); (M.S.); (A.E.)
| | - Meng Zhao
- Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences, Hohhot 010031, China; (J.T.); (M.Z.); (H.L.); (X.W.); (Y.L.)
| | - Mahla Saminzadeh
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman P.O. Box 76169-133, Iran; (M.M.); (S.E.K.); (M.S.); (A.E.)
| | - Hui Li
- Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences, Hohhot 010031, China; (J.T.); (M.Z.); (H.L.); (X.W.); (Y.L.)
| | - Xiao Wang
- Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences, Hohhot 010031, China; (J.T.); (M.Z.); (H.L.); (X.W.); (Y.L.)
| | - Yuan Li
- Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences, Hohhot 010031, China; (J.T.); (M.Z.); (H.L.); (X.W.); (Y.L.)
| | - Ali Esmailizadeh
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman P.O. Box 76169-133, Iran; (M.M.); (S.E.K.); (M.S.); (A.E.)
| |
Collapse
|
4
|
Truong B, Hull LE, Ruan Y, Huang QQ, Hornsby W, Martin H, van Heel DA, Wang Y, Martin AR, Lee SH, Natarajan P. Integrative polygenic risk score improves the prediction accuracy of complex traits and diseases. CELL GENOMICS 2024; 4:100523. [PMID: 38508198 PMCID: PMC11019356 DOI: 10.1016/j.xgen.2024.100523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/15/2023] [Accepted: 02/20/2024] [Indexed: 03/22/2024]
Abstract
Polygenic risk scores (PRSs) are an emerging tool to predict the clinical phenotypes and outcomes of individuals. We propose PRSmix, a framework that leverages the PRS corpus of a target trait to improve prediction accuracy, and PRSmix+, which incorporates genetically correlated traits to better capture the human genetic architecture for 47 and 32 diseases/traits in European and South Asian ancestries, respectively. PRSmix demonstrated a mean prediction accuracy improvement of 1.20-fold (95% confidence interval [CI], [1.10; 1.3]; p = 9.17 × 10-5) and 1.19-fold (95% CI, [1.11; 1.27]; p = 1.92 × 10-6), and PRSmix+ improved the prediction accuracy by 1.72-fold (95% CI, [1.40; 2.04]; p = 7.58 × 10-6) and 1.42-fold (95% CI, [1.25; 1.59]; p = 8.01 × 10-7) in European and South Asian ancestries, respectively. Compared to the previously cross-trait-combination methods with scores from pre-defined correlated traits, we demonstrated that our method improved prediction accuracy for coronary artery disease up to 3.27-fold (95% CI, [2.1; 4.44]; p value after false discovery rate (FDR) correction = 2.6 × 10-4). Our method provides a comprehensive framework to benchmark and leverage the combined power of PRS for maximal performance in a desired target population.
Collapse
Affiliation(s)
- Buu Truong
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA 02142, USA; Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA
| | - Leland E Hull
- Division of General Internal Medicine, Massachusetts General Hospital, 100 Cambridge Street, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Yunfeng Ruan
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA 02142, USA; Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA
| | - Qin Qin Huang
- Department of Human Genetics, Wellcome Sanger Institute, Cambridge, UK
| | - Whitney Hornsby
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA 02142, USA; Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA
| | - Hilary Martin
- Department of Human Genetics, Wellcome Sanger Institute, Cambridge, UK
| | - David A van Heel
- Blizard Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Ying Wang
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Alicia R Martin
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - S Hong Lee
- Australian Centre for Precision Health, University of South Australia Cancer Research Institute, University of South Australia, Adelaide, SA 5000, Australia
| | - Pradeep Natarajan
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA 02142, USA; Center for Genomic Medicine and Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA.
| |
Collapse
|
5
|
Mota LFM, Arikawa LM, Santos SWB, Fernandes Júnior GA, Alves AAC, Rosa GJM, Mercadante MEZ, Cyrillo JNSG, Carvalheiro R, Albuquerque LG. Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in Nellore cattle. Sci Rep 2024; 14:6404. [PMID: 38493207 PMCID: PMC10944497 DOI: 10.1038/s41598-024-57234-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 03/15/2024] [Indexed: 03/18/2024] Open
Abstract
Genomic selection (GS) offers a promising opportunity for selecting more efficient animals to use consumed energy for maintenance and growth functions, impacting profitability and environmental sustainability. Here, we compared the prediction accuracy of multi-layer neural network (MLNN) and support vector regression (SVR) against single-trait (STGBLUP), multi-trait genomic best linear unbiased prediction (MTGBLUP), and Bayesian regression (BayesA, BayesB, BayesC, BRR, and BLasso) for feed efficiency (FE) traits. FE-related traits were measured in 1156 Nellore cattle from an experimental breeding program genotyped for ~ 300 K markers after quality control. Prediction accuracy (Acc) was evaluated using a forward validation splitting the dataset based on birth year, considering the phenotypes adjusted for the fixed effects and covariates as pseudo-phenotypes. The MLNN and SVR approaches were trained by randomly splitting the training population into fivefold to select the best hyperparameters. The results show that the machine learning methods (MLNN and SVR) and MTGBLUP outperformed STGBLUP and the Bayesian regression approaches, increasing the Acc by approximately 8.9%, 14.6%, and 13.7% using MLNN, SVR, and MTGBLUP, respectively. Acc for SVR and MTGBLUP were slightly different, ranging from 0.62 to 0.69 and 0.62 to 0.68, respectively, with empirically unbiased for both models (0.97 and 1.09). Our results indicated that SVR and MTGBLUBP approaches were more accurate in predicting FE-related traits than Bayesian regression and STGBLUP and seemed competitive for GS of complex phenotypes with various degrees of inheritance.
Collapse
Affiliation(s)
- Lucio F M Mota
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil.
| | - Leonardo M Arikawa
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil
| | - Samuel W B Santos
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil
| | - Gerardo A Fernandes Júnior
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil
| | - Anderson A C Alves
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil
| | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI, 53706, USA
| | - Maria E Z Mercadante
- Institute of Animal Science, Beef Cattle Research Center, Sertãozinho, SP, 14174-000, Brazil
- National Council for Science and Technological Development, Brasilia, DF, 71605-001, Brazil
| | - Joslaine N S G Cyrillo
- Institute of Animal Science, Beef Cattle Research Center, Sertãozinho, SP, 14174-000, Brazil
| | - Roberto Carvalheiro
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil
- National Council for Science and Technological Development, Brasilia, DF, 71605-001, Brazil
| | - Lucia G Albuquerque
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, SP, 14884-900, Brazil.
- National Council for Science and Technological Development, Brasilia, DF, 71605-001, Brazil.
| |
Collapse
|
6
|
Cuyabano BCD, Boichard D, Gondro C. Expected values for the accuracy of predicted breeding values accounting for genetic differences between reference and target populations. Genet Sel Evol 2024; 56:15. [PMID: 38424504 PMCID: PMC11234767 DOI: 10.1186/s12711-024-00876-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/08/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Genetic merit, or breeding values as referred to in livestock and crop breeding programs, is one of the keys to the successful selection of animals in commercial farming systems. The developments in statistical methods during the twentieth century and single nucleotide polymorphism (SNP) chip technologies in the twenty-first century have revolutionized agricultural production, by allowing highly accurate predictions of breeding values for selection candidates at a very early age. Nonetheless, for many breeding populations, realized accuracies of predicted breeding values (PBV) remain below the theoretical maximum, even when the reference population is sufficiently large, and SNPs included in the model are in sufficient linkage disequilibrium (LD) with the quantitative trait locus (QTL). This is particularly noticeable over generations, as we observe the so-called erosion of the effects of SNPs due to recombinations, accompanied by the erosion of the accuracy of prediction. While accurately quantifying the erosion at the individual SNP level is a difficult and unresolved task, quantifying the erosion of the accuracy of prediction is a more tractable problem. In this paper, we describe a method that uses the relationship between reference and target populations to calculate expected values for the accuracies of predicted breeding values for non-phenotyped individuals accounting for erosion. The accuracy of the expected values was evaluated through simulations, and a further evaluation was performed on real data. RESULTS Using simulations, we empirically confirmed that our expected values for the accuracy of PBV accounting for erosion were able to correctly determine the prediction accuracy of breeding values for non-phenotyped individuals. When comparing the expected to the realized accuracies of PBV with real data, only one out of the four traits evaluated presented accuracies that were significantly higher than the expected, approachingh 2 . CONCLUSIONS We defined an index of genetic correlation between reference and target populations, which summarizes the expected overall erosion due to differences in allele frequencies and LD patterns between populations. We used this correlation along with a trait's heritability to derive expected values for the accuracy ( R ) of PBV accounting for the erosion, and demonstrated that our derived E R | erosion is a reliable metric.
Collapse
Affiliation(s)
- Beatriz C D Cuyabano
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France.
| | - Didier Boichard
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France
| | - Cedric Gondro
- Department of Animal Science, Michigan State University, 474 S Shaw Ln, East Lansing, MI, 48824, USA
| |
Collapse
|
7
|
Roques S, Martinez-Fernandez G, Ramayo-Caldas Y, Popova M, Denman S, Meale SJ, Morgavi DP. Recent Advances in Enteric Methane Mitigation and the Long Road to Sustainable Ruminant Production. Annu Rev Anim Biosci 2024; 12:321-343. [PMID: 38079599 DOI: 10.1146/annurev-animal-021022-024931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2024]
Abstract
Mitigation of methane emission, a potent greenhouse gas, is a worldwide priority to limit global warming. A substantial part of anthropogenic methane is emitted by the livestock sector, as methane is a normal product of ruminant digestion. We present the latest developments and challenges ahead of the main efficient mitigation strategies of enteric methane production in ruminants. Numerous mitigation strategies have been developed in the last decades, from dietary manipulation and breeding to targeting of methanogens, the microbes that produce methane. The most recent advances focus on specific inhibition of key enzymes involved in methanogenesis. But these inhibitors, although efficient, are not affordable and not adapted to the extensive farming systems prevalent in low- and middle-income countries. Effective global mitigation of methane emissions from livestock should be based not only on scientific progress but also on the feasibility and accessibility of mitigation strategies.
Collapse
Affiliation(s)
- Simon Roques
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR Herbivores, Saint-Genes-Champanelle, France; , ,
| | | | - Yuliaxis Ramayo-Caldas
- Animal Breeding and Genetics Program, Institute of Agrifood Research and Technology (IRTA), Torre Marimon, Caldes de Montbui, Spain;
| | - Milka Popova
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR Herbivores, Saint-Genes-Champanelle, France; , ,
| | - Stuart Denman
- Agriculture and Food, CSIRO, St. Lucia, Queensland, Australia; ,
| | - Sarah J Meale
- School of Agriculture and Food Sustainability, Faculty of Science, University of Queensland, Gatton, Queensland, Australia;
| | - Diego P Morgavi
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR Herbivores, Saint-Genes-Champanelle, France; , ,
| |
Collapse
|
8
|
Di Scipio M, Khan M, Mao S, Chong M, Judge C, Pathan N, Perrot N, Nelson W, Lali R, Di S, Morton R, Petch J, Paré G. A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets. Nat Commun 2023; 14:5196. [PMID: 37626057 PMCID: PMC10457310 DOI: 10.1038/s41467-023-40913-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 08/16/2023] [Indexed: 08/27/2023] Open
Abstract
Identification of gene-by-environment interactions (GxE) is crucial to understand the interplay of environmental effects on complex traits. However, current methods evaluating GxE on biobank-scale datasets have limitations. We introduce MonsterLM, a multiple linear regression method that does not rely on model specification and provides unbiased estimates of variance explained by GxE. We demonstrate robustness of MonsterLM through comprehensive genome-wide simulations using real genetic data from 325,989 individuals. We estimate GxE using waist-to-hip-ratio, smoking, and exercise as the environmental variables on 13 outcomes (N = 297,529-325,989) in the UK Biobank. GxE variance is significant for 8 environment-outcome pairs, ranging from 0.009 - 0.071. The majority of GxE variance involves SNPs without strong marginal or interaction associations. We observe modest improvements in polygenic score prediction when incorporating GxE. Our results imply a significant contribution of GxE to complex trait variance and we show MonsterLM to be well-purposed to handle this with biobank-scale data.
Collapse
Affiliation(s)
- Matteo Di Scipio
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Mohammad Khan
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Shihong Mao
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
| | - Michael Chong
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton, ON, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, ON, Canada
| | - Conor Judge
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
| | - Nazia Pathan
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Nicolas Perrot
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
| | - Walter Nelson
- Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Ricky Lali
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Shuang Di
- Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Robert Morton
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, ON, Canada
| | - Jeremy Petch
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
- Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
| | - Guillaume Paré
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, ON, Canada.
- Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton, ON, Canada.
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, ON, Canada.
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada.
| |
Collapse
|
9
|
Lee HS, Kim Y, Lee DH, Seo D, Lee DJ, Do CH, Dinh PTN, Ekanayake W, Lee KH, Yoon D, Lee SH, Koo YM. Comparison of accuracy of breeding value for cow from three methods in Hanwoo (Korean cattle) population. JOURNAL OF ANIMAL SCIENCE AND TECHNOLOGY 2023; 65:720-734. [PMID: 37970511 PMCID: PMC10640958 DOI: 10.5187/jast.2023.e5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 01/08/2023] [Accepted: 01/09/2023] [Indexed: 11/17/2023]
Abstract
In Korea, Korea Proven Bulls (KPN) program has been well-developed. Breeding and evaluation of cows are also an essential factor to increase earnings and genetic gain. This study aimed to evaluate the accuracy of cow breeding value by using three methods (pedigree index [PI], pedigree-based best linear unbiased prediction [PBLUP], and genomic-BLUP [GBLUP]). The reference population (n = 16,971) was used to estimate breeding values for 481 females as a test population. The accuracy of GBLUP was 0.63, 0.66, 0.62 and 0.63 for carcass weight (CWT), eye muscle area (EMA), back-fat thickness (BFT), and marbling score (MS), respectively. As for the PBLUP method, accuracy of prediction was 0.43 for CWT, 0.45 for EMA, 0.43 for MS, and 0.44 for BFT. Accuracy of PI method was the lowest (0.28 to 0.29 for carcass traits). The increase by approximate 20% in accuracy of GBLUP method than other methods could be because genomic information may explain Mendelian sampling error that pedigree information cannot detect. Bias can cause reducing accuracy of estimated breeding value (EBV) for selected animals. Regression coefficient between true breeding value (TBV) and GBLUP EBV, PBLUP EBV, and PI EBV were 0.78, 0.625, and 0.35, respectively for CWT. This showed that genomic EBV (GEBV) is less biased than PBLUP and PI EBV in this study. In addition, number of effective chromosome segments (Me) statistic that indicates the independent loci is one of the important factors affecting the accuracy of BLUP. The correlation between Me and the accuracy of GBLUP is related to the genetic relationship between reference and test population. The correlations between Me and accuracy were -0.74 in CWT, -0.75 in EMA, -0.73 in MS, and -0.75 in BF, which were strongly negative. These results proved that the estimation of genetic ability using genomic data is the most effective, and the smaller the Me, the higher the accuracy of EBV.
Collapse
Affiliation(s)
- Hyo Sang Lee
- Genetic Information Division, Korea Animal
Improvement Association, Livestock Hall, Seoul 06668,
Korea
| | - Yeongkuk Kim
- Department of Bio-AI Convergence, Chungnam
National University, Daejeon 34134, Korea
| | - Doo Ho Lee
- Division of Animal and Dairy Science,
Chungnam National University, Daejeon 34148, Korea
| | | | - Dong Jae Lee
- Division of Animal and Dairy Science,
Chungnam National University, Daejeon 34148, Korea
| | - Chang Hee Do
- Institute of Agricultural Science,
Chungnam National University, Daejeon 34134, Korea
| | - Phuong Thanh N. Dinh
- Department of Bio-AI Convergence, Chungnam
National University, Daejeon 34134, Korea
| | - Waruni Ekanayake
- Division of Animal and Dairy Science,
Chungnam National University, Daejeon 34148, Korea
| | - Kil Hwan Lee
- Genetic Information Division, Korea Animal
Improvement Association, Livestock Hall, Seoul 06668,
Korea
| | - Duhak Yoon
- Department of Animal Science and
Biotechnology, Kyungpook National University, Sangju 37224,
Korea
| | - Seung Hwan Lee
- Division of Animal and Dairy Science,
Chungnam National University, Daejeon 34148, Korea
| | - Yang Mo Koo
- Genetic Information Division, Korea Animal
Improvement Association, Livestock Hall, Seoul 06668,
Korea
| |
Collapse
|
10
|
Richards TJ, McGuigan K, Aguirre JD, Humanes A, Bozec YM, Mumby PJ, Riginos C. Moving beyond heritability in the search for coral adaptive potential. GLOBAL CHANGE BIOLOGY 2023; 29:3869-3882. [PMID: 37310164 DOI: 10.1111/gcb.16719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/31/2023] [Accepted: 04/04/2023] [Indexed: 06/14/2023]
Abstract
Global environmental change is happening at unprecedented rates. Coral reefs are among the ecosystems most threatened by global change. For wild populations to persist, they must adapt. Knowledge shortfalls about corals' complex ecological and evolutionary dynamics, however, stymie predictions about potential adaptation to future conditions. Here, we review adaptation through the lens of quantitative genetics. We argue that coral adaptation studies can benefit greatly from "wild" quantitative genetic methods, where traits are studied in wild populations undergoing natural selection, genomic relationship matrices can replace breeding experiments, and analyses can be extended to examine genetic constraints among traits. In addition, individuals with advantageous genotypes for anticipated future conditions can be identified. Finally, genomic genotyping supports simultaneous consideration of how genetic diversity is arrayed across geographic and environmental distances, providing greater context for predictions of phenotypic evolution at a metapopulation scale.
Collapse
Affiliation(s)
- Thomas J Richards
- School of Biological Sciences, The University of Queensland, Queensland, St Lucia, Australia
| | - Katrina McGuigan
- School of Biological Sciences, The University of Queensland, Queensland, St Lucia, Australia
| | - J David Aguirre
- School of Natural Sciences, Massey University, Auckland, New Zealand
| | - Adriana Humanes
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Yves-Marie Bozec
- School of Biological Sciences, The University of Queensland, Queensland, St Lucia, Australia
| | - Peter J Mumby
- School of Biological Sciences, The University of Queensland, Queensland, St Lucia, Australia
| | - Cynthia Riginos
- School of Biological Sciences, The University of Queensland, Queensland, St Lucia, Australia
| |
Collapse
|
11
|
Neshat M, Lee S, Momin MM, Truong B, van der Werf JHJ, Lee SH. An effective hyper-parameter can increase the prediction accuracy in a single-step genetic evaluation. Front Genet 2023; 14:1104906. [PMID: 37359380 PMCID: PMC10285379 DOI: 10.3389/fgene.2023.1104906] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 05/23/2023] [Indexed: 06/28/2023] Open
Abstract
The H-matrix best linear unbiased prediction (HBLUP) method has been widely used in livestock breeding programs. It can integrate all information, including pedigree, genotypes, and phenotypes on both genotyped and non-genotyped individuals into one single evaluation that can provide reliable predictions of breeding values. The existing HBLUP method requires hyper-parameters that should be adequately optimised as otherwise the genomic prediction accuracy may decrease. In this study, we assess the performance of HBLUP using various hyper-parameters such as blending, tuning, and scale factor in simulated and real data on Hanwoo cattle. In both simulated and cattle data, we show that blending is not necessary, indicating that the prediction accuracy decreases when using a blending hyper-parameter <1. The tuning process (adjusting genomic relationships accounting for base allele frequencies) improves prediction accuracy in the simulated data, confirming previous studies, although the improvement is not statistically significant in the Hanwoo cattle data. We also demonstrate that a scale factor, α, which determines the relationship between allele frequency and per-allele effect size, can improve the HBLUP accuracy in both simulated and real data. Our findings suggest that an optimal scale factor should be considered to increase prediction accuracy, in addition to blending and tuning processes, when using HBLUP.
Collapse
Affiliation(s)
- Mehdi Neshat
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
| | - Soohyun Lee
- Division of Animal Breeding and Genetics, National Institute of Animal Science (NIAS), Cheonan, Republic of Korea
| | - Md. Moksedul Momin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
- Department of Genetics and Animal Breeding, Faculty of Veterinary Medicine, Chattogram Veterinary and Animal Sciences University (CVASU), Chattogram, Bangladesh
| | - Buu Truong
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- Cardiovascular Research Centre, Massachusetts General Hospital, Boston, MA, United States
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad, Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, United States
| | | | - S. Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
| |
Collapse
|
12
|
Truong B, Hull LE, Ruan Y, Huang QQ, Hornsby W, Martin H, van Heel DA, Wang Y, Martin AR, Lee SH, Natarajan P. Integrative polygenic risk score improves the prediction accuracy of complex traits and diseases. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.21.23286110. [PMID: 36865265 PMCID: PMC9980241 DOI: 10.1101/2023.02.21.23286110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
Abstract
Polygenic risk scores (PRS) are an emerging tool to predict the clinical phenotypes and outcomes of individuals. Validation and transferability of existing PRS across independent datasets and diverse ancestries are limited, which hinders the practical utility and exacerbates health disparities. We propose PRSmix, a framework that evaluates and leverages the PRS corpus of a target trait to improve prediction accuracy, and PRSmix+, which incorporates genetically correlated traits to better capture the human genetic architecture. We applied PRSmix to 47 and 32 diseases/traits in European and South Asian ancestries, respectively. PRSmix demonstrated a mean prediction accuracy improvement of 1.20-fold (95% CI: [1.10; 1.3]; P-value = 9.17 × 10-5) and 1.19-fold (95% CI: [1.11; 1.27]; P-value = 1.92 × 10-6), and PRSmix+ improved the prediction accuracy by 1.72-fold (95% CI: [1.40; 2.04]; P-value = 7.58 × 10-6) and 1.42-fold (95% CI: [1.25; 1.59]; P-value = 8.01 × 10-7) in European and South Asian ancestries, respectively. Compared to the previously established cross-trait-combination method with scores from pre-defined correlated traits, we demonstrated that our method can improve prediction accuracy for coronary artery disease up to 3.27-fold (95% CI: [2.1; 4.44]; P-value after FDR correction = 2.6 × 10-4). Our method provides a comprehensive framework to benchmark and leverage the combined power of PRS for maximal performance in a desired target population.
Collapse
Affiliation(s)
- Buu Truong
- Program in Medical and Population Genetics and the Cardiovascular
Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA
02142
- Center for Genomic Medicine and Cardiovascular Research Center,
Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, 02114
| | - Leland E. Hull
- Division of General Internal Medicine, 100 Cambridge Street,
Massachusetts General Hospital, Boston, MA, 02114
- Department of Medicine, Harvard Medical School, 25 Shattuck
Street, Boston, MA 02115
| | - Yunfeng Ruan
- Program in Medical and Population Genetics and the Cardiovascular
Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA
02142
- Center for Genomic Medicine and Cardiovascular Research Center,
Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, 02114
| | - Qin Qin Huang
- Department of Human Genetics, Wellcome Sanger Institute,
Cambridge, UK
| | - Whitney Hornsby
- Program in Medical and Population Genetics and the Cardiovascular
Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA
02142
- Center for Genomic Medicine and Cardiovascular Research Center,
Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, 02114
| | - Hilary Martin
- Department of Human Genetics, Wellcome Sanger Institute,
Cambridge, UK
| | - David A. van Heel
- Blizard Institute, Barts and the London School of Medicine and
Dentistry, Queen Mary University of London, London, UK
| | - Ying Wang
- Program in Medical and Population Genetics and the Cardiovascular
Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA
02142
- Stanley Center for Psychiatric Research, Broad Institute of
Harvard and MIT, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General
Hospital, Boston, MA, USA
| | - Alicia R. Martin
- Stanley Center for Psychiatric Research, Broad Institute of
Harvard and MIT, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General
Hospital, Boston, MA, USA
| | - S. Hong Lee
- Australian Centre for Precision Health, University of South
Australia Cancer Research Institute, University of South Australia, Adelaide, SA, 5000,
Australia
| | - Pradeep Natarajan
- Program in Medical and Population Genetics and the Cardiovascular
Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA
02142
- Center for Genomic Medicine and Cardiovascular Research Center,
Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, 02114
- Department of Medicine, Harvard Medical School, 25 Shattuck
Street, Boston, MA 02115
| |
Collapse
|
13
|
Zhang R, Zhang Y, Liu T, Jiang B, Li Z, Qu Y, Chen Y, Li Z. Utilizing Variants Identified with Multiple Genome-Wide Association Study Methods Optimizes Genomic Selection for Growth Traits in Pigs. Animals (Basel) 2023; 13:ani13040722. [PMID: 36830509 PMCID: PMC9952664 DOI: 10.3390/ani13040722] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 02/09/2023] [Accepted: 02/15/2023] [Indexed: 02/22/2023] Open
Abstract
Improving the prediction accuracies of economically important traits in genomic selection (GS) is a main objective for researchers and breeders in the livestock industry. This study aims at utilizing potentially functional SNPs and QTLs identified with various genome-wide association study (GWAS) models in GS of pig growth traits. We used three well-established GWAS methods, including the mixed linear model, Bayesian model and meta-analysis, as well as 60K SNP-chip and whole genome sequence (WGS) data from 1734 Yorkshire and 1123 Landrace pigs to detect SNPs related to four growth traits: average daily gain, backfat thickness, body weight and birth weight. A total of 1485 significant loci and 24 candidate genes which are involved in skeletal muscle development, fatty deposition, lipid metabolism and insulin resistance were identified. Compared with using all SNP-chip data, GS with the pre-selected functional SNPs in the standard genomic best linear unbiased prediction (GBLUP), and a two-kernel based GBLUP model yielded average gains in accuracy by 4 to 46% (from 0.19 ± 0.07 to 0.56 ± 0.07) and 5 to 27% (from 0.16 ± 0.06 to 0.57 ± 0.05) for the four traits, respectively, suggesting that the prioritization of preselected functional markers in GS models had the potential to improve prediction accuracies for certain traits in livestock breeding.
Collapse
Affiliation(s)
- Ruifeng Zhang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Yi Zhang
- Institute of Neuroscience, Panzhihua University, Panzhihua 617000, China
| | - Tongni Liu
- Genetic Data Center, Faculty of Forestry, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Bo Jiang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Zhenyang Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Youping Qu
- Guangdong IPIG Technology Co., Ltd., Guangzhou 510006, China
| | - Yaosheng Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Zhengcao Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
- Correspondence:
| |
Collapse
|
14
|
Yan H, Guo H, Li T, Zhang H, Xu W, Xie J, Zhu X, Yu Y, Chen J, Zhao S, Xu J, Hu M, Jiang Y, Zhang H, Ma M, He Z. High-precision early warning system for rice cadmium accumulation risk assessment. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 859:160135. [PMID: 36375547 DOI: 10.1016/j.scitotenv.2022.160135] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 10/01/2022] [Accepted: 11/07/2022] [Indexed: 06/16/2023]
Abstract
Rapid global industrialization has resulted in widespread cadmium contamination in agricultural soils and products. A considerable proportion of rice consumers are exposed to Cd levels above the provisional safe intake limit, raising widespread environmental concerns on risk management. Therefore, a generalized approach is urgently needed to enable correct evaluation and early warning of cadmium contaminants in rice products. Combining big data and computer science together, this study developed a system named "SMART Cd Early Warning", which integrated 4 modules including genotype-to-phenotype (G2P) modelling, high-throughput sequencing, G2P prediction and rice Cd contamination risk assessment, for rice cadmium accumulation early warning. This system can rapidly assess the risk of rice cadmium accumulation by genotyping leaves at seeding stage. The parameters including statistical methods, population size, training population-testing population ratio, SNP density were assessed to ensure G2P model exhibited superior performance in terms of prediction precision (up to 0.76 ± 0.003) and computing efficiency (within 2 h). In field trials of cadmium-contaminated farmlands in Wenling and Fuyang city, Zhejiang Province, "SMART Cd Early Warning" exhibited superior capability for identification risk rice varieties, suggesting a potential of "SMART Cd Early-Warning system" in OsGCd risk assessment and early warning in the age of smart.
Collapse
Affiliation(s)
- Huili Yan
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Hanyao Guo
- Hebei Normal University, Shijiazhuang 050024, China
| | - Ting Li
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hezifan Zhang
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wenxiu Xu
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Jianyin Xie
- Key Lab of Crop Heterosis and Utilization of Ministry of Education, Beijing Key Lab of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Xiaoyang Zhu
- Key Lab of Crop Heterosis and Utilization of Ministry of Education, Beijing Key Lab of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Yijun Yu
- Zhejiang Station for Management of Arable Land Quality and Fertilizer, Hangzhou 310020, China
| | - Jian Chen
- Plant Protection, Fertilizer and Rural Energy Agency of Wenling, Wenling 317500, China
| | - Shouqing Zhao
- Plant Protection, Fertilizer and Rural Energy Agency of Wenling, Wenling 317500, China
| | - Jun Xu
- Fuyang Agricultural Technology Extension Center, Fuyang 311400, China
| | - Minjun Hu
- Fuyang Agricultural Technology Extension Center, Fuyang 311400, China
| | - Yugen Jiang
- Fuyang Agricultural Technology Extension Center, Fuyang 311400, China
| | - Hongliang Zhang
- Key Lab of Crop Heterosis and Utilization of Ministry of Education, Beijing Key Lab of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China; Sanya Institute of China Agricultural University, Sanya 572024, China
| | - Mi Ma
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Zhenyan He
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China.
| |
Collapse
|
15
|
Nantongo JS, Potts BM, Klápště J, Graham NJ, Dungey HS, Fitzgerald H, O'Reilly-Wapstra JM. Genomic selection for resistance to mammalian bark stripping and associated chemical compounds in radiata pine. G3 (BETHESDA, MD.) 2022; 12:jkac245. [PMID: 36218439 PMCID: PMC9635650 DOI: 10.1093/g3journal/jkac245] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 08/29/2022] [Indexed: 07/28/2023]
Abstract
The integration of genomic data into genetic evaluations can facilitate the rapid selection of superior genotypes and accelerate the breeding cycle in trees. In this study, 390 trees from 74 control-pollinated families were genotyped using a 36K Axiom SNP array. A total of 15,624 high-quality SNPs were used to develop genomic prediction models for mammalian bark stripping, tree height, and selected primary and secondary chemical compounds in the bark. Genetic parameters from different genomic prediction methods-single-trait best linear unbiased prediction based on a marker-based relationship matrix (genomic best linear unbiased prediction), multitrait single-step genomic best linear unbiased prediction, which integrated the marker-based and pedigree-based relationship matrices (single-step genomic best linear unbiased prediction) and the single-trait generalized ridge regression-were compared to equivalent single- or multitrait pedigree-based approaches (ABLUP). The influence of the statistical distribution of data on the genetic parameters was assessed. Results indicated that the heritability estimates were increased nearly 2-fold with genomic models compared to the equivalent pedigree-based models. Predictive accuracy of the single-step genomic best linear unbiased prediction was higher than the ABLUP for most traits. Allowing for heterogeneity in marker effects through the use of generalized ridge regression did not markedly improve predictive ability over genomic best linear unbiased prediction, arguing that most of the chemical traits are modulated by many genes with small effects. Overall, the traits with low pedigree-based heritability benefited more from genomic models compared to the traits with high pedigree-based heritability. There was no evidence that data skewness or the presence of outliers affected the genomic or pedigree-based genetic estimates.
Collapse
Affiliation(s)
- Judith S Nantongo
- Corresponding author: National Agricultural Research Organization, P.O Box 1752, Mukono, Uganda.
| | - Brad M Potts
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
- ARC Training Centre for Forest Value, Hobart, TAS 7001, Australia
| | - Jaroslav Klápště
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua 3046, New Zealand
| | - Natalie J Graham
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua 3046, New Zealand
| | - Heidi S Dungey
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua 3046, New Zealand
| | - Hugh Fitzgerald
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
| | - Julianne M O'Reilly-Wapstra
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
- ARC Training Centre for Forest Value, Hobart, TAS 7001, Australia
| |
Collapse
|
16
|
Atanda SA, Govindan V, Singh R, Robbins KR, Crossa J, Bentley AR. Sparse testing using genomic prediction improves selection for breeding targets in elite spring wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1939-1950. [PMID: 35348821 PMCID: PMC9205816 DOI: 10.1007/s00122-022-04085-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 03/16/2022] [Indexed: 06/08/2023]
Abstract
Sparse testing using genomic prediction can be efficiently used to increase the number of testing environments while maintaining selection intensity in the early yield testing stage without increasing the breeding budget. Sparse testing using genomic prediction enables expanded use of selection environments in early-stage yield testing without increasing phenotyping cost. We evaluated different sparse testing strategies in the yield testing stage of a CIMMYT spring wheat breeding pipeline characterized by multiple populations each with small family sizes of 1-9 individuals. Our results indicated that a substantial overlap between lines across environments should be used to achieve optimal prediction accuracy. As sparse testing leverages information generated within and across environments, the genetic correlations between environments and genomic relationships of lines across environments were the main drivers of prediction accuracy in multi-environment yield trials. Including information from previous evaluation years did not consistently improve the prediction performance. Genomic best linear unbiased prediction was found to be the best predictor of true breeding value, and therefore, we propose that it should be used as a selection decision metric in the early yield testing stages. We also propose it as a proxy for assessing prediction performance to mirror breeder's advancement decisions in a breeding program so that it can be readily applied for advancement decisions by breeding programs.
Collapse
Affiliation(s)
| | - Velu Govindan
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Ravi Singh
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Kelly R Robbins
- Section of Plant Breeding and Genetics, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, USA
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.
| |
Collapse
|
17
|
Rodriguez Neira JD, Peripolli E, de Negreiros MPM, Espigolan R, López-Correa R, Aguilar I, Lobo RB, Baldi F. Prediction ability for growth and maternal traits using SNP arrays based on different marker densities in Nellore cattle using the ssGBLUP. J Appl Genet 2022; 63:389-400. [PMID: 35133621 DOI: 10.1007/s13353-022-00685-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 01/25/2022] [Accepted: 02/02/2022] [Indexed: 11/25/2022]
Abstract
This study aimed to investigate the prediction ability for growth and maternal traits using different low-density customized SNP arrays selected by informativeness and distribution of markers across the genome employing single-step genomic BLUP (ssGBLUP). Phenotypic records for adjusted weight at 210 and 450 days of age were utilized. A total of 945 animals were genotyped with high-density chip, and 267 individuals born after 2008 were selected as validation population. We evaluated 11 scenarios using five customized density arrays (40 k, 20 k, 10 k, 5 k and 2 k) and the HD array was used as desirable scenario. The GEBV predictions and BIF (Beef Improvement Federation) accuracy were obtained with BLUPF90 family programs. Linear regression was used to evaluate the prediction ability, inflation, and bias of GEBV of each customized array. An overestimation of partial GEBVs in contrast with complete GEBVs and increase of BIF accuracy with the density arrays diminished were observed. For all traits, the prediction ability was higher as the array density increased and it was similar with customized arrays higher than 10 k SNPs. Level of inflation was lower as the density array increased of and was higher for MW210 effect. The bias was susceptible to overestimation of GEBVs when the density customized arrays decreased. These results revealed that the BIF accuracy is sensible to overestimation using low-density customized arrays while the prediction ability with least 10,000 informative SNPs obtained from the Illumina BovineHD BeadChip shows accurate and less biased predictions. Low-density customized arrays under ssGBLUP method could be feasible and cost-effective in genomic selection.
Collapse
Affiliation(s)
- Juan Diego Rodriguez Neira
- Departamento de Zootecnia, Faculdade de Ciências Agrarias e Veterinárias, Universidade Estadual Paulista (Unesp), Jaboticabal, 14884-900, Brazil.
| | - Elisa Peripolli
- Departamento de Zootecnia, Faculdade de Ciências Agrarias e Veterinárias, Universidade Estadual Paulista (Unesp), Jaboticabal, 14884-900, Brazil
| | - Maria Paula Marinho de Negreiros
- Departamento de Medicina Veterinária, Faculdade de Zootecnia e Engenharia de Alimentos, Universidade de São Paulo (Usp), Pirassununga, 13535-900, Brazil
| | - Rafael Espigolan
- Departamento de Medicina Veterinária, Faculdade de Zootecnia e Engenharia de Alimentos, Universidade de São Paulo (Usp), Pirassununga, 13535-900, Brazil
| | - Rodrigo López-Correa
- Departamento de Genética y Mejoramiento Animal, Facultad de Veterinaria, Universidad de La República, Montevideo, Uruguay
| | - Ignacio Aguilar
- Instituto Nacional de Investigación Agropecuaria (INIA), Montevideo, Uruguay
| | - Raysildo B Lobo
- Associação Nacional de Criadores e Pesquisadores (ANCP), Ribeirão Preto, Brazil
| | - Fernando Baldi
- Departamento de Zootecnia, Faculdade de Ciências Agrarias e Veterinárias, Universidade Estadual Paulista (Unesp), Jaboticabal, 14884-900, Brazil
| |
Collapse
|
18
|
Elsen JM. Genomic Prediction of Complex Traits, Principles, Overview of Factors Affecting the Reliability of Genomic Prediction, and Algebra of the Reliability. Methods Mol Biol 2022; 2467:45-76. [PMID: 35451772 DOI: 10.1007/978-1-0716-2205-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The quality of the predictions of genetic values based on the genotyping of neutral markers (GEBVs) is a key information to decide whether or not to implement genomic selection. This quality depends on the part of the genetic variability captured by the markers and on the precision of the estimate of their effects. Selection index theory provided the framework for evaluating the accuracy of GEBVs once the information had been gathered, with the genomic relationship matrix (GRM) playing a central role. When this accuracy must be known a priori, the theory of quantitative genetics gives clues to calculate the expectation of this GRM. This chapter makes a critical inventory of the methods developed to calculate these accuracies a posteriori and a priori. The most significant factors affecting this accuracy are described (size of the reference population, number of markers, linkage disequilibrium, heritability).
Collapse
Affiliation(s)
- Jean-Michel Elsen
- GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France.
| |
Collapse
|
19
|
Exploring the size of reference population for expected accuracy of genomic prediction using simulated and real data in Japanese Black cattle. BMC Genomics 2021; 22:799. [PMID: 34742249 PMCID: PMC8572443 DOI: 10.1186/s12864-021-08121-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 10/21/2021] [Indexed: 11/19/2022] Open
Abstract
Background Size of reference population is a crucial factor affecting the accuracy of prediction of the genomic estimated breeding value (GEBV). There are few studies in beef cattle that have compared accuracies achieved using real data to that achieved with simulated data and deterministic predictions. Thus, extent to which traits of interest affect accuracy of genomic prediction in Japanese Black cattle remains obscure. This study aimed to explore the size of reference population for expected accuracy of genomic prediction for simulated and carcass traits in Japanese Black cattle using a large amount of samples. Results A simulation analysis showed that heritability and size of reference population substantially impacted the accuracy of GEBV, whereas the number of quantitative trait loci did not. The estimated numbers of independent chromosome segments (Me) and the related weighting factor (w) derived from simulation results and a maximum likelihood (ML) approach were 1900–3900 and 1, respectively. The expected accuracy for trait with heritability of 0.1–0.5 fitted well with empirical values when the reference population comprised > 5000 animals. The heritability for carcass traits was estimated to be 0.29–0.41 and the accuracy of GEBVs was relatively consistent with simulation results. When the reference population comprised 7000–11,000 animals, the accuracy of GEBV for carcass traits can range 0.73–0.79, which is comparable to estimated breeding value obtained in the progeny test. Conclusion Our simulation analysis demonstrated that the expected accuracy of GEBV for a polygenic trait with low-to-moderate heritability could be practical in Japanese Black cattle population. For carcass traits, a total of 7000–11,000 animals can be a sufficient size of reference population for genomic prediction. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08121-z.
Collapse
|
20
|
Zhou X, Lee SH. An integrative analysis of genomic and exposomic data for complex traits and phenotypic prediction. Sci Rep 2021; 11:21495. [PMID: 34728654 PMCID: PMC8564528 DOI: 10.1038/s41598-021-00427-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 10/12/2021] [Indexed: 12/18/2022] Open
Abstract
Complementary to the genome, the concept of exposome has been proposed to capture the totality of human environmental exposures. While there has been some recent progress on the construction of the exposome, few tools exist that can integrate the genome and exposome for complex trait analyses. Here we propose a linear mixed model approach to bridge this gap, which jointly models the random effects of the two omics layers on phenotypes of complex traits. We illustrate our approach using traits from the UK Biobank (e.g., BMI and height for N ~ 35,000) with a small fraction of the exposome that comprises 28 lifestyle factors. The joint model of the genome and exposome explains substantially more phenotypic variance and significantly improves phenotypic prediction accuracy, compared to the model based on the genome alone. The additional phenotypic variance captured by the exposome includes its additive effects as well as non-additive effects such as genome-exposome (gxe) and exposome-exposome (exe) interactions. For example, 19% of variation in BMI is explained by additive effects of the genome, while additional 7.2% by additive effects of the exposome, 1.9% by exe interactions and 4.5% by gxe interactions. Correspondingly, the prediction accuracy for BMI, computed using Pearson's correlation between the observed and predicted phenotypes, improves from 0.15 (based on the genome alone) to 0.35 (based on the genome and exposome). We also show, using established theories, that integrating genomic and exposomic data can be an effective way of attaining a clinically meaningful level of prediction accuracy for disease traits. In conclusion, the genomic and exposomic effects can contribute to phenotypic variation via their latent relationships, i.e. genome-exposome correlation, and gxe and exe interactions, and modelling these effects has a potential to improve phenotypic prediction accuracy and thus holds a great promise for future clinical practice.
Collapse
Affiliation(s)
- Xuan Zhou
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia
- South Australian Health and Medical Research Institute, Adelaide, SA, 5000, Australia
| | - S Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia.
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia.
- South Australian Health and Medical Research Institute, Adelaide, SA, 5000, Australia.
| |
Collapse
|
21
|
Ahmar S, Ballesta P, Ali M, Mora-Poblete F. Achievements and Challenges of Genomics-Assisted Breeding in Forest Trees: From Marker-Assisted Selection to Genome Editing. Int J Mol Sci 2021; 22:10583. [PMID: 34638922 PMCID: PMC8508745 DOI: 10.3390/ijms221910583] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 09/26/2021] [Accepted: 09/27/2021] [Indexed: 12/23/2022] Open
Abstract
Forest tree breeding efforts have focused mainly on improving traits of economic importance, selecting trees suited to new environments or generating trees that are more resilient to biotic and abiotic stressors. This review describes various methods of forest tree selection assisted by genomics and the main technological challenges and achievements in research at the genomic level. Due to the long rotation time of a forest plantation and the resulting long generation times necessary to complete a breeding cycle, the use of advanced techniques with traditional breeding have been necessary, allowing the use of more precise methods for determining the genetic architecture of traits of interest, such as genome-wide association studies (GWASs) and genomic selection (GS). In this sense, main factors that determine the accuracy of genomic prediction models are also addressed. In turn, the introduction of genome editing opens the door to new possibilities in forest trees and especially clustered regularly interspaced short palindromic repeats and CRISPR-associated protein 9 (CRISPR/Cas9). It is a highly efficient and effective genome editing technique that has been used to effectively implement targetable changes at specific places in the genome of a forest tree. In this sense, forest trees still lack a transformation method and an inefficient number of genotypes for CRISPR/Cas9. This challenge could be addressed with the use of the newly developing technique GRF-GIF with speed breeding.
Collapse
Affiliation(s)
- Sunny Ahmar
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile;
| | - Paulina Ballesta
- The National Fund for Scientific and Technological Development, Av. del Agua 3895, Talca 3460000, Chile
| | - Mohsin Ali
- Department of Forestry and Range Management, University of Agriculture Faisalabad, Faisalabad 38000, Pakistan;
| | - Freddy Mora-Poblete
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile;
| |
Collapse
|
22
|
Esuma W, Ozimati A, Kulakow P, Gore MA, Wolfe MD, Nuwamanya E, Egesi C, Kawuki RS. Effectiveness of genomic selection for improving provitamin A carotenoid content and associated traits in cassava. G3 (BETHESDA, MD.) 2021; 11:jkab160. [PMID: 33963852 PMCID: PMC8496257 DOI: 10.1093/g3journal/jkab160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 04/26/2021] [Indexed: 11/14/2022]
Abstract
Global efforts are underway to develop cassava with enhanced levels of provitamin A carotenoids to sustainably meet increasing demands for food and nutrition where the crop is a major staple. Herein, we tested the effectiveness of genomic selection (GS) for rapid improvement of cassava for total carotenoids content and associated traits. We evaluated 632 clones from Uganda's provitamin A cassava breeding pipeline and 648 West African introductions. At harvest, each clone was assessed for level of total carotenoids, dry matter content, and resistance to cassava brown streak disease (CBSD). All clones were genotyped with diversity array technology and imputed to a set of 23,431 single nucleotide polymorphic markers. We assessed predictive ability of four genomic prediction methods in scenarios of cross-validation, across population prediction, and inclusion of quantitative trait loci markers. Cross-validations produced the highest mean prediction ability for total carotenoids content (0.52) and the lowest for CBSD resistance (0.20), with G-BLUP outperforming other models tested. Across population, predictions showed low ability of Ugandan population to predict the performance of West African clones, with the highest predictive ability recorded for total carotenoids content (0.34) and the lowest for CBSD resistance (0.12) using G-BLUP. By incorporating chromosome 1 markers associated with carotenoids content as independent kernel in the G-BLUP model of a cross-validation scenario, prediction ability slightly improved from 0.52 to 0.58. These results reinforce ongoing efforts aimed at integrating GS into cassava breeding and demonstrate the utility of this tool for rapid genetic improvement.
Collapse
Affiliation(s)
- Williams Esuma
- National Crops Resources Research Institute, Kampala, Uganda
| | - Alfred Ozimati
- National Crops Resources Research Institute, Kampala, Uganda
| | - Peter Kulakow
- International Institute for Tropical Agriculture, Ibadan, Nigeria
| | - Michael A Gore
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Marnin D Wolfe
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | | | - Chiedozie Egesi
- International Institute for Tropical Agriculture, Ibadan, Nigeria
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Robert S Kawuki
- National Crops Resources Research Institute, Kampala, Uganda
| |
Collapse
|
23
|
Dekkers JCM, Su H, Cheng J. Predicting the accuracy of genomic predictions. Genet Sel Evol 2021; 53:55. [PMID: 34187354 PMCID: PMC8244147 DOI: 10.1186/s12711-021-00647-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 06/11/2021] [Indexed: 11/22/2022] Open
Abstract
Background Mathematical models are needed for the design of breeding programs using genomic prediction. While deterministic models for selection on pedigree-based estimates of breeding values (PEBV) are available, these have not been fully developed for genomic selection, with a key missing component being the accuracy of genomic EBV (GEBV) of selection candidates. Here, a deterministic method was developed to predict this accuracy within a closed breeding population based on the accuracy of GEBV and PEBV in the reference population and the distance of selection candidates from their closest ancestors in the reference population. Methods The accuracy of GEBV was modeled as a combination of the accuracy of PEBV and of EBV based on genomic relationships deviated from pedigree (DEBV). Loss of the accuracy of DEBV from the reference to the target population was modeled based on the effective number of independent chromosome segments in the reference population (Me). Measures of Me derived from the inverse of the variance of relationships and from the accuracies of GEBV and PEBV in the reference population, derived using either a Fisher information or a selection index approach, were compared by simulation. Results Using simulation, both the Fisher and the selection index approach correctly predicted accuracy in the target population over time, both with and without selection. The index approach, however, resulted in estimates of Me that were less affected by heritability, reference size, and selection, and which are, therefore, more appropriate as a population parameter. The variance of relationships underpredicted Me and was greatly affected by selection. A leave-one-out cross-validation approach was proposed to estimate required accuracies of EBV in the reference population. Aspects of the methods were validated using real data. Conclusions A deterministic method was developed to predict the accuracy of GEBV in selection candidates in a closed breeding population. The population parameter Me that is required for these predictions can be derived from an available reference data set, and applied to other reference data sets and traits for that population. This method can be used to evaluate the benefit of genomic prediction and to optimize genomic selection breeding programs. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00647-w.
Collapse
Affiliation(s)
- Jack C M Dekkers
- Department of Animal Science, Iowa State University, Ames, Iowa, USA.
| | - Hailin Su
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
| | - Jian Cheng
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
| |
Collapse
|
24
|
Cai M, Xiao J, Zhang S, Wan X, Zhao H, Chen G, Yang C. A unified framework for cross-population trait prediction by leveraging the genetic correlation of polygenic traits. Am J Hum Genet 2021; 108:632-655. [PMID: 33770506 PMCID: PMC8059341 DOI: 10.1016/j.ajhg.2021.03.002] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 03/01/2021] [Indexed: 12/29/2022] Open
Abstract
The development of polygenic risk scores (PRSs) has proved useful to stratify the general European population into different risk groups. However, PRSs are less accurate in non-European populations due to genetic differences across different populations. To improve the prediction accuracy in non-European populations, we propose a cross-population analysis framework for PRS construction with both individual-level (XPA) and summary-level (XPASS) GWAS data. By leveraging trans-ancestry genetic correlation, our methods can borrow information from the Biobank-scale European population data to improve risk prediction in the non-European populations. Our framework can also incorporate population-specific effects to further improve construction of PRS. With innovations in data structure and algorithm design, our methods provide a substantial saving in computational time and memory usage. Through comprehensive simulation studies, we show that our framework provides accurate, efficient, and robust PRS construction across a range of genetic architectures. In a Chinese cohort, our methods achieved 7.3%-198.0% accuracy gain for height and 19.5%-313.3% accuracy gain for body mass index (BMI) in terms of predictive R2 compared to existing PRS approaches. We also show that XPA and XPASS can achieve substantial improvement for construction of height PRSs in the African population, suggesting the generality of our framework across global populations.
Collapse
Affiliation(s)
- Mingxuan Cai
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Jiashun Xiao
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Shunkang Zhang
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Xiang Wan
- Shenzhen Research Institute of Big Data, Shenzhen 518172, China
| | - Hongyu Zhao
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai 201111, China; Department of Biostatistics, Yale School of Public Health, New Haven, CT 06510, USA
| | - Gang Chen
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Can Yang
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
| |
Collapse
|
25
|
Tong H, Phan NVT, Nguyen TT, Nguyen DV, Vo NS, Le L. Review on Databases and Bioinformatic Approaches on Pharmacogenomics of Adverse Drug Reactions. PHARMACOGENOMICS & PERSONALIZED MEDICINE 2021; 14:61-75. [PMID: 33469342 PMCID: PMC7812041 DOI: 10.2147/pgpm.s290781] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 12/26/2020] [Indexed: 12/27/2022]
Abstract
Pharmacogenomics has been used effectively in studying adverse drug reactions by determining the person-specific genetic factors associated with individual response to a drug. Current approaches have revealed the significant importance of sequencing technologies and sequence analysis strategies for interpreting the contribution of genetic variation in developing adverse reactions. Advance in next generation sequencing and platform brings new opportunities in validating the genetic candidates in certain reactions, and could be used to develop the preemptive tests to predict the outcome of the variation in a personal response to a drug. With the highly accumulated available data recently, the in silico approach with data analysis and modeling plays as other important alternatives which significantly support the final decisions in the transformation from research to clinical applications such as diagnosis and treatments for various types of adverse responses.
Collapse
Affiliation(s)
- Hang Tong
- School of Biotechnology, International University, Ho Chi Minh City, Vietnam.,Vietnam National University, Ho Chi Minh City, Vietnam
| | - Nga V T Phan
- School of Biotechnology, International University, Ho Chi Minh City, Vietnam.,Vietnam National University, Ho Chi Minh City, Vietnam
| | - Thanh T Nguyen
- Department of Translational Biomedical Informatics, Vingroup Big Data Institute, Hanoi, Vietnam
| | - Dinh V Nguyen
- Department of Respiratory, Allergy and Clinical Immunology, Vinmec International Hospital, Hanoi, Vietnam.,College of Health Sciences, VinUniversity, Hanoi, Vietnam
| | - Nam S Vo
- Department of Translational Biomedical Informatics, Vingroup Big Data Institute, Hanoi, Vietnam
| | - Ly Le
- School of Biotechnology, International University, Ho Chi Minh City, Vietnam.,Vietnam National University, Ho Chi Minh City, Vietnam.,Department of Translational Biomedical Informatics, Vingroup Big Data Institute, Hanoi, Vietnam
| |
Collapse
|
26
|
Abstract
A suitable pairwise relatedness estimation is key to genetic studies. Several methods are proposed to compute relatedness in autopolyploids based on molecular data. However, unlike diploids, autopolyploids still need further studies considering scenarios with many linked molecular markers with known dosage. In this study, we provide guidelines for plant geneticists and breeders to access trustworthy pairwise relatedness estimates. To this end, we simulated populations considering different ploidy levels, meiotic pairings patterns, number of loci and alleles, and inbreeding levels. Analysis were performed to access the accuracy of distinct methods and to demonstrate the usefulness of molecular marker in practical situations. Overall, our results suggest that at least 100 effective biallelic molecular markers are required to have good pairwise relatedness estimation if methods based on correlation is used. For this number of loci, current methods based on multiallelic markers show lower performance than biallelic ones. To estimate relatedness in cases of inbreeding or close relationships (as parent-offspring, full-sibs, or half-sibs) is more challenging. Methods to estimate pairwise relatedness based on molecular markers, for different ploidy levels or pedigrees were implemented in the AGHmatrix R package.
Collapse
|
27
|
Accuracy of genomic evaluation using imputed high-density genotypes for carcass traits in commercial Hanwoo population. Livest Sci 2020. [DOI: 10.1016/j.livsci.2020.104256] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
28
|
Cappetta E, Andolfo G, Di Matteo A, Barone A, Frusciante L, Ercolano MR. Accelerating Tomato Breeding by Exploiting Genomic Selection Approaches. PLANTS (BASEL, SWITZERLAND) 2020; 9:E1236. [PMID: 32962095 PMCID: PMC7569914 DOI: 10.3390/plants9091236] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 05/13/2020] [Accepted: 09/15/2020] [Indexed: 01/16/2023]
Abstract
Genomic selection (GS) is a predictive approach that was built up to increase the rate of genetic gain per unit of time and reduce the generation interval by utilizing genome-wide markers in breeding programs. It has emerged as a valuable method for improving complex traits that are controlled by many genes with small effects. GS enables the prediction of the breeding value of candidate genotypes for selection. In this work, we address important issues related to GS and its implementation in the plant context with special emphasis on tomato breeding. Genomic constraints and critical parameters affecting the accuracy of prediction such as the number of markers, statistical model, phenotyping and complexity of trait, training population size and composition should be carefully evaluated. The comparison of GS approaches for facilitating the selection of tomato superior genotypes during breeding programs is also discussed. GS applied to tomato breeding has already been shown to be feasible. We illustrated how GS can improve the rate of gain in elite line selection, and descendent and backcross schemes. The GS schemes have begun to be delineated and computer science can provide support for future selection strategies. A new promising breeding framework is beginning to emerge for optimizing tomato improvement procedures.
Collapse
Affiliation(s)
| | | | | | | | | | - Maria Raffaella Ercolano
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055 Naples, Italy; (E.C.); (G.A.); (A.D.M.); (A.B.); (L.F.)
| |
Collapse
|
29
|
Alam MZ, Lee YM, Son HJ, Hanna LH, Riley DG, Mannen H, Sasazaki S, Park SP, Kim JJ. Genetic characteristics of Korean Jeju Black cattle with high density single nucleotide polymorphisms. Anim Biosci 2020; 34:789-800. [PMID: 32882779 PMCID: PMC8100474 DOI: 10.5713/ajas.19.0888] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 06/29/2020] [Indexed: 11/27/2022] Open
Abstract
Objective Conservation and genetic improvement of cattle breeds require information about genetic diversity and population structure of the cattle. In this study, we investigated the genetic diversity and population structure of the three cattle breeds in the Korean peninsula. Methods Jeju Black, Hanwoo, Holstein cattle in Korea, together with six foreign breeds were examined. Genetic diversity within the cattle breeds was analyzed with minor allele frequency (MAF), observed and expected heterozygosity (HO and HE), inbreeding coefficient (FIS) and past effective population size. Molecular variance and population structure between the nine breeds were analyzed using a model-based clustering method. Genetic distances between breeds were evaluated with Nei’s genetic distance and Weir and Cockerham’s FST. Results Our results revealed that Jeju Black cattle had lowest level of heterozygosity (HE = 0.21) among the studied taurine breeds, and an average MAF of 0.16. The level of inbreeding was −0.076 for Jeju Black, while −0.018 to −0.118 for the other breeds. Principle component analysis and neighbor-joining tree showed a clear separation of Jeju Black cattle from other local (Hanwoo and Japanese cattle) and taurine/indicine cattle breeds in evolutionary process, and a distinct pattern of admixture of Jeju Black cattle having no clustering with other studied populations. The FST value between Jeju Black cattle and Hanwoo was 0.106, which was lowest across the pair of breeds ranging from 0.161 to 0.274, indicating some degree of genetic closeness of Jeju Black cattle with Hanwoo. The past effective population size of Jeju Black cattle was very small, i.e. 38 in 13 generation ago, whereas 209 for Hanwoo. Conclusion This study indicates genetic uniqueness of Jeju Black cattle. However, a small effective population size of Jeju Black cattle indicates the requirement for an implementation of a sustainable breeding policy to increase the population for genetic improvement and future conservation.
Collapse
Affiliation(s)
- M Zahangir Alam
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea.,Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh
| | - Yun-Mi Lee
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| | - Hyo-Jung Son
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| | - Lauren H Hanna
- Department of Animal Sciences, North Dakota State University, Fargo, ND 58105, USA
| | - David G Riley
- Department of Animal Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Hideyuki Mannen
- Graduate School of Agricultural Science, Kobe University, Kobe 657-8501, Japan
| | - Shinji Sasazaki
- Graduate School of Agricultural Science, Kobe University, Kobe 657-8501, Japan
| | - Se Pill Park
- Faculty of Biotechnology, Jeju National University, Jeju 13557, Korea
| | - Jong-Joo Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| |
Collapse
|
30
|
Folkersen L, Pain O, Ingason A, Werge T, Lewis CM, Austin J. Impute.me: An Open-Source, Non-profit Tool for Using Data From Direct-to-Consumer Genetic Testing to Calculate and Interpret Polygenic Risk Scores. Front Genet 2020; 11:578. [PMID: 32714365 PMCID: PMC7340159 DOI: 10.3389/fgene.2020.00578] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 05/11/2020] [Indexed: 01/07/2023] Open
Abstract
To date, interpretation of genomic information has focused on single variants conferring disease risk, but most disorders of major public concern have a polygenic architecture. Polygenic risk scores (PRSs) give a single measure of disease liability by summarizing disease risk across hundreds of thousands of genetic variants. They can be calculated in any genome-wide genotype data-source, using a prediction model based on genome-wide summary statistics from external studies. As genome-wide association studies increase in power, the predictive ability for disease risk will also increase. Although PRSs are unlikely ever to be fully diagnostic, they may give valuable medical information for risk stratification, prognosis, or treatment response prediction. Public engagement is therefore becoming important on the potential use and acceptability of PRSs. However, the current public perception of genetics is that it provides "yes/no" answers about the presence/absence of a condition, or the potential for developing a condition, which in not the case for common, complex disorders with polygenic architecture. Meanwhile, unregulated third-party applications are being developed to satisfy consumer demand for information on the impact of lower-risk variants on common diseases that are highly polygenic. Often, applications report results from single-nucleotide polymorphisms (SNPs) and disregard effect size, which is highly inappropriate for common, complex disorders where everybody carries risk variants. Tools are therefore needed to communicate our understanding of genetic vulnerability as a continuous trait, where a genetic liability confers risk for disease. Impute.me is one such tool, whose focus is on education and information on common, complex disorders with polygenetic architecture. Its research-focused open-source website allows users to upload consumer genetics data to obtain PRSs, with results reported on a population-level normal distribution. Diseases can only be browsed by International Classification of Diseases, 10th Revision (ICD-10) chapter-location or alphabetically, thus prompting the user to consider genetic risk scores in a medical context of relevance to the individual. Here, we present an overview of the implementation of the impute.me site, along with analysis of typical usage patterns, which may advance public perception of genomic risk and precision medicine.
Collapse
Affiliation(s)
- Lasse Folkersen
- Institute of Biological Psychiatry, Mental Health Centre Sankt Hans, Copenhagen, Denmark
| | - Oliver Pain
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
| | - Andrés Ingason
- Institute of Biological Psychiatry, Mental Health Centre Sankt Hans, Copenhagen, Denmark
| | - Thomas Werge
- Institute of Biological Psychiatry, Mental Health Centre Sankt Hans, Copenhagen, Denmark
| | - Cathryn M. Lewis
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
- Department of Medical & Molecular Genetics, Faculty of Life Sciences & Medicine, King’s College London, London, United Kingdom
| | - Jehannine Austin
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
31
|
Efficient polygenic risk scores for biobank scale data by exploiting phenotypes from inferred relatives. Nat Commun 2020; 11:3074. [PMID: 32555176 PMCID: PMC7299943 DOI: 10.1038/s41467-020-16829-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 05/25/2020] [Indexed: 01/06/2023] Open
Abstract
Polygenic risk scores are emerging as a potentially powerful tool to predict future phenotypes of target individuals, typically using unrelated individuals, thereby devaluing information from relatives. Here, for 50 traits from the UK Biobank data, we show that a design of 5,000 individuals with first-degree relatives of target individuals can achieve a prediction accuracy similar to that of around 220,000 unrelated individuals (mean prediction accuracy = 0.26 vs. 0.24, mean fold-change = 1.06 (95% CI: 0.99-1.13), P-value = 0.08), despite a 44-fold difference in sample size. For lifestyle traits, the prediction accuracy with 5,000 individuals including first-degree relatives of target individuals is significantly higher than that with 220,000 unrelated individuals (mean prediction accuracy = 0.22 vs. 0.16, mean fold-change = 1.40 (1.17-1.62), P-value = 0.025). Our findings suggest that polygenic prediction integrating family information may help to accelerate precision health and clinical intervention. Genetic data from large cohorts of unrelated individuals can be used to create polygenic risk scores, which could be used to predict individual risk of developing a specific disease. Here the authors show that smaller cohorts of related individuals can provide similarly powerful predictive ability.
Collapse
|
32
|
Raymond B, Wientjes YCJ, Bouwman AC, Schrooten C, Veerkamp RF. A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices. Genet Sel Evol 2020; 52:21. [PMID: 32345213 PMCID: PMC7189707 DOI: 10.1186/s12711-020-00540-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 04/14/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A multi-population genomic prediction (GP) model in which important pre-selected single nucleotide polymorphisms (SNPs) are differentially weighted (MPMG) has been shown to result in better prediction accuracy than a multi-population, single genomic relationship matrix ([Formula: see text]) GP model (MPSG) in which all SNPs are weighted equally. Our objective was to underpin theoretically the advantages and limits of the MPMG model over the MPSG model, by deriving and validating a deterministic prediction equation for its accuracy. METHODS Using selection index theory, we derived an equation to predict the accuracy of estimated total genomic values of selection candidates from population [Formula: see text] ([Formula: see text]), when individuals from two populations, [Formula: see text] and [Formula: see text], are combined in the training population and two [Formula: see text], made respectively from pre-selected and remaining SNPs, are fitted simultaneously in MPMG. We used simulations to validate the prediction equation in scenarios that differed in the level of genetic correlation between populations, heritability, and proportion of genetic variance explained by the pre-selected SNPs. Empirical accuracy of the MPMG model in each scenario was calculated and compared to the predicted accuracy from the equation. RESULTS In general, the derived prediction equation resulted in accurate predictions of [Formula: see text] for the scenarios evaluated. Using the prediction equation, we showed that an important advantage of the MPMG model over the MPSG model is its ability to benefit from the small number of independent chromosome segments ([Formula: see text]) due to the pre-selected SNPs, both within and across populations, whereas for the MPSG model, there is only a single value for [Formula: see text], calculated based on all SNPs, which is very large. However, this advantage is dependent on the pre-selected SNPs that explain some proportion of the total genetic variance for the trait. CONCLUSIONS We developed an equation that gives insight into why, and under which conditions the MPMG outperforms the MPSG model for GP. The equation can be used as a deterministic tool to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for prediction of disease risk scores.
Collapse
Affiliation(s)
- Biaty Raymond
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands. .,Biometris, Wageningen University and Research, 6700AA, Wageningen, The Netherlands.
| | - Yvonne C J Wientjes
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | - Aniek C Bouwman
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | | | - Roel F Veerkamp
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
33
|
Dechow CD, Liu WS, Specht LW, Blackburn H. Reconstitution and modernization of lost Holstein male lineages using samples from a gene bank. J Dairy Sci 2020; 103:4510-4516. [PMID: 32171516 DOI: 10.3168/jds.2019-17753] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 01/10/2020] [Indexed: 11/19/2022]
Abstract
More than 99% of all known Holstein artificial insemination (AI) bulls in the United States can be traced through their male lineage to just 2 bulls born in the 1950s, and all Holstein bulls can be traced back to 2 bulls born in the late 1800s. As the Y chromosome is passed exclusively from sire to son, this suggests that variation is limited for much of the Y chromosome. Two additional male lineages that are separate from modern lineages before 1890 were present at the start of the AI era and had semen available from the USDA National Animal Germplasm Program (Fort Collins, CO). Semen from representatives of those lineages were used for in vitro embryo production by mating to elite modern genetic females, resulting in the birth of 7 bulls and 8 heifers. Genomic evaluation of the bulls suggested that lineages from the beginning of the AI era could be reconstituted to breed average for total economic merit in 1 generation when mated to elite females due to high genetic merit for fertility, near-average genetic merit for fat and protein yield, and below-average genetic merit for udder and physical conformation. Semen from the bulls is commercially available to facilitate Y chromosome research and efforts to restore lost genetic diversity.
Collapse
Affiliation(s)
- C D Dechow
- Department of Animal Science, Pennsylvania State University, University Park 16802.
| | - W S Liu
- Department of Animal Science, Pennsylvania State University, University Park 16802
| | - L W Specht
- Department of Animal Science, Pennsylvania State University, University Park 16802
| | - H Blackburn
- National Animal Germplasm Program, Fort Collins, CO 80521
| |
Collapse
|
34
|
Corredor FA, Sanglard LP, Leach RJ, Ross JW, Keating AF, Serão NVL. Genetic and genomic characterization of vulva size traits in Yorkshire and Landrace gilts. BMC Genet 2020; 21:28. [PMID: 32164558 PMCID: PMC7068987 DOI: 10.1186/s12863-020-0834-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Accepted: 02/26/2020] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Reproductive performance is critical for efficient swine production. Recent results indicated that vulva size (VS) may be predictive of reproductive performance in sows. Study objectives were to estimate genetic parameters, identify genomic regions associated, and estimate genomic prediction accuracies (GPA) for VS traits. RESULTS Heritability estimates of VS traits, vulva area (VA), height (VH), and width (VW) measurements, were moderately to highly heritable in Yorkshire, with 0.46 ± 0.10, 0.55 ± 0.10, 0.31 ± 0.09, respectively, whereas these estimates were low to moderate in Landrace, with 0.16 ± 0.09, 0.24 ± 0.11, and 0.08 ± 0.06, respectively. Genetic correlations within VS traits were very high for both breeds, with the lowest of 0.67 ± 0.29 for VH and VW for Landrace. Genome-wide association studies (GWAS) for Landrace, reveled genomic region associated with VS traits on Sus scrofa chromosome (SSC) 2 (154-157 Mb), 7 (107-110 Mb), 8 (4-6 Mb), and 10 (8-19 Mb). For Yorkshire, genomic regions on SSC 1 (87-91 and 282-287 Mb) and 5 (67 Mb) were identified. All regions explained at least 3.4% of the genetic variance. Accuracies of genomic prediction were moderate in Landrace, ranging from 0.30 (VH) to 0.61 (VA), and lower for Yorkshire, with 0.07 (VW) to 0.11 (VH). Between-breed and multi-breed genomic prediction accuracies were low. CONCLUSIONS Our findings suggest that VS traits are heritable in Landrace and Yorkshire gilts. Genomic analyses show that major QTL control these traits, and they differ between breed. Genomic information can be used to increase genetic gains for these traits in gilts. Additional research must be done to validate the GWAS and genomic prediction results reported in our study.
Collapse
Affiliation(s)
| | | | | | - Jason W. Ross
- Department of Animal Science, Iowa State University, IA50010, Ames, USA
- Iowa Pork Industry Center, Iowa State University, Ames, IA 50010 USA
| | - Aileen F. Keating
- Department of Animal Science, Iowa State University, IA50010, Ames, USA
| | - Nick V. L. Serão
- Department of Animal Science, Iowa State University, IA50010, Ames, USA
| |
Collapse
|
35
|
Mota LFM, Fernandes GA, Herrera AC, Scalez DCB, Espigolan R, Magalhães AFB, Carvalheiro R, Baldi F, Albuquerque LG. Genomic reaction norm models exploiting genotype × environment interaction on sexual precocity indicator traits in Nellore cattle. Anim Genet 2020; 51:210-223. [PMID: 31944356 DOI: 10.1111/age.12902] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2019] [Indexed: 12/31/2022]
Abstract
Brazilian beef cattle are raised predominantly on pasture in a wide range of environments. In this scenario, genotype by environment (G×E) interaction is an important source of phenotypic variation in the reproductive traits. Hence, the evaluation of G×E interactions for heifer's early pregnancy (HP) and scrotal circumference (SC) traits in Nellore cattle, belonging to three breeding programs, was carried out to determine the animal's sensitivity to the environmental conditions (EC). The dataset consisted of 85 874 records for HP and 151 553 records for SC, from which 1800 heifers and 3343 young bulls were genotyped with the BovineHD BeadChip. Genotypic information for 826 sires was also used in the analyses. EC levels were based on the contemporary group solutions for yearling body weight. Linear reaction norm models (RNM), using pedigree information (RNM_A) or pedigree and genomic information (RNM_H), were used to infer G×E interactions. Two validation schemes were used to assess the predictive ability, with the following training populations: (a) forward scheme-dataset was split based on year of birth from 2008 for HP and from 2011 for SC; and (b) environment-specific scheme-low EC (-3.0 and -1.5) and high EC (1.5 and 3.0). The inclusion of the H matrix in RNM increased the genetic variance of the intercept and slope by 18.55 and 23.00% on average respectively, and provided genetic parameter estimates that were more accurate than those considering pedigree only. The same trend was observed for heritability estimates, which were 0.28-0.56 for SC and 0.26-0.49 for HP, using RNM_H, and 0.26-0.52 for SC and 0.22-0.45 for HP, using RNM_A. The lowest correlation observed between unfavorable (-3.0) and favorable (3.0) EC levels were 0.30 for HP and -0.12 for SC, indicating the presence of G×E interaction. The G×E interaction effect implied differences in animals' genetic merit and re-ranking of animals on different environmental conditions. SNP marker-environment interaction was detected for Nellore sexual precocity indicator traits with changes in effect and variance across EC levels. The RNM_H captured G×E interaction effects better than RNM_A and improved the predictive ability by around 14.04% for SC and 21.31% for HP. Using the forward scheme increased the overall predictive ability for SC (20.55%) and HP (11.06%) compared with the environment-specific scheme. The results suggest that the inclusion of genomic information combined with the pedigree to assess the G×E interaction leads to more accurate variance components and genetic parameter estimates.
Collapse
Affiliation(s)
- L F M Mota
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - G A Fernandes
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - A C Herrera
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - D C B Scalez
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - R Espigolan
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - A F B Magalhães
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - R Carvalheiro
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil.,National Council for Science and Technological Development, 71605-001, Brasilia, Brazil
| | - F Baldi
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil
| | - L G Albuquerque
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Via de Acesso Prof. Paulo Donato Castelane, 14884-900, Jaboticabal, Brazil.,National Council for Science and Technological Development, 71605-001, Brasilia, Brazil
| |
Collapse
|
36
|
Xu Y, Liu X, Fu J, Wang H, Wang J, Huang C, Prasanna BM, Olsen MS, Wang G, Zhang A. Enhancing Genetic Gain through Genomic Selection: From Livestock to Plants. PLANT COMMUNICATIONS 2020; 1:100005. [PMID: 33404534 PMCID: PMC7747995 DOI: 10.1016/j.xplc.2019.100005] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies, the rate of genetic gain needs to be accelerated to meet humanity's demand for agricultural products. In this regard, genomic selection (GS) has been considered most promising for genetic improvement of the complex traits controlled by many genes each with minor effects. Livestock scientists pioneered GS application largely due to livestock's significantly higher individual values and the greater reduction in generation interval that can be achieved in GS. Large-scale application of GS in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects, along with significant cost reduction. Moreover, it would be more effective to integrate GS with other breeding tools and platforms for accelerating the breeding process and thereby further enhancing genetic gain. In addition, establishing an open-source breeding network and developing transdisciplinary approaches would be essential in enhancing breeding efficiency for small- and medium-sized enterprises and agricultural research systems in developing countries. New strategies centered on GS for enhancing genetic gain need to be developed.
Collapse
Affiliation(s)
- Yunbi Xu
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
- CIMMYT-China Tropical Maize Research Center, Foshan University, Foshan 528231, China
- CIMMYT-China Specialty Maize Research Center, Shanghai Academy of Agricultural Sciences, Shanghai 201400, China
| | - Xiaogang Liu
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Junjie Fu
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Hongwu Wang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Jiankang Wang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Changling Huang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Boddupalli M. Prasanna
- CIMMYT (International Maize and Wheat Improvement Center), ICRAF Campus, United Nations Avenue, Nairobi, Kenya
| | - Michael S. Olsen
- CIMMYT (International Maize and Wheat Improvement Center), ICRAF Campus, United Nations Avenue, Nairobi, Kenya
| | - Guoying Wang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Aimin Zhang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
37
|
The breeding structure for the small ruminant resources in India. Trop Anim Health Prod 2020; 52:1717-1724. [PMID: 31898023 DOI: 10.1007/s11250-019-02188-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 12/22/2019] [Indexed: 10/25/2022]
Abstract
Intense selection for a few desired traits has resulted in reduction of the effective population size (Ne) in most of the plant and livestock populations across the world. The objective of the research was to assess the impact of Ne on the genetic architecture of the population in a simulated data with variable Ne for general population under selection. Along with this, the estimate of Ne and its ratio to adult breeding population (NB) in the census data of small ruminants of India were also investigated. Results indicated that the average inbreeding ([Formula: see text]) decreases with increase in Ne; similarly, increase in [Formula: see text] per generation was highest in population with lowest Ne. Correlation of estimated breeding value (EBV) with true breeding value (TBV) was not much affected with effective population size. An effective number of chromosome segments (Me) in the populations under selection were significantly affected by magnitude of Ne, with linear positive relation between Ne and Me. Results on livestock census data revealed that all the sheep and goat breeds have sufficiently large Ne based on derived and actual census data. The median for ratio of effective population size to adult census size in sheep breeds was 0.120 and for goat breeds was 0.131. Karnah and Poonchi sheep shares the status of endangered breeds due to a smaller number of breeding female population and hence need attention for conservation. The Ne was large in sheep and goat due to less selection pressure as a result of low coverage of breed improvement programs, availability of large number of breeding males, and absence of artificial insemination (AI) in the field flocks. The estimates of Ne and its ratio to the adult census size (NB) excluded several factors such as fluctuating population size and overlapping generations. Study revealed introspection from most of the industrial breeding programs on the issue of Ne for populations under selection. Similarly, in small ruminants, large Ne indicates huge genetic diversity and scope of improvement in the productivity in near future.
Collapse
|
38
|
Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations. Genet Sel Evol 2019; 51:72. [PMID: 31805849 PMCID: PMC6896509 DOI: 10.1186/s12711-019-0514-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Accepted: 11/25/2019] [Indexed: 12/13/2022] Open
Abstract
Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. Methods Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. Results A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. Conclusions Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.
Collapse
|
39
|
Iqbal A, Choi TJ, Kim YS, Lee YM, Zahangir Alam M, Jung JH, Choe HS, Kim JJ. Comparison of genomic predictions for carcass and reproduction traits in Berkshire, Duroc and Yorkshire populations in Korea. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2019; 32:1657-1663. [PMID: 31480201 PMCID: PMC6817783 DOI: 10.5713/ajas.18.0672] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 06/02/2019] [Indexed: 11/27/2022]
Abstract
Objective A genome-based best linear unbiased prediction (GBLUP) method was applied to evaluate accuracies of genomic estimated breeding value (GEBV) of carcass and reproductive traits in Berkshire, Duroc and Yorkshire populations in Korean swine breeding farms. Methods The data comprised a total of 1,870, 696, and 1,723 genotyped pigs belonging to Berkshire, Duroc and Yorkshire breeds, respectively. Reference populations for carcass traits consisted of 888 Berkshire, 466 Duroc, and 1,208 Yorkshire pigs, and those for reproductive traits comprised 210, 154, and 890 dams for the respective breeds. The carcass traits analyzed were backfat thickness (BFT) and carcass weight (CWT), and the reproductive traits were total number born (TNB) and number born alive (NBA). For each trait, GEBV accuracies were evaluated with a GEBV BLUP model and realized GEBVs. Results The accuracies under the GBLUP model for BFT and CWT ranged from 0.33–0.72 and 0.33–0.63, respectively. For NBA and TNB, the model accuracies ranged 0.32 to 0.54 and 0.39 to 0.56, respectively. The realized accuracy estimates for BFT and CWT ranged 0.30 to 0.46 and 0.09 to 0.27, respectively, and 0.50 to 0.70 and 0.70 to 0.87 for NBA and TNB, respectively. For the carcass traits, the GEBV accuracies under the GBLUP model were higher than the realized GEBV accuracies across the breed populations, while for reproductive traits the realized accuracies were higher than the model based GEBV accuracies. Conclusion The genomic prediction accuracy increased with reference population size and heritability of the trait. The GEBV accuracies were also influenced by GEBV estimation method, such that careful selection of animals based on the estimated GEBVs is needed. GEBV accuracy will increase with a larger sized reference population, which would be more beneficial for traits with low heritability such as reproductive traits.
Collapse
Affiliation(s)
- Asif Iqbal
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| | - Tae-Jeong Choi
- Swine Science Division, National Institute of Animal Science, RDA, Wanju 55365, Korea
| | - You-Sam Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| | - Yun-Mi Lee
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| | - M Zahangir Alam
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| | | | - Ho-Sung Choe
- Department of Animal Biotechnology, Chonbuk National University, Jeonju 54896, Korea
| | - Jong-Joo Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Korea
| |
Collapse
|
40
|
Al Kalaldeh M, Gibson J, Duijvesteijn N, Daetwyler HD, MacLeod I, Moghaddar N, Lee SH, van der Werf JHJ. Using imputed whole-genome sequence data to improve the accuracy of genomic prediction for parasite resistance in Australian sheep. Genet Sel Evol 2019; 51:32. [PMID: 31242855 PMCID: PMC6595562 DOI: 10.1186/s12711-019-0476-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 06/18/2019] [Indexed: 01/16/2023] Open
Abstract
Background This study aimed at (1) comparing the accuracies of genomic prediction for parasite resistance in sheep based on whole-genome sequence (WGS) data to those based on 50k and high-density (HD) single nucleotide polymorphism (SNP) panels; (2) investigating whether the use of variants within quantitative trait loci (QTL) regions that were selected from regional heritability mapping (RHM) in an independent dataset improved the accuracy more than variants selected from genome-wide association studies (GWAS); and (3) comparing the prediction accuracies between variants selected from WGS data to variants selected from the HD SNP panel. Results The accuracy of genomic prediction improved marginally from 0.16 ± 0.02 and 0.18 ± 0.01 when using all the variants from 50k and HD genotypes, respectively, to 0.19 ± 0.01 when using all the variants from WGS data. Fitting a GRM from the selected variants alongside a GRM from the 50k SNP genotypes improved the prediction accuracy substantially compared to fitting the 50k SNP genotypes alone. The gain in prediction accuracy was slightly more pronounced when variants were selected from WGS data compared to when variants were selected from the HD panel. When sequence variants that passed the GWAS \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$- log_{10} (p\,value)$$\end{document}-log10(pvalue) threshold of 3 across the entire genome were selected, the prediction accuracy improved by 5% (up to 0.21 ± 0.01), whereas when selection was limited to sequence variants that passed the same GWAS \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$- log_{10} (p\,value)$$\end{document}-log10(pvalue) threshold of 3 in regions identified by RHM, the accuracy improved by 9% (up to 0.25 ± 0.01). Conclusions Our results show that through careful selection of sequence variants from the QTL regions, the accuracy of genomic prediction for parasite resistance in sheep can be improved. These findings have important implications for genomic prediction in sheep.
Collapse
Affiliation(s)
- Mohammad Al Kalaldeh
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia. .,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia.
| | - John Gibson
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Naomi Duijvesteijn
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Hans D Daetwyler
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,Centre for AgriBioscience, Agriculture Victoria, Bundoora, VIC, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia
| | - Iona MacLeod
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,Centre for AgriBioscience, Agriculture Victoria, Bundoora, VIC, 3083, Australia
| | - Nasir Moghaddar
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Sang Hong Lee
- Australian Centre for Precision Health, University of South Australia Cancer Research Institute, University of South Australia, Adelaide, SA, 5000, Australia
| | - Julius H J van der Werf
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| |
Collapse
|
41
|
Gowane GR, Lee SH, Clark S, Moghaddar N, Al-Mamun HA, van der Werf JHJ. Effect of selection and selective genotyping for creation of reference on bias and accuracy of genomic prediction. J Anim Breed Genet 2019; 136:390-407. [PMID: 31215699 DOI: 10.1111/jbg.12420] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 05/22/2019] [Accepted: 05/23/2019] [Indexed: 01/17/2023]
Abstract
Reference populations for genomic selection usually involve selected individuals, which may result in biased prediction of estimated genomic breeding values (GEBV). In a simulation study, bias and accuracy of GEBV were explored for various genetic models with individuals selectively genotyped in a typical nucleus breeding program. We compared the performance of three existing methods, that is, Best Linear Unbiased Prediction of breeding values using pedigree-based relationships (PBLUP), genomic relationships for genotyped animals only (GBLUP) and a Single-Step approach (SSGBLUP) using both. For a scenario with no-selection and random mating (RR), prediction was unbiased. However, lower accuracy and bias were observed for scenarios with selection and random mating (SR) or selection and positive assortative mating (SA). As expected, bias disappeared when all individuals were genotyped and used in GBLUP. SSGBLUP showed higher accuracy compared to GBLUP, and bias of prediction was negligible with SR. However, PBLUP and SSGBLUP still showed bias in SA due to high inbreeding. SSGBLUP and PBLUP were unbiased provided that inbreeding was accounted for in the relationship matrices. Selective genotyping based on extreme phenotypic contrasts increased the prediction accuracy, but prediction was biased when using GBLUP. SSGBLUP could correct the biasedness while gaining higher accuracy than GBLUP. In a typical animal breeding program, where it is too expensive to genotype all animals, it would be appropriate to genotype phenotypically contrasting selection candidates and use a Single-Step approach to obtain accurate and unbiased prediction of GEBV.
Collapse
Affiliation(s)
- Gopal R Gowane
- Animal Genetics & Breeding Division, ICAR-Central Sheep & Wool Research Institute, Avikanagar, India
| | - Sang Hong Lee
- Australian Centre for Precision Health, University of South Australia Cancer Research Institute, Adelaide, South Australia, Australia
| | - Sam Clark
- School of Environmental and Rural Sciences, University of New England, Armidale, New South Wales, Australia
| | - Nasir Moghaddar
- School of Environmental and Rural Sciences, University of New England, Armidale, New South Wales, Australia
| | | | - Julius H J van der Werf
- School of Environmental and Rural Sciences, University of New England, Armidale, New South Wales, Australia
| |
Collapse
|
42
|
Ge T, Chen CY, Ni Y, Feng YCA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019; 10:1776. [PMID: 30992449 PMCID: PMC6467998 DOI: 10.1038/s41467-019-09718-5] [Citation(s) in RCA: 788] [Impact Index Per Article: 157.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 03/25/2019] [Indexed: 01/23/2023] Open
Abstract
Polygenic risk scores (PRS) have shown promise in predicting human complex traits and diseases. Here, we present PRS-CS, a polygenic prediction method that infers posterior effect sizes of single nucleotide polymorphisms (SNPs) using genome-wide association summary statistics and an external linkage disequilibrium (LD) reference panel. PRS-CS utilizes a high-dimensional Bayesian regression framework, and is distinct from previous work by placing a continuous shrinkage (CS) prior on SNP effect sizes, which is robust to varying genetic architectures, provides substantial computational advantages, and enables multivariate modeling of local LD patterns. Simulation studies using data from the UK Biobank show that PRS-CS outperforms existing methods across a wide range of genetic architectures, especially when the training sample size is large. We apply PRS-CS to predict six common complex diseases and six quantitative traits in the Partners HealthCare Biobank, and further demonstrate the improvement of PRS-CS in prediction accuracy over alternative methods.
Collapse
Affiliation(s)
- Tian Ge
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA.
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| | - Chia-Yen Chen
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Yang Ni
- Department of Statistics, Texas A&M University, College Station, TX, 77843, USA
| | - Yen-Chen Anne Feng
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| |
Collapse
|
43
|
Mangin B, Rincent R, Rabier CE, Moreau L, Goudemand-Dugue E. Training set optimization of genomic prediction by means of EthAcc. PLoS One 2019; 14:e0205629. [PMID: 30779753 PMCID: PMC6380617 DOI: 10.1371/journal.pone.0205629] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 01/03/2019] [Indexed: 12/17/2022] Open
Abstract
Genomic prediction is a useful tool for plant and animal breeding programs and is starting to be used to predict human diseases as well. A shortcoming that slows down the genomic selection deployment is that the accuracy of the prediction is not known a priori. We propose EthAcc (Estimated THeoretical ACCuracy) as a method for estimating the accuracy given a training set that is genotyped and phenotyped. EthAcc is based on a causal quantitative trait loci model estimated by a genome-wide association study. This estimated causal model is crucial; therefore, we compared different methods to find the one yielding the best EthAcc. The multilocus mixed model was found to perform the best. We compared EthAcc to accuracy estimators that can be derived via a mixed marker model. We showed that EthAcc is the only approach to correctly estimate the accuracy. Moreover, in case of a structured population, in accordance with the achieved accuracy, EthAcc showed that the biggest training set is not always better than a smaller and closer training set. We then performed training set optimization with EthAcc and compared it to CDmean. EthAcc outperformed CDmean on real datasets from sugar beet, maize, and wheat. Nonetheless, its performance was mainly due to the use of an optimal but inaccessible set as a start of the optimization algorithm. EthAcc's precision and algorithm issues prevent it from reaching a good training set with a random start. Despite this drawback, we demonstrated that a substantial gain in accuracy can be obtained by performing training set optimization.
Collapse
Affiliation(s)
- Brigitte Mangin
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
- * E-mail:
| | | | - Charles-Elie Rabier
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
| | - Laurence Moreau
- GQE-Le Moulon, INRA, Univ Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Gif-sur-Yvette, France
| | | |
Collapse
|
44
|
Lee S, Dang C, Choy Y, Do C, Cho K, Kim J, Kim Y, Lee J. Comparison of genome-wide association and genomic prediction methods for milk production traits in Korean Holstein cattle. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2019; 32:913-921. [PMID: 30744323 PMCID: PMC6601072 DOI: 10.5713/ajas.18.0847] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Accepted: 01/11/2019] [Indexed: 11/27/2022]
Abstract
OBJECTIVE The objectives of this study were to compare identified informative regions through two genome-wide association study (GWAS) approaches and determine the accuracy and bias of the direct genomic value (DGV) for milk production traits in Korean Holstein cattle, using two genomic prediction approaches: single-step genomic best linear unbiased prediction (ss-GBLUP) and Bayesian Bayes-B. METHODS Records on production traits such as adjusted 305-day milk (MY305), fat (FY305), and protein (PY305) yields were collected from 265,271 first parity cows. After quality control, 50,765 single-nucleotide polymorphic genotypes were available for analysis. In GWAS for ss-GBLUP (ssGWAS) and Bayes-B (BayesGWAS), the proportion of genetic variance for each 1-Mb genomic window was calculated and used to identify informative genomic regions. Accuracy of the DGV was estimated by a five-fold cross-validation with random clustering. As a measure of accuracy for DGV, we also assessed the correlation between DGV and deregressed-estimated breeding value (DEBV). The bias of DGV for each method was obtained by determining regression coefficients. RESULTS A total of nine and five significant windows (1 Mb) were identified for MY305 using ssGWAS and BayesGWAS, respectively. Using ssGWAS and BayesGWAS, we also detected multiple significant regions for FY305 (12 and 7) and PY305 (14 and 2), respectively. Both single-step DGV and Bayes DGV also showed somewhat moderate accuracy ranges for MY305 (0.32 to 0.34), FY305 (0.37 to 0.39), and PY305 (0.35 to 0.36) traits, respectively. The mean biases of DGVs determined using the single-step and Bayesian methods were 1.50±0.21 and 1.18±0.26 for MY305, 1.75±0.33 and 1.14±0.20 for FY305, and 1.59±0.20 and 1.14±0.15 for PY305, respectively. CONCLUSION From the bias perspective, we believe that genomic selection based on the application of Bayesian approaches would be more suitable than application of ss-GBLUP in Korean Holstein populations.
Collapse
Affiliation(s)
- SeokHyun Lee
- Animal Breeding and Genetics Division, National Institute of Animal Science, RDA, Cheonan 31000, Korea
| | - ChangGwon Dang
- Animal Breeding and Genetics Division, National Institute of Animal Science, RDA, Cheonan 31000, Korea
| | - YunHo Choy
- Animal Breeding and Genetics Division, National Institute of Animal Science, RDA, Cheonan 31000, Korea
| | - ChangHee Do
- Division of Animal and Dairy Science, Chungnam National University, Daejeon 34134, Korea
| | - Kwanghyun Cho
- Department of Dairy Science, Korea National College of Agriculture and Fisheries, Jeonju 54874, Korea
| | - Jongjoo Kim
- Division of Applied Life Science, Yeungnam University, Gyeongsan 38541, Korea
| | - Yousam Kim
- Division of Applied Life Science, Yeungnam University, Gyeongsan 38541, Korea
| | - Jungjae Lee
- Jun P&C Institute, INC., Yongin 16950, Korea
| |
Collapse
|
45
|
van den Berg I, Meuwissen THE, MacLeod IM, Goddard ME. Predicting the effect of reference population on the accuracy of within, across, and multibreed genomic prediction. J Dairy Sci 2019; 102:3155-3174. [PMID: 30738664 DOI: 10.3168/jds.2018-15231] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 12/08/2018] [Indexed: 01/24/2023]
Abstract
Genomic prediction is widely used to select candidates for breeding. Size and composition of the reference population are important factors influencing prediction accuracy. In Holstein dairy cattle, large reference populations are used, but this is difficult to achieve in numerically small breeds and for traits that are not routinely recorded. The prediction accuracy is usually estimated using cross-validation, requiring the full data set. It would be useful to have a method to predict the benefit of multibreed reference populations that does not require the availability of the full data set. Our objective was to study the effect of the size and breed composition of the reference population on the accuracy of genomic prediction using genomic BLUP and Bayes R. We also examined the effect of trait heritability and validation breed on prediction accuracy. Using these empirical results, we investigated the use of a formula to predict the effect of the size and composition of the reference population on the accuracy of genomic prediction. Phenotypes were simulated in a data set containing real genotypes of imputed sequence variants for 22,752 dairy bulls and cows, including Holstein, Jersey, Red Holstein, and Australian Red cattle. Different reference populations were constructed, varying in size and composition, to study within-breed, multibreed, and across-breed prediction. Phenotypes were simulated varying in heritability, number of chromosomes, and number of quantitative trait loci. Genomic prediction was carried out using genomic BLUP and Bayes R. We used either the genomic relationship matrix (GRM) to estimate the number of independent chromosomal segments and subsequently to predict accuracy, or the accuracies obtained from single-breed reference populations to predict the accuracies of larger or multibreed reference populations. Using the GRM overestimated the accuracy; this overestimation was likely due to close relationships among some of the reference animals. Consequently, the GRM could not be used to predict the accuracy of genomic prediction reliably. However, a method using the prediction accuracies obtained by cross-validation using a small, single-breed reference population predicted the accuracy using a multibreed reference population well and slightly overestimated the accuracy for a larger reference population of the same breed, but gave a reasonably close estimate of the accuracy for a multibreed reference population. This method could be useful for making decisions regarding the size and composition of the reference population.
Collapse
Affiliation(s)
- I van den Berg
- Faculty of Veterinary & Agricultural Science, University of Melbourne, 3010 Parkville, Victoria, Australia; Agriculture Victoria, AgriBio, Centre for AgriBioscience, 3083 Bundoora, Victoria, Australia.
| | - T H E Meuwissen
- Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, 1432 Ås, Norway
| | - I M MacLeod
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 3083 Bundoora, Victoria, Australia
| | - M E Goddard
- Faculty of Veterinary & Agricultural Science, University of Melbourne, 3010 Parkville, Victoria, Australia; Agriculture Victoria, AgriBio, Centre for AgriBioscience, 3083 Bundoora, Victoria, Australia
| |
Collapse
|
46
|
Poppe M, Mulder H, Ducro B, de Jong G. Genetic analysis of udder conformation traits derived from automatic milking system recording in dairy cows. J Dairy Sci 2019; 102:1386-1396. [DOI: 10.3168/jds.2018-14838] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 10/24/2018] [Indexed: 11/19/2022]
|
47
|
Nishino J, Ochi H, Kochi Y, Tsunoda T, Matsui S. Sample Size for Successful Genome-Wide Association Study of Major Depressive Disorder. Front Genet 2018; 9:227. [PMID: 30002671 PMCID: PMC6032046 DOI: 10.3389/fgene.2018.00227] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 06/07/2018] [Indexed: 12/29/2022] Open
Abstract
Major depressive disorder (MDD) is a complex, heritable psychiatric disorder. Advanced statistical genetics for genome-wide association studies (GWASs) have suggested that the heritability of MDD is largely explained by common single nucleotide polymorphisms (SNPs). However, until recently, there has been little success in identifying MDD-associated SNPs. Here, based on an empirical Bayes estimation of a semi-parametric hierarchical mixture model using summary statistics from GWASs, we show that MDD has a distinctive polygenic architecture consisting of a relatively small number of risk variants (~17%), e.g., compared to schizophrenia (~42%). In addition, these risk variants were estimated to have very small effects (genotypic odds ratio ≤ 1.04 under the additive model). Based on the estimated architecture, the required sample size for detecting significant SNPs in a future GWAS was predicted to be exceptionally large. It is noteworthy that the number of genome-wide significant MDD-associated SNPs would rapidly increase when collecting 50,000 or more MDD-cases (and the same number of controls); it can reach as much as 100 SNPs out of nearly independent (linkage disequilibrium pruned) 100,000 SNPs for ~120,000 MDD-cases.
Collapse
Affiliation(s)
- Jo Nishino
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan.,CREST, JST, Tokyo, Japan
| | - Hidenori Ochi
- CREST, JST, Tokyo, Japan.,Division of Frontier Medical Science, Programs for Biomedical Research Graduate School of Biomedical Science, Department of Gastroenterology and Metabolism, Hiroshima University, Hiroshima, Japan.,Laboratory for Digestive Diseases, RIKEN Center for Integrative Medical Sciences, Hiroshima, Japan
| | - Yuta Kochi
- CREST, JST, Tokyo, Japan.,Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan.,CREST, JST, Tokyo, Japan.,Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Risk Analysis Research Center, The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan
| | - Shigeyuki Matsui
- CREST, JST, Tokyo, Japan.,Risk Analysis Research Center, The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan.,Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
48
|
Turley P, Walters RK, Maghzian O, Okbay A, Lee JJ, Fontana MA, Nguyen-Viet TA, Wedow R, Zacher M, Furlotte NA, Magnusson P, Oskarsson S, Johannesson M, Visscher PM, Laibson D, Cesarini D, Neale BM, Benjamin DJ. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat Genet 2018; 50:229-237. [PMID: 29292387 PMCID: PMC5805593 DOI: 10.1038/s41588-017-0009-4] [Citation(s) in RCA: 556] [Impact Index Per Article: 92.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 11/06/2017] [Indexed: 12/28/2022]
Abstract
We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.
Collapse
Affiliation(s)
- Patrick Turley
- Broad Institute, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Cambridge, MA, USA.
| | - Raymond K Walters
- Broad Institute, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Cambridge, MA, USA
| | - Omeed Maghzian
- Department of Economics, Harvard University, Cambridge, MA, USA
| | - Aysu Okbay
- Department of Complex Trait Genetics, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - James J Lee
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | | | - Tuan Anh Nguyen-Viet
- Center for Economic and Social Research, University of Southern California, Los Angeles, CA, USA
| | - Robbee Wedow
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO, USA
- Institute of Behavioral Science, University of Colorado Boulder, Boulder, CO, USA
- Department of Sociology, University of Colorado Boulder, Boulder, CO, USA
| | - Meghan Zacher
- Department of Sociology, Harvard University, Cambridge, MA, USA
| | | | - Patrik Magnusson
- Institutionen för Medicinsk Epidemiologi och Biostatistik, Karolinska Institutet, Stockholm, Sweden
| | - Sven Oskarsson
- Department of Government, Uppsala Universitet, Uppsala, Sweden
| | - Magnus Johannesson
- Department of Economics, Stockholm School of Economics, Stockholm, Sweden
| | - Peter M Visscher
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - David Laibson
- Department of Economics, Harvard University, Cambridge, MA, USA
- National Bureau of Economic Research, Cambridge, MA, USA
| | - David Cesarini
- National Bureau of Economic Research, Cambridge, MA, USA.
- Department of Economics and Center for Experimental Social Science, New York University, New York, NY, USA.
- Institutet för Näringslivsforskning, Stockholm, Sweden.
| | - Benjamin M Neale
- Broad Institute, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Cambridge, MA, USA.
| | - Daniel J Benjamin
- Center for Economic and Social Research, University of Southern California, Los Angeles, CA, USA.
- National Bureau of Economic Research, Cambridge, MA, USA.
- Department of Economics, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
49
|
Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nat Commun 2017; 8:456. [PMID: 28878256 PMCID: PMC5587666 DOI: 10.1038/s41467-017-00470-2] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 06/30/2017] [Indexed: 01/03/2023] Open
Abstract
Using genotype data to perform accurate genetic prediction of complex traits can facilitate genomic selection in animal and plant breeding programs, and can aid in the development of personalized medicine in humans. Because most complex traits have a polygenic architecture, accurate genetic prediction often requires modeling all genetic variants together via polygenic methods. Here, we develop such a polygenic method, which we refer to as the latent Dirichlet process regression model. Dirichlet process regression is non-parametric in nature, relies on the Dirichlet process to flexibly and adaptively model the effect size distribution, and thus enjoys robust prediction performance across a broad spectrum of genetic architectures. We compare Dirichlet process regression with several commonly used prediction methods with simulations. We further apply Dirichlet process regression to predict gene expressions, to conduct PrediXcan based gene set test, to perform genomic selection of four traits in two species, and to predict eight complex traits in a human cohort.Genetic prediction of complex traits with polygenic architecture has wide application from animal breeding to disease prevention. Here, Zeng and Zhou develop a non-parametric genetic prediction method based on latent Dirichlet Process regression models.
Collapse
|