1
|
Joukhadar R, Li Y, Thistlethwaite R, Forrest KL, Tibbits JF, Trethowan R, Hayden MJ. Optimising desired gain indices to maximise selection response. FRONTIERS IN PLANT SCIENCE 2024; 15:1337388. [PMID: 38978519 PMCID: PMC11228337 DOI: 10.3389/fpls.2024.1337388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 05/23/2024] [Indexed: 07/10/2024]
Abstract
Introduction In plant breeding, we often aim to improve multiple traits at once. However, without knowing the economic value of each trait, it is hard to decide which traits to focus on. This is where "desired gain selection indices" come in handy, which can yield optimal gains in each trait based on the breeder's prioritisation of desired improvements when economic weights are not available. However, they lack the ability to maximise the selection response and determine the correlation between the index and net genetic merit. Methods Here, we report the development of an iterative desired gain selection index method that optimises the sampling of the desired gain values to achieve a targeted or a user-specified selection response for multiple traits. This targeted selection response can be constrained or unconstrained for either a subset or all the studied traits. Results We tested the method using genomic estimated breeding values (GEBVs) for seven traits in a bread wheat (Triticum aestivum) reference breeding population comprising 3,331 lines and achieved prediction accuracies ranging between 0.29 and 0.47 across the seven traits. The indices were validated using 3,005 double haploid lines that were derived from crosses between parents selected from the reference population. We tested three user-specified response scenarios: a constrained equal weight (INDEX1), a constrained yield dominant weight (INDEX2), and an unconstrained weight (INDEX3). Our method achieved an equivalent response to the user-specified selection response when constraining a set of traits, and this response was much better than the response of the traditional desired gain selection indices method without iteration. Interestingly, when using unconstrained weight, our iterative method maximised the selection response and shifted the average GEBVs of the selection candidates towards the desired direction. Discussion Our results show that the method is an optimal choice not only when economic weights are unavailable, but also when constraining the selection response is an unfavourable option.
Collapse
Affiliation(s)
- Reem Joukhadar
- Agriculture Victoria, Centre for AgriBioscience, AgriBio, Bundoora, VIC, Australia
| | - Yongjun Li
- Agriculture Victoria, Centre for AgriBioscience, AgriBio, Bundoora, VIC, Australia
| | - Rebecca Thistlethwaite
- School of Life and Environmental Sciences, Plant Breeding Institute, Sydney Institute of Agriculture, The University of Sydney, Narrabri, NSW, Australia
| | - Kerrie L. Forrest
- Agriculture Victoria, Centre for AgriBioscience, AgriBio, Bundoora, VIC, Australia
| | - Josquin F. Tibbits
- Agriculture Victoria, Centre for AgriBioscience, AgriBio, Bundoora, VIC, Australia
| | - Richard Trethowan
- School of Life and Environmental Sciences, Plant Breeding Institute, Sydney Institute of Agriculture, The University of Sydney, Narrabri, NSW, Australia
- School of Life and Environmental Sciences, Plant Breeding Institute, Sydney Institute of Agriculture, The University of Sydney, Cobbitty, NSW, Australia
| | - Matthew J. Hayden
- Agriculture Victoria, Centre for AgriBioscience, AgriBio, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| |
Collapse
|
2
|
Gebremedhin A, Li Y, Shunmugam ASK, Sudheesh S, Valipour-Kahrood H, Hayden MJ, Rosewarne GM, Kaur S. Genomic selection for target traits in the Australian lentil breeding program. FRONTIERS IN PLANT SCIENCE 2024; 14:1284781. [PMID: 38235201 PMCID: PMC10791954 DOI: 10.3389/fpls.2023.1284781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 12/07/2023] [Indexed: 01/19/2024]
Abstract
Genomic selection (GS) uses associations between markers and phenotypes to predict the breeding values of individuals. It can be applied early in the breeding cycle to reduce the cross-to-cross generation interval and thereby increase genetic gain per unit of time. The development of cost-effective, high-throughput genotyping platforms has revolutionized plant breeding programs by enabling the implementation of GS at the scale required to achieve impact. As a result, GS is becoming routine in plant breeding, even in minor crops such as pulses. Here we examined 2,081 breeding lines from Agriculture Victoria's national lentil breeding program for a range of target traits including grain yield, ascochyta blight resistance, botrytis grey mould resistance, salinity and boron stress tolerance, 100-grain weight, seed size index and protein content. A broad range of narrow-sense heritabilities was observed across these traits (0.24-0.66). Genomic prediction models were developed based on 64,781 genome-wide SNPs using Bayesian methodology and genomic estimated breeding values (GEBVs) were calculated. Forward cross-validation was applied to examine the prediction accuracy of GS for these targeted traits. The accuracy of GEBVs was consistently higher (0.34-0.83) than BLUP estimated breeding values (EBVs) (0.22-0.54), indicating a higher expected rate of genetic gain with GS. GS-led parental selection using early generation breeding materials also resulted in higher genetic gain compared to BLUP-based selection performed using later generation breeding lines. Our results show that implementing GS in lentil breeding will fast track the development of high-yielding cultivars with increased resistance to biotic and abiotic stresses, as well as improved seed quality traits.
Collapse
Affiliation(s)
- Alem Gebremedhin
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | - Yongjun Li
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | | | - Shimna Sudheesh
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | | | - Matthew J. Hayden
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | | | - Sukhjiwan Kaur
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| |
Collapse
|
3
|
Xiang R, Fang L, Liu S, Macleod IM, Liu Z, Breen EJ, Gao Y, Liu GE, Tenesa A, Mason BA, Chamberlain AJ, Wray NR, Goddard ME. Gene expression and RNA splicing explain large proportions of the heritability for complex traits in cattle. CELL GENOMICS 2023; 3:100385. [PMID: 37868035 PMCID: PMC10589627 DOI: 10.1016/j.xgen.2023.100385] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 08/10/2022] [Accepted: 07/26/2023] [Indexed: 10/24/2023]
Abstract
Many quantitative trait loci (QTLs) are in non-coding regions. Therefore, QTLs are assumed to affect gene regulation. Gene expression and RNA splicing are primary steps of transcription, so DNA variants changing gene expression (eVariants) or RNA splicing (sVariants) are expected to significantly affect phenotypes. We quantify the contribution of eVariants and sVariants detected from 16 tissues (n = 4,725) to 37 traits of ∼120,000 cattle (average magnitude of genetic correlation between traits = 0.13). Analyzed in Bayesian mixture models, averaged across 37 traits, cis and trans eVariants and sVariants detected from 16 tissues jointly explain 69.2% (SE = 0.5%) of heritability, 44% more than expected from the same number of random variants. This 69.2% includes an average of 24% from trans e-/sVariants (14% more than expected). Averaged across 56 lipidomic traits, multi-tissue cis and trans e-/sVariants also explain 71.5% (SE = 0.3%) of heritability, demonstrating the essential role of proximal and distal regulatory variants in shaping mammalian phenotypes.
Collapse
Affiliation(s)
- Ruidong Xiang
- Faculty of Veterinary & Agricultural Science, the University of Melbourne, Parkville, VIC 3052, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
- Cambridge-Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - Lingzhao Fang
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, the University of Edinburgh, Edinburgh, UK
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Shuli Liu
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
| | - Iona M. Macleod
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
| | - Zhiqian Liu
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
| | - Edmond J. Breen
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
| | - Yahui Gao
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - George E. Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Albert Tenesa
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, the University of Edinburgh, Edinburgh, UK
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, the University of Edinburgh, Midlothian EH25 9RG, UK
| | - CattleGTEx Consortium
- Faculty of Veterinary & Agricultural Science, the University of Melbourne, Parkville, VIC 3052, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
- Cambridge-Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, the University of Edinburgh, Edinburgh, UK
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, the University of Edinburgh, Midlothian EH25 9RG, UK
- Institute for Molecular Bioscience, the University of Queensland, Brisbane, QLD 4072, Australia
- Queensland Brain Institute, the University of Queensland, Brisbane, QLD 4072, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| | - Brett A. Mason
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
| | - Amanda J. Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, the University of Queensland, Brisbane, QLD 4072, Australia
- Queensland Brain Institute, the University of Queensland, Brisbane, QLD 4072, Australia
| | - Michael E. Goddard
- Faculty of Veterinary & Agricultural Science, the University of Melbourne, Parkville, VIC 3052, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia
| |
Collapse
|
4
|
Zhao T, Cheng H. Interpreting single-step genomic evaluation as a neural network of three layers: pedigree, genotypes, and phenotypes. Genet Sel Evol 2023; 55:68. [PMID: 37789273 PMCID: PMC10546757 DOI: 10.1186/s12711-023-00838-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/08/2023] [Indexed: 10/05/2023] Open
Abstract
The single-step approach has become the most widely-used methodology for genomic evaluations when only a subset of phenotyped individuals in the pedigree are genotyped, where the genotypes for non-genotyped individuals are imputed based on gene contents (i.e., genotypes) of genotyped individuals through their pedigree relationships. We proposed a new method named single-step neural network with mixed models (NNMM) to represent single-step genomic evaluations as a neural network of three sequential layers: pedigree, genotypes, and phenotypes. These three sequential layers of information create a unified network instead of two separate steps, allowing the unobserved gene contents of non-genotyped individuals to be sampled based on pedigree, observed genotypes of genotyped individuals, and phenotypes. In addition to imputation of genotypes using all three sources of information, including phenotypes, genotypes, and pedigree, single-step NNMM provides a more flexible framework to allow nonlinear relationships between genotypes and phenotypes, and for individuals to be genotyped with different single-nucleotide polymorphism (SNP) panels. The single-step NNMM has been implemented in the software package "JWAS'.
Collapse
Affiliation(s)
- Tianjing Zhao
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
- Integrative Genetics and Genomics Graduate Group, University of California Davis, Davis, CA, 95616, USA
| | - Hao Cheng
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA.
| |
Collapse
|
5
|
An Improved Bayesian Shrinkage Regression Algorithm for Genomic Selection. Genes (Basel) 2022; 13:genes13122193. [PMID: 36553460 PMCID: PMC9778053 DOI: 10.3390/genes13122193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 11/14/2022] [Accepted: 11/18/2022] [Indexed: 11/25/2022] Open
Abstract
Currently a hot topic, genomic selection (GS) has consistently provided powerful support for breeding studies and achieved more comprehensive and reliable selection in animal and plant breeding. GS estimates the effects of all single nucleotide polymorphisms (SNPs) and thereby predicts the genomic estimation of breeding value (GEBV), accelerating breeding progress and overcoming the limitations of conventional breeding. The successful application of GS primarily depends on the accuracy of the GEBV. Adopting appropriate advanced algorithms to improve the accuracy of the GEBV is time-saving and efficient for breeders, and the available algorithms can be further improved in the big data era. In this study, we develop a new algorithm under the Bayesian Shrinkage Regression (BSR, which is called BayesA) framework, an improved expectation-maximization algorithm for BayesA (emBAI). The emBAI algorithm first corrects the polygenic and environmental noise and then calculates the GEBV by emBayesA. We conduct two simulation experiments and a real dataset analysis for flowering time-related Arabidopsis phenotypes to validate the new algorithm. Compared to established methods, emBAI is more powerful in terms of prediction accuracy, mean square error (MSE), mean absolute error (MAE), the area under the receiver operating characteristic curve (AUC) and correlation of prediction in simulation studies. In addition, emBAI performs well under the increasing genetic background. The analysis of the Arabidopsis real dataset further illustrates the benefits of emBAI for genomic prediction according to prediction accuracy, MSE, MAE and correlation of prediction. Furthermore, the new method shows the advantages of significant loci detection and effect coefficient estimation, which are confirmed by The Arabidopsis Information Resource (TAIR) gene bank. In conclusion, the emBAI algorithm provides powerful support for GS in high-dimensional genomic datasets.
Collapse
|