1
|
Mascher M, Jayakodi M, Shim H, Stein N. Promises and challenges of crop translational genomics. Nature 2024:10.1038/s41586-024-07713-5. [PMID: 39313530 PMCID: PMC7616746 DOI: 10.1038/s41586-024-07713-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/13/2024] [Indexed: 09/25/2024]
Abstract
Crop translational genomics applies breeding techniques based on genomic datasets to improve crops. Technological breakthroughs in the past ten years have made it possible to sequence the genomes of increasing numbers of crop varieties and have assisted in the genetic dissection of crop performance. However, translating research findings to breeding applications remains challenging. Here we review recent progress and future prospects for crop translational genomics in bringing results from the laboratory to the field. Genetic mapping, genomic selection and sequence-assisted characterization and deployment of plant genetic resources utilize rapid genotyping of large populations. These approaches have all had an impact on breeding for qualitative traits, where single genes with large phenotypic effects exert their influence. Characterization of the complex genetic architectures that underlie quantitative traits such as yield and flowering time, especially in newly domesticated crops, will require further basic research, including research into regulation and interactions of genes and the integration of genomic approaches and high-throughput phenotyping, before targeted interventions can be designed. Future priorities for translation include supporting genomics-assisted breeding in low-income countries and adaptation of crops to changing environments.
Collapse
Affiliation(s)
- Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Hyeonah Shim
- Department of Agriculture, Forestry and Bioresources, Plant Genomics and Breeding Institute, Research Institute of Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, Korea
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
- Martin Luther University Halle-Wittenberg, Halle, Germany.
| |
Collapse
|
2
|
Lee AMJ, Foong MYM, Song BK, Chew FT. Genomic selection for crop improvement in fruits and vegetables: a systematic scoping review. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2024; 44:60. [PMID: 39267903 PMCID: PMC11391014 DOI: 10.1007/s11032-024-01497-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 09/01/2024] [Indexed: 09/15/2024]
Abstract
To ensure the nutritional needs of an expanding global population, it is crucial to optimize the growing capabilities and breeding values of fruit and vegetable crops. While genomic selection, initially implemented in animal breeding, holds tremendous potential, its utilization in fruit and vegetable crops remains underexplored. In this systematic review, we reviewed 63 articles covering genomic selection and its applications across 25 different types of fruit and vegetable crops over the last decade. The traits examined were directly related to the edible parts of the crops and carried significant economic importance. Comparative analysis with WHO/FAO data identified potential economic drivers underlying the study focus of some crops and highlighted crops with potential for further genomic selection research and application. Factors affecting genomic selection accuracy in fruit and vegetable studies are discussed and suggestions made to assist in their implementation into plant breeding schemes. Genetic gain in fruits and vegetables can be improved by utilizing genomic selection to improve selection intensity, accuracy, and integration of genetic variation. However, the reduction of breeding cycle times may not be beneficial in crops with shorter life cycles such as leafy greens as compared to fruit trees. There is an urgent need to integrate genomic selection methods into ongoing breeding programs and assess the actual genomic estimated breeding values of progeny resulting from these breeding programs against the prediction models. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-024-01497-2.
Collapse
Affiliation(s)
- Adrian Ming Jern Lee
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 117543 Republic of Singapore
- NUS Agritech Centre, National University of Singapore, 85 Science Park Dr, #01-03, Singapore, 118258 Republic of Singapore
| | - Melissa Yuin Mern Foong
- School of Science, Monash University Malaysia, Bandar Sunway, 47500 Subang Jaya, Selangor Darul Ehsan Malaysia
| | - Beng Kah Song
- School of Science, Monash University Malaysia, Bandar Sunway, 47500 Subang Jaya, Selangor Darul Ehsan Malaysia
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 117543 Republic of Singapore
- NUS Agritech Centre, National University of Singapore, 85 Science Park Dr, #01-03, Singapore, 118258 Republic of Singapore
| |
Collapse
|
3
|
Weber SE, Roscher-Ehrig L, Kox T, Abbadi A, Stahl A, Snowdon RJ. Genomic prediction in Brassica napus: evaluating the benefit of imputed whole-genome sequencing data. Genome 2024; 67:210-222. [PMID: 38708850 DOI: 10.1139/gen-2023-0126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Advances in sequencing technology allow whole plant genomes to be sequenced with high quality. Combining genotypic and phenotypic data in genomic prediction helps breeders to select crossing partners in partially phenotyped populations. In plant breeding programs, the cost of sequencing entire breeding populations still exceeds available genotyping budgets. Hence, the method for genotyping is still mainly single nucleotide polymorphism (SNP) arrays; however, arrays are unable to assess the entire genome- and population-wide diversity. A compromise involves genotyping the entire population using an SNP array and a subset of the population with whole-genome sequencing. Both datasets can then be used to impute markers from whole-genome sequencing onto the entire population. Here, we evaluate whether imputation of whole-genome sequencing data enhances genomic predictions, using data from a nested association mapping population of rapeseed (Brassica napus). Employing two cross-validation schemes that mimic scenarios for the prediction of close and distant relatives, we show that imputed marker data do not significantly improve prediction accuracy, likely due to redundancy in relationship estimates and imputation errors. In simulation studies, only small improvements were observed, further corroborating the findings. We conclude that SNP arrays are already equipped with the information that is added by imputation through relationship and linkage disequilibrium.
Collapse
Affiliation(s)
- Sven E Weber
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - Lennard Roscher-Ehrig
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | | | | | - Andreas Stahl
- Julius Kuehn Institute (JKI), Federal Research Centre for Cultivated Plants, Institute for Resistance Research and Stress Tolerance, Quedlinburg, Germany
| | - Rod J Snowdon
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| |
Collapse
|
4
|
Thieffry S, Aubert J, Devers-Lamrani M, Martin-Laurent F, Romdhane S, Rouard N, Siol M, Spor A. Engineering multi-degrading bacterial communities to bioremediate soils contaminated with pesticides residues. JOURNAL OF HAZARDOUS MATERIALS 2024; 471:134454. [PMID: 38688223 DOI: 10.1016/j.jhazmat.2024.134454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/09/2024] [Accepted: 04/25/2024] [Indexed: 05/02/2024]
Abstract
Parallel to the important use of pesticides in conventional agriculture there is a growing interest for green technologies to clear contaminated soil from pesticides and their degradation products. Bioaugmentation i. e. the inoculation of degrading micro-organisms in polluted soil, is a promising method still in needs of further developments. Specifically, improvements in the understanding of how degrading microorganisms must overcome abiotic filters and interact with the autochthonous microbial communities are needed in order to efficiently design bioremediation strategies. Here we designed a protocol aiming at studying the degradation of two herbicides, glyphosate (GLY) and isoproturon (IPU), via experimental modifications of two source bacterial communities. We used statistical methods stemming from genomic prediction to link community composition to herbicides degradation potentials. Our approach proved to be efficient with correlation estimates over 0.8 - between model predictions and measured pesticide degradation values. Multi-degrading bacterial communities were obtained by coalescing bacterial communities with high GLY or IPU degradation ability based on their community-level properties. Finally, we evaluated the efficiency of constructed multi-degrading communities to remove pesticide contamination in a different soil. While results are less clear in the case of GLY, we showed an efficient transfer of degrading capacities towards the receiving soil even at relatively low inoculation levels in the case of IPU. Altogether, we developed an innovative protocol for building multi-degrading simplified bacterial communities with the help of genomic prediction tools and coalescence, and proved their efficiency in a contaminated soil.
Collapse
Affiliation(s)
- Sylvia Thieffry
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France; Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA Paris-Saclay, 91120 Palaiseau, France.
| | - Julie Aubert
- Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA Paris-Saclay, 91120 Palaiseau, France
| | - Marion Devers-Lamrani
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France
| | - Fabrice Martin-Laurent
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France
| | - Sana Romdhane
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France
| | - Nadine Rouard
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France
| | - Mathieu Siol
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France
| | - Aymé Spor
- INRAE, Institut Agro, Université de Bourgogne, Université de Bourgogne Franche-Comté, Agroécologie,21000 Dijon, France.
| |
Collapse
|
5
|
Carvalho WA, Gaspar EB, Domingues R, Regitano LCA, Cardoso FF. Genetic factors underlying host resistance to Rhipicephalus microplus tick infestation in Braford cattle: a systems biology perspective. Mamm Genome 2024; 35:186-200. [PMID: 38480585 DOI: 10.1007/s00335-024-10030-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 01/29/2024] [Indexed: 05/29/2024]
Abstract
Approximately 80% of the world's cattle are raised in regions with a high risk of tick-borne diseases, resulting in significant economic losses due to parasitism by Rhipicephalus (Boophilus) microplus. However, the lack of a systemic biology approach hampers a comprehensive understanding of tick-host interactions that mediate tick resistance phenotypes. Here, we conducted a genome-wide association study (GWAS) of 2933 Braford cattle and found 340 single-nucleotide polymorphisms (SNPs) associated with tick counts. Gene expression analyses were performed on skin samples obtained from previously tick-exposed heifers with extremely high or low estimated breeding values for R. microplus counts. Evaluations were performed both before and after artificial infestation with ticks. Differentially expressed genes were found within 1-Mb windows centered at significant SNPs from GWAS. A total of 330 genes were related to the breakdown of homeostasis that was induced by larval attachment to bovine skin. Enrichment analysis pointed to a key role of proteolysis and signal transduction via JAK/STAT, NFKB and WNT/beta catenin signaling pathways. Integrative analysis on matrixEQTL revealed two cis-eQTLs and four significant SNPs in the genes peptidyl arginine deiminase type IV (PADI4) and LOC11449251. The integration of genomic data from QTL maps and transcriptome analyses has identified a set of twelve key genes that show significant associations with tick loads. These genes could be key candidates to improve the accuracy of genomic predictions for tick resistance in Braford cattle.
Collapse
|
6
|
Bose S, Banerjee S, Kumar S, Saha A, Nandy D, Hazra S. Review of applications of artificial intelligence (AI) methods in crop research. J Appl Genet 2024; 65:225-240. [PMID: 38216788 DOI: 10.1007/s13353-023-00826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/23/2023] [Accepted: 12/26/2023] [Indexed: 01/14/2024]
Abstract
Sophisticated and modern crop improvement techniques can bridge the gap for feeding the ever-increasing population. Artificial intelligence (AI) refers to the simulation of human intelligence in machines, which refers to the application of computational algorithms, machine learning (ML) and deep learning (DL) techniques. This is aimed to generalise patterns and relationships from historical data, employing various mathematical optimisation techniques thus making prediction models for facilitating selection of superior genotypes. These techniques are less resource intensive and can solve the problem based on the analysis of large-scale phenotypic datasets. ML for genomic selection (GS) uses high-throughput genotyping technologies to gather genetic information on a large number of markers across the genome. The prediction of GS models is based on the mathematical relation between genotypic and phenotypic data from the training population. ML techniques have emerged as powerful tools for genome editing through analysing large-scale genomic data and facilitating the development of accurate prediction models. Precise phenotyping is a prerequisite to advance crop breeding for solving agricultural production-related issues. ML algorithms can solve this problem through generating predictive models, based on the analysis of large-scale phenotypic datasets. DL models also have the potential reliability of precise phenotyping. This review provides a comprehensive overview on various ML and DL models, their applications, potential to enhance the efficiency, specificity and safety towards advanced crop improvement protocols such as genomic selection, genome editing, along with phenotypic prediction to promote accelerated breeding.
Collapse
Affiliation(s)
- Suvojit Bose
- Department of Vegetables and Spice Crops, Uttar Banga Krishi Viswavidyalaya, Pundibari, Cooch Behar, 736165, West Bengal, India
| | | | - Soumya Kumar
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Akash Saha
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Debalina Nandy
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Soham Hazra
- Department of Agriculture, Brainware University, Barasat, 700125, West Bengal, India.
| |
Collapse
|
7
|
Chen C, Bhuiyan SA, Ross E, Powell O, Dinglasan E, Wei X, Atkin F, Deomano E, Hayes B. Genomic prediction for sugarcane diseases including hybrid Bayesian-machine learning approaches. FRONTIERS IN PLANT SCIENCE 2024; 15:1398903. [PMID: 38751840 PMCID: PMC11095127 DOI: 10.3389/fpls.2024.1398903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 04/15/2024] [Indexed: 05/18/2024]
Abstract
Sugarcane smut and Pachymetra root rots are two serious diseases of sugarcane, with susceptible infected crops losing over 30% of yield. A heritable component to both diseases has been demonstrated, suggesting selection could improve disease resistance. Genomic selection could accelerate gains even further, enabling early selection of resistant seedlings for breeding and clonal propagation. In this study we evaluated four types of algorithms for genomic predictions of clonal performance for disease resistance. These algorithms were: Genomic best linear unbiased prediction (GBLUP), including extensions to model dominance and epistasis, Bayesian methods including BayesC and BayesR, Machine learning methods including random forest, multilayer perceptron (MLP), modified convolutional neural network (CNN) and attention networks designed to capture epistasis across the genome-wide markers. Simple hybrid methods, that first used BayesR/GWAS to identify a subset of 1000 markers with moderate to large marginal additive effects, then used attention networks to derive predictions from these effects and their interactions, were also developed and evaluated. The hypothesis for this approach was that using a subset of markers more likely to have an effect would enable better estimation of interaction effects than when there were an extremely large number of possible interactions, especially with our limited data set size. To evaluate the methods, we applied both random five-fold cross-validation and a structured PCA based cross-validation that separated 4702 sugarcane clones (that had disease phenotypes and genotyped for 26k genome wide SNP markers) by genomic relationship. The Bayesian methods (BayesR and BayesC) gave the highest accuracy of prediction, followed closely by hybrid methods with attention networks. The hybrid methods with attention networks gave the lowest variation in accuracy of prediction across validation folds (and lowest MSE), which may be a criteria worth considering in practical breeding programs. This suggests that hybrid methods incorporating the attention mechanism could be useful for genomic prediction of clonal performance, particularly where non-additive effects may be important.
Collapse
Affiliation(s)
- Chensong Chen
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Shamsul A. Bhuiyan
- Sugar Research Australia, Woodford, QLD, Australia
- Queensland Micro- and Nanotechnology Centre, Griffith University, Nathan, QLD, Australia
| | - Elizabeth Ross
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Owen Powell
- Center for Crop Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Eric Dinglasan
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Xianming Wei
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | | | - Emily Deomano
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | - Ben Hayes
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
8
|
Li W, Li W, Song Z, Gao Z, Xie K, Wang Y, Wang B, Hu J, Zhang Q, Ning C, Wang D, Fan X. Marker Density and Models to Improve the Accuracy of Genomic Selection for Growth and Slaughter Traits in Meat Rabbits. Genes (Basel) 2024; 15:454. [PMID: 38674388 PMCID: PMC11050255 DOI: 10.3390/genes15040454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 03/25/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
The selection and breeding of good meat rabbit breeds are fundamental to their industrial development, and genomic selection (GS) can employ genomic information to make up for the shortcomings of traditional phenotype-based breeding methods. For the practical implementation of GS in meat rabbit breeding, it is necessary to assess different marker densities and GS models. Here, we obtained low-coverage whole-genome sequencing (lcWGS) data from 1515 meat rabbits (including parent herd and half-sibling offspring). The specific objectives were (1) to derive a baseline for heritability estimates and genomic predictions based on randomly selected marker densities and (2) to assess the accuracy of genomic predictions for single- and multiple-trait linear mixed models. We found that a marker density of 50 K can be used as a baseline for heritability estimation and genomic prediction. For GS, the multi-trait genomic best linear unbiased prediction (GBLUP) model results in more accurate predictions for virtually all traits compared to the single-trait model, with improvements greater than 15% for all of them, which may be attributed to the use of information on genetically related traits. In addition, we discovered a positive correlation between the performance of the multi-trait GBLUP and the genetic correlation between the traits. We anticipate that this approach will provide solutions for GS, as well as optimize breeding programs, in meat rabbits.
Collapse
Affiliation(s)
- Wenjie Li
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
- Department of Animal Genetics and Breeding, University of Anhui Agricultural, Hefei 230031, China
| | - Wenqiang Li
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Zichen Song
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Zihao Gao
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Kerui Xie
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Yubing Wang
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Bo Wang
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Jiaqing Hu
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Qin Zhang
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Chao Ning
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| | - Dan Wang
- Key Laboratory of Efficient Utilization of Non-Grain Feed Resources (Co-Construction by Ministry and Province), College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Ministry of Agriculture and Rural Affairs, Taian 271000, China
| | - Xinzhong Fan
- Department of Animal Genetics and Breeding, Shandong Agricultural University, Taian 271000, China; (W.L.); (W.L.); (Z.S.); (K.X.); (B.W.); (J.H.); (Q.Z.); (C.N.)
| |
Collapse
|
9
|
Lee J, Mun H, Koo Y, Park S, Kim J, Yu S, Shin J, Lee J, Son J, Park C, Lee S, Song H, Kim S, Dang C, Park J. Enhancing Genomic Prediction Accuracy for Body Conformation Traits in Korean Holstein Cattle. Animals (Basel) 2024; 14:1052. [PMID: 38612291 PMCID: PMC11011013 DOI: 10.3390/ani14071052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 03/18/2024] [Accepted: 03/28/2024] [Indexed: 04/14/2024] Open
Abstract
The Holstein breed is the mainstay of dairy production in Korea. In this study, we evaluated the genomic prediction accuracy for body conformation traits in Korean Holstein cattle, using a range of π levels (0.75, 0.90, 0.99, and 0.995) in Bayesian methods (BayesB and BayesC). Focusing on 24 traits, we analyzed the impact of different π levels on prediction accuracy. We observed a general increase in accuracy at higher levels for specific traits, with variations depending on the Bayesian method applied. Notably, the highest accuracy was achieved for rear teat angle when using deregressed estimated breeding values including parent average as a response variable. We further demonstrated that incorporating parent average into deregressed estimated breeding values enhances genomic prediction accuracy, showcasing the effectiveness of the model in integrating both offspring and parental genetic information. Additionally, we identified 18 significant window regions through genome-wide association studies, which are crucial for future fine mapping and discovery of causal mutations. These findings provide valuable insights into the efficiency of genomic selection for body conformation traits in Korean Holstein cattle and highlight the potential for advancements in the prediction accuracy using larger datasets and more sophisticated genomic models.
Collapse
Affiliation(s)
- Jungjae Lee
- Department of Animal Science and Technology, College of Biotechnology and Natural Resources, Chung-Ang University, Anseong 17546, Republic of Korea;
| | - Hyosik Mun
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Yangmo Koo
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Sangchul Park
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Junsoo Kim
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Seongpil Yu
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Jiseob Shin
- Dairy Cattle Improvement Center of NH-Agree Business Group, National Agricultural Cooperative Federation, Goyang 10292, Republic of Korea; (J.S.); (S.L.); (H.S.)
| | - Jaegu Lee
- Animal Breeding and Genetics Division, National Institute of Animal Science, Rural Development Administration, Cheonan 31000, Republic of Korea;
| | - Jihyun Son
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Chanhyuk Park
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Seokhyun Lee
- Dairy Cattle Improvement Center of NH-Agree Business Group, National Agricultural Cooperative Federation, Goyang 10292, Republic of Korea; (J.S.); (S.L.); (H.S.)
| | - Hyungjun Song
- Dairy Cattle Improvement Center of NH-Agree Business Group, National Agricultural Cooperative Federation, Goyang 10292, Republic of Korea; (J.S.); (S.L.); (H.S.)
| | - Sungjin Kim
- Korea Animal Improvement Association, Seoul 06668, Republic of Korea; (H.M.); (Y.K.); (S.P.); (J.K.); (S.Y.); (J.S.); (C.P.); (S.K.)
| | - Changgwon Dang
- Animal Breeding and Genetics Division, National Institute of Animal Science, Rural Development Administration, Cheonan 31000, Republic of Korea;
| | - Jun Park
- Department of Animal Biotechnology, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
10
|
Lin YC, Mayer M, Valle Torres D, Pook T, Hölker AC, Presterl T, Ouzunova M, Schön CC. Genomic prediction within and across maize landrace derived populations using haplotypes. FRONTIERS IN PLANT SCIENCE 2024; 15:1351466. [PMID: 38584949 PMCID: PMC10995330 DOI: 10.3389/fpls.2024.1351466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 03/05/2024] [Indexed: 04/09/2024]
Abstract
Genomic prediction (GP) using haplotypes is considered advantageous compared to GP solely reliant on single nucleotide polymorphisms (SNPs), owing to haplotypes' enhanced ability to capture ancestral information and their higher linkage disequilibrium with quantitative trait loci (QTL). Many empirical studies supported the advantages of haplotype-based GP over SNP-based approaches. Nevertheless, the performance of haplotype-based GP can vary significantly depending on multiple factors, including the traits being studied, the genetic structure of the population under investigation, and the particular method employed for haplotype construction. In this study, we compared haplotype and SNP based prediction accuracies in four populations derived from European maize landraces. Populations comprised either doubled haploid lines (DH) derived directly from landraces, or gamete capture lines (GC) derived from crosses of the landraces with an inbred line. For two different landraces, both types of populations were generated, genotyped with 600k SNPs and phenotyped as lines per se for five traits. Our study explores three prediction scenarios: (i) within each of the four populations, (ii) across DH and GC populations from the same landrace, and (iii) across landraces using either DH or GC populations. Three haplotype construction methods were evaluated: 1. fixed-window blocks (FixedHB), 2. LD-based blocks (HaploView), and 3. IBD-based blocks (HaploBlocker). In within population predictions, FixedHB and HaploView methods performed as well as or slightly better than SNPs for all traits. HaploBlocker improved accuracy for certain traits but exhibited inferior performance for others. In prediction across populations, the parameter setting from HaploBlocker which controls the construction of shared haplotypes between populations played a crucial role for obtaining optimal results. When predicting across landraces, accuracies were low for both, SNP and haplotype approaches, but for specific traits substantial improvement was observed with HaploBlocker. This study provides recommendations for optimal haplotype construction and identifies relevant parameters for constructing haplotypes in the context of genomic prediction.
Collapse
Affiliation(s)
- Yan-Cheng Lin
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Manfred Mayer
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Bayer CropScience Deutschland GmbH, Borken, Germany
| | - Daniel Valle Torres
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Sugar Beet Breeding, Strube Research GmbH & Co. KG, Söllingen, Germany
| | - Torsten Pook
- Animal Breeding and Genomics, Wageningen University & Research, Wageningen, Netherlands
| | - Armin C. Hölker
- Product Development Maize and Oil Crops, KWS SAAT SE & Co. KGaA, Einbeck, Germany
| | - Thomas Presterl
- Product Development Maize and Oil Crops, KWS SAAT SE & Co. KGaA, Einbeck, Germany
| | - Milena Ouzunova
- Product Development Maize and Oil Crops, KWS SAAT SE & Co. KGaA, Einbeck, Germany
| | - Chris-Carolin Schön
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|
11
|
Wang H, Bai Y, Biligetu B. Effects of SNP marker density and training population size on prediction accuracy in alfalfa (Medicago sativa L.) genomic selection. THE PLANT GENOME 2024; 17:e20431. [PMID: 38263612 DOI: 10.1002/tpg2.20431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 11/29/2023] [Accepted: 01/04/2024] [Indexed: 01/25/2024]
Abstract
Effects of individual single-nucleotide polymorphism (SNP) markers and the size of "training" and "test" populations affect prediction accuracy in genomic selection (GS). This study evaluated 11 subsets of 4932 SNPs using six genetic additive methods to understand marker density in GS prediction in alfalfa (Medicago sativa L.). In the GS methods, the effect of "training" to "test" population size was also evaluated. Fourteen alfalfa populations sampled from long-term grazing sites were genotyped using genotyping by sequencing for the identification of SNPs. These populations were also phenotyped for six agromorphological and three nutritive traits from 2018 to 2020. The accuracy of GS prediction improved across six GS methods when the ratio of "training" to "test" population size increased. However, the prediction accuracy of the six GS methods reduced to a range of -0.27 to 0.11 when random, uninformative SNPs were used. In this study, five Bayesian methods and ridge-regression best linear unbiased prediction (rrBLUP) method had similar GS accuracies for "training" sets, but rrBLUP tended to outperform Bayesian methods in independent "test" sets when SNP subsets with high mean-squared-estimated-marker effect were used. These findings can enhance the application of GS in alfalfa genetic improvement.
Collapse
Affiliation(s)
- Hu Wang
- Department of Plant Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Yuguang Bai
- Department of Plant Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Bill Biligetu
- Department of Plant Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
12
|
Cuyabano BCD, Boichard D, Gondro C. Expected values for the accuracy of predicted breeding values accounting for genetic differences between reference and target populations. Genet Sel Evol 2024; 56:15. [PMID: 38424504 PMCID: PMC11234767 DOI: 10.1186/s12711-024-00876-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/08/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Genetic merit, or breeding values as referred to in livestock and crop breeding programs, is one of the keys to the successful selection of animals in commercial farming systems. The developments in statistical methods during the twentieth century and single nucleotide polymorphism (SNP) chip technologies in the twenty-first century have revolutionized agricultural production, by allowing highly accurate predictions of breeding values for selection candidates at a very early age. Nonetheless, for many breeding populations, realized accuracies of predicted breeding values (PBV) remain below the theoretical maximum, even when the reference population is sufficiently large, and SNPs included in the model are in sufficient linkage disequilibrium (LD) with the quantitative trait locus (QTL). This is particularly noticeable over generations, as we observe the so-called erosion of the effects of SNPs due to recombinations, accompanied by the erosion of the accuracy of prediction. While accurately quantifying the erosion at the individual SNP level is a difficult and unresolved task, quantifying the erosion of the accuracy of prediction is a more tractable problem. In this paper, we describe a method that uses the relationship between reference and target populations to calculate expected values for the accuracies of predicted breeding values for non-phenotyped individuals accounting for erosion. The accuracy of the expected values was evaluated through simulations, and a further evaluation was performed on real data. RESULTS Using simulations, we empirically confirmed that our expected values for the accuracy of PBV accounting for erosion were able to correctly determine the prediction accuracy of breeding values for non-phenotyped individuals. When comparing the expected to the realized accuracies of PBV with real data, only one out of the four traits evaluated presented accuracies that were significantly higher than the expected, approachingh 2 . CONCLUSIONS We defined an index of genetic correlation between reference and target populations, which summarizes the expected overall erosion due to differences in allele frequencies and LD patterns between populations. We used this correlation along with a trait's heritability to derive expected values for the accuracy ( R ) of PBV accounting for the erosion, and demonstrated that our derived E R | erosion is a reliable metric.
Collapse
Affiliation(s)
- Beatriz C D Cuyabano
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France.
| | - Didier Boichard
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France
| | - Cedric Gondro
- Department of Animal Science, Michigan State University, 474 S Shaw Ln, East Lansing, MI, 48824, USA
| |
Collapse
|
13
|
Hoque A, Anderson JV, Rahman M. Genomic prediction for agronomic traits in a diverse Flax (Linum usitatissimum L.) germplasm collection. Sci Rep 2024; 14:3196. [PMID: 38326469 PMCID: PMC10850546 DOI: 10.1038/s41598-024-53462-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/31/2024] [Indexed: 02/09/2024] Open
Abstract
Breeding programs require exhaustive phenotyping of germplasms, which is time-demanding and expensive. Genomic prediction helps breeders harness the diversity of any collection to bypass phenotyping. Here, we examined the genomic prediction's potential for seed yield and nine agronomic traits using 26,171 single nucleotide polymorphism (SNP) markers in a set of 337 flax (Linum usitatissimum L.) germplasm, phenotyped in five environments. We evaluated 14 prediction models and several factors affecting predictive ability based on cross-validation schemes. Models yielded significant variation among predictive ability values across traits for the whole marker set. The ridge regression (RR) model covering additive gene action yielded better predictive ability for most of the traits, whereas it was higher for low heritable traits by models capturing epistatic gene action. Marker subsets based on linkage disequilibrium decay distance gave significantly higher predictive abilities to the whole marker set, but for randomly selected markers, it reached a plateau above 3000 markers. Markers having significant association with traits improved predictive abilities compared to the whole marker set when marker selection was made on the whole population instead of the training set indicating a clear overfitting. The correction for population structure did not increase predictive abilities compared to the whole collection. However, stratified sampling by picking representative genotypes from each cluster improved predictive abilities. The indirect predictive ability for a trait was proportionate to its correlation with other traits. These results will help breeders to select the best models, optimum marker set, and suitable genotype set to perform an indirect selection for quantitative traits in this diverse flax germplasm collection.
Collapse
Affiliation(s)
- Ahasanul Hoque
- Department of Plant Sciences, North Dakota State University, Fargo, ND, USA
- Department of Genetics and Plant Breeding, Bangladesh Agricultural University, Mymensingh, 2202, Bangladesh
| | - James V Anderson
- USDA-ARS, Edward T. Schafer Agricultural Research Center, Fargo, ND, USA
| | - Mukhlesur Rahman
- Department of Plant Sciences, North Dakota State University, Fargo, ND, USA.
| |
Collapse
|
14
|
Montesinos-López OA, Crespo-Herrera L, Xavier A, Godwa M, Beyene Y, Pierre CS, de la Rosa-Santamaria R, Salinas-Ruiz J, Gerard G, Vitale P, Dreisigacker S, Lillemo M, Grignola F, Sarinelli M, Pozzo E, Quiroga M, Montesinos-López A, Crossa J. A marker weighting approach for enhancing within-family accuracy in genomic prediction. G3 (BETHESDA, MD.) 2024; 14:jkad278. [PMID: 38079160 PMCID: PMC10849334 DOI: 10.1093/g3journal/jkad278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 11/27/2023] [Indexed: 02/09/2024]
Abstract
Genomic selection is revolutionizing plant breeding. However, its practical implementation is still very challenging, since predicted values do not necessarily have high correspondence to the observed phenotypic values. When the goal is to predict within-family, it is not always possible to obtain reasonable accuracies, which is of paramount importance to improve the selection process. For this reason, in this research, we propose the Adversaria-Boruta (AB) method, which combines the virtues of the adversarial validation (AV) method and the Boruta feature selection method. The AB method operates primarily by minimizing the disparity between training and testing distributions. This is accomplished by reducing the weight assigned to markers that display the most significant differences between the training and testing sets. Therefore, the AB method built a weighted genomic relationship matrix that is implemented with the genomic best linear unbiased predictor (GBLUP) model. The proposed AB method is compared using 12 real data sets with the GBLUP model that uses a nonweighted genomic relationship matrix. Our results show that the proposed AB method outperforms the GBLUP by 8.6, 19.7, and 9.8% in terms of Pearson's correlation, mean square error, and normalized root mean square error, respectively. Our results support that the proposed AB method is a useful tool to improve the prediction accuracy of a complete family, however, we encourage other investigators to evaluate the AB method to increase the empirical evidence of its potential.
Collapse
Affiliation(s)
| | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | - Alencar Xavier
- Corteva Agrisciences, 8305 NW 62nd Ave, Johnston, IA 50131, USA
- Purdue University, 915W State Street, West Lafayette, IN 47907, USA
| | - Manje Godwa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | - Yoseph Beyene
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | - Carolina Saint Pierre
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | | | - Josafhat Salinas-Ruiz
- Colegio de Postgraduados Campus Córdoba, Carretera Federal Córdoba-Veracruz km 348, Manuel León, Amatlán de los Reyes, Veracruz, CP 94953, Mexico
| | - Guillermo Gerard
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | - Paolo Vitale
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
| | - Morten Lillemo
- Department of Plant Science, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1433 As, Norway
| | | | | | | | | | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430, Guadalajara, Jalisco, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, CP 52640, Edo. de México, Mexico
- Colegio de Postgraduados, Montecillos, Edo. de México CP 56230, Mexico
| |
Collapse
|
15
|
Ousmael KM, Cappa EP, Hansen JK, Hendre P, Hansen OK. Genomic evaluation for breeding and genetic management in Cordia africana, a multipurpose tropical tree species. BMC Genomics 2024; 25:9. [PMID: 38166623 PMCID: PMC10759591 DOI: 10.1186/s12864-023-09907-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 12/14/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Planting tested forest reproductive material is crucial to ensure the increased resilience of intensively managed productive stands for timber and wood product markets under climate change scenarios. Single-step Genomic Best Linear Unbiased Prediction (ssGBLUP) analysis is a cost-effective option for using genomic tools to enhance the accuracy of predicted breeding values and genetic parameter estimation in forest tree species. Here, we tested the efficiency of ssGBLUP in a tropical multipurpose tree species, Cordia africana, by partial population genotyping. A total of 8070 trees from three breeding seedling orchards (BSOs) were phenotyped for height. We genotyped 6.1% of the phenotyped individuals with 4373 single nucleotide polymorphisms. The results of ssGBLUP were compared with pedigree-based best linear unbiased prediction (ABLUP) and genomic best linear unbiased prediction (GBLUP), based on genetic parameters, theoretical accuracy of breeding values, selection candidate ranking, genetic gain, and predictive accuracy and prediction bias. RESULTS Genotyping a subset of the study population provided insights into the level of relatedness in BSOs, allowing better genetic management. Due to the inbreeding detected within the genotyped provenances, we estimated genetic parameters both with and without accounting for inbreeding. The ssGBLUP model showed improved performance in terms of additive genetic variance and theoretical breeding value accuracy. Similarly, ssGBLUP showed improved predictive accuracy and lower bias than the pedigree-based relationship matrix (ABLUP). CONCLUSIONS This study of C. africana, a species in decline due to deforestation and selective logging, revealed inbreeding depression. The provenance exhibiting the highest level of inbreeding had the poorest overall performance. The use of different relationship matrices and accounting for inbreeding did not substantially affect the ranking of candidate individuals. This is the first study of this approach in a tropical multipurpose tree species, and the analysed BSOs represent the primary effort to breed C. africana.
Collapse
Affiliation(s)
- Kedra M Ousmael
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Rolighedsvej 23, 1958, Frederiksberg C, Denmark.
| | - Eduardo P Cappa
- Instituto Nacional de Tecnología Agropecuaria (INTA), Instituto de Recursos Biológicos, Centro de Investigación en Recursos Naturales, De Los Reseros y Dr. Nicolás Repetto s/n, 1686, Hurlingham, Buenos Aires, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina
| | - Jon K Hansen
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Rolighedsvej 23, 1958, Frederiksberg C, Denmark
| | - Prasad Hendre
- World Agroforestry Centre (ICRAF), United Nations Avenue, Nairobi, 00100, Kenya
| | - Ole K Hansen
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Rolighedsvej 23, 1958, Frederiksberg C, Denmark
| |
Collapse
|
16
|
Zhang Y, Zhang M, Ye J, Xu Q, Feng Y, Xu S, Hu D, Wei X, Hu P, Yang Y. Integrating genome-wide association study into genomic selection for the prediction of agronomic traits in rice ( Oryza sativa L.). MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2023; 43:81. [PMID: 37965378 PMCID: PMC10641074 DOI: 10.1007/s11032-023-01423-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/09/2023] [Indexed: 11/16/2023]
Abstract
Accurately identifying varieties with targeted agronomic traits was thought to contribute to genetic selection and accelerate rice breeding progress. Genomic selection (GS) is a promising technique that uses markers covering the whole genome to predict the genomic-estimated breeding values (GEBV), with the ability to select before phenotypes are measured. To choose the appropriate GS models for breeding work, we analyzed the predictability of nine agronomic traits measured from a population of 459 diverse rice varieties. By the comparison of eight representative GS models, we found that the prediction accuracies ranged from 0.407 to 0.896, with reproducing kernel Hilbert space (RKHS) having the highest predictive ability in most traits. Further results demonstrated the predictivity of GS is altered by several factors. Moreover, we assessed the method of integrating genome-wide association study (GWAS) into various GS models. The predictabilities of GS combined peak-associated markers generated from six different GWAS models were significantly different; a recommendation of Mixed Linear Model (MLM)-RKHS was given for the GWAS-GS-integrated prediction. Finally, based on the above result, we experimented with applying the P-values obtained from optimal GWAS models into ridge regression best linear unbiased prediction (rrBLUP), which benefited the low predictive traits in rice. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-023-01423-y.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Mengchen Zhang
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| | - Junhua Ye
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Qun Xu
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Yue Feng
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| | - Siliang Xu
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Dongxiu Hu
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Xinghua Wei
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| | - Peisong Hu
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Yaolong Yang
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| |
Collapse
|
17
|
Liu H, Yu S. A dimensionality-reduction genomic prediction method without direct inverse of the genomic relationship matrix for large genomic data. PLANT CELL REPORTS 2023; 42:1825-1832. [PMID: 37750948 DOI: 10.1007/s00299-023-03069-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/08/2023] [Indexed: 09/27/2023]
Abstract
KEY MESSAGE A new genomic prediction method (RHPP) was developed via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. Computational efficiency is becoming a hot issue in the practical application of genomic prediction due to the large number of data generated by the high-throughput genotyping technology. In this study, we developed a fast genomic prediction method RHPP via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. The simulation results demonstrated similar prediction accuracy between RHPP and GBLUP, and significantly higher computational efficiency of the former with the increase of individuals. The results of real datasets of both bread wheat and loblolly pine demonstrated that RHPP had a similar or better predictive accuracy in most cases compared with GBLUP. In the future, RHPP may be an attractive choice for analyzing large-scale and high-dimensional data.
Collapse
Affiliation(s)
- Hailan Liu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China.
| | - Shizhou Yu
- Molecular Genetics Key Laboratory of China Tobacco, Guizhou Academy of Tobacco Science, Guiyang, 550081, Guizhou, China.
| |
Collapse
|
18
|
Weber SE, Chawla HS, Ehrig L, Hickey LT, Frisch M, Snowdon RJ. Accurate prediction of quantitative traits with failed SNP calls in canola and maize. FRONTIERS IN PLANT SCIENCE 2023; 14:1221750. [PMID: 37936929 PMCID: PMC10627008 DOI: 10.3389/fpls.2023.1221750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 10/05/2023] [Indexed: 11/09/2023]
Abstract
In modern plant breeding, genomic selection is becoming the gold standard to select superior genotypes in large breeding populations that are only partially phenotyped. Many breeding programs commonly rely on single-nucleotide polymorphism (SNP) markers to capture genome-wide data for selection candidates. For this purpose, SNP arrays with moderate to high marker density represent a robust and cost-effective tool to generate reproducible, easy-to-handle, high-throughput genotype data from large-scale breeding populations. However, SNP arrays are prone to technical errors that lead to failed allele calls. To overcome this problem, failed calls are often imputed, based on the assumption that failed SNP calls are purely technical. However, this ignores the biological causes for failed calls-for example: deletions-and there is increasing evidence that gene presence-absence and other kinds of genome structural variants can play a role in phenotypic expression. Because deletions are frequently not in linkage disequilibrium with their flanking SNPs, permutation of missing SNP calls can potentially obscure valuable marker-trait associations. In this study, we analyze published datasets for canola and maize using four parametric and two machine learning models and demonstrate that failed allele calls in genomic prediction are highly predictive for important agronomic traits. We present two statistical pipelines, based on population structure and linkage disequilibrium, that enable the filtering of failed SNP calls that are likely caused by biological reasons. For the population and trait examined, prediction accuracy based on these filtered failed allele calls was competitive to standard SNP-based prediction, underlying the potential value of missing data in genomic prediction approaches. The combination of SNPs with all failed allele calls or the filtered allele calls did not outperform predictions with only SNP-based prediction due to redundancy in genomic relationship estimates.
Collapse
Affiliation(s)
- Sven E. Weber
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | | | - Lennard Ehrig
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | - Lee T. Hickey
- Centre for Crop Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD, Australia
| | - Matthias Frisch
- Department of Biometry and Population Genetics, Justus Liebig University, Giessen, Germany
| | - Rod J. Snowdon
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| |
Collapse
|
19
|
Weber SE, Frisch M, Snowdon RJ, Voss-Fels KP. Haplotype blocks for genomic prediction: a comparative evaluation in multiple crop datasets. FRONTIERS IN PLANT SCIENCE 2023; 14:1217589. [PMID: 37731980 PMCID: PMC10507710 DOI: 10.3389/fpls.2023.1217589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 08/21/2023] [Indexed: 09/22/2023]
Abstract
In modern plant breeding, genomic selection is becoming the gold standard for selection of superior genotypes. The basis for genomic prediction models is a set of phenotyped lines along with their genotypic profile. With high marker density and linkage disequilibrium (LD) between markers, genotype data in breeding populations tends to exhibit considerable redundancy. Therefore, interest is growing in the use of haplotype blocks to overcome redundancy by summarizing co-inherited features. Moreover, haplotype blocks can help to capture local epistasis caused by interacting loci. Here, we compared genomic prediction methods that either used single SNPs or haplotype blocks with regards to their prediction accuracy for important traits in crop datasets. We used four published datasets from canola, maize, wheat and soybean. Different approaches to construct haplotype blocks were compared, including blocks based on LD, physical distance, number of adjacent markers and the algorithms implemented in the software "Haploview" and "HaploBlocker". The tested prediction methods included Genomic Best Linear Unbiased Prediction (GBLUP), Extended GBLUP to account for additive by additive epistasis (EGBLUP), Bayesian LASSO and Reproducing Kernel Hilbert Space (RKHS) regression. We found improved prediction accuracy in some traits when using haplotype blocks compared to SNP-based predictions, however the magnitude of improvement was very trait- and model-specific. Especially in settings with low marker density, haplotype blocks can improve genomic prediction accuracy. In most cases, physically large haplotype blocks yielded a strong decrease in prediction accuracy. Especially when prediction accuracy varies greatly across different prediction models, prediction based on haplotype blocks can improve prediction accuracy of underperforming models. However, there is no "best" method to build haplotype blocks, since prediction accuracy varied considerably across methods and traits. Hence, criteria used to define haplotype blocks should not be viewed as fixed biological parameters, but rather as hyperparameters that need to be adjusted for every dataset.
Collapse
Affiliation(s)
- Sven E. Weber
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | - Matthias Frisch
- Department of Biometry and Population Genetics, Justus Liebig University, Giessen, Germany
| | - Rod J. Snowdon
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | - Kai P. Voss-Fels
- Institute for Grapevine Breeding, Hochschule Geisenheim University, Geisenheim, Germany
| |
Collapse
|
20
|
Liu Z, Sun H, Zhang Y, Du M, Xiang J, Li X, Chang Y, Sun J, Cheng X, Xiong M, Zhao Z, Liu E. Mining the candidate genes of rice panicle traits via a genome-wide association study. Front Genet 2023; 14:1239550. [PMID: 37732315 PMCID: PMC10507276 DOI: 10.3389/fgene.2023.1239550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 08/16/2023] [Indexed: 09/22/2023] Open
Abstract
Panicle traits are important for improving the panicle architecture and grain yield of rice. Therefore, we performed a genome-wide association study (GWAS) to analyze and determine the genetic determinants of five panicle traits. A total of 1.29 million single nucleotide polymorphism (SNP) loci were detected in 162 rice materials. We carried out a GWAS of panicle length (PL), total grain number per panicle (TGP), filled grain number per panicle (FGP), seed setting rate (SSR) and grain weight per panicle (GWP) in 2019, 2020 and 2021. Four quantitative trait loci (QTLs) for PL were detected on chromosomes 1, 6, and 9; one QTL for TGP, FGP, and GWP was detected on chromosome 4; two QTLs for FGP were detected on chromosomes 4 and 7; and one QTL for SSR was detected on chromosome 1. These QTLs were detected via a general linear model (GLM) and mixed linear model (MLM) in both years of the study period. In this study, the genomic best linear unbiased prediction (BLUP) method was used to verify the accuracy of the GWAS results. There are nine QTLs were both detected by the multi-environment GWAS method and the BLUP method. Moreover, further analysis revealed that three candidate genes, LOC_Os01g43700, LOC_Os09g25784, and LOC_Os04g47890, may be significantly related to panicle traits of rice. Haplotype analysis indicated that LOC_Os01g43700 and LOC_Os09g25784 are highly associated with PL and that LOC_Os04g47890 is highly associated with TGP, FGP, and GWP. Our results offer essential genetic information for the molecular improvement of panicle traits. The identified candidate genes and elite haplotypes could be used in marker-assisted selection to improve rice yield through pyramid breeding.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Erbao Liu
- College of Agronomy, Anhui Agricultural University, Hefei, China
| |
Collapse
|
21
|
Adekale D, Alkhoder H, Liu Z, Segelke D, Tetens J. Single-step SNPBLUP evaluation in six German beef cattle breeds. J Anim Breed Genet 2023; 140:496-507. [PMID: 37061869 DOI: 10.1111/jbg.12774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 04/03/2023] [Accepted: 04/04/2023] [Indexed: 04/17/2023]
Abstract
The implementation of genomic selection for six German beef cattle populations was evaluated. Although the multiple-step implementation of genomic selection is the status quo in most national dairy cattle evaluations, the breeding structure of German beef cattle, coupled with the shortcoming and complexity of the multiple-step method, makes single step a more attractive option to implement genomic selection in German beef cattle populations. Our objective was to develop a national beef cattle single-step genomic evaluation in five economically important traits in six German beef cattle populations and investigate its impact on the accuracy and bias of genomic evaluations relative to the current pedigree-based evaluation. Across the six breeds in our study, 461,929 phenotyped and 14,321 genotyped animals were evaluated with a multi-trait single-step model. To validate the single-step model, phenotype data in the last 2 years were removed in a forward validation study. For the conventional and single-step approaches, the genomic estimated breeding values of validation animals and other animals were compared between the truncated and the full evaluations. The correlation of the GEBVs between the full and truncated evaluations in the validation animals was slightly higher in the single-step evaluation. The regression of the full GEBVs on truncated GEBVs was close to the optimal value of 1 for both the pedigree-based and the single-step evaluations. The SNP effect estimates from the truncated evaluation were highly correlated with those from the full evaluation, with values ranging from 0.79 to 0.94. The correlation of the SNP effect was influenced by the number of genotyped animals shared between the full and truncated evaluations. The regression coefficients of the SNP effect of the full evaluation on the truncated evaluation were all close to the expected value of 1, indicating unbiased estimates of the SNP markers for the production traits. The Manhattan plot of the SNP effect estimates identified chromosomal regions harbouring major genes for muscling and body weight in breeds of French origin. Based on the regression intercept and slope of the GEBVs of validation animals, the single-step evaluation was neither inflated nor deflated across the six breeds. Overall, the single-step model resulted in a more accurate and stable evaluation. However, due to the small number of genotyped individuals, the single-step method only provided slightly better results when compared to the pedigree-based method.
Collapse
Affiliation(s)
- Damilola Adekale
- Functional Breeding - Genetik und züchterische Verbesserung funktionaler Merkmale, GAU, Göttingen, Germany
- Biometrie, Vereinigte Informationssysteme Tierhaltung w.V., Verden, Germany
| | - Hatem Alkhoder
- Biometrie, Vereinigte Informationssysteme Tierhaltung w.V., Verden, Germany
| | - Zengting Liu
- Biometrie, Vereinigte Informationssysteme Tierhaltung w.V., Verden, Germany
| | - Dierck Segelke
- Biometrie, Vereinigte Informationssysteme Tierhaltung w.V., Verden, Germany
| | - Jens Tetens
- Functional Breeding - Genetik und züchterische Verbesserung funktionaler Merkmale, GAU, Göttingen, Germany
| |
Collapse
|
22
|
Feldmann MJ, Covarrubias-Pazaran G, Piepho HP. Complex traits and candidate genes: estimation of genetic variance components across multiple genetic architectures. G3 (BETHESDA, MD.) 2023; 13:jkad148. [PMID: 37405459 PMCID: PMC10468314 DOI: 10.1093/g3journal/jkad148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/09/2023] [Accepted: 06/12/2023] [Indexed: 07/06/2023]
Abstract
Large-effect loci-those statistically significant loci discovered by genome-wide association studies or linkage mapping-associated with key traits segregate amidst a background of minor, often undetectable, genetic effects in wild and domesticated plants and animals. Accurately attributing mean differences and variance explained to the correct components in the linear mixed model analysis is vital for selecting superior progeny and parents in plant and animal breeding, gene therapy, and medical genetics in humans. Marker-assisted prediction and its successor, genomic prediction, have many advantages for selecting superior individuals and understanding disease risk. However, these two approaches are less often integrated to study complex traits with different genetic architectures. This simulation study demonstrates that the average semivariance can be applied to models incorporating Mendelian, oligogenic, and polygenic terms simultaneously and yields accurate estimates of the variance explained for all relevant variables. Our previous research focused on large-effect loci and polygenic variance separately. This work aims to synthesize and expand the average semivariance framework to various genetic architectures and the corresponding mixed models. This framework independently accounts for the effects of large-effect loci and the polygenic genetic background and is universally applicable to genetics studies in humans, plants, animals, and microbes.
Collapse
Affiliation(s)
- Mitchell J Feldmann
- Department of Plant Sciences, University of California Davis, One Shields Ave, Davis, CA 95616, USA
| | - Giovanny Covarrubias-Pazaran
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, El Batán, 56130 Texcoco, Edo. de México, México
| | - Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Stuttgart 70599, Germany
| |
Collapse
|
23
|
Melchinger AE, Fernando R, Stricker C, Schön CC, Auinger HJ. Genomic prediction in hybrid breeding: I. Optimizing the training set design. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:176. [PMID: 37532821 PMCID: PMC10397156 DOI: 10.1007/s00122-023-04413-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 06/23/2023] [Indexed: 08/04/2023]
Abstract
KEY MESSAGE Training sets produced by maximizing the number of parent lines, each involved in one cross, had the highest prediction accuracy for H0 hybrids, but lowest for H1 and H2 hybrids. Genomic prediction holds great promise for hybrid breeding but optimum composition of the training set (TS) as determined by the number of parents (nTS) and crosses per parent (c) has received little attention. Our objective was to examine prediction accuracy ([Formula: see text]) of GCA for lines used as parents of the TS (I1 lines) or not (I0 lines), and H0, H1 and H2 hybrids, comprising crosses of type I0 × I0, I1 × I0 and I1 × I1, respectively, as function of nTS and c. In the theory, we developed estimates for [Formula: see text] of GBLUPs for hybrids: (i)[Formula: see text] based on the expected prediction accuracy, and (ii) [Formula: see text] based on [Formula: see text] of GBLUPs of GCA and SCA effects. In the simulation part, hybrid populations were generated using molecular data from two experimental maize data sets. Additive and dominance effects of QTL borrowed from literature were used to simulate six scenarios of traits differing in the proportion (τSCA = 1%, 6%, 22%) of SCA variance in σG2 and heritability (h2 = 0.4, 0.8). Values of [Formula: see text] and [Formula: see text] closely agreed with [Formula: see text] for hybrids. For given size NTS = nTS × c of TS, [Formula: see text] of H0 hybrids and GCA of I0 lines was highest for c = 1. Conversely, for GCA of I1 lines and H1 and H2 hybrids, c = 1 yielded lowest [Formula: see text] with concordant results across all scenarios for both data sets. In view of these opposite trends, the optimum choice of c for maximizing selection response across all types of hybrids depends on the size and resources of the breeding program.
Collapse
Affiliation(s)
- Albrecht E Melchinger
- Plant Breeding, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany.
| | - Rohan Fernando
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA
| | - Christian Stricker
- Plant Breeding, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Chris-Carolin Schön
- Plant Breeding, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Hans-Jürgen Auinger
- Plant Breeding, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| |
Collapse
|
24
|
Wicki M, Raoul J, Legarra A. Effect of subdivision of the Lacaune dairy sheep breed on the accuracy of genomic prediction. J Dairy Sci 2023; 106:5570-5581. [PMID: 37349212 DOI: 10.3168/jds.2022-23114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 02/16/2023] [Indexed: 06/24/2023]
Abstract
Genomic selection was deployed in Lacaune dairy breed in 2015. Lacaune population split in 1972 into 2 breeding companies with associated flocks, and there have been very few exchanges of animals between the subpopulations, leading to divergence of the 2 subpopulations. In spite of that, there is a joint genomic prediction. The objective of this study is to understand how this structuring affects prediction accuracy. We analyzed all the data available from Lacaune breeding program for milk yield: around 6 million phenotypes, 2 million animals in the pedigree and more than 29,000 genotyped animals, including 3,434 and 2,868 AI rams for each company. To consider missing pedigree, we set up genetic groups using the theory of metafounders. First, we studied the pedigree and genomic structures of the 2 subpopulations calculating Fst, evolution of average pedigree relationships across time and principal components analysis of genomic relationships. In a second part, we compared the reliability between different scenarios: an evaluation with a single reference population (Alone), an evaluation with a joint reference population (Together) and an evaluation of one subpopulation based on the reference population of the other group (Indirect). The low Fst value (0.02) reveals that the 2 subpopulations are still genetically close. Nevertheless, a low and constant average relationship between the animals of the 2 subpopulations confirms the absence of recent connections between them. We can see with principal component analysis results that even if they are close, they diverge over time. Finally, we observe small gains in accuracy of Together versus Alone, in spite of whereas doubling the reference population size in Together. These gains vary across years and subpopulations: less than 0.08 (0.46 to 0.54; ratio of accuracy for the partial and whole evaluations-corresponding to the greatest change in this ratio for breeding company 1, observed for the cohort 2016) for one subpopulation and between 0.03 (0.55 to 0.58) and 0.17 (0.48 to 0.65) for the other. To conclude, the 2 subpopulations remain close enough genetically so that their combined evaluation is advantageous, even if only slightly.
Collapse
Affiliation(s)
- M Wicki
- INRAE, INP, UMR 1388 GenPhySE, F-31326 Castanet-Tolosan, France; Institut de l'Elevage, Castanet-Tolosan 31321, France.
| | - J Raoul
- INRAE, INP, UMR 1388 GenPhySE, F-31326 Castanet-Tolosan, France; Institut de l'Elevage, Castanet-Tolosan 31321, France
| | - A Legarra
- INRAE, INP, UMR 1388 GenPhySE, F-31326 Castanet-Tolosan, France
| |
Collapse
|
25
|
Wang N, Zhang W, Wang X, Zheng Z, Bai D, Li K, Zhao X, Xiang J, Liang Z, Qian Y, Wang W, Shi Y. Genome-Wide Association Study of Xian Rice Grain Shape and Weight in Different Environments. PLANTS (BASEL, SWITZERLAND) 2023; 12:2549. [PMID: 37447110 PMCID: PMC10347298 DOI: 10.3390/plants12132549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 06/29/2023] [Accepted: 07/03/2023] [Indexed: 07/15/2023]
Abstract
Drought is one of the key environmental factors affecting the growth and yield potential of rice. Grain shape, on the other hand, is an important factor determining the appearance, quality, and yield of rice grains. Here, we re-sequenced 275 Xian accessions and then conducted a genome-wide association study (GWAS) on six agronomic traits with the 404,411 single nucleotide polymorphisms (SNPs) derived by the best linear unbiased prediction (BLUP) for each trait. Under two years of drought stress (DS) and normal water (NW) treatments, a total of 16 QTLs associated with rice grain shape and grain weight were detected on chromosomes 1, 2, 3, 4, 5, 7, 8, 11, and 12. In addition, these QTLs were analyzed by haplotype analysis and functional annotation, and one clone (GSN1) and five new candidate genes were identified in the candidate interval. The findings provide important genetic information for the molecular improvement of grain shape and weight in rice.
Collapse
Affiliation(s)
- Nansheng Wang
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Wanyang Zhang
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Xinchen Wang
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Zhenzhen Zheng
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Di Bai
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Keyang Li
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Xueyu Zhao
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Jun Xiang
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Zhaojie Liang
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Yingzhi Qian
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| | - Wensheng Wang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Yingyao Shi
- College of Agronomy, Anhui Agricultural University, Hefei 230000, China; (N.W.); (W.Z.); (X.W.); (Z.Z.); (D.B.); (K.L.); (X.Z.); (J.X.); (Z.L.); (Y.Q.)
| |
Collapse
|
26
|
Chen ZQ, Klingberg A, Hallingbäck HR, Wu HX. Preselection of QTL markers enhances accuracy of genomic selection in Norway spruce. BMC Genomics 2023; 24:147. [PMID: 36973641 PMCID: PMC10041705 DOI: 10.1186/s12864-023-09250-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/15/2023] [Indexed: 03/29/2023] Open
Abstract
Genomic prediction (GP) or genomic selection is a method to predict the accumulative effect of all quantitative trait loci (QTLs) in a population by estimating the realized genomic relationships between the individuals and by capturing the linkage disequilibrium between markers and QTLs. Thus, marker preselection is considered a promising method to capture Mendelian segregation effects. Using QTLs detected in a genome-wide association study (GWAS) may improve GP. Here, we performed GWAS and GP in a population with 904 clones from 32 full-sib families using a newly developed 50 k SNP Norway spruce array. Through GWAS we identified 41 SNPs associated with budburst stage (BB) and the largest effect association explained 5.1% of the phenotypic variation (PVE). For the other five traits such as growth and wood quality traits, only 2 - 13 associations were observed and the PVE of the strongest effects ranged from 1.2% to 2.0%. GP using approximately 100 preselected SNPs, based on the smallest p-values from GWAS showed the greatest predictive ability (PA) for the trait BB. For the other traits, a preselection of 2000-4000 SNPs, was found to offer the best model fit according to the Akaike information criterion being minimized. But PA-magnitudes from GP using such selections were still similar to that of GP using all markers. Analyses on both real-life and simulated data also showed that the inclusion of a large QTL SNP in the model as a fixed effect could improve PA and accuracy of GP provided that the PVE of the QTL was ≥ 2.5%.
Collapse
Affiliation(s)
- Zhi-Qiang Chen
- Umeå Plant Science Centre, Department Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, 90183, Umeå, Sweden.
| | | | | | - Harry X Wu
- Umeå Plant Science Centre, Department Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, 90183, Umeå, Sweden.
- Black Mountain Laboratory, CSIRO National Collection Research Australia, Canberra, ACT, 2601, Australia.
| |
Collapse
|
27
|
Jiménez NP, Feldmann MJ, Famula RA, Pincot DDA, Bjornson M, Cole GS, Knapp SJ. Harnessing underutilized gene bank diversity and genomic prediction of cross usefulness to enhance resistance to Phytophthora cactorum in strawberry. THE PLANT GENOME 2023; 16:e20275. [PMID: 36480594 DOI: 10.1002/tpg2.20275] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 09/19/2022] [Indexed: 05/10/2023]
Abstract
The development of strawberry (Fragaria × ananassa Duchesne ex Rozier) cultivars resistant to Phytophthora crown rot (PhCR), a devastating disease caused by the soil-borne pathogen Phytophthora cactorum (Lebert & Cohn) J. Schröt., has been challenging partly because the resistance phenotypes are quantitative and only moderately heritable. To develop deeper insights into the genetics of resistance and build the foundation for applying genomic selection, a genetically diverse training population was screened for resistance to California isolates of the pathogen. Here we show that genetic gains in breeding for resistance to PhCR have been negligible (3% of the cultivars tested were highly resistant and none surpassed early 20th century cultivars). Narrow-sense genomic heritability for PhCR resistance ranged from 0.41 to 0.75 among training population individuals. Using multivariate genome-wide association studies (GWAS), we identified a large-effect locus (predicted to be RPc2) that explained 43.6-51.6% of the genetic variance, was necessary but not sufficient for resistance, and was associated with calcium channel and other candidate genes with known plant defense functions. The addition of underutilized gene bank resources to our training population doubled additive genetic variance, increased the accuracy of genomic selection, and enabled the discovery of individuals carrying favorable alleles that are either rare or not present in modern cultivars. The incorporation of an RPc2-associated single-nucleotide polymorphism (SNP) as a fixed effect increased genomic prediction accuracy from 0.40 to 0.55. Finally, we show that parent selection using genomic-estimated breeding values, genetic variances, and cross usefulness holds promise for enhancing resistance to PhCR in strawberry.
Collapse
Affiliation(s)
- Nicolás P Jiménez
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Mitchell J Feldmann
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Randi A Famula
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Dominique D A Pincot
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Marta Bjornson
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Glenn S Cole
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Steven J Knapp
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| |
Collapse
|
28
|
Bandillo NB, Jarquin D, Posadas LG, Lorenz AJ, Graef GL. Genomic selection performs as effectively as phenotypic selection for increasing seed yield in soybean. THE PLANT GENOME 2023; 16:e20285. [PMID: 36447395 DOI: 10.1002/tpg2.20285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 10/08/2022] [Indexed: 05/10/2023]
Abstract
Increasing the rate of genetic gain for seed yield remains the primary breeding objective in both public and private soybean [Glycine max (L.) Merr.] breeding programs. Genomic selection (GS) has the potential to accelerate the rate of genetic gain for soybean seed yield. Limited studies to date have validated GS accuracy and directly compared GS with phenotypic selection (PS), and none have been reported in soybean. This study conducted the first empirical validation of GS for increasing seed yield using over 1,500 lines and over 7 yr (2010-2016) of replicated experiments in the University of Nebraska-Lincoln soybean breeding program. The study was designed to capture the varying genetic relatedness of the training population to three validation sets: two large biparental populations (TBP-1 and TBP-2) and a large validation set comprised of 457 preselected advanced lines derived from 45 biparental populations (TMP). We found that prediction accuracy (.54) realized in our validation experiments was comparable with what we obtained from a series of cross-validation experiments (.64). Both GS and PS were more effective for increasing the population mean performance compared with random selection (RS). We found a selection advantage of GS over PS, where higher genetic gain and identification of top-performing lines was maximized at 10-20% selected proportion. Genomic selection led to small increases in genetic similarity when compared with PS and RS presumably because of a significant shift on allelic frequencies toward the extremes, suggesting that it could erode genetic diversity more quickly. Overall, we found that GS can perform as effectively as PS but that measures should be considered to protect against loss of genetic variance when using GS.
Collapse
Affiliation(s)
- Nonoy B Bandillo
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
- Dep. of Plant Sciences, North Dakota State Univ., NDSU Dep. 7670, P.O. Box 6050, Fargo, ND, 58108-6050, USA
| | - Diego Jarquin
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
- Agronomy Dep., Univ. of Florida, 2089 McCarthy Hall B, Gainesville, FL, 32611, USA
| | - Luis G Posadas
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
| | - Aaron J Lorenz
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
- Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, St. Paul, MN, 55108-6026, USA
| | - George L Graef
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
| |
Collapse
|
29
|
Learning high-order interactions for polygenic risk prediction. PLoS One 2023; 18:e0281618. [PMID: 36763605 PMCID: PMC9916647 DOI: 10.1371/journal.pone.0281618] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/27/2023] [Indexed: 02/11/2023] Open
Abstract
Within the framework of precision medicine, the stratification of individual genetic susceptibility based on inherited DNA variation has paramount relevance. However, one of the most relevant pitfalls of traditional Polygenic Risk Scores (PRS) approaches is their inability to model complex high-order non-linear SNP-SNP interactions and their effect on the phenotype (e.g. epistasis). Indeed, they incur in a computational challenge as the number of possible interactions grows exponentially with the number of SNPs considered, affecting the statistical reliability of the model parameters as well. In this work, we address this issue by proposing a novel PRS approach, called High-order Interactions-aware Polygenic Risk Score (hiPRS), that incorporates high-order interactions in modeling polygenic risk. The latter combines an interaction search routine based on frequent itemsets mining and a novel interaction selection algorithm based on Mutual Information, to construct a simple and interpretable weighted model of user-specified dimensionality that can predict a given binary phenotype. Compared to traditional PRSs methods, hiPRS does not rely on GWAS summary statistics nor any external information. Moreover, hiPRS differs from Machine Learning-based approaches that can include complex interactions in that it provides a readable and interpretable model and it is able to control overfitting, even on small samples. In the present work we demonstrate through a comprehensive simulation study the superior performance of hiPRS w.r.t. state of the art methods, both in terms of scoring performance and interpretability of the resulting model. We also test hiPRS against small sample size, class imbalance and the presence of noise, showcasing its robustness to extreme experimental settings. Finally, we apply hiPRS to a case study on real data from DACHS cohort, defining an interaction-aware scoring model to predict mortality of stage II-III Colon-Rectal Cancer patients treated with oxaliplatin.
Collapse
|
30
|
Angarita Barajas BK, Cantet RJC, Steibel JP, Schrauf MF, Forneris NS. Heritability estimates and predictive ability for pig meat quality traits using identity-by-state and identity-by-descent relationships in an F 2 population. J Anim Breed Genet 2023; 140:13-27. [PMID: 36300585 DOI: 10.1111/jbg.12742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 10/05/2022] [Indexed: 12/13/2022]
Abstract
Genomic relationships can be computed with dense genome-wide genotypes through different methods, either based on identity-by-state (IBS) or identity-by-descent (IBD). The latter has been shown to increase the accuracy of both estimated relationships and predicted breeding values. However, it is not clear whether an IBD approach would achieve greater heritability ( h 2 ) and predictive ability ( r ̂ y , y ̂ ) than its IBS counterpart for data with low-depth pedigrees. Here, we compare both approaches in terms of the estimated of h 2 and r ̂ y , y ̂ , using data on meat quality and carcass traits recorded in experimental crossbred pigs, with a pedigree constrained to only three generations. Three animal models were fitted which differed on the relationship matrix: an IBS model ( G IBS ), an IBD (defined within the known pedigree) model ( G IBD ), and a pedigree model ( A 22 ). In 9 of 20 traits, the range of increase for the estimates of σ u 2 and h 2 was 1.2-2.9 times greater with G IBS and G IBD models than with A 22 . Whereas for all traits, both parameters were similar between genomic models. The r ̂ y , y ̂ of the genomic models was higher compared to A 22 . A scarce increment in r ̂ y , y ̂ was found with G IBS when compared to G IBD , most likely due to the former recovering sizeable relationships among founder F0 animals.
Collapse
Affiliation(s)
| | - Rodolfo J C Cantet
- Instituto de Investigaciones en Producción Animal (INPA-CONICET-UBA), Buenos Aires, Argentina.,Departamento de Producción Animal, Facultad de Agronomía, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Juan P Steibel
- Department of Animal Science, Michigan State University, East Lansing, Michigan, USA.,Department of Fisheries and Wildlife, Michigan State University, East Lansing, Michigan, USA
| | - Matias F Schrauf
- Departamento de Métodos Cuantitativos y Sistemas de Información, Facultad de Agronomía, Universidad de Buenos Aires, Buenos Aires, Argentina.,Animal Breeding & Genomics, Wageningen Livestock Research, Wageningen University & Research, Wageningen, The Netherlands
| | - Natalia S Forneris
- Instituto de Investigaciones en Producción Animal (INPA-CONICET-UBA), Buenos Aires, Argentina.,Departamento de Producción Animal, Facultad de Agronomía, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
31
|
DoVale JC, Carvalho HF, Sabadin F, Fritsche-Neto R. Genotyping marker density and prediction models effects in long-term breeding schemes of cross-pollinated crops. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:4523-4539. [PMID: 36261658 DOI: 10.1007/s00122-022-04236-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Accepted: 10/09/2022] [Indexed: 06/16/2023]
Abstract
In genomic recurrent selection, the more markers, the better because they buffer the linkage disequilibrium losses caused by recombination over cycles, and consequently, provide higher responses to selection. Reductions of genotyping marker density have been extensively evaluated as potential strategies to reduce the genotyping costs of genomic selection (GS). Low-density marker panels are appealing in GS because they entail lower multicollinearity and computing time and allow more individuals to be genotyped for the same cost. However, statistical models used in GS are usually evaluated with empirical data, using "static" training sets and populations. This may be adequate for making predictions during a breeding program's initial cycles but not for the long-term. Moreover, studies that focus on long selective breeding cycles generally do not consider GS models with the effect of dominance, which is particularly important for breeding outcomes in cross-pollinated crops. Hence, dominance effects are important and unexplored in GS for long-term programs involving allogamous species. To address it, we employed two approaches: analysis of empirical maize datasets and simulations of long-term breeding applying phenotypic and genomic recurrent selection (intrapopulation and reciprocal schemes). In both schemes, we simulated twenty breeding cycles and assessed the effect of marker density reduction on the population mean, the best crosses, additive variance, selective accuracy, and response to selection with models [additive, additive-dominant, general (GCA), and this plus specific combining ability (GCA + SCA)]. Our results indicate that marker reduction based on linkage disequilibrium levels provides useful predictions only within a cycle, as accuracy significantly decreases over cycles. In the long-term, without training set updating, high-marker density provides the best responses to selection. The model to be used depends on the breeding scheme: additive for intrapopulation and additive-dominant or GCA + SCA for reciprocal.
Collapse
Affiliation(s)
- Júlio César DoVale
- Department of Crop Science, Federal University of Ceará, Fortaleza, CE, Brazil.
| | | | - Felipe Sabadin
- Virginia Tech: Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | | |
Collapse
|
32
|
Ashraf B, Hunter DC, Bérénos C, Ellis PA, Johnston SE, Pilkington JG, Pemberton JM, Slate J. Genomic prediction in the wild: A case study in Soay sheep. Mol Ecol 2022; 31:6541-6555. [PMID: 34719074 DOI: 10.1111/mec.16262] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/13/2021] [Accepted: 10/25/2021] [Indexed: 01/13/2023]
Abstract
Genomic prediction, the technique whereby an individual's genetic component of their phenotype is estimated from its genome, has revolutionised animal and plant breeding and medical genetics. However, despite being first introduced nearly two decades ago, it has hardly been adopted by the evolutionary genetics community studying wild organisms. Here, genomic prediction is performed on eight traits in a wild population of Soay sheep. The population has been the focus of a >30 year evolutionary ecology study and there is already considerable understanding of the genetic architecture of the focal Mendelian and quantitative traits. We show that the accuracy of genomic prediction is high for all traits, but especially those with loci of large effect segregating. Five different methods are compared, and the two methods that can accommodate zero-effect and large-effect loci in the same model tend to perform best. If the accuracy of genomic prediction is similar in other wild populations, then there is a real opportunity for pedigree-free molecular quantitative genetics research to be enabled in many more wild populations; currently the literature is dominated by studies that have required decades of field data collection to generate sufficiently deep pedigrees. Finally, some of the potential applications of genomic prediction in wild populations are discussed.
Collapse
Affiliation(s)
- Bilal Ashraf
- School of Biosciences, University of Sheffield, Sheffield, UK.,Department of Anthropology, Durham University, Durham, UK
| | - Darren C Hunter
- School of Biosciences, University of Sheffield, Sheffield, UK.,School of Biology, University of St Andrews, St Andrews, UK
| | - Camillo Bérénos
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Philip A Ellis
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Susan E Johnston
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Jill G Pilkington
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | - Jon Slate
- School of Biosciences, University of Sheffield, Sheffield, UK
| |
Collapse
|
33
|
Hardner CM, Fikere M, Gasic K, da Silva Linge C, Worthington M, Byrne D, Rawandoozi Z, Peace C. Multi-environment genomic prediction for soluble solids content in peach ( Prunus persica). FRONTIERS IN PLANT SCIENCE 2022; 13:960449. [PMID: 36275520 PMCID: PMC9583944 DOI: 10.3389/fpls.2022.960449] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 08/01/2022] [Indexed: 06/16/2023]
Abstract
Genotype-by-environment interaction (G × E) is a common phenomenon influencing genetic improvement in plants, and a good understanding of this phenomenon is important for breeding and cultivar deployment strategies. However, there is little information on G × E in horticultural tree crops, mostly due to evaluation costs, leading to a focus on the development and deployment of locally adapted germplasm. Using sweetness (measured as soluble solids content, SSC) in peach/nectarine assessed at four trials from three US peach-breeding programs as a case study, we evaluated the hypotheses that (i) complex data from multiple breeding programs can be connected using GBLUP models to improve the knowledge of G × E for breeding and deployment and (ii) accounting for a known large-effect quantitative trait locus (QTL) improves the prediction accuracy. Following a structured strategy using univariate and multivariate models containing additive and dominance genomic effects on SSC, a model that included a previously detected QTL and background genomic effects was a significantly better fit than a genome-wide model with completely anonymous markers. Estimates of an individual's narrow-sense and broad-sense heritability for SSC were high (0.57-0.73 and 0.66-0.80, respectively), with 19-32% of total genomic variance explained by the QTL. Genome-wide dominance effects and QTL effects were stable across environments. Significant G × E was detected for background genome effects, mostly due to the low correlation of these effects across seasons within a particular trial. The expected prediction accuracy, estimated from the linear model, was higher than the realised prediction accuracy estimated by cross-validation, suggesting that these two parameters measure different qualities of the prediction models. While prediction accuracy was improved in some cases by combining data across trials, particularly when phenotypic data for untested individuals were available from other trials, this improvement was not consistent. This study confirms that complex data can be combined into a single analysis using GBLUP methods to improve understanding of G × E and also incorporate known QTL effects. In addition, the study generated baseline information to account for population structure in genomic prediction models in horticultural crop improvement.
Collapse
Affiliation(s)
- Craig M. Hardner
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Mulusew Fikere
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Ksenija Gasic
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
| | - Cassia da Silva Linge
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
| | - Margaret Worthington
- Faculty Horticulture, University of Arkansas System Division of Agriculture, Fayetteville, AR, United States
| | - David Byrne
- College of Agriculture and Life Sciences, Texas A&M University, College Station, TX, United States
| | - Zena Rawandoozi
- College of Agriculture and Life Sciences, Texas A&M University, College Station, TX, United States
| | - Cameron Peace
- Department of Horticulture, Washington State University, Pullman, WA, United States
| |
Collapse
|
34
|
Olasege BS, Porto-Neto LR, Tahir MS, Gouveia GC, Cánovas A, Hayes BJ, Fortes MRS. Correlation scan: identifying genomic regions that affect genetic correlations applied to fertility traits. BMC Genomics 2022; 23:684. [PMID: 36195838 PMCID: PMC9533527 DOI: 10.1186/s12864-022-08898-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 09/19/2022] [Indexed: 11/10/2022] Open
Abstract
Although the genetic correlations between complex traits have been estimated for more than a century, only recently we have started to map and understand the precise localization of the genomic region(s) that underpin these correlations. Reproductive traits are often genetically correlated. Yet, we don't fully understand the complexities, synergism, or trade-offs between male and female fertility. In this study, we used reproductive traits in two cattle populations (Brahman; BB, Tropical Composite; TC) to develop a novel framework termed correlation scan (CS). This framework was used to identify local regions associated with the genetic correlations between male and female fertility traits. Animals were genotyped with bovine high-density single nucleotide polymorphisms (SNPs) chip assay. The data used consisted of ~1000 individual records measured through frequent ovarian scanning for age at first corpus luteum (AGECL) and a laboratory assay for serum levels of insulin growth hormone (IGF1 measured in bulls, IGF1b, or cows, IGF1c). The methodology developed herein used correlations of 500-SNP effects in a 100-SNPs sliding window in each chromosome to identify local genomic regions that either drive or antagonize the genetic correlations between traits. We used Fisher's Z-statistics through a permutation method to confirm which regions of the genome harboured significant correlations. About 30% of the total genomic regions were identified as driving and antagonizing genetic correlations between male and female fertility traits in the two populations. These regions confirmed the polygenic nature of the traits being studied and pointed to genes of interest. For BB, the most important chromosome in terms of local regions is often located on bovine chromosome (BTA) 14. However, the important regions are spread across few different BTA's in TC. Quantitative trait loci (QTLs) and functional enrichment analysis revealed many significant windows co-localized with known QTLs related to milk production and fertility traits, especially puberty. In general, the enriched reproductive QTLs driving the genetic correlations between male and female fertility are the same for both cattle populations, while the antagonizing regions were population specific. Moreover, most of the antagonizing regions were mapped to chromosome X. These results suggest regions of chromosome X for further investigation into the trade-offs between male and female fertility. We compared the CS with two other recently proposed methods that map local genomic correlations. Some genomic regions were significant across methods. Yet, many significant regions identified with the CS were overlooked by other methods.
Collapse
Affiliation(s)
- Babatunde S Olasege
- The University of Queensland, School of Chemistry and Molecular Biosciences, Saint Lucia Campus, Brisbane, QLD, 4072, Australia.,CSIRO Agriculture and Food, Saint Lucia, QLD, 4067, Australia
| | | | - Muhammad S Tahir
- The University of Queensland, School of Chemistry and Molecular Biosciences, Saint Lucia Campus, Brisbane, QLD, 4072, Australia.,CSIRO Agriculture and Food, Saint Lucia, QLD, 4067, Australia
| | - Gabriela C Gouveia
- Animal Science Department, Veterinary School, Federal University of Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Angela Cánovas
- Department of Animal Biosciences, Centre for Genetic Improvement of Livestock, University of Guelph, 50 Stone Rd E, Guelph, ON, N1G 2W1, Canada
| | - Ben J Hayes
- The University of Queensland, Queensland Alliance for Agriculture and Food Innovation (QAAFI), Saint Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Marina R S Fortes
- The University of Queensland, School of Chemistry and Molecular Biosciences, Saint Lucia Campus, Brisbane, QLD, 4072, Australia. .,The University of Queensland, Queensland Alliance for Agriculture and Food Innovation (QAAFI), Saint Lucia Campus, Brisbane, QLD, 4072, Australia.
| |
Collapse
|
35
|
Bhattarai G, Shi A, Mou B, Correll JC. Resequencing worldwide spinach germplasm for identification of field resistance QTLs to downy mildew and assessment of genomic selection methods. HORTICULTURE RESEARCH 2022; 9:uhac205. [PMID: 36467269 PMCID: PMC9715576 DOI: 10.1093/hr/uhac205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 09/04/2022] [Indexed: 06/16/2023]
Abstract
Downy mildew, commercially the most important disease of spinach, is caused by the obligate oomycete Peronospora effusa. In the past two decades, new pathogen races have repeatedly overcome the resistance used in newly released cultivars, urging the need for more durable resistance. Commercial spinach cultivars are bred with major R genes to impart resistance to downy mildew pathogens and are effective against some pathogen races/isolates. This work aimed to evaluate the worldwide USDA spinach germplasm collections and commercial cultivars for resistance to downy mildew pathogen in the field condition under natural inoculum pressure and conduct genome wide association analysis (GWAS) to identify resistance-associated genomic regions (alleles). Another objective was to evaluate the prediction accuracy (PA) using several genomic prediction (GP) methods to assess the potential implementation of genomic selection (GS) to improve spinach breeding for resistance to downy mildew pathogen. More than four hundred diverse spinach genotypes comprising USDA germplasm accessions and commercial cultivars were evaluated for resistance to downy mildew pathogen between 2017-2019 in Salinas Valley, California and Yuma, Arizona. GWAS was performed using single nucleotide polymorphism (SNP) markers identified via whole genome resequencing (WGR) in GAPIT and TASSEL programs; detected 14, 12, 5, and 10 significantly associated SNP markers with the resistance from four tested environments, respectively; and the QTL alleles were detected at the previously reported region of chromosome 3 in three of the four experiments. In parallel, PA was assessed using six GP models and seven unique marker datasets for field resistance to downy mildew pathogen across four tested environments. The results suggest the suitability of GS to improve field resistance to downy mildew pathogen. The QTL, SNP markers, and PA estimates provide new information in spinach breeding to select resistant plants and breeding lines through marker-assisted selection (MAS) and GS, eventually helping to accumulate beneficial alleles for durable disease resistance.
Collapse
|
36
|
Ayat M, Domaratzki M. Sparse bayesian learning for genomic selection in yeast. FRONTIERS IN BIOINFORMATICS 2022; 2:960889. [PMID: 36304259 PMCID: PMC9580947 DOI: 10.3389/fbinf.2022.960889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 08/02/2022] [Indexed: 11/13/2022] Open
Abstract
Genomic selection, which predicts phenotypes such as yield and drought resistance in crops from high-density markers positioned throughout the genome of the varieties, is moving towards machine learning techniques to make predictions on complex traits that are controlled by several genes. In this paper, we consider sparse Bayesian learning and ensemble learning as a technique for genomic selection and ranking markers based on their relevance to a trait. We define and explore two different forms of the sparse Bayesian learning for predicting phenotypes and identifying the most influential markers of a trait, respectively. We apply our methods on a Saccharomyces cerevisiae dataset, and analyse our results with respect to existing related works, trait heritability, as well as the accuracies obtained from linear and Gaussian kernel functions. We find that sparse Bayesian methods are not only competitive with other machine learning methods in predicting yeast growth in different environments, but are also capable of identifying the most important markers, including both positive and negative effects on the growth, from which biologists can get insight. This attribute can make our proposed ensemble of sparse Bayesian learners favourable in ranking markers based on their relevance to a trait.
Collapse
Affiliation(s)
- Maryam Ayat
- Lactanet, Sainte-Anne-deBellevue, QC, Canada
| | - Mike Domaratzki
- Department of Computer Science, University of Western Ontario, London, ON, Canada
- *Correspondence: Mike Domaratzki,
| |
Collapse
|
37
|
Gamal El‐Dien O, Shalev TJ, Yuen MMS, Stirling R, Daniels LD, Breinholt JW, Neves LG, Kirst M, Van der Merwe L, Yanchuk AD, Ritland C, Russell JH, Bohlmann J. Genomic selection reveals hidden relatedness and increased breeding efficiency in western redcedar polycross breeding. Evol Appl 2022; 15:1291-1312. [PMID: 36051463 PMCID: PMC9423091 DOI: 10.1111/eva.13463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/19/2022] [Accepted: 07/26/2022] [Indexed: 11/29/2022] Open
Abstract
Western redcedar (WRC) is an ecologically and economically important forest tree species characterized by low genetic diversity with high self-compatibility and high heartwood durability. Using sequence capture genotyping of target genic and non-genic regions, we genotyped 44 parent trees and 1520 offspring trees representing 26 polycross (PX) families collected from three progeny test sites using 45,378 SNPs. Trees were phenotyped for eight traits related to growth, heartwood and foliar chemistry associated with wood durability and deer browse resistance. We used the genomic realized relationship matrix for paternity assignment, maternal pedigree correction, and to estimate genetic parameters. We compared genomics-based (GBLUP) and two pedigree-based (ABLUP: polycross and reconstructed full-sib [FS] pedigrees) models. Models were extended to estimate dominance genetic effects. Pedigree reconstruction revealed significant unequal male contribution and separated the 26 PX families into 438 FS families. Traditional maternal PX pedigree analysis resulted in up to 51% overestimation in genetic gain and 44% in diversity. Genomic analysis resulted in up to 22% improvement in offspring breeding value (BV) theoretical accuracy, 35% increase in expected genetic gain for forward selection, and doubled selection intensity for backward selection. Overall, all traits showed low to moderate heritability (0.09-0.28), moderate genotype by environment interaction (type-B genetic correlation: 0.51-0.80), low to high expected genetic gain (6.01%-55%), and no significant negative genetic correlation reflecting no large trade-offs for multi-trait selection. Only three traits showed a significant dominance effect. GBLUP resulted in smaller but more accurate heritability estimates for five traits, but larger estimates for the wood traits. Comparison between all, genic-coding, genic-non-coding and intergenic SNPs showed little difference in genetic estimates. In summary, we show that GBLUP overcomes the PX limitations, successfully captures expected historical and hidden relatedness as well as linkage disequilibrium (LD), and results in increased breeding efficiency in WRC.
Collapse
Affiliation(s)
- Omnia Gamal El‐Dien
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Pharmacognosy Department, Faculty of PharmacyAlexandria UniversityAlexandriaEgypt
| | - Tal J. Shalev
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Macaire M. S. Yuen
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | | | - Lori D. Daniels
- Department of Forest and Conservation SciencesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Jesse W. Breinholt
- Rapid GenomicsGainesvilleFloridaUSA
- Intermountain HealthcareIntermountain Precision GenomicsSt. GeorgeUtahUSA
| | | | - Matias Kirst
- School of Forest, Fisheries and Geomatic SciencesUniversity of FloridaGainesvilleFloridaUSA
| | - Lise Van der Merwe
- British Columbia Ministry of ForestsLands and Natural Resource Operations and Rural DevelopmentVictoriaBritish ColumbiaCanada
| | - Alvin D. Yanchuk
- British Columbia Ministry of ForestsLands and Natural Resource Operations and Rural DevelopmentVictoriaBritish ColumbiaCanada
| | - Carol Ritland
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Department of Forest and Conservation SciencesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - John H. Russell
- British Columbia Ministry of ForestsLands and Natural Resource Operations and Rural DevelopmentVictoriaBritish ColumbiaCanada
| | - Joerg Bohlmann
- Michael Smith LaboratoriesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Department of Forest and Conservation SciencesUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Department of BotanyUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| |
Collapse
|
38
|
Mancin E, Mota LFM, Tuliozi B, Verdiglione R, Mantovani R, Sartori C. Improvement of Genomic Predictions in Small Breeds by Construction of Genomic Relationship Matrix Through Variable Selection. Front Genet 2022; 13:814264. [PMID: 35664297 PMCID: PMC9158133 DOI: 10.3389/fgene.2022.814264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Accepted: 03/22/2022] [Indexed: 11/13/2022] Open
Abstract
Genomic selection has been increasingly implemented in the animal breeding industry, and it is becoming a routine method in many livestock breeding contexts. However, its use is still limited in several small-population local breeds, which are, nonetheless, an important source of genetic variability of great economic value. A major roadblock for their genomic selection is accuracy when population size is limited: to improve breeding value accuracy, variable selection models that assume heterogenous variance have been proposed over the last few years. However, while these models might outperform traditional and genomic predictions in terms of accuracy, they also carry a proportional increase of breeding value bias and dispersion. These mutual increases are especially striking when genomic selection is performed with a low number of phenotypes and high shrinkage value—which is precisely the situation that happens with small local breeds. In our study, we tested several alternative methods to improve the accuracy of genomic selection in a small population. First, we investigated the impact of using only a subset of informative markers regarding prediction accuracy, bias, and dispersion. We used different algorithms to select them, such as recursive feature eliminations, penalized regression, and XGBoost. We compared our results with the predictions of pedigree-based BLUP, single-step genomic BLUP, and weighted single-step genomic BLUP in different simulated populations obtained by combining various parameters in terms of number of QTLs and effective population size. We also investigated these approaches on a real data set belonging to the small local Rendena breed. Our results show that the accuracy of GBLUP in small-sized populations increased when performed with SNPs selected via variable selection methods both in simulated and real data sets. In addition, the use of variable selection models—especially those using XGBoost—in our real data set did not impact bias and the dispersion of estimated breeding values. We have discussed possible explanations for our results and how our study can help estimate breeding values for future genomic selection in small breeds.
Collapse
Affiliation(s)
- Enrico Mancin
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, Italy
| | - Lucio Flavio Macedo Mota
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, Italy
| | - Beniamino Tuliozi
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, Italy
| | - Rina Verdiglione
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, Italy
| | - Roberto Mantovani
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, Italy
| | - Cristina Sartori
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, Italy
| |
Collapse
|
39
|
Juliana P, He X, Poland J, Roy KK, Malaker PK, Mishra VK, Chand R, Shrestha S, Kumar U, Roy C, Gahtyari NC, Joshi AK, Singh RP, Singh PK. Genomic selection for spot blotch in bread wheat breeding panels, full-sibs and half-sibs and index-based selection for spot blotch, heading and plant height. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1965-1983. [PMID: 35416483 PMCID: PMC9205839 DOI: 10.1007/s00122-022-04087-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 03/17/2022] [Indexed: 06/14/2023]
Abstract
KEY MESSAGE Genomic selection is a promising tool to select for spot blotch resistance and index-based selection can simultaneously select for spot blotch resistance, heading and plant height. A major biotic stress challenging bread wheat production in regions characterized by humid and warm weather is spot blotch caused by the fungus Bipolaris sorokiniana. Since genomic selection (GS) is a promising selection tool, we evaluated its potential for spot blotch in seven breeding panels comprising 6736 advanced lines from the International Maize and Wheat Improvement Center. Our results indicated moderately high mean genomic prediction accuracies of 0.53 and 0.40 within and across breeding panels, respectively which were on average 177.6% and 60.4% higher than the mean accuracies from fixed effects models using selected spot blotch loci. Genomic prediction was also evaluated in full-sibs and half-sibs panels and sibs were predicted with the highest mean accuracy (0.63) from a composite training population with random full-sibs and half-sibs. The mean accuracies when full-sibs were predicted from other full-sibs within families and when full-sibs panels were predicted from other half-sibs panels were 0.47 and 0.44, respectively. Comparison of GS with phenotypic selection (PS) of the top 10% of resistant lines suggested that GS could be an ideal tool to discard susceptible lines, as greater than 90% of the susceptible lines discarded by PS were also discarded by GS. We have also reported the evaluation of selection indices to simultaneously select non-late and non-tall genotypes with low spot blotch phenotypic values and genomic-estimated breeding values. Overall, this study demonstrates the potential of integrating GS and index-based selection for improving spot blotch resistance in bread wheat.
Collapse
Affiliation(s)
- Philomin Juliana
- Borlaug Institute for South Asia (BISA), Ludhiana, Punjab, India
| | - Xinyao He
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600, Mexico, DF, Mexico
| | - Jesse Poland
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Krishna K Roy
- Bangladesh Wheat and Maize Research Institute, Nashipur, Dinajpur, 5200, Bangladesh
| | - Paritosh K Malaker
- Bangladesh Wheat and Maize Research Institute, Nashipur, Dinajpur, 5200, Bangladesh
| | - Vinod K Mishra
- Institute of Agricultural Sciences, Banaras Hindu University, Varanasi, Uttar Pradesh, India
| | - Ramesh Chand
- Institute of Agricultural Sciences, Banaras Hindu University, Varanasi, Uttar Pradesh, India
| | - Sandesh Shrestha
- Department of Plant Pathology, Wheat Genetics Resource Center, Kansas State University, Manhattan, KS, USA
| | - Uttam Kumar
- Borlaug Institute for South Asia (BISA), Ludhiana, Punjab, India
| | - Chandan Roy
- Department of Plant Breeding and Genetics, Bihar Agricultural University, Sabour, Bihar, 813210, India
| | - Navin C Gahtyari
- ICAR-Vivekanand Parvatiya Krishi Anushandhan Sansthan, Almora, Uttarakhand, 263601, India
| | - Arun K Joshi
- Borlaug Institute for South Asia (BISA), Ludhiana, Punjab, India
- CIMMYT-India, NASC Complex, DPS Marg, New Delhi, India
| | - Ravi P Singh
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600, Mexico, DF, Mexico.
| | - Pawan K Singh
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600, Mexico, DF, Mexico.
| |
Collapse
|
40
|
Feldmann MJ, Piepho HP, Knapp SJ. Average semivariance directly yields accurate estimates of the genomic variance in complex trait analyses. G3 GENES|GENOMES|GENETICS 2022; 12:6571389. [PMID: 35442424 PMCID: PMC9157152 DOI: 10.1093/g3journal/jkac080] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 03/17/2022] [Indexed: 11/23/2022]
Abstract
Many important traits in plants, animals, and microbes are polygenic and challenging to improve through traditional marker-assisted selection. Genomic prediction addresses this by incorporating all genetic data in a mixed model framework. The primary method for predicting breeding values is genomic best linear unbiased prediction, which uses the realized genomic relationship or kinship matrix (K) to connect genotype to phenotype. Genomic relationship matrices share information among entries to estimate the observed entries’ genetic values and predict unobserved entries’ genetic values. One of the main parameters of such models is genomic variance (σg2), or the variance of a trait associated with a genome-wide sample of DNA polymorphisms, and genomic heritability (hg2); however, the seminal papers introducing different forms of K often do not discuss their effects on the model estimated variance components despite their importance in genetic research and breeding. Here, we discuss the effect of several standard methods for calculating the genomic relationship matrix on estimates of σg2 and hg2. With current approaches, we found that the genomic variance tends to be either overestimated or underestimated depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assortment of alleles and heterozygosity (H) in the observed population. Using the average semivariance, we propose a new matrix, KASV, that directly yields accurate estimates of σg2 and hg2 in the observed population and produces best linear unbiased predictors equivalent to routine methods in plants and animals.
Collapse
Affiliation(s)
- Mitchell J Feldmann
- Department of Plant Sciences, University of California , Davis, CA 95616, USA
| | - Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim , 70593 Stuttgart, Germany
| | - Steven J Knapp
- Department of Plant Sciences, University of California , Davis, CA 95616, USA
| |
Collapse
|
41
|
Chasing genetic correlation breakers to stimulate population resilience to climate change. Sci Rep 2022; 12:8238. [PMID: 35581288 PMCID: PMC9114142 DOI: 10.1038/s41598-022-12320-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 05/09/2022] [Indexed: 11/29/2022] Open
Abstract
Global climate change introduces new combinations of environmental conditions, which is expected to increase stress on plants. This could affect many traits in multiple ways that are as yet unknown but will likely require the modification of existing genetic relationships among functional traits potentially involved in local adaptation. Theoretical evolutionary studies have determined that it is an advantage to have an excess of recombination events under heterogeneous environmental conditions. Our study, conducted on a population of radiata pine (Pinus radiata D. Don), was able to identify individuals that show high genetic recombination at genomic regions, which potentially include pleiotropic or collocating QTLs responsible for the studied traits, reaching a prediction accuracy of 0.80 in random cross-validation and 0.72 when whole family was removed from the training population and predicted. To identify these highly recombined individuals, a training population was constructed from correlation breakers, created through tandem selection of parents in the previous generation and their consequent mating. Although the correlation breakers showed lower observed heterogeneity possibly due to direct selection in both studied traits, the genomic regions with statistically significant differences in the linkage disequilibrium pattern showed higher level of heretozygosity, which has the effect of decomposing unfavourable genetic correlation. We propose undertaking selection of correlation breakers under current environmental conditions and using genomic predictions to increase the frequency of these ’recombined’ individuals in future plantations, ensuring the resilience of planted forests to changing climates. The increased frequency of such individuals will decrease the strength of the population-level genetic correlations among traits, increasing the opportunity for new trait combinations to be developed in the future.
Collapse
|
42
|
Massender E, Brito LF, Maignel L, Oliveira HR, Jafarikia M, Baes CF, Sullivan B, Schenkel FS. Single- and multiple-breed genomic evaluations for conformation traits in Canadian Alpine and Saanen dairy goats. J Dairy Sci 2022; 105:5985-6000. [DOI: 10.3168/jds.2021-21713] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 03/10/2022] [Indexed: 11/19/2022]
|
43
|
Building a Calibration Set for Genomic Prediction, Characteristics to Be Considered, and Optimization Approaches. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:77-112. [PMID: 35451773 DOI: 10.1007/978-1-0716-2205-6_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The efficiency of genomic selection strongly depends on the prediction accuracy of the genetic merit of candidates. Numerous papers have shown that the composition of the calibration set is a key contributor to prediction accuracy. A poorly defined calibration set can result in low accuracies, whereas an optimized one can considerably increase accuracy compared to random sampling, for a same size. Alternatively, optimizing the calibration set can be a way of decreasing the costs of phenotyping by enabling similar levels of accuracy compared to random sampling but with fewer phenotypic units. We present here the different factors that have to be considered when designing a calibration set, and review the different criteria proposed in the literature. We classified these criteria into two groups: model-free criteria based on relatedness, and criteria derived from the linear mixed model. We introduce criteria targeting specific prediction objectives including the prediction of highly diverse panels, biparental families, or hybrids. We also review different ways of updating the calibration set, and different procedures for optimizing phenotyping experimental designs.
Collapse
|
44
|
Chen CJ, Garrick D, Fernando R, Karaman E, Stricker C, Keehan M, Cheng H. XSim version 2: simulation of modern breeding programs. G3 GENES|GENOMES|GENETICS 2022; 12:6542309. [PMID: 35244161 PMCID: PMC8982375 DOI: 10.1093/g3journal/jkac032] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Accepted: 01/06/2022] [Indexed: 11/25/2022]
Abstract
Simulation can be an efficient approach to design, evaluate, and optimize breeding programs. In the era of modern agriculture, breeding programs can benefit from a simulator that integrates various sources of big data and accommodates state-of-the-art statistical models. The initial release of XSim, in which stochastic descendants can be efficiently simulated with a drop-down strategy, has mainly been used to validate genomic selection results. In this article, we present XSim Version 2 that is an open-source tool and has been extensively redesigned with additional features to meet the needs in modern breeding programs. It seamlessly incorporates multiple statistical models for genetic evaluations, such as GBLUP, Bayesian alphabets, and neural networks, and it can effortlessly simulate successive generations of descendants based on complex mating schemes by the aid of its modular design. Case studies are presented to demonstrate the flexibility of XSim Version 2 in simulating crossbreeding in animal and plant populations. Modern biotechnology, including double haploids and embryo transfer, can all be simultaneously integrated into the mating plans that drive the simulation. From a computing perspective, XSim Version 2 is implemented in Julia, which is a computer language that retains the readability of scripting languages (e.g. R and Python) without sacrificing much computational speed compared to compiled languages (e.g. C). This makes XSim Version 2 a simulation tool that is relatively easy for both champions and community members to maintain, modify, or extend in order to improve their breeding programs. Functions and operators are overloaded for a better user interface so they may concatenate, subset, summarize, and organize simulated populations at each breeding step. With the strong and foreseeable demands in the community, XSim Version 2 will serve as a modern simulator bridging the gaps between theories and experiments with its flexibility, extensibility, and friendly interface.
Collapse
Affiliation(s)
- Chunpeng James Chen
- Department of Animal Science, University of California, Davis, CA 95616, USA
| | | | - Rohan Fernando
- Department of Animal Science, Iowa State University, Ames, IA 50010, USA
| | - Emre Karaman
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus 8830, Denmark
| | - Chris Stricker
- agn Genetics GmbH, Davos-Dorf, Graubünden 7260, Switzerland
| | | | - Hao Cheng
- Department of Animal Science, University of California, Davis, CA 95616, USA
| |
Collapse
|
45
|
Mahmood Z, Ali M, Mirza JI, Fayyaz M, Majeed K, Naeem MK, Aziz A, Trethowan R, Ogbonnaya FC, Poland J, Quraishi UM, Hickey LT, Rasheed A, He Z. Genome-Wide Association and Genomic Prediction for Stripe Rust Resistance in Synthetic-Derived Wheats. FRONTIERS IN PLANT SCIENCE 2022; 13:788593. [PMID: 35283883 PMCID: PMC8908430 DOI: 10.3389/fpls.2022.788593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 01/07/2022] [Indexed: 06/14/2023]
Abstract
Stripe rust caused by Puccnina striiformis (Pst) is an economically important disease attacking wheat all over the world. Identifying and deploying new genes for Pst resistance is an economical and long-term strategy for controlling Pst. A genome-wide association study (GWAS) using single nucleotide polymorphisms (SNPs) and functional haplotypes were used to identify loci associated with stripe rust resistance in synthetic-derived (SYN-DER) wheats in four environments. In total, 92 quantitative trait nucleotides (QTNs) distributed over 65 different loci were associated with resistance to Pst at seedling and adult plant stages. Nine additional loci were discovered by the linkage disequilibrium-based haplotype-GWAS approach. The durable rust-resistant gene Lr34/Yr18 provided resistance in all four environments, and against all the five Pst races used in this study. The analysis identified several SYN-DER accessions that carried major genes: either Yr24/Yr26 or Yr32. New loci were also identified on chr2B, chr5B, and chr7D, and 14 QTNs and three haplotypes identified on the D-genome possibly carry new alleles of the known genes contributed by the Ae. tauschii founders. We also evaluated eleven different models for genomic prediction of Pst resistance, and a prediction accuracy up to 0.85 was achieved for an adult plant resistance, however, genomic prediction for seedling resistance remained very low. A meta-analysis based on a large number of existing GWAS would enhance the identification of new genes and loci for stripe rust resistance in wheat. The genetic framework elucidated here for stripe rust resistance in SYN-DER identified the novel loci for resistance to Pst assembled in adapted genetic backgrounds.
Collapse
Affiliation(s)
- Zahid Mahmood
- Department of Plant Sciences, Quaid-i-Azam University, Islamabad, Pakistan
- Crop Sciences Institute, National Agricultural Research Centre (NARC), Islamabad, Pakistan
| | - Mohsin Ali
- Institute of Crop Sciences, CIMMYT-China office, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
| | | | | | - Khawar Majeed
- Department of Plant Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Muhammad Kashif Naeem
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agriculture Research Center (NARC), Islamabad, Pakistan
| | - Abdul Aziz
- Department of Plant Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Richard Trethowan
- Plant Breeding Institute, School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW, Australia
| | | | - Jesse Poland
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | | | - Lee Thomas Hickey
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Saint Lucia, QLD, Australia
| | - Awais Rasheed
- Department of Plant Sciences, Quaid-i-Azam University, Islamabad, Pakistan
- Institute of Crop Sciences, CIMMYT-China office, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
| | - Zhonghu He
- Institute of Crop Sciences, CIMMYT-China office, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
| |
Collapse
|
46
|
Stejskal J, Klápště J, Čepl J, El-Kassaby YA, Lstibůrek M. Effect of clonal testing on the efficiency of genomic evaluation in forest tree breeding. Sci Rep 2022; 12:3033. [PMID: 35194102 PMCID: PMC8864020 DOI: 10.1038/s41598-022-06952-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/02/2022] [Indexed: 11/09/2022] Open
Abstract
Through stochastic simulations, accuracies of breeding values and response to selection were assessed under traditional pedigree-(BLUP) and genomic-based evaluation methods (GBLUP) in forest tree breeding. The latter provides a methodological foundation for genomic selection. We evaluated the impact of clonal replication in progeny testing on the response to selection realized in seed orchards under variable marker density and target effective population sizes. We found that clonal replication in progeny trials boosted selection accuracy, thus providing additional genetic gains under BLUP. While a similar trend was observed for GBLUP, however, the added gains did not surpass those under BLUP. Therefore, breeding programs deploying extensive progeny testing with clonal propagation might not benefit from the deployment of genomic information. These findings could be helpful in the context of operational breeding programs.
Collapse
Affiliation(s)
- J Stejskal
- Faculty of Forestry and Wood Sciences, Czech University of Life Sciences, Kamýcká 1176, 165 21, Praha, Czech Republic.
| | - J Klápště
- Scion (New Zealand Forest Research Institute Ltd.), 49 Sala Street, Whakarewarewa, 3010, Rotorua, New Zealand
| | - J Čepl
- Faculty of Forestry and Wood Sciences, Czech University of Life Sciences, Kamýcká 1176, 165 21, Praha, Czech Republic
| | - Y A El-Kassaby
- Department of Forest and Conservation Sciences, Faculty of Forestry, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - M Lstibůrek
- Faculty of Forestry and Wood Sciences, Czech University of Life Sciences, Kamýcká 1176, 165 21, Praha, Czech Republic
| |
Collapse
|
47
|
Improving lodgepole pine genomic evaluation using spatial correlation structure and SNP selection with single-step GBLUP. Heredity (Edinb) 2022; 128:209-224. [PMID: 35181761 PMCID: PMC8986842 DOI: 10.1038/s41437-022-00508-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 01/27/2022] [Accepted: 01/28/2022] [Indexed: 01/20/2023] Open
Abstract
Modeling environmental spatial heterogeneity can improve the efficiency of forest tree genomic evaluation. Furthermore, genotyping costs can be lowered by reducing the number of markers needed. We investigated the impact on variance components, breeding value accuracy, and bias of two phenotypic data adjustments (experimental design and autoregressive spatial models), and a relationship matrix calculated from a subset of markers selected for their ability to infer ancestry. Using a multiple-trait multiple-site single-step Genomic Best Linear Unbiased Prediction (ssGBLUP) approach, four scenarios (2 phenotype adjustments × 2 marker sets) were applied to diameter at breast height (DBH), height (HT), and resistance to western gall rust (WGR) in four open-pollinated progeny trials of lodgepole pine, with 1490 (out of 11,188) trees genotyped with 25,099 SNPs. As a control, we fitted the conventional ABLUP model using pedigree information. The highest heritability estimates were achieved for the ABLUP followed closely by the ssGBLUP with the full marker set and using the spatial phenotype adjustments. The highest predictive ability was obtained by using a reduced marker subset (8000 SNPs) when either the spatial (DBH: 0.429, and WGR: 0.513) or design (HT: 0.467) phenotype corrections were used. No significant difference was detected in prediction bias among the six fitted models, and all values were close to 1 (0.918-1.014). Results demonstrated that selecting informative markers, such as those capturing ancestry, can improve the predictive ability. The use of spatial correlation structure increased traits' heritability and reduced prediction bias, while increases in predictive ability were trait-dependent.
Collapse
|
48
|
Abstract
Traditional tree improvement is cumbersome and costly. Our main objective was to assess the extent to which genomic data can currently accelerate and improve decision making in this field. We used diameter at breast height (DBH) and wood density (WD) data for 4430 tree genotypes and single-nucleotide polymorphism (SNP) data for 2446 tree genotypes. Pedigree reconstruction was performed using a combination of maximum likelihood parentage assignment and matching based on identity-by-state (IBS) similarity. In addition, we used best linear unbiased prediction (BLUP) methods to predict phenotypes using SNP markers (GBLUP), recorded pedigree information (ABLUP), and single-step “blended” BLUP (HBLUP) combining SNP and pedigree information. We substantially improved the accuracy of pedigree records, resolving the inconsistent parental information of 506 tree genotypes. This led to substantially increased predictive ability (i.e., by up to 87%) in HBLUP analyses compared to a baseline from ABLUP. Genomic prediction was possible across populations and within previously untested families with moderately large training populations (N = 800–1200 tree genotypes) and using as few as 2000–5000 SNP markers. HBLUP was generally more effective than traditional ABLUP approaches, particularly after dealing appropriately with pedigree uncertainties. Our study provides evidence that genome-wide marker data can significantly enhance tree improvement. The operational implementation of genomic selection has started in radiata pine breeding in New Zealand, but further reductions in DNA extraction and genotyping costs may be required to realise the full potential of this approach.
Collapse
|
49
|
Budhlakoti N, Kushwaha AK, Rai A, Chaturvedi KK, Kumar A, Pradhan AK, Kumar U, Kumar RR, Juliana P, Mishra DC, Kumar S. Genomic Selection: A Tool for Accelerating the Efficiency of Molecular Breeding for Development of Climate-Resilient Crops. Front Genet 2022; 13:832153. [PMID: 35222548 PMCID: PMC8864149 DOI: 10.3389/fgene.2022.832153] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 01/10/2022] [Indexed: 12/17/2022] Open
Abstract
Since the inception of the theory and conceptual framework of genomic selection (GS), extensive research has been done on evaluating its efficiency for utilization in crop improvement. Though, the marker-assisted selection has proven its potential for improvement of qualitative traits controlled by one to few genes with large effects. Its role in improving quantitative traits controlled by several genes with small effects is limited. In this regard, GS that utilizes genomic-estimated breeding values of individuals obtained from genome-wide markers to choose candidates for the next breeding cycle is a powerful approach to improve quantitative traits. In the last two decades, GS has been widely adopted in animal breeding programs globally because of its potential to improve selection accuracy, minimize phenotyping, reduce cycle time, and increase genetic gains. In addition, given the promising initial evaluation outcomes of GS for the improvement of yield, biotic and abiotic stress tolerance, and quality in cereal crops like wheat, maize, and rice, prospects of integrating it in breeding crops are also being explored. Improved statistical models that leverage the genomic information to increase the prediction accuracies are critical for the effectiveness of GS-enabled breeding programs. Study on genetic architecture under drought and heat stress helps in developing production markers that can significantly accelerate the development of stress-resilient crop varieties through GS. This review focuses on the transition from traditional selection methods to GS, underlying statistical methods and tools used for this purpose, current status of GS studies in crop plants, and perspectives for its successful implementation in the development of climate-resilient crops.
Collapse
Affiliation(s)
- Neeraj Budhlakoti
- ICAR- Indian Agricultural Statistics Research Institute, New Delhi, India
| | | | - Anil Rai
- ICAR- Indian Agricultural Statistics Research Institute, New Delhi, India
| | - K K Chaturvedi
- ICAR- Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Anuj Kumar
- ICAR- Indian Agricultural Statistics Research Institute, New Delhi, India
| | | | - Uttam Kumar
- Borlaug Institute for South Asia (BISA), Ludhiana, India
| | | | | | - D C Mishra
- ICAR- Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sundeep Kumar
- ICAR- National Bureau of Plant Genetic Resources, New Delhi, India
| |
Collapse
|
50
|
Juliana P, He X, Marza F, Islam R, Anwar B, Poland J, Shrestha S, Singh GP, Chawade A, Joshi AK, Singh RP, Singh PK. Genomic Selection for Wheat Blast in a Diversity Panel, Breeding Panel and Full-Sibs Panel. FRONTIERS IN PLANT SCIENCE 2022; 12:745379. [PMID: 35069614 PMCID: PMC8782147 DOI: 10.3389/fpls.2021.745379] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 11/09/2021] [Indexed: 06/14/2023]
Abstract
Wheat blast is an emerging threat to wheat production, due to its recent migration to South Asia and Sub-Saharan Africa. Because genomic selection (GS) has emerged as a promising breeding strategy, the key objective of this study was to evaluate it for wheat blast phenotyped at precision phenotyping platforms in Quirusillas (Bolivia), Okinawa (Bolivia) and Jashore (Bangladesh) using three panels: (i) a diversity panel comprising 172 diverse spring wheat genotypes, (ii) a breeding panel comprising 248 elite breeding lines, and (iii) a full-sibs panel comprising 298 full-sibs. We evaluated two genomic prediction models (the genomic best linear unbiased prediction or GBLUP model and the Bayes B model) and compared the genomic prediction accuracies with accuracies from a fixed effects model (with selected blast-associated markers as fixed effects), a GBLUP + fixed effects model and a pedigree relationships-based model (ABLUP). On average, across all the panels and environments analyzed, the GBLUP + fixed effects model (0.63 ± 0.13) and the fixed effects model (0.62 ± 0.13) gave the highest prediction accuracies, followed by the Bayes B (0.59 ± 0.11), GBLUP (0.55 ± 0.1), and ABLUP (0.48 ± 0.06) models. The high prediction accuracies from the fixed effects model resulted from the markers tagging the 2NS translocation that had a large effect on blast in all the panels. This implies that in environments where the 2NS translocation-based blast resistance is effective, genotyping one to few markers tagging the translocation is sufficient to predict the blast response and genome-wide markers may not be needed. We also observed that marker-assisted selection (MAS) based on a few blast-associated markers outperformed GS as it selected the highest mean percentage (88.5%) of lines also selected by phenotypic selection and discarded the highest mean percentage of lines (91.8%) also discarded by phenotypic selection, across all panels. In conclusion, while this study demonstrates that MAS might be a powerful strategy to select for the 2NS translocation-based blast resistance, we emphasize that further efforts to use genomic tools to identify non-2NS translocation-based blast resistance are critical.
Collapse
Affiliation(s)
| | - Xinyao He
- International Maize and Wheat Improvement Center (CIMMYT), Mexico, Mexico
| | - Felix Marza
- Instituto Nacional de Innovación Agropecuaria y Forestal (INIAF), La Paz, Bolivia
| | - Rabiul Islam
- Bangladesh Wheat and Maize Research Institute (BWMRI), Dinajpur, Bangladesh
| | - Babul Anwar
- Bangladesh Wheat and Maize Research Institute (BWMRI), Dinajpur, Bangladesh
| | - Jesse Poland
- Department of Plant Pathology, Wheat Genetics Resource Center, Kansas State University, Manhattan, KS, United States
| | - Sandesh Shrestha
- Department of Plant Pathology, Wheat Genetics Resource Center, Kansas State University, Manhattan, KS, United States
| | - Gyanendra P. Singh
- Indian Council of Agricultural Research (ICAR)-Indian Institute of Wheat and Barley Research, Karnal, India
| | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Arun K. Joshi
- Borlaug Institute for South Asia (BISA), Ludhiana, India
- CIMMYT-India, New Delhi, India
| | - Ravi P. Singh
- International Maize and Wheat Improvement Center (CIMMYT), Mexico, Mexico
| | - Pawan K. Singh
- International Maize and Wheat Improvement Center (CIMMYT), Mexico, Mexico
| |
Collapse
|