1
|
Crossa J, Montesinos-Lopez OA, Costa-Neto G, Vitale P, Martini JWR, Runcie D, Fritsche-Neto R, Montesinos-Lopez A, Pérez-Rodríguez P, Gerard G, Dreisigacker S, Crespo-Herrera L, Pierre CS, Lillemo M, Cuevas J, Bentley A, Ortiz R. Machine learning algorithms translate big data into predictive breeding accuracy. TRENDS IN PLANT SCIENCE 2024:S1360-1385(24)00259-0. [PMID: 39462718 DOI: 10.1016/j.tplants.2024.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 08/23/2024] [Accepted: 09/23/2024] [Indexed: 10/29/2024]
Abstract
Statistical machine learning (ML) extracts patterns from extensive genomic, phenotypic, and environmental data. ML algorithms automatically identify relevant features and use cross-validation to ensure robust models and improve prediction reliability in new lines. Furthermore, ML analyses of genotype-by-environment (G×E) interactions can offer insights into the genetic factors that affect performance in specific environments. By leveraging historical breeding data, ML streamlines strategies and automates analyses to reveal genomic patterns. In this review we examine the transformative impact of big data, including multi-trait genomics, phenomics, and environmental covariables, on genomic-enabled prediction in plant breeding. We discuss how big data and ML are revolutionizing the field by enhancing prediction accuracy, deepening our understanding of G×E interactions, and optimizing breeding strategies through the analysis of extensive and diverse datasets.
Collapse
Affiliation(s)
- José Crossa
- Louisiana State University, College of Agriculture, Baton Rouge, LA, USA; Colegio de Postgraduados, Montecillos, CP 56230, Estado de México, Mexico; International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km 45, El Batán, Texcoco, CP 56237, Estado de México, Mexico; Department of Statistics and Operations Research and Distinguished Scientist Fellowship Program, King Saud University, Riyadh 11451, Saudi Arabia
| | | | | | - Paolo Vitale
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km 45, El Batán, Texcoco, CP 56237, Estado de México, Mexico
| | | | - Daniel Runcie
- Department of Plant Sciences, University of California Davis, Davis, CA, USA
| | | | - Abelardo Montesinos-Lopez
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430 Guadalajara, Jalisco, Mexico
| | | | - Guillermo Gerard
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km 45, El Batán, Texcoco, CP 56237, Estado de México, Mexico
| | - Susanna Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km 45, El Batán, Texcoco, CP 56237, Estado de México, Mexico
| | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km 45, El Batán, Texcoco, CP 56237, Estado de México, Mexico
| | - Carolina Saint Pierre
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km 45, El Batán, Texcoco, CP 56237, Estado de México, Mexico
| | - Morten Lillemo
- Norwegian University of Life Science (NMBU), Department of Plant Science, Ås, Norway
| | - Jaime Cuevas
- Universidad de Quintana Roo, Chetumal, Quintana Roo, 77019, Mexico
| | - Alison Bentley
- Australian National University, Research School of Biology, Canberra, NSW, Australia.
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), PO Box 190 Sundsvagen 10, SE 23422 Lomma, Sweden.
| |
Collapse
|
2
|
Jighly A. Boosting genome-wide association power and genomic prediction accuracy for date palm fruit traits with advanced statistics. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2024; 344:112110. [PMID: 38704095 DOI: 10.1016/j.plantsci.2024.112110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 03/05/2024] [Accepted: 04/30/2024] [Indexed: 05/06/2024]
Abstract
The date palm is economically vital in the Middle East and North Africa, providing essential fibres, vitamins, and carbohydrates. Understanding the genetic architecture of its traits remains complex due to the tree's perennial nature and long generation times. This study aims to address these complexities by employing advanced genome-wide association (GWAS) and genomic prediction models using previously published data involving fruit acid content, sugar content, dimension, and colour traits. The multivariate GWAS model identified seven QTL, including five novel associations, that shed light on the genetic control of these traits. Furthermore, the research evaluates different genomic prediction models that considered genotype by environment and genotype by trait interactions. While colour- traits demonstrate strong predictive power, other traits display moderate accuracies across different models and scenarios aligned with the expectations when using small reference populations. When designing the cross-validation to predict new individuals, the accuracy of the best multi-trait model was significantly higher than all single-trait models for dimension traits, but not for the remaining traits, which showed similar performances. However, the cross-validation strategy that masked random phenotypic records (i.e., mimicking the unbalanced phenotypic records) showed significantly higher accuracy for all traits except acid contents. The findings underscore the importance of understanding genetic architecture for informed breeding strategies. The research emphasises the need for larger population sizes and multivariate models to enhance gene tagging power and predictive accuracy to advance date palm breeding programs. These findings support more targeted breeding in date palm, improving productivity and resilience to various environments.
Collapse
|
3
|
Dreisigacker S, Martini JWR, Cuevas J, Pérez-Rodríguez P, Lozano-Ramírez N, Huerta J, Singh P, Crespo-Herrera L, Bentley AR, Crossa J. Genomic prediction of synthetic hexaploid wheat upon tetraploid durum and diploid Aegilops parental pools. THE PLANT GENOME 2024; 17:e20464. [PMID: 38764312 DOI: 10.1002/tpg2.20464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 04/04/2024] [Accepted: 04/09/2024] [Indexed: 05/21/2024]
Abstract
Bread wheat (Triticum aestivum L.) is a globally important food crop, which was domesticated about 8-10,000 years ago. Bread wheat is an allopolyploid, and it evolved from two hybridization events of three species. To widen the genetic base in breeding, bread wheat has been re-synthesized by crossing durum wheat (Triticum turgidum ssp. durum) and goat grass (Aegilops tauschii Coss), leading to so-called synthetic hexaploid wheat (SHW). We applied the quantitative genetics tools of "hybrid prediction"-originally developed for the prediction of wheat hybrids generated from different heterotic groups - to a situation of allopolyploidization. Our use-case predicts the phenotypes of SHW for three quantitatively inherited global wheat diseases, namely tan spot (TS), septoria nodorum blotch (SNB), and spot blotch (SB). Our results revealed prediction abilities comparable to studies in 'traditional' elite or hybrid wheat. Prediction abilities were highest using a marker model and performing random cross-validation, predicting the performance of untested SHW (0.483 for SB to 0.730 for TS). When testing parents not necessarily used in SHW, combination prediction abilities were slightly lower (0.378 for SB to 0.718 for TS), yet still promising. Despite the limited phenotypic data, our results provide a general example for predictive models targeting an allopolyploidization event and a method that can guide the use of genetic resources available in gene banks.
Collapse
Affiliation(s)
| | | | - Jaime Cuevas
- Universidad Autónoma del Estado de Quintana Roo, Chetumal, México
| | | | | | - Julio Huerta
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, México
| | - Pawan Singh
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, México
| | | | - Alison R Bentley
- Australian National University, Research School of Biology, Canberra, Australia
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, México
- Colegio de Postgraduados, Campus Montecillos, Texcoco, México
| |
Collapse
|
4
|
Peixoto MA, Leach KA, Jarquin D, Flannery P, Zystro J, Tracy WF, Bhering L, Resende MFR. Utilizing genomic prediction to boost hybrid performance in a sweet corn breeding program. FRONTIERS IN PLANT SCIENCE 2024; 15:1293307. [PMID: 38726298 PMCID: PMC11080654 DOI: 10.3389/fpls.2024.1293307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/26/2024] [Indexed: 05/12/2024]
Abstract
Sweet corn breeding programs, like field corn, focus on the development of elite inbred lines to produce commercial hybrids. For this reason, genomic selection models can help the in silico prediction of hybrid crosses from the elite lines, which is hypothesized to improve the test cross scheme, leading to higher genetic gain in a breeding program. This study aimed to explore the potential of implementing genomic selection in a sweet corn breeding program through hybrid prediction in a within-site across-year and across-site framework. A total of 506 hybrids were evaluated in six environments (California, Florida, and Wisconsin, in the years 2020 and 2021). A total of 20 traits from three different groups were measured (plant-, ear-, and flavor-related traits) across the six environments. Eight statistical models were considered for prediction, as the combination of two genomic prediction models (GBLUP and RKHS) with two different kernels (additive and additive + dominance), and in a single- and multi-trait framework. Also, three different cross-validation schemes were tested (CV1, CV0, and CV00). The different models were then compared based on the correlation between the estimated breeding values/total genetic values and phenotypic measurements. Overall, heritabilities and correlations varied among the traits. The models implemented showed good accuracies for trait prediction. The GBLUP implementation outperformed RKHS in all cross-validation schemes and models. Models with additive plus dominance kernels presented a slight improvement over the models with only additive kernels for some of the models examined. In addition, models for within-site across-year and across-site performed better in the CV0 than the CV00 scheme, on average. Hence, GBLUP should be considered as a standard model for sweet corn hybrid prediction. In addition, we found that the implementation of genomic prediction in a sweet corn breeding program presented reliable results, which can improve the testcross stage by identifying the top candidates that will reach advanced field-testing stages.
Collapse
Affiliation(s)
- Marco Antônio Peixoto
- Laboratório de Biometria, Universidade Federal de Viçosa, Viçosa, Minas Gerais, Brazil
- Department of Horticultural Sciences, University of Florida, Gainesville, FL, United States
| | - Kristen A. Leach
- Department of Horticultural Sciences, University of Florida, Gainesville, FL, United States
| | - Diego Jarquin
- Department of Agronomy, University of Florida, Gainesville, FL, United States
| | - Patrick Flannery
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, WI, United States
| | - Jared Zystro
- Organic Seed Alliance, Port Townsend, WA, United States
| | - William F. Tracy
- Department of Plant and Agroecosystem Sciences, University of Wisconsin-Madison, Madison, WI, United States
| | - Leonardo Bhering
- Laboratório de Biometria, Universidade Federal de Viçosa, Viçosa, Minas Gerais, Brazil
| | - Márcio F. R. Resende
- Department of Horticultural Sciences, University of Florida, Gainesville, FL, United States
| |
Collapse
|
5
|
Hamazaki K, Iwata H. AI-assisted selection of mating pairs through simulation-based optimized progeny allocation strategies in plant breeding. FRONTIERS IN PLANT SCIENCE 2024; 15:1361894. [PMID: 38817943 PMCID: PMC11138345 DOI: 10.3389/fpls.2024.1361894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 03/06/2024] [Indexed: 06/01/2024]
Abstract
Emerging technologies such as genomic selection have been applied to modern plant and animal breeding to increase the speed and efficiency of variety release. However, breeding requires decisions regarding parent selection and mating pairs, which significantly impact the ultimate genetic gain of a breeding scheme. The selection of appropriate parents and mating pairs to increase genetic gain while maintaining genetic diversity is still an urgent need that breeders are facing. This study aimed to determine the best progeny allocation strategies by combining future-oriented simulations and numerical black-box optimization for an improved selection of parents and mating pairs. In this study, we focused on optimizing the allocation of progenies, and the breeding process was regarded as a black-box function whose input is a set of parameters related to the progeny allocation strategies and whose output is the ultimate genetic gain of breeding schemes. The allocation of progenies to each mating pair was parameterized according to a softmax function, whose input is a weighted sum of multiple features for the allocation, including expected genetic variance of progenies and selection criteria such as different types of breeding values, to balance genetic gains and genetic diversity optimally. The weighting parameters were then optimized by the black-box optimization algorithm called StoSOO via future-oriented breeding simulations. Simulation studies to evaluate the potential of our novel method revealed that the breeding strategy based on optimized weights attained almost 10% higher genetic gain than that with an equal allocation of progenies to all mating pairs within just four generations. Among the optimized strategies, those considering the expected genetic variance of progenies could maintain the genetic diversity throughout the breeding process, leading to a higher ultimate genetic gain than those without considering it. These results suggest that our novel method can significantly improve the speed and efficiency of variety development through optimized decisions regarding the selection of parents and mating pairs. In addition, by changing simulation settings, our future-oriented optimization framework for progeny allocation strategies can be easily implemented into general breeding schemes, contributing to accelerated plant and animal breeding with high efficiency.
Collapse
Affiliation(s)
| | - Hiroyoshi Iwata
- Laboratory of Biometry and Bioinformatics, Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
6
|
Yadav S, Ross EM, Wei X, Liu S, Nguyen LT, Powell O, Hickey LT, Deomano E, Atkin F, Voss-Fels KP, Hayes BJ. Use of continuous genotypes for genomic prediction in sugarcane. THE PLANT GENOME 2024; 17:e20417. [PMID: 38066702 DOI: 10.1002/tpg2.20417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/30/2023] [Accepted: 11/14/2023] [Indexed: 03/22/2024]
Abstract
Genomic selection in sugarcane faces challenges due to limited genomic tools and high genomic complexity, particularly because of its high and variable ploidy. The classification of genotypes for single nucleotide polymorphisms (SNPs) becomes difficult due to the wide range of possible allele dosages. Previous genomic studies in sugarcane used pseudo-diploid genotyping, grouping all heterozygotes into a single class. In this study, we investigate the use of continuous genotypes as a proxy for allele-dosage in genomic prediction models. The hypothesis is that continuous genotypes could better reflect allele dosage at SNPs linked to mutations affecting target traits, resulting in phenotypic variation. The dataset included genotypes of 1318 clones at 58K SNP markers, with about 26K markers filtered using standard quality controls. Predictions for tonnes of cane per hectare (TCH), commercial cane sugar (CCS), and fiber content (Fiber) were made using parametric, non-parametric, and Bayesian methods. Continuous genotypes increased accuracy by 5%-7% for CCS and Fiber. The pseudo-diploid parametrization performed better for TCH. Reproducing kernel Hilbert spaces model with Gaussian kernel and AK4 (arc-cosine kernel with hidden layer 4) kernel outperformed other methods for TCH and CCS, suggesting that non-additive effects might influence these traits. The prevalence of low-dosage markers in the study may have limited the benefits of approximating allele-dosage information with continuous genotypes in genomic prediction models. Continuous genotypes simplify genomic prediction in polyploid crops, allowing additional markers to be used without adhering to pseudo-diploid inheritance. The approach can particularly benefit high ploidy species or emerging crops with unknown ploidy.
Collapse
Affiliation(s)
- Seema Yadav
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Xianming Wei
- Sugar Research Australia, Mackay, Queensland, Australia
| | - Shouye Liu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Loan To Nguyen
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Owen Powell
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Lee T Hickey
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| | - Emily Deomano
- Sugar Research Australia, Indooroopilly, Queensland, Australia
| | - Felicity Atkin
- Sugar Research Australia, Meringa Gordonvale, Queensland, Australia
| | - Kai P Voss-Fels
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
- Department of Grapevine Breeding, Hochschule Geisenheim University, Geisenheim, Germany
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
7
|
Cuyabano BCD, Boichard D, Gondro C. Expected values for the accuracy of predicted breeding values accounting for genetic differences between reference and target populations. Genet Sel Evol 2024; 56:15. [PMID: 38424504 PMCID: PMC11234767 DOI: 10.1186/s12711-024-00876-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/08/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Genetic merit, or breeding values as referred to in livestock and crop breeding programs, is one of the keys to the successful selection of animals in commercial farming systems. The developments in statistical methods during the twentieth century and single nucleotide polymorphism (SNP) chip technologies in the twenty-first century have revolutionized agricultural production, by allowing highly accurate predictions of breeding values for selection candidates at a very early age. Nonetheless, for many breeding populations, realized accuracies of predicted breeding values (PBV) remain below the theoretical maximum, even when the reference population is sufficiently large, and SNPs included in the model are in sufficient linkage disequilibrium (LD) with the quantitative trait locus (QTL). This is particularly noticeable over generations, as we observe the so-called erosion of the effects of SNPs due to recombinations, accompanied by the erosion of the accuracy of prediction. While accurately quantifying the erosion at the individual SNP level is a difficult and unresolved task, quantifying the erosion of the accuracy of prediction is a more tractable problem. In this paper, we describe a method that uses the relationship between reference and target populations to calculate expected values for the accuracies of predicted breeding values for non-phenotyped individuals accounting for erosion. The accuracy of the expected values was evaluated through simulations, and a further evaluation was performed on real data. RESULTS Using simulations, we empirically confirmed that our expected values for the accuracy of PBV accounting for erosion were able to correctly determine the prediction accuracy of breeding values for non-phenotyped individuals. When comparing the expected to the realized accuracies of PBV with real data, only one out of the four traits evaluated presented accuracies that were significantly higher than the expected, approachingh 2 . CONCLUSIONS We defined an index of genetic correlation between reference and target populations, which summarizes the expected overall erosion due to differences in allele frequencies and LD patterns between populations. We used this correlation along with a trait's heritability to derive expected values for the accuracy ( R ) of PBV accounting for the erosion, and demonstrated that our derived E R | erosion is a reliable metric.
Collapse
Affiliation(s)
- Beatriz C D Cuyabano
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France.
| | - Didier Boichard
- INRAE, AgroParisTech, GABI, Université Paris Saclay, 78350, Jouy-en-Josas, France
| | - Cedric Gondro
- Department of Animal Science, Michigan State University, 474 S Shaw Ln, East Lansing, MI, 48824, USA
| |
Collapse
|
8
|
Hoque A, Anderson JV, Rahman M. Genomic prediction for agronomic traits in a diverse Flax (Linum usitatissimum L.) germplasm collection. Sci Rep 2024; 14:3196. [PMID: 38326469 PMCID: PMC10850546 DOI: 10.1038/s41598-024-53462-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/31/2024] [Indexed: 02/09/2024] Open
Abstract
Breeding programs require exhaustive phenotyping of germplasms, which is time-demanding and expensive. Genomic prediction helps breeders harness the diversity of any collection to bypass phenotyping. Here, we examined the genomic prediction's potential for seed yield and nine agronomic traits using 26,171 single nucleotide polymorphism (SNP) markers in a set of 337 flax (Linum usitatissimum L.) germplasm, phenotyped in five environments. We evaluated 14 prediction models and several factors affecting predictive ability based on cross-validation schemes. Models yielded significant variation among predictive ability values across traits for the whole marker set. The ridge regression (RR) model covering additive gene action yielded better predictive ability for most of the traits, whereas it was higher for low heritable traits by models capturing epistatic gene action. Marker subsets based on linkage disequilibrium decay distance gave significantly higher predictive abilities to the whole marker set, but for randomly selected markers, it reached a plateau above 3000 markers. Markers having significant association with traits improved predictive abilities compared to the whole marker set when marker selection was made on the whole population instead of the training set indicating a clear overfitting. The correction for population structure did not increase predictive abilities compared to the whole collection. However, stratified sampling by picking representative genotypes from each cluster improved predictive abilities. The indirect predictive ability for a trait was proportionate to its correlation with other traits. These results will help breeders to select the best models, optimum marker set, and suitable genotype set to perform an indirect selection for quantitative traits in this diverse flax germplasm collection.
Collapse
Affiliation(s)
- Ahasanul Hoque
- Department of Plant Sciences, North Dakota State University, Fargo, ND, USA
- Department of Genetics and Plant Breeding, Bangladesh Agricultural University, Mymensingh, 2202, Bangladesh
| | - James V Anderson
- USDA-ARS, Edward T. Schafer Agricultural Research Center, Fargo, ND, USA
| | - Mukhlesur Rahman
- Department of Plant Sciences, North Dakota State University, Fargo, ND, USA.
| |
Collapse
|
9
|
Montesinos-López A, Gutiérrez-Pulido H, Ramos-Pulido S, Montesinos-López JC, Montesinos-López OA, Crossa J. Bayesian discrete lognormal regression model for genomic prediction. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:21. [PMID: 38221602 DOI: 10.1007/s00122-023-04526-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 12/11/2023] [Indexed: 01/16/2024]
Abstract
KEY MESSAGE Genomic prediction models for quantitative traits assume continuous and normally distributed phenotypes. In this research, we proposed a novel Bayesian discrete lognormal regression model. Genomic selection is a powerful tool in modern breeding programs that uses genomic information to predict the performance of individuals and select those with desirable traits. It has revolutionized animal and plant breeding, as it allows breeders to identify the best candidates without labor-intensive and time-consuming phenotypic evaluations. While several statistical models have been developed, most of them have been for quantitative continuous traits and only a few for count responses. In this paper, we propose a discrete lognormal regression model in the Bayesian context, that with a Gibbs sampler to explore the corresponding posterior distribution and make the predictions. Two datasets of resistance disease is used in the wheat crop and are then evaluated against the traditional Gaussian model and a lognormal model. The results indicate the proposed model is a competitive and natural model for predicting count genomic traits.
Collapse
Affiliation(s)
- Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, C. P. 44430, Guadalajara, Jalisco, México
| | - Humberto Gutiérrez-Pulido
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, C. P. 44430, Guadalajara, Jalisco, México
| | - Sofía Ramos-Pulido
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, C. P. 44430, Guadalajara, Jalisco, México
| | | | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz Km. 45, El Batán, C. P. 56237, Texcoco, Edo. de México, México.
- Colegio de Postgraduados, C. P. 56230, Montecillos, Edo. de México, México.
- Centre for Crop & Food Innovation, Food Futures Institute, Murdoch University, Murdoch, 6150, Australia.
| |
Collapse
|
10
|
Freudiger A, Jovanovic VM, Huang Y, Snyder-Mackler N, Conrad DF, Miller B, Montague MJ, Westphal H, Stadler PF, Bley S, Horvath JE, Brent LJN, Platt ML, Ruiz-Lambides A, Tung J, Nowick K, Ringbauer H, Widdig A. Taking identity-by-descent analysis into the wild: Estimating realized relatedness in free-ranging macaques. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.09.574911. [PMID: 38260273 PMCID: PMC10802400 DOI: 10.1101/2024.01.09.574911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Biological relatedness is a key consideration in studies of behavior, population structure, and trait evolution. Except for parent-offspring dyads, pedigrees capture relatedness imperfectly. The number and length of DNA segments that are identical-by-descent (IBD) yield the most precise estimates of relatedness. Here, we leverage novel methods for estimating locus-specific IBD from low coverage whole genome resequencing data to demonstrate the feasibility and value of resolving fine-scaled gradients of relatedness in free-living animals. Using primarily 4-6× coverage data from a rhesus macaque (Macaca mulatta) population with available long-term pedigree data, we show that we can call the number and length of IBD segments across the genome with high accuracy even at 0.5× coverage. The resulting estimates demonstrate substantial variation in genetic relatedness within kin classes, leading to overlapping distributions between kin classes. They identify cryptic genetic relatives that are not represented in the pedigree and reveal elevated recombination rates in females relative to males, which allows us to discriminate maternal and paternal kin using genotype data alone. Our findings represent a breakthrough in the ability to understand the predictors and consequences of genetic relatedness in natural populations, contributing to our understanding of a fundamental component of population structure in the wild.
Collapse
Affiliation(s)
- Annika Freudiger
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Vladimir M Jovanovic
- Human Biology and Primate Evolution, Institut für Zoologie, Freie Universität Berlin, Berlin, Germany
- Bioinformatics Solution Center, Freie Universität Berlin, Berlin, Germany
| | - Yilei Huang
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Noah Snyder-Mackler
- Center for Evolution & Medicine, School of Life Sciences, Arizona State University, Tempe, USA
| | - Donald F Conrad
- Division of Genetics, Oregon National Primate Research Center, Portland, Oregon, USA
| | - Brian Miller
- Division of Genetics, Oregon National Primate Research Center, Portland, Oregon, USA
| | - Michael J Montague
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Hendrikje Westphal
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, NM, USA
| | - Stefanie Bley
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Julie E Horvath
- Department of Biological and Biomedical Sciences, North Carolina Central University, North Carolina, Durham, USA
- Research and Collections Section, North Carolina Museum of Natural Sciences, North Carolina, Raleigh, USA
- Department of Biological Sciences, North Carolina State University, North Carolina, Raleigh, USA
- Department of Evolutionary Anthropology, Duke University, North Carolina, Durham, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Lauren J N Brent
- Centre for Research in Animal Behaviour, University of Exeter, Exeter, UK
| | - Michael L Platt
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Marketing Department, the Wharton School of Business, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Angelina Ruiz-Lambides
- Cayo Santiago Field Station, Caribbean Primate Research Center, University of Puerto Rico, Punta Santiago, Puerto Rico
| | - Jenny Tung
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Evolutionary Anthropology, Duke University, North Carolina, Durham, USA
- Department of Biology, Duke University, Durham, North Carolina, USA
- Duke University Population Research Institute, Durham, North Carolina, USA
| | - Katja Nowick
- Human Biology and Primate Evolution, Institut für Zoologie, Freie Universität Berlin, Berlin, Germany
- Bioinformatics Solution Center, Freie Universität Berlin, Berlin, Germany
| | - Harald Ringbauer
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Anja Widdig
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany
| |
Collapse
|
11
|
Dong L, Xie Y, Zhang Y, Wang R, Sun X. Genomic dissection of additive and non-additive genetic effects and genomic prediction in an open-pollinated family test of Japanese larch. BMC Genomics 2024; 25:11. [PMID: 38166605 PMCID: PMC10759612 DOI: 10.1186/s12864-023-09891-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 12/11/2023] [Indexed: 01/05/2024] Open
Abstract
Genomic dissection of genetic effects on desirable traits and the subsequent use of genomic selection hold great promise for accelerating the rate of genetic improvement of forest tree species. In this study, a total of 661 offspring trees from 66 open-pollinated families of Japanese larch (Larix kaempferi (Lam.) Carrière) were sampled at a test site. The contributions of additive and non-additive effects (dominance, imprinting and epistasis) were evaluated for nine valuable traits related to growth, wood physical and chemical properties, and competitive ability using three pedigree-based and four Genomics-based Best Linear Unbiased Predictions (GBLUP) models and used to determine the genetic model. The predictive ability (PA) of two genomic prediction methods, GBLUP and Reproducing Kernel Hilbert Spaces (RKHS), was compared. The traits could be classified into two types based on different quantitative genetic architectures: for type I, including wood chemical properties and Pilodyn penetration, additive effect is the main source of variation (38.20-67.46%); for type II, including growth, competitive ability and acoustic velocity, epistasis plays a significant role (50.76-91.26%). Dominance and imprinting showed low to moderate contributions (< 36.26%). GBLUP was more suitable for traits of type I (PAs = 0.37-0.39 vs. 0.14-0.25), and RKHS was more suitable for traits of type II (PAs = 0.23-0.37 vs. 0.07-0.23). Non-additive effects make no meaningful contribution to the enhancement of PA of GBLUP method for all traits. These findings enhance our current understanding of the architecture of quantitative traits and lay the foundation for the development of genomic selection strategies in Japanese larch.
Collapse
Affiliation(s)
- Leiming Dong
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
- Key Laboratory of National Forestry and Grassland Administration on Plant Ex situ Conservation, Beijing Floriculture Engineering Technology Research Centre, Beijing Botanical Garden, Beijing, 100093, China
| | - Yunhui Xie
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Yalin Zhang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Ruizhen Wang
- Key Laboratory of National Forestry and Grassland Administration on Plant Ex situ Conservation, Beijing Floriculture Engineering Technology Research Centre, Beijing Botanical Garden, Beijing, 100093, China
| | - Xiaomei Sun
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China.
| |
Collapse
|
12
|
Fradgley NS, Bentley AR, Gardner KA, Swarbreck SM, Kerton M. Maintenance of UK bread baking quality: Trends in wheat quality traits over 50 years of breeding and potential for future application of genomic-assisted selection. THE PLANT GENOME 2023; 16:e20326. [PMID: 37057385 DOI: 10.1002/tpg2.20326] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/22/2023] [Accepted: 02/28/2023] [Indexed: 06/19/2023]
Abstract
Improved selection of wheat varieties with high end-use quality contributes to sustainable food systems by ensuring productive crops are suitable for human consumption end-uses. Here, we investigated the genetic control and genomic prediction of milling and baking quality traits in a panel of 379 historic and elite, high-quality UK bread wheat (Triticum eastivum L.) varieties and breeding lines. Analysis of the panel showed that genetic diversity has not declined over recent decades of selective breeding while phenotypic analysis found a clear trend of increased loaf baking quality of modern milling wheats despite declining grain protein content. Genome-wide association analysis identified 24 quantitative trait loci (QTL) across all quality traits, many of which had pleiotropic effects. Changes in the frequency of positive alleles of QTL over recent decades reflected trends in trait variation and reveal where progress has historically been made for improved baking quality traits. It also demonstrates opportunities for marker-assisted selection for traits such as Hagberg falling number and specific weight that do not appear to have been improved by recent decades of phenotypic selection. We demonstrate that applying genomic prediction in a commercial wheat breeding program for expensive late-stage loaf baking quality traits outperforms phenotypic selection based on early-stage predictive quality traits. Finally, trait-assisted genomic prediction combining both phenotypic and genomic selection enabled slightly higher prediction accuracy, but genomic prediction alone was the most cost-effective selection strategy considering genotyping and phenotyping costs per sample.
Collapse
Affiliation(s)
- Nick S Fradgley
- Genetics and Pre-Breeding Department, National Institute of Agricultural Botany (NIAB), 93 Lawrence Weaver Road, Cambridge, UK
| | - Alison R Bentley
- Genetics and Pre-Breeding Department, National Institute of Agricultural Botany (NIAB), 93 Lawrence Weaver Road, Cambridge, UK
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, México
| | - Keith A Gardner
- Genetics and Pre-Breeding Department, National Institute of Agricultural Botany (NIAB), 93 Lawrence Weaver Road, Cambridge, UK
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, México
| | - Stéphanie M Swarbreck
- Genetics and Pre-Breeding Department, National Institute of Agricultural Botany (NIAB), 93 Lawrence Weaver Road, Cambridge, UK
| | | |
Collapse
|
13
|
Singh V, Krause M, Sandhu D, Sekhon RS, Kaundal A. Salinity stress tolerance prediction for biomass-related traits in maize (Zea mays L.) using genome-wide markers. THE PLANT GENOME 2023; 16:e20385. [PMID: 37667417 DOI: 10.1002/tpg2.20385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/18/2023] [Accepted: 08/14/2023] [Indexed: 09/06/2023]
Abstract
Maize (Zea mays L.) is the third most important cereal crop after rice (Oryza sativa) and wheat (Triticum aestivum). Salinity stress significantly affects vegetative biomass and grain yield and, therefore, reduces the food and silage productivity of maize. Selecting salt-tolerant genotypes is a cumbersome and time-consuming process that requires meticulous phenotyping. To predict salt tolerance in maize, we estimated breeding values for four biomass-related traits, including shoot length, shoot weight, root length, and root weight under salt-stressed and controlled conditions. A five-fold cross-validation method was used to select the best model among genomic best linear unbiased prediction (GBLUP), ridge-regression BLUP (rrBLUP), extended GBLUP, Bayesian Lasso, Bayesian ridge regression, BayesA, BayesB, and BayesC. Examination of the effect of different marker densities on prediction accuracy revealed that a set of low-density single nucleotide polymorphisms obtained through filtering based on a combination of analysis of variance and linkage disequilibrium provided the best prediction accuracy for all the traits. The average prediction accuracy in cross-validations ranged from 0.46 to 0.77 across the four derived traits. The GBLUP, rrBLUP, and all Bayesian models except BayesB demonstrated comparable levels of prediction accuracy that were superior to the other modeling approaches. These findings provide a roadmap for the deployment and optimization of genomic selection in breeding for salt tolerance in maize.
Collapse
Affiliation(s)
- Vishal Singh
- Plants, Soils, and Climate, College of Agricultural and Applied Sciences, Utah State University, Logan, Utah, USA
- ICAR-Indian Institute of Maize Research, Ludhiana, Punjab, India
| | - Margaret Krause
- Plants, Soils, and Climate, College of Agricultural and Applied Sciences, Utah State University, Logan, Utah, USA
| | - Devinder Sandhu
- US Salinity Laboratory (USDA-ARS), Riverside, California, USA
| | - Rajandeep S Sekhon
- Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina, USA
| | - Amita Kaundal
- Plants, Soils, and Climate, College of Agricultural and Applied Sciences, Utah State University, Logan, Utah, USA
| |
Collapse
|
14
|
Lopez-Cruz M, Aguate FM, Washburn JD, de Leon N, Kaeppler SM, Lima DC, Tan R, Thompson A, De La Bretonne LW, de Los Campos G. Leveraging data from the Genomes-to-Fields Initiative to investigate genotype-by-environment interactions in maize in North America. Nat Commun 2023; 14:6904. [PMID: 37903778 PMCID: PMC10616096 DOI: 10.1038/s41467-023-42687-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 10/18/2023] [Indexed: 11/01/2023] Open
Abstract
Genotype-by-environment (G×E) interactions can significantly affect crop performance and stability. Investigating G×E requires extensive data sets with diverse cultivars tested over multiple locations and years. The Genomes-to-Fields (G2F) Initiative has tested maize hybrids in more than 130 year-locations in North America since 2014. Here, we curate and expand this data set by generating environmental covariates (using a crop model) for each of the trials. The resulting data set includes DNA genotypes and environmental data linked to more than 70,000 phenotypic records of grain yield and flowering traits for more than 4000 hybrids. We show how this valuable data set can serve as a benchmark in agricultural modeling and prediction, paving the way for countless G×E investigations in maize. We use multivariate analyses to characterize the data set's genetic and environmental structure, study the association of key environmental factors with traits, and provide benchmarks using genomic prediction models.
Collapse
Affiliation(s)
- Marco Lopez-Cruz
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, 48824, USA.
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| | - Fernando M Aguate
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, 48824, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Jacob D Washburn
- United States Department of Agriculture, Agricultural Research Service, University of Missouri, Columbia, MO, 65211, USA
| | - Natalia de Leon
- Department of Agronomy, University of Wisconsin, Madison, WI, 53706, USA
| | - Shawn M Kaeppler
- Department of Agronomy, University of Wisconsin, Madison, WI, 53706, USA
- Wisconsin Crop Innovation Center, University of Wisconsin, Middleton, WI, 53562, USA
| | | | - Ruijuan Tan
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, 48824, USA
| | - Addie Thompson
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, 48824, USA
- Plant Resilience Institute, Michigan State University, East Lansing, MI, 48824, USA
| | | | - Gustavo de Los Campos
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, 48824, USA.
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Statistics and Probability, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
15
|
Muqaddasi QH, Muqaddasi RK, Ebmeyer E, Korzun V, Argillier O, Mirdita V, Reif JC, Ganal MW, Röder MS. Genetic control and prospects of predictive breeding for European winter wheat's Zeleny sedimentation values and Hagberg-Perten falling number. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:229. [PMID: 37874400 PMCID: PMC10598174 DOI: 10.1007/s00122-023-04450-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 08/16/2023] [Indexed: 10/25/2023]
Abstract
KEY MESSAGE Sedimentation values and falling number in the last decades have helped maintain high baking quality despite rigorous selection for grain yield in wheat. Allelic combinations of major loci sustained the bread-making quality while improving grain yield. Glu-D1, Pinb-D1, and non-gluten proteins are associated with sedimentation values and falling number in European wheat. Zeleny sedimentation values (ZSV) and Hagberg-Perten falling number (HFN) are among the most important parameters that help determine the baking quality classes of wheat and, thus, influence the monetary benefits for growers. We used a published data set of 372 European wheat varieties evaluated in replicated field trials in multiple environments. ZSV and HFN traits hold a wide and significant genotypic variation and high broad-sense heritability. The genetic correlations revealed positive and significant associations of ZSV and HFN with each other, grain protein content (GPC) and grain hardness; however, they were all significantly negatively correlated with grain yield. Besides, GPC appeared to be the major predictor for ZSV and HFN. Our genome-wide association analyses based on high-quality SSR, SNP, and candidate gene markers revealed a strong quantitative genetic nature of ZSV and HFN by explaining their total genotypic variance as 41.49% and 38.06%, respectively. The association of known Glutenin (Glu-1) and Puroindoline (Pin-1) with ZSV provided positive analytic proof of our studies. We report novel candidate loci associated with globulins and albumins-the non-gluten monomeric proteins in wheat. In addition, predictive breeding analyses for ZSV and HFN suggest using genomic selection in the early stages of breeding programs with an average prediction accuracy of 81 and 59%, respectively.
Collapse
Affiliation(s)
- Quddoos H Muqaddasi
- European Wheat Breeding Center, BASF Agricultural Solutions GmbH, Am Schwabeplan 8, 06466, Stadt Seeland OT Gatersleben, Germany.
- KWS SAAT SE & Co. KGaA, Einbeck, 37574, Germany.
| | - Roop Kamal Muqaddasi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466, Stadt Seeland OT Gatersleben, Germany
| | | | | | | | - Vilson Mirdita
- European Wheat Breeding Center, BASF Agricultural Solutions GmbH, Am Schwabeplan 8, 06466, Stadt Seeland OT Gatersleben, Germany
| | - Jochen C Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466, Stadt Seeland OT Gatersleben, Germany
| | - Martin W Ganal
- TraitGenetics GmbH, Am Schwabeplan 1B, 06466, Stadt Seeland OT Gatersleben, Germany
| | - Marion S Röder
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466, Stadt Seeland OT Gatersleben, Germany
| |
Collapse
|
16
|
Sadeqi MB, Ballvora A, Dadshani S, Léon J. Genetic Parameter and Hyper-Parameter Estimation Underlie Nitrogen Use Efficiency in Bread Wheat. Int J Mol Sci 2023; 24:14275. [PMID: 37762585 PMCID: PMC10531695 DOI: 10.3390/ijms241814275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/07/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023] Open
Abstract
Estimation and prediction play a key role in breeding programs. Currently, phenotyping of complex traits such as nitrogen use efficiency (NUE) in wheat is still expensive, requires high-throughput technologies and is very time consuming compared to genotyping. Therefore, researchers are trying to predict phenotypes based on marker information. Genetic parameters such as population structure, genomic relationship matrix, marker density and sample size are major factors that increase the performance and accuracy of a model. However, they play an important role in adjusting the statistically significant false discovery rate (FDR) threshold in estimation. In parallel, there are many genetic hyper-parameters that are hidden and not represented in the given genomic selection (GS) model but have significant effects on the results, such as panel size, number of markers, minor allele frequency, number of call rates for each marker, number of cross validations and batch size in the training set of the genomic file. The main challenge is to ensure the reliability and accuracy of predicted breeding values (BVs) as results. Our study has confirmed the results of bias-variance tradeoff and adaptive prediction error for the ensemble-learning-based model STACK, which has the highest performance when estimating genetic parameters and hyper-parameters in a given GS model compared to other models.
Collapse
Affiliation(s)
- Mohammad Bahman Sadeqi
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| | - Agim Ballvora
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| | - Said Dadshani
- INRES-Plant Nutrition, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany;
| | - Jens Léon
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| |
Collapse
|
17
|
Montesinos-López OA, Crossa J, Saint Pierre C, Gerard G, Valenzo-Jiménez MA, Vitale P, Valladares-Cellis PE, Buenrostro-Mariscal R, Montesinos-López A, Crespo-Herrera L. Multivariate Genomic Hybrid Prediction with Kernels and Parental Information. Int J Mol Sci 2023; 24:13799. [PMID: 37762107 PMCID: PMC10531250 DOI: 10.3390/ijms241813799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 08/28/2023] [Accepted: 09/01/2023] [Indexed: 09/29/2023] Open
Abstract
Genomic selection (GS) plays a pivotal role in hybrid prediction. It can enhance the selection of parental lines, accurately predict hybrid performance, and harness hybrid vigor. Likewise, it can optimize breeding strategies by reducing field trial requirements, expediting hybrid development, facilitating targeted trait improvement, and enhancing adaptability to diverse environments. Leveraging genomic information empowers breeders to make informed decisions and significantly improve the efficiency and success rate of hybrid breeding programs. In order to improve the genomic ability performance, we explored the incorporation of parental phenotypic information as covariates under a multi-trait framework. Approach 1, referred to as Pmean, directly utilized parental phenotypic information without any preprocessing. While approach 2, denoted as BV, replaced the direct use of phenotypic values of both parents with their respective breeding values. While an improvement in prediction performance was observed in both approaches, with a minimum 4.24% reduction in the normalized root mean square error (NRMSE), the direct incorporation of parental phenotypic information in the Pmean approach slightly outperformed the BV approach. We also compared these two approaches using linear and nonlinear kernels, but no relevant gain was observed. Finally, our results increase empirical evidence confirming that the integration of parental phenotypic information helps increase the prediction performance of hybrids.
Collapse
Affiliation(s)
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco 52640, México, Mexico; (J.C.); (C.S.P.); (G.G.); (P.V.)
- Colegio de Postgraduados, Montecillos 56230, México, Mexico
| | - Carolina Saint Pierre
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco 52640, México, Mexico; (J.C.); (C.S.P.); (G.G.); (P.V.)
| | - Guillermo Gerard
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco 52640, México, Mexico; (J.C.); (C.S.P.); (G.G.); (P.V.)
| | - Marco Alberto Valenzo-Jiménez
- Universidad Michoacana de San Nicolas de Hidalgo (UMSNH), Avenida Francisco J. Mujica S/N Ciudad Universitaria, Morelia 58030, Michoacán, Mexico
| | - Paolo Vitale
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco 52640, México, Mexico; (J.C.); (C.S.P.); (G.G.); (P.V.)
| | | | | | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Jalisco, Mexico
| | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco 52640, México, Mexico; (J.C.); (C.S.P.); (G.G.); (P.V.)
| |
Collapse
|
18
|
El Hanafi S, Jiang Y, Kehel Z, Schulthess AW, Zhao Y, Mascher M, Haupt M, Himmelbach A, Stein N, Amri A, Reif JC. Genomic predictions to leverage phenotypic data across genebanks. FRONTIERS IN PLANT SCIENCE 2023; 14:1227656. [PMID: 37701801 PMCID: PMC10493331 DOI: 10.3389/fpls.2023.1227656] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/07/2023] [Indexed: 09/14/2023]
Abstract
Genome-wide prediction is a powerful tool in breeding. Initial results suggest that genome-wide approaches are also promising for enhancing the use of the genebank material: predicting the performance of plant genetic resources can unlock their hidden potential and fill the information gap in genebanks across the world and, hence, underpin prebreeding programs. As a proof of concept, we evaluated the power of across-genebank prediction for extensive germplasm collections relying on historical data on flowering/heading date, plant height, and thousand kernel weight of 9,344 barley (Hordeum vulgare L.) plant genetic resources from the German Federal Ex situ Genebank for Agricultural and Horticultural Crops (IPK) and of 1,089 accessions from the International Center for Agriculture Research in the Dry Areas (ICARDA) genebank. Based on prediction abilities for each trait, three scenarios for predictive characterization were compared: 1) a benchmark scenario, where test and training sets only contain ICARDA accessions, 2) across-genebank predictions using IPK as training and ICARDA as test set, and 3) integrated genebank predictions that include IPK with 30% of ICARDA accessions as a training set to predict the rest of ICARDA accessions. Within the population of ICARDA accessions, prediction abilities were low to moderate, which was presumably caused by a limited number of accessions used to train the model. Interestingly, ICARDA prediction abilities were boosted up to ninefold by using training sets composed of IPK plus 30% of ICARDA accessions. Pervasive genotype × environment interactions (GEIs) can become a potential obstacle to train robust genome-wide prediction models across genebanks. This suggests that the potential adverse effect of GEI on prediction ability was counterbalanced by the augmented training set with certain connectivity to the test set. Therefore, across-genebank predictions hold the promise to improve the curation of the world's genebank collections and contribute significantly to the long-term development of traditional genebanks toward biodigital resource centers.
Collapse
Affiliation(s)
- Samira El Hanafi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Yong Jiang
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Zakaria Kehel
- International Center for Agricultural Research in Dry Areas (ICARDA), Rabat, Morocco
| | - Albert W. Schulthess
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Yusheng Zhao
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Max Haupt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Axel Himmelbach
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
- Center for Integrated Breeding Research (CiBreed), Georg-August-University, Göttingen, Germany
| | - Ahmed Amri
- International Center for Agricultural Research in Dry Areas (ICARDA), Rabat, Morocco
| | - Jochen C. Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| |
Collapse
|
19
|
Susmitha P, Kumar P, Yadav P, Sahoo S, Kaur G, Pandey MK, Singh V, Tseng TM, Gangurde SS. Genome-wide association study as a powerful tool for dissecting competitive traits in legumes. FRONTIERS IN PLANT SCIENCE 2023; 14:1123631. [PMID: 37645459 PMCID: PMC10461012 DOI: 10.3389/fpls.2023.1123631] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/08/2023] [Indexed: 08/31/2023]
Abstract
Legumes are extremely valuable because of their high protein content and several other nutritional components. The major challenge lies in maintaining the quantity and quality of protein and other nutritional compounds in view of climate change conditions. The global need for plant-based proteins has increased the demand for seeds with a high protein content that includes essential amino acids. Genome-wide association studies (GWAS) have evolved as a standard approach in agricultural genetics for examining such intricate characters. Recent development in machine learning methods shows promising applications for dimensionality reduction, which is a major challenge in GWAS. With the advancement in biotechnology, sequencing, and bioinformatics tools, estimation of linkage disequilibrium (LD) based associations between a genome-wide collection of single-nucleotide polymorphisms (SNPs) and desired phenotypic traits has become accessible. The markers from GWAS could be utilized for genomic selection (GS) to predict superior lines by calculating genomic estimated breeding values (GEBVs). For prediction accuracy, an assortment of statistical models could be utilized, such as ridge regression best linear unbiased prediction (rrBLUP), genomic best linear unbiased predictor (gBLUP), Bayesian, and random forest (RF). Both naturally diverse germplasm panels and family-based breeding populations can be used for association mapping based on the nature of the breeding system (inbred or outbred) in the plant species. MAGIC, MCILs, RIAILs, NAM, and ROAM are being used for association mapping in several crops. Several modifications of NAM, such as doubled haploid NAM (DH-NAM), backcross NAM (BC-NAM), and advanced backcross NAM (AB-NAM), have also been used in crops like rice, wheat, maize, barley mustard, etc. for reliable marker-trait associations (MTAs), phenotyping accuracy is equally important as genotyping. Highthroughput genotyping, phenomics, and computational techniques have advanced during the past few years, making it possible to explore such enormous datasets. Each population has unique virtues and flaws at the genomics and phenomics levels, which will be covered in more detail in this review study. The current investigation includes utilizing elite breeding lines as association mapping population, optimizing the choice of GWAS selection, population size, and hurdles in phenotyping, and statistical methods which will analyze competitive traits in legume breeding.
Collapse
Affiliation(s)
- Pusarla Susmitha
- Regional Agricultural Research Station, Acharya N.G. Ranga Agricultural University, Andhra Pradesh, India
| | - Pawan Kumar
- Department of Genetics and Plant Breeding, College of Agriculture, Chaudhary Charan Singh (CCS) Haryana Agricultural University, Hisar, India
| | - Pankaj Yadav
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Rajasthan, India
| | - Smrutishree Sahoo
- Department of Genetics and Plant Breeding, School of Agriculture, Gandhi Institute of Engineering and Technology (GIET) University, Odisha, India
| | - Gurleen Kaur
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Manish K. Pandey
- Department of Genomics, Prebreeding and Bioinformatics, International Crops Research Institute for the Semi-Arid Tropics, Hyderabad, India
| | - Varsha Singh
- Department of Plant and Soil Sciences, Mississippi State University, Starkville, MS, United States
| | - Te Ming Tseng
- Department of Plant and Soil Sciences, Mississippi State University, Starkville, MS, United States
| | - Sunil S. Gangurde
- Department of Plant Pathology, University of Georgia, Tifton, GA, United States
| |
Collapse
|
20
|
Alves AAC, Fernandes AFA, Lopes FB, Breen V, Hawken R, Gianola D, Rosa GJDM. (Quasi) multitask support vector regression with heuristic hyperparameter optimization for whole-genome prediction of complex traits: a case study with carcass traits in broilers. G3 (BETHESDA, MD.) 2023; 13:jkad109. [PMID: 37216670 PMCID: PMC10411556 DOI: 10.1093/g3journal/jkad109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 03/13/2023] [Accepted: 04/24/2023] [Indexed: 05/24/2023]
Abstract
This study investigates nonlinear kernels for multitrait (MT) genomic prediction using support vector regression (SVR) models. We assessed the predictive ability delivered by single-trait (ST) and MT models for 2 carcass traits (CT1 and CT2) measured in purebred broiler chickens. The MT models also included information on indicator traits measured in vivo [Growth and feed efficiency trait (FE)]. We proposed an approach termed (quasi) multitask SVR (QMTSVR), with hyperparameter optimization performed via genetic algorithm. ST and MT Bayesian shrinkage and variable selection models [genomic best linear unbiased predictor (GBLUP), BayesC (BC), and reproducing kernel Hilbert space (RKHS) regression] were employed as benchmarks. MT models were trained using 2 validation designs (CV1 and CV2), which differ if the information on secondary traits is available in the testing set. Models' predictive ability was assessed with prediction accuracy (ACC; i.e. the correlation between predicted and observed values, divided by the square root of phenotype accuracy), standardized root-mean-squared error (RMSE*), and inflation factor (b). To account for potential bias in CV2-style predictions, we also computed a parametric estimate of accuracy (ACCpar). Predictive ability metrics varied according to trait, model, and validation design (CV1 or CV2), ranging from 0.71 to 0.84 for ACC, 0.78 to 0.92 for RMSE*, and between 0.82 and 1.34 for b. The highest ACC and smallest RMSE* were achieved with QMTSVR-CV2 in both traits. We observed that for CT1, model/validation design selection was sensitive to the choice of accuracy metric (ACC or ACCpar). Nonetheless, the higher predictive accuracy of QMTSVR over MTGBLUP and MTBC was replicated across accuracy metrics, besides the similar performance between the proposed method and the MTRKHS model. Results showed that the proposed approach is competitive with conventional MT Bayesian regression models using either Gaussian or spike-slab multivariate priors.
Collapse
Affiliation(s)
| | | | | | - Vivian Breen
- Cobb-Vantress Inc., Siloam Springs, AR 72761, USA
| | | | - Daniel Gianola
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | |
Collapse
|
21
|
Islam MS, Corak K, McCord P, Hulse-Kemp AM, Lipka AE. A first look at the ability to use genomic prediction for improving the ratooning ability of sugarcane. FRONTIERS IN PLANT SCIENCE 2023; 14:1205999. [PMID: 37600177 PMCID: PMC10433174 DOI: 10.3389/fpls.2023.1205999] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 07/03/2023] [Indexed: 08/22/2023]
Abstract
The sugarcane ratooning ability (RA) is the most important target trait for breeders seeking to enhance the profitability of sugarcane production by reducing the planting cost. Understanding the genetics governing the RA could help breeders by identifying molecular markers that could be used for genomics-assisted breeding (GAB). A replicated field trial was conducted for three crop cycles (plant cane, first ratoon, and second ratoon) using 432 sugarcane clones and used for conducting genome-wide association and genomic prediction of five sugar and yield component traits of the RA. The RA traits for economic index (EI), stalk population (SP), stalk weight (SW), tonns of cane per hectare (TCH), and tonns of sucrose per hectare (TSH) were estimated from the yield and sugar data. A total of six putative quantitative trait loci and eight nonredundant single-nucleotide polymorphism (SNP) markers were associated with all five tested RA traits and appear to be unique. Seven putative candidate genes were colocated with significant SNPs associated with the five RA traits. The genomic prediction accuracies for those tested traits were moderate and ranged from 0.21 to 0.36. However, the models fitting fixed effects for the most significant associated markers for each respective trait did not give any advantages over the standard models without fixed effects. As a result of this study, more robust markers could be used in the future for clone selection in sugarcane, potentially helping resolve the genetic control of the RA in sugarcane.
Collapse
Affiliation(s)
| | - Keo Corak
- Genomics and Bioinformatics Research Unit, USDA-ARS, Raleigh, NC, United States
| | - Per McCord
- Sugarcane Field Station, USDA-ARS, Canal Point, FL, United States
- Irrigated Agriculture Research and Extension Center, Washington State University, Prosser, WA, United States
| | - Amanda M. Hulse-Kemp
- Genomics and Bioinformatics Research Unit, USDA-ARS, Raleigh, NC, United States
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC, United States
| | - Alexander E. Lipka
- Department of Crop Sciences, University of Illinois, Urbana-Champaign, IL, United States
| |
Collapse
|
22
|
Adams J, de Vries M, van Eeuwijk F. Efficient Genomic Prediction of Yield and Dry Matter in Hybrid Potato. PLANTS (BASEL, SWITZERLAND) 2023; 12:2617. [PMID: 37514232 PMCID: PMC10385487 DOI: 10.3390/plants12142617] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 06/27/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023]
Abstract
There is an ongoing endeavor within the potato breeding sector to rapidly adapt potato from a clonal polyploid crop to a diploid hybrid potato crop. While hybrid breeding allows for the efficient generation and selection of parental lines, it also increases breeding program complexity and results in longer breeding cycles. Over the past two decades, genomic prediction has revolutionized hybrid crop breeding through shorter breeding cycles, lower phenotyping costs, and better population improvement, resulting in increased genetic gains for genetically complex traits. In order to accelerate the genetic gains in hybrid potato, the proper implementation of genomic prediction is a crucial milestone in the rapid improvement of this crop. The authors of this paper set out to test genomic prediction in hybrid potato using current genotyped material with two alternative models: one model that predicts the general combining ability effects (GCA) and another which predicts both the general and specific combining ability effects (GCA+SCA). Using a training set comprising 769 hybrids and 456 genotyped parental lines, we found that reasonable a prediction accuracy could be achieved for most phenotypes with both zero common parents (ρ=0.36-0.61) and one (ρ=0.50-0.68) common parent between the training and test sets. There was no benefit with the inclusion of non-additive genetic effects in the GCA+SCA model despite SCA variance contributing between 9% and 19% of the total genetic variance. Genotype-by-environment interactions, while present, did not appear to affect the prediction accuracy, though prediction errors did vary across the trial's targets. These results suggest that genomically estimated breeding values on parental lines are sufficient for hybrid yield prediction.
Collapse
Affiliation(s)
- James Adams
- Biometris, Mathematical and Statistical Methods, Wageningen University and Research, 6708 PB Wageningen, The Netherlands
- Solynta, Dreijenlaan 2, 6703 HA Wageningen, The Netherlands
| | | | - Fred van Eeuwijk
- Biometris, Mathematical and Statistical Methods, Wageningen University and Research, 6708 PB Wageningen, The Netherlands
| |
Collapse
|
23
|
Johnsson M. Genomics in animal breeding from the perspectives of matrices and molecules. Hereditas 2023; 160:20. [PMID: 37149663 PMCID: PMC10163706 DOI: 10.1186/s41065-023-00285-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 05/03/2023] [Indexed: 05/08/2023] Open
Abstract
BACKGROUND This paper describes genomics from two perspectives that are in use in animal breeding and genetics: a statistical perspective concentrating on models for estimating breeding values, and a sequence perspective concentrating on the function of DNA molecules. MAIN BODY This paper reviews the development of genomics in animal breeding and speculates on its future from these two perspectives. From the statistical perspective, genomic data are large sets of markers of ancestry; animal breeding makes use of them while remaining agnostic about their function. From the sequence perspective, genomic data are a source of causative variants; what animal breeding needs is to identify and make use of them. CONCLUSION The statistical perspective, in the form of genomic selection, is the more applicable in contemporary breeding. Animal genomics researchers using from the sequence perspective are still working towards this the isolation of causative variants, equipped with new technologies but continuing a decades-long line of research.
Collapse
Affiliation(s)
- Martin Johnsson
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Box 7023, Uppsala, 75007, Sweden.
| |
Collapse
|
24
|
Ficht A, Konkin DJ, Cram D, Sidebottom C, Tan Y, Pozniak C, Rajcan I. Genomic selection for agronomic traits in a winter wheat breeding program. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:38. [PMID: 36897431 DOI: 10.1007/s00122-023-04294-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 12/19/2022] [Indexed: 06/18/2023]
Abstract
rAMP-seq based genomic selection for agronomic traits has been shown to be a useful tool for winter wheat breeding programs by increasing the rate of genetic gain. Genomic selection (GS) is an effective strategy to employ in a breeding program that focuses on optimizing quantitative traits, which results in the ability for breeders to select the best genotypes. GS was incorporated into a breeding program to determine the potential for implementation on an annual basis, with emphasis on selecting optimal parents and decreasing the time and costs associated with phenotyping large numbers of genotypes. The design options for applying repeat amplification sequencing (rAMP-seq) in bread wheat were explored, and a low-cost single primer pair strategy was implemented. A total of 1870 winter wheat genotypes were phenotyped and genotyped using rAMP-seq. The optimization of training to testing population size showed that the 70:30 ratio provided the most consistent prediction accuracy. Three GS models were tested, rrBLUP, RKHS and feed-forward neural networks using the University of Guelph Winter Wheat Breeding Program (UGWWBP) and Elite-UGWWBP populations. The models performed equally well for both populations and did not differ in prediction accuracy (r) for most agronomic traits, with the exception of yield, where RKHS performed the best with an r = 0.34 and 0.39 for each population, respectively. The ability to operate a breeding program where multiple selection strategies, including GS, are utilized will lead to higher efficiency in the program and ultimately lead to a higher rate of genetic gain.
Collapse
Affiliation(s)
- Alexandra Ficht
- Department of Plant Agriculture, University of Guelph, Crop Science Building, 50 Stone Road East, Guelph, ON, N1G 2W1, Canada
| | - David J Konkin
- Aquatic and Crop Resource Development Research Centre, National Research Council of Canada, Saskatoon, Canada
| | - Dustin Cram
- Aquatic and Crop Resource Development Research Centre, National Research Council of Canada, Saskatoon, Canada
| | - Christine Sidebottom
- Aquatic and Crop Resource Development Research Centre, National Research Council of Canada, Saskatoon, Canada
| | - Yifang Tan
- Aquatic and Crop Resource Development Research Centre, National Research Council of Canada, Saskatoon, Canada
| | - Curtis Pozniak
- Department of Plant Sciences, Crop Development Centre, University of Saskatchewan, Room 2E64, Agriculture Building, 51 Campus Drive, Saskatoon, SK, S7N 5A8, Canada
| | - Istvan Rajcan
- Department of Plant Agriculture, University of Guelph, Crop Science Building, 50 Stone Road East, Guelph, ON, N1G 2W1, Canada.
| |
Collapse
|
25
|
Fernández-González J, Akdemir D, Isidro Y Sánchez J. A comparison of methods for training population optimization in genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:30. [PMID: 36892603 PMCID: PMC9998580 DOI: 10.1007/s00122-023-04265-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/21/2022] [Indexed: 06/18/2023]
Abstract
Maximizing CDmean and Avg_GRM_self were the best criteria for training set optimization. A training set size of 50-55% (targeted) or 65-85% (untargeted) is needed to obtain 95% of the accuracy. With the advent of genomic selection (GS) as a widespread breeding tool, mechanisms to efficiently design an optimal training set for GS models became more relevant, since they allow maximizing the accuracy while minimizing the phenotyping costs. The literature described many training set optimization methods, but there is a lack of a comprehensive comparison among them. This work aimed to provide an extensive benchmark among optimization methods and optimal training set size by testing a wide range of them in seven datasets, six different species, different genetic architectures, population structure, heritabilities, and with several GS models to provide some guidelines about their application in breeding programs. Our results showed that targeted optimization (uses information from the test set) performed better than untargeted (does not use test set data), especially when heritability was low. The mean coefficient of determination was the best targeted method, although it was computationally intensive. Minimizing the average relationship within the training set was the best strategy for untargeted optimization. Regarding the optimal training set size, maximum accuracy was obtained when the training set was the entire candidate set. Nevertheless, a 50-55% of the candidate set was enough to reach 95-100% of the maximum accuracy in the targeted scenario, while we needed a 65-85% for untargeted optimization. Our results also suggested that a diverse training set makes GS robust against population structure, while including clustering information was less effective. The choice of the GS model did not have a significant influence on the prediction accuracies.
Collapse
Affiliation(s)
- Javier Fernández-González
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Madrid, Spain.
| | - Deniz Akdemir
- CIBMTR (Center for International Blood and Marrow Transplant Research), National Marrow Donor Program/Be The Match, Minneapolis, USA
| | - Julio Isidro Y Sánchez
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Madrid, Spain.
| |
Collapse
|
26
|
Jiménez NP, Feldmann MJ, Famula RA, Pincot DDA, Bjornson M, Cole GS, Knapp SJ. Harnessing underutilized gene bank diversity and genomic prediction of cross usefulness to enhance resistance to Phytophthora cactorum in strawberry. THE PLANT GENOME 2023; 16:e20275. [PMID: 36480594 DOI: 10.1002/tpg2.20275] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 09/19/2022] [Indexed: 05/10/2023]
Abstract
The development of strawberry (Fragaria × ananassa Duchesne ex Rozier) cultivars resistant to Phytophthora crown rot (PhCR), a devastating disease caused by the soil-borne pathogen Phytophthora cactorum (Lebert & Cohn) J. Schröt., has been challenging partly because the resistance phenotypes are quantitative and only moderately heritable. To develop deeper insights into the genetics of resistance and build the foundation for applying genomic selection, a genetically diverse training population was screened for resistance to California isolates of the pathogen. Here we show that genetic gains in breeding for resistance to PhCR have been negligible (3% of the cultivars tested were highly resistant and none surpassed early 20th century cultivars). Narrow-sense genomic heritability for PhCR resistance ranged from 0.41 to 0.75 among training population individuals. Using multivariate genome-wide association studies (GWAS), we identified a large-effect locus (predicted to be RPc2) that explained 43.6-51.6% of the genetic variance, was necessary but not sufficient for resistance, and was associated with calcium channel and other candidate genes with known plant defense functions. The addition of underutilized gene bank resources to our training population doubled additive genetic variance, increased the accuracy of genomic selection, and enabled the discovery of individuals carrying favorable alleles that are either rare or not present in modern cultivars. The incorporation of an RPc2-associated single-nucleotide polymorphism (SNP) as a fixed effect increased genomic prediction accuracy from 0.40 to 0.55. Finally, we show that parent selection using genomic-estimated breeding values, genetic variances, and cross usefulness holds promise for enhancing resistance to PhCR in strawberry.
Collapse
Affiliation(s)
- Nicolás P Jiménez
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Mitchell J Feldmann
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Randi A Famula
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Dominique D A Pincot
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Marta Bjornson
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Glenn S Cole
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| | - Steven J Knapp
- Dep. of Plant Sciences, Univ. of California, One Shields Ave, Davis, CA, 95616, USA
| |
Collapse
|
27
|
Fradgley NS, Bacon J, Bentley AR, Costa‐Neto G, Cottrell A, Crossa J, Cuevas J, Kerton M, Pope E, Swarbreck SM, Gardner KA. Prediction of near-term climate change impacts on UK wheat quality and the potential for adaptation through plant breeding. GLOBAL CHANGE BIOLOGY 2023; 29:1296-1313. [PMID: 36482280 PMCID: PMC10108302 DOI: 10.1111/gcb.16552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 11/17/2022] [Accepted: 11/29/2022] [Indexed: 05/26/2023]
Abstract
Wheat is a major crop worldwide, mainly cultivated for human consumption and animal feed. Grain quality is paramount in determining its value and downstream use. While we know that climate change threatens global crop yields, a better understanding of impacts on wheat end-use quality is also critical. Combining quantitative genetics with climate model outputs, we investigated UK-wide trends in genotypic adaptation for wheat quality traits. In our approach, we augmented genomic prediction models with environmental characterisation of field trials to predict trait values and climate effects in historical field trial data between 2001 and 2020. Addition of environmental covariates, such as temperature and rainfall, successfully enabled prediction of genotype by environment interactions (G × E), and increased prediction accuracy of most traits for new genotypes in new year cross validation. We then extended predictions from these models to much larger numbers of simulated environments using climate scenarios projected under Representative Concentration Pathways 8.5 for 2050-2069. We found geographically varying climate change impacts on wheat quality due to contrasting associations between specific weather covariables and quality traits across the UK. Notably, negative impacts on quality traits were predicted in the East of the UK due to increased summer temperatures while the climate in the North and South-west may become more favourable with increased summer temperatures. Furthermore, by projecting 167,040 simulated future genotype-environment combinations, we found only limited potential for breeding to exploit predictable G × E to mitigate year-to-year environmental variability for most traits except Hagberg falling number. This suggests low adaptability of current UK wheat germplasm across future UK climates. More generally, approaches demonstrated here will be critical to enable adaptation of global crops to near-term climate change.
Collapse
Affiliation(s)
| | | | - Alison R. Bentley
- NIABCambridgeUK
- International Maize and Wheat Improvement Center (CIMMYT)Carretera México‐VeracruzMexico
| | | | | | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT)Carretera México‐VeracruzMexico
| | - Jaime Cuevas
- Universidad Autonoma del Estado de Quintana RooChetumalQuintana RooMexico
| | | | | | | | - Keith A. Gardner
- NIABCambridgeUK
- International Maize and Wheat Improvement Center (CIMMYT)Carretera México‐VeracruzMexico
| |
Collapse
|
28
|
Jeon D, Kang Y, Lee S, Choi S, Sung Y, Lee TH, Kim C. Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction. FRONTIERS IN PLANT SCIENCE 2023; 14:1092584. [PMID: 36743488 PMCID: PMC9892199 DOI: 10.3389/fpls.2023.1092584] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/05/2023] [Indexed: 06/18/2023]
Abstract
As the world's population grows and food needs diversification, the demand for cereals and horticultural crops with beneficial traits increases. In order to meet a variety of demands, suitable cultivars and innovative breeding methods need to be developed. Breeding methods have changed over time following the advance of genetics. With the advent of new sequencing technology in the early 21st century, predictive breeding, such as genomic selection (GS), emerged when large-scale genomic information became available. GS shows good predictive ability for the selection of individuals with traits of interest even for quantitative traits by using various types of the whole genome-scanning markers, breaking away from the limitations of marker-assisted selection (MAS). In the current review, we briefly describe the history of breeding techniques, each breeding method, various statistical models applied to GS and methods to increase the GS efficiency. Consequently, we intend to propose and define the term digital breeding through this review article. Digital breeding is to develop a predictive breeding methods such as GS at a higher level, aiming to minimize human intervention by automatically proceeding breeding design, propagating breeding populations, and to make selections in consideration of various environments, climates, and topography during the breeding process. We also classified the phases of digital breeding based on the technologies and methods applied to each phase. This review paper will provide an understanding and a direction for the final evolution of plant breeding in the future.
Collapse
Affiliation(s)
- Donghyun Jeon
- Plant Computational Genomics Laboratory, Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon, Republic of Korea
| | - Yuna Kang
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| | - Solji Lee
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| | - Sehyun Choi
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| | - Yeonjun Sung
- Plant Computational Genomics Laboratory, Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon, Republic of Korea
| | - Tae-Ho Lee
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Republic of Korea
| | - Changsoo Kim
- Plant Computational Genomics Laboratory, Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon, Republic of Korea
- Plant Computational Genomics Laboratory, Department of Crop Science, Chungnam National University, Daejeon, Republic of Korea
| |
Collapse
|
29
|
Nishio M, Inoue K, Arakawa A, Ichinoseki K, Kobayashi E, Okamura T, Fukuzawa Y, Ogawa S, Taniguchi M, Oe M, Takeda M, Kamata T, Konno M, Takagi M, Sekiya M, Matsuzawa T, Inoue Y, Watanabe A, Kobayashi H, Shibata E, Ohtani A, Yazaki R, Nakashima R, Ishii K. Application of linear and machine learning models to genomic prediction of fatty acid composition in Japanese Black cattle. Anim Sci J 2023; 94:e13883. [PMID: 37909231 DOI: 10.1111/asj.13883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 08/29/2023] [Accepted: 09/15/2023] [Indexed: 11/02/2023]
Abstract
We collected 3180 records of oleic acid (C18:1) and monounsaturated fatty acid (MUFA) measured using gas chromatography (GC) and 6960 records of C18:1 and MUFA measured using near-infrared spectroscopy (NIRS) in intermuscular fat samples of Japanese Black cattle. We compared genomic prediction performance for four linear models (genomic best linear unbiased prediction [GBLUP], kinship-adjusted multiple loci [KAML], BayesC, and BayesLASSO) and five machine learning models (Gaussian kernel [GK], deep kernel [DK], random forest [RF], extreme gradient boost [XGB], and convolutional neural network [CNN]). For GC-based C18:1 and MUFA, KAML showed the highest accuracies, followed by BayesC, XGB, DK, GK, and BayesLASSO, with more than 6% gain of accuracy by KAML over GBLUP. Meanwhile, DK had the highest prediction accuracy for NIRS-based C18:1 and MUFA, but the difference in accuracies between DK and KAML was slight. For all traits, accuracies of RF and CNN were lower than those of GBLUP. The KAML extends GBLUP methods, of which marker effects are weighted, and involves only additive genetic effects; whereas machine learning methods capture non-additive genetic effects. Thus, KAML is the most suitable method for breeding of fatty acid composition in Japanese Black cattle.
Collapse
Affiliation(s)
- Motohide Nishio
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | - Keiichi Inoue
- National Livestock Breeding Center, Fukushima, Japan
- University of Miyazaki, Miyazaki, Japan
| | - Aisaku Arakawa
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | | | - Eiji Kobayashi
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | | | - Yo Fukuzawa
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | - Shinichiro Ogawa
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | | | - Mika Oe
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| | | | - Takehiro Kamata
- Aomori Prefectural Industrial Technology Research Center, Tsugaru, Japan
| | - Masaru Konno
- Iwate Agricultural Research Center Animal Industry Research Institute, Takizawa, Japan
| | - Michihiro Takagi
- Miyagi Prefecture Animal Industry Experiment Station, Osaki, Japan
| | - Mario Sekiya
- Akita Prefectural Livestock Experiment Station, Daisen, Japan
| | - Tamotsu Matsuzawa
- Livestock Research Centre, Fukushima Agricultural Technology Centre, Fukushima, Japan
| | - Yoshinobu Inoue
- Tottori Prefectural Livestock Research Center, Tottori, Japan
| | | | - Hiroshi Kobayashi
- Institute of Animal Production Okayama Prefectural Technology Center for Agriculture, Forestry and Fisheries, Misaki, Japan
| | - Eri Shibata
- Hiroshima Prefectural Technology Research Institute, Livestock Technology Research Center, Shobara, Japan
| | - Akihumi Ohtani
- Yamaguchi Prefectural Agriculture and Forestry General Technology Center, Mine, Japan
| | - Ryu Yazaki
- Oita Prefectural Agriculture, Forestry, and Fisheries Research Center, Takeda, Japan
| | - Ryotaro Nakashima
- Cattle Breeding Development Institute of Kagoshima Prefecture, Soo, Japan
| | - Kazuo Ishii
- Institute of Livestock and Grassland Science, NARO, Tsukuba, Japan
| |
Collapse
|
30
|
Angarita Barajas BK, Cantet RJC, Steibel JP, Schrauf MF, Forneris NS. Heritability estimates and predictive ability for pig meat quality traits using identity-by-state and identity-by-descent relationships in an F 2 population. J Anim Breed Genet 2023; 140:13-27. [PMID: 36300585 DOI: 10.1111/jbg.12742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 10/05/2022] [Indexed: 12/13/2022]
Abstract
Genomic relationships can be computed with dense genome-wide genotypes through different methods, either based on identity-by-state (IBS) or identity-by-descent (IBD). The latter has been shown to increase the accuracy of both estimated relationships and predicted breeding values. However, it is not clear whether an IBD approach would achieve greater heritability ( h 2 ) and predictive ability ( r ̂ y , y ̂ ) than its IBS counterpart for data with low-depth pedigrees. Here, we compare both approaches in terms of the estimated of h 2 and r ̂ y , y ̂ , using data on meat quality and carcass traits recorded in experimental crossbred pigs, with a pedigree constrained to only three generations. Three animal models were fitted which differed on the relationship matrix: an IBS model ( G IBS ), an IBD (defined within the known pedigree) model ( G IBD ), and a pedigree model ( A 22 ). In 9 of 20 traits, the range of increase for the estimates of σ u 2 and h 2 was 1.2-2.9 times greater with G IBS and G IBD models than with A 22 . Whereas for all traits, both parameters were similar between genomic models. The r ̂ y , y ̂ of the genomic models was higher compared to A 22 . A scarce increment in r ̂ y , y ̂ was found with G IBS when compared to G IBD , most likely due to the former recovering sizeable relationships among founder F0 animals.
Collapse
Affiliation(s)
| | - Rodolfo J C Cantet
- Instituto de Investigaciones en Producción Animal (INPA-CONICET-UBA), Buenos Aires, Argentina.,Departamento de Producción Animal, Facultad de Agronomía, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Juan P Steibel
- Department of Animal Science, Michigan State University, East Lansing, Michigan, USA.,Department of Fisheries and Wildlife, Michigan State University, East Lansing, Michigan, USA
| | - Matias F Schrauf
- Departamento de Métodos Cuantitativos y Sistemas de Información, Facultad de Agronomía, Universidad de Buenos Aires, Buenos Aires, Argentina.,Animal Breeding & Genomics, Wageningen Livestock Research, Wageningen University & Research, Wageningen, The Netherlands
| | - Natalia S Forneris
- Instituto de Investigaciones en Producción Animal (INPA-CONICET-UBA), Buenos Aires, Argentina.,Departamento de Producción Animal, Facultad de Agronomía, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
31
|
Morales L, Ametz C, Dallinger HG, Löschenberger F, Neumayer A, Zimmerl S, Buerstmayr H. Comparison of linear and semi-parametric models incorporating genomic, pedigree, and associated loci information for the prediction of resistance to stripe rust in an Austrian winter wheat breeding program. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:23. [PMID: 36692839 PMCID: PMC9873752 DOI: 10.1007/s00122-023-04249-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]
Abstract
We used a historical dataset on stripe rust resistance across 11 years in an Austrian winter wheat breeding program to evaluate genomic and pedigree-based linear and semi-parametric prediction methods. Stripe rust (yellow rust) is an economically important foliar disease of wheat (Triticum aestivum L.) caused by the fungus Puccinia striiformis f. sp. tritici. Resistance to stripe rust is controlled by both qualitative (R-genes) and quantitative (small- to medium-effect quantitative trait loci, QTL) mechanisms. Genomic and pedigree-based prediction methods can accelerate selection for quantitative traits such as stripe rust resistance. Here we tested linear and semi-parametric models incorporating genomic, pedigree, and QTL information for cross-validated, forward, and pairwise prediction of adult plant resistance to stripe rust across 11 years (2008-2018) in an Austrian winter wheat breeding program. Semi-parametric genomic modeling had the greatest predictive ability and genetic variance overall, but differences between models were small. Including QTL as covariates improved predictive ability in some years where highly significant QTL had been detected via genome-wide association analysis. Predictive ability was moderate within years (cross-validated) but poor in cross-year frameworks.
Collapse
Affiliation(s)
- Laura Morales
- Institute of Biotechnology in Plant Production, Department of Agrobiotechnology, University of Natural Resources and Life Sciences Vienna, Tulln, Austria.
| | | | - Hermann Gregor Dallinger
- Institute of Biotechnology in Plant Production, Department of Agrobiotechnology, University of Natural Resources and Life Sciences Vienna, Tulln, Austria
- Saatzucht Donau GmbH and CoKG, Probstdorf, Austria
| | | | | | - Simone Zimmerl
- Institute of Biotechnology in Plant Production, Department of Agrobiotechnology, University of Natural Resources and Life Sciences Vienna, Tulln, Austria
| | - Hermann Buerstmayr
- Institute of Biotechnology in Plant Production, Department of Agrobiotechnology, University of Natural Resources and Life Sciences Vienna, Tulln, Austria
| |
Collapse
|
32
|
Cuevas J, Reslow F, Crossa J, Ortiz R. Modeling genotype × environment interaction for single and multitrait genomic prediction in potato (Solanum tuberosum L.). G3 (BETHESDA, MD.) 2022; 13:6883526. [PMID: 36477309 PMCID: PMC9911059 DOI: 10.1093/g3journal/jkac322] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 11/01/2022] [Accepted: 11/28/2022] [Indexed: 12/13/2022]
Abstract
In this study, we extend research on genomic prediction (GP) to polysomic polyploid plant species with the main objective to investigate single-trait (ST) and multitrait (MT) multienvironment (ME) models using field trial data from 3 locations in Sweden [Helgegården (HEL), Mosslunda (MOS), Umeå (UM)] over 2 years (2020, 2021) of 253 potato cultivars and breeding clones for 5 tuber weight traits and 2 tuber flesh quality characteristics. This research investigated the GP of 4 genome-based prediction models with genotype × environment interactions (GEs): (1) ST reaction norm model (M1), (2) ST model considering covariances between environments (M2), (3) ST M2 extended to include a random vector that utilizes the environmental covariances (M3), and (4) MT model with GE (M4). Several prediction problems were analyzed for each of the GP accuracy of the 4 models. Results of the prediction of traits in HEL, the high yield potential testing site in 2021, show that the best-predicted traits were tuber flesh starch (%), weight of tuber above 60 or below 40 mm in size, and the total tuber weight. In terms of GP, accuracy model M4 gave the best prediction accuracy in 3 traits, namely tuber weight of 40-50 or above 60 mm in size, and total tuber weight, and very similar in the starch trait. For MOS in 2021, the best predictive traits were starch, weight of tubers above 60, 50-60, or below 40 mm in size, and the total tuber weight. MT model M4 was the best GP model based on its accuracy when some cultivars are observed in some traits. For the GP accuracy of traits in UM in 2021, the best predictive traits were the weight of tubers above 60, 50-60, or below 40 mm in size, and the best model was MT M4, followed by models ST M3 and M2.
Collapse
Affiliation(s)
- Jaime Cuevas
- Departamento de Energía, Universidad Autónoma del Estado de Quintana Roo, Chetumal, Quintana Roo 77019, México
| | - Fredrik Reslow
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), P.O. Box 190, Lomma SE 23436, Sweden
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz Km. 45, El Batán, Texcoco 56237, Edo. de Mexico, Mexico,Colegio de Postgraduados, Montecillos, Edo. de México 56230, México
| | - Rodomiro Ortiz
- Corresponding author: Sveriges Lantbruksuniversitet, Inst. för Växtförädling, Box 190, SE 23 422 Lomma, Sweden.
| |
Collapse
|
33
|
Montesinos-López OA, Carter AH, Bernal-Sandoval DA, Cano-Paez B, Montesinos-López A, Crossa J. A Comparison between Three Tuning Strategies for Gaussian Kernels in the Context of Univariate Genomic Prediction. Genes (Basel) 2022; 13:genes13122282. [PMID: 36553547 PMCID: PMC9778581 DOI: 10.3390/genes13122282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/15/2022] [Accepted: 11/29/2022] [Indexed: 12/07/2022] Open
Abstract
Genomic prediction is revolutionizing plant breeding since candidate genotypes can be selected without the need to measure their trait in the field. When a reference population contains both phenotypic and genotypic information, it is trained by a statistical machine learning method that is subsequently used for making predictions of breeding or phenotypic values of candidate genotypes that were only genotyped. Nevertheless, the successful implementation of the genomic selection (GS) methodology depends on many factors. One key factor is the type of statistical machine learning method used since some are unable to capture nonlinear patterns available in the data. While kernel methods are powerful statistical machine learning algorithms that capture complex nonlinear patterns in the data, their successful implementation strongly depends on the careful tuning process of the involved hyperparameters. As such, in this paper we compare three methods of tuning (manual tuning, grid search, and Bayesian optimization) for the Gaussian kernel under a Bayesian best linear unbiased predictor model. We used six real datasets of wheat (Triticum aestivum L.) to compare the three strategies of tuning. We found that if we want to obtain the major benefits of using Gaussian kernels, it is very important to perform a careful tuning process. The best prediction performance was observed when the tuning process was performed with grid search and Bayesian optimization. However, we did not observe relevant differences between the grid search and Bayesian optimization approach. The observed gains in terms of prediction performance were between 2.1% and 27.8% across the six datasets under study.
Collapse
Affiliation(s)
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99164, USA
| | | | - Bernabe Cano-Paez
- Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), Mexico City 04510, Mexico
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Mexico
- Correspondence: (A.M.-L.); (J.C.)
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), El Batan, Texcoco 56237, Mexico
- Hidrociencias, Colegio de Postgraduados, Campus Montecillos, Carretera México-Texcoco Km. 36.5, Montecillo 56230, Mexico
- Correspondence: (A.M.-L.); (J.C.)
| |
Collapse
|
34
|
Kismiantini, Montesinos-López A, Cano-Páez B, Montesinos-López JC, Chavira-Flores M, Montesinos-López OA, Crossa J. A Multi-Trait Gaussian Kernel Genomic Prediction Model under Three Tunning Strategies. Genes (Basel) 2022; 13:2279. [PMID: 36553548 PMCID: PMC9778253 DOI: 10.3390/genes13122279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 11/27/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022] Open
Abstract
While genomic selection (GS) began revolutionizing plant breeding when it was proposed around 20 years ago, its practical implementation is still challenging as many factors affect its accuracy. One such factor is the choice of the statistical machine learning method. For this reason, we explore the tuning process under a multi-trait framework using the Gaussian kernel with a multi-trait Bayesian Best Linear Unbiased Predictor (GBLUP) model. We explored three methods of tuning (manual, grid search and Bayesian optimization) using 5 real datasets of breeding programs. We found that using grid search and Bayesian optimization improve between 1.9 and 6.8% the prediction accuracy regarding of using manual tuning. While the improvement in prediction accuracy in some cases can be marginal, it is very important to carry out the tuning process carefully to improve the accuracy of the GS methodology, even though this entails greater computational resources.
Collapse
Affiliation(s)
- Kismiantini
- Statistics Study Program, Universitas Negeri Yogyakarta, Yogyakarta 55281, Indonesia
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Jalisco, Mexico
| | - Bernabe Cano-Páez
- Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico
| | | | - Moisés Chavira-Flores
- Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas (IIMAS), Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico, Veracruz 52640, Edo. de México, Mexico
- Colegio de Postgraduados, Montecillos 56230, Edo. de México, Mexico
| |
Collapse
|
35
|
Seyum EG, Bille NH, Abtew WG, Munyengwa N, Bell JM, Cros D. Genomic selection in tropical perennial crops and plantation trees: a review. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2022; 42:58. [PMID: 37313015 PMCID: PMC10248687 DOI: 10.1007/s11032-022-01326-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 09/06/2022] [Indexed: 06/15/2023]
Abstract
To overcome the multiple challenges currently faced by agriculture, such as climate change and soil deterioration, more efficient plant breeding strategies are required. Genomic selection (GS) is crucial for the genetic improvement of quantitative traits, as it can increase selection intensity, shorten the generation interval, and improve selection accuracy for traits that are difficult to phenotype. Tropical perennial crops and plantation trees are of major economic importance and have consequently been the subject of many GS articles. In this review, we discuss the factors that affect GS accuracy (statistical models, linkage disequilibrium, information concerning markers, relatedness between training and target populations, the size of the training population, and trait heritability) and the genetic gain expected in these species. The impact of GS will be particularly strong in tropical perennial crops and plantation trees as they have long breeding cycles and constrained selection intensity. Future GS prospects are also discussed. High-throughput phenotyping will allow constructing of large training populations and implementing of phenomic selection. Optimized modeling is needed for longitudinal traits and multi-environment trials. The use of multi-omics, haploblocks, and structural variants will enable going beyond single-locus genotype data. Innovative statistical approaches, like artificial neural networks, are expected to efficiently handle the increasing amounts of heterogeneous multi-scale data. Targeted recombinations on sites identified from profiles of marker effects have the potential to further increase genetic gain. GS can also aid re-domestication and introgression breeding. Finally, GS consortia will play an important role in making the best of these opportunities. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-022-01326-4.
Collapse
Affiliation(s)
- Essubalew Getachew Seyum
- Department of Plant Biology and Physiology, Faculty of Sciences, University of Yaoundé I, Yaoundé, Cameroon
- Department of Horticulture and Plant Sciences, College of Agriculture and Veterinary Medicine, Jimma University, P.O. Box 307, Jimma, Ethiopia
| | - Ngalle Hermine Bille
- Department of Plant Biology and Physiology, Faculty of Sciences, University of Yaoundé I, Yaoundé, Cameroon
| | - Wosene Gebreselassie Abtew
- Department of Horticulture and Plant Sciences, College of Agriculture and Veterinary Medicine, Jimma University, P.O. Box 307, Jimma, Ethiopia
| | - Norman Munyengwa
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072 Australia
| | - Joseph Martin Bell
- Department of Plant Biology and Physiology, Faculty of Sciences, University of Yaoundé I, Yaoundé, Cameroon
| | - David Cros
- CIRAD, UMR AGAP Institut, 34398 Montpellier, France
- UMR AGAP Institut, CIRAD, INRAE, Univ. Montpellier, Institut Agro, 34398 Montpellier, France
| |
Collapse
|
36
|
Kim GW, Hong JP, Lee HY, Kwon JK, Kim DA, Kang BC. Genomic selection with fixed-effect markers improves the prediction accuracy for Capsaicinoid contents in Capsicum annuum. HORTICULTURE RESEARCH 2022; 9:uhac204. [PMID: 36467271 PMCID: PMC9714256 DOI: 10.1093/hr/uhac204] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 08/05/2022] [Indexed: 06/17/2023]
Abstract
Capsaicinoids provide chili peppers (Capsicum spp.) with their characteristic pungency. Several structural and transcription factor genes are known to control capsaicinoid contents in pepper. However, many other genes also regulating capsaicinoid contents remain unknown, making it difficult to develop pepper cultivars with different levels of capsaicinoids. Genomic selection (GS) uses genome-wide random markers (including many in undiscovered genes) for a trait to improve selection efficiency. In this study, we predicted the capsaicinoid contents of pepper breeding lines using several GS models trained with genotypic and phenotypic data from a training population. We used a core collection of 351 Capsicum accessions and 96 breeding lines as training and testing populations, respectively. To obtain the optimal number of single nucleotide polymorphism (SNP) markers for GS, we tested various numbers of genome-wide SNP markers based on linkage disequilibrium. We obtained the highest mean prediction accuracy (0.550) for different models using 3294 SNP markers. Using this marker set, we conducted GWAS and selected 25 markers that were associated with capsaicinoid biosynthesis genes and quantitative trait loci for capsaicinoid contents. Finally, to develop more accurate prediction models, we obtained SNP markers from GWAS as fixed-effect markers for GS, where 3294 genome-wide SNPs were employed. When four to five fixed-effect markers from GWAS were used as fixed effects, the RKHS and RR-BLUP models showed accuracies of 0.696 and 0.689, respectively. Our results lay the foundation for developing pepper cultivars with various capsaicinoid levels using GS for capsaicinoid contents.
Collapse
Affiliation(s)
- Geon Woo Kim
- Department of Agriculture, Forestry and Bioresources, Research Institute of Agriculture and Life Sciences, Plant Genomics Breeding Institute, College of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Ju-Pyo Hong
- Department of Agriculture, Forestry and Bioresources, Research Institute of Agriculture and Life Sciences, Plant Genomics Breeding Institute, College of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Hea-Young Lee
- Department of Agriculture, Forestry and Bioresources, Research Institute of Agriculture and Life Sciences, Plant Genomics Breeding Institute, College of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Jin-Kyung Kwon
- Department of Agriculture, Forestry and Bioresources, Research Institute of Agriculture and Life Sciences, Plant Genomics Breeding Institute, College of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Dong-Am Kim
- R&D Center, Hana Seed Co., Ltd., Anseong 17601, Republic of Korea
| | | |
Collapse
|
37
|
A joint learning approach for genomic prediction in polyploid grasses. Sci Rep 2022; 12:12499. [PMID: 35864135 PMCID: PMC9304331 DOI: 10.1038/s41598-022-16417-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 07/11/2022] [Indexed: 12/20/2022] Open
Abstract
Poaceae, among the most abundant plant families, includes many economically important polyploid species, such as forage grasses and sugarcane (Saccharum spp.). These species have elevated genomic complexities and limited genetic resources, hindering the application of marker-assisted selection strategies. Currently, the most promising approach for increasing genetic gains in plant breeding is genomic selection. However, due to the polyploidy nature of these polyploid species, more accurate models for incorporating genomic selection into breeding schemes are needed. This study aims to develop a machine learning method by using a joint learning approach to predict complex traits from genotypic data. Biparental populations of sugarcane and two species of forage grasses (Urochloa decumbens, Megathyrsus maximus) were genotyped, and several quantitative traits were measured. High-quality markers were used to predict several traits in different cross-validation scenarios. By combining classification and regression strategies, we developed a predictive system with promising results. Compared with traditional genomic prediction methods, the proposed strategy achieved accuracy improvements exceeding 50%. Our results suggest that the developed methodology could be implemented in breeding programs, helping reduce breeding cycles and increase genetic gains.
Collapse
|
38
|
Genome-Enabled Prediction Methods Based on Machine Learning. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:189-218. [PMID: 35451777 DOI: 10.1007/978-1-0716-2205-6_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Growth of artificial intelligence and machine learning (ML) methodology has been explosive in recent years. In this class of procedures, computers get knowledge from sets of experiences and provide forecasts or classification. In genome-wide based prediction (GWP), many ML studies have been carried out. This chapter provides a description of main semiparametric and nonparametric algorithms used in GWP in animals and plants. Thirty-four ML comparative studies conducted in the last decade were used to develop a meta-analysis through a Thurstonian model, to evaluate algorithms with the best predictive qualities. It was found that some kernel, Bayesian, and ensemble methods displayed greater robustness and predictive ability. However, the type of study and data distribution must be considered in order to choose the most appropriate model for a given problem.
Collapse
|
39
|
Ogawa S, Matsuda H, Taniguchi Y, Watanabe T, Sugimoto Y, Iwaisaki H. Estimation of the autosomal contribution to total additive genetic variability of carcass traits in Japanese Black cattle. Anim Sci J 2022; 93:e13710. [PMID: 35416392 DOI: 10.1111/asj.13710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 02/18/2022] [Accepted: 03/18/2022] [Indexed: 11/29/2022]
Abstract
We attempted to estimate the additive genetic variance explained by each autosome, using genotype data of 33,657 single nucleotide polymorphism (SNP) markers in 2271 Japanese Black fattened steers. Traits were cold carcass weight, ribeye area, rib thickness, subcutaneous fat thickness, estimated yield percentage, and marbling score. Two mixed linear models were used: One is that (model 1) incorporating a genomic relationship matrix (G matrix) constructed by using all available SNPs, and another (model 2), incorporating two G matrices constructed by using the SNPs on one autosome and using those on the remaining autosomes. Genomic heritabilities estimated using model 1 were moderate to high. The sums of the proportions of the additive genetic variance explained by each autosome to the total genetic variance estimated by using model 2 were >90%. For carcass weight, the proportions explained by Bos taurus autosomes 6, 8, and 14 were higher than those explained by the remaining autosomes. In some cases, the estimated proportion was close to 0. The results obtained from model 2 could provide a novel insight into the genetic architecture, such as heritability per chromosome, of carcass traits in Japanese Black cattle, although further careful investigation would be required.
Collapse
Affiliation(s)
| | | | - Yukio Taniguchi
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | | | - Yoshikazu Sugimoto
- Shirakawa Institute of Animal Genetics, Japan Livestock Technology Association, Tokyo, Japan
| | | |
Collapse
|
40
|
Montesinos-López OA, Montesinos-López JC, Montesinos-López A, Ramírez-Alcaraz JM, Poland J, Singh R, Dreisigacker S, Crespo L, Mondal S, Govidan V, Juliana P, Espino JH, Shrestha S, Varshney RK, Crossa J. Bayesian multitrait kernel methods improve multienvironment genome-based prediction. G3 (BETHESDA, MD.) 2022; 12:6446035. [PMID: 34849802 PMCID: PMC9210316 DOI: 10.1093/g3journal/jkab406] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 11/18/2021] [Indexed: 11/14/2022]
Abstract
When multitrait data are available, the preferred models are those that are able to account for correlations between phenotypic traits because when the degree of correlation is moderate or large, this increases the genomic prediction accuracy. For this reason, in this article, we explore Bayesian multitrait kernel methods for genomic prediction and we illustrate the power of these models with three-real datasets. The kernels under study were the linear, Gaussian, polynomial, and sigmoid kernels; they were compared with the conventional Ridge regression and GBLUP multitrait models. The results show that, in general, the Gaussian kernel method outperformed conventional Bayesian Ridge and GBLUP multitrait linear models by 2.2–17.45% (datasets 1–3) in terms of prediction performance based on the mean square error of prediction. This improvement in terms of prediction performance of the Bayesian multitrait kernel method can be attributed to the fact that the proposed model is able to capture nonlinear patterns more efficiently than linear multitrait models. However, not all kernels perform well in the datasets used for evaluation, which is why more than one kernel should be evaluated to be able to choose the best kernel.
Collapse
Affiliation(s)
| | | | - Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Guadalajara 44430, Mexico
- Corresponding author: Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco 44430, Mexico. (A.M.-L.); International Maize and Wheat Improvement Center (CIMMYT). Km 45 Carretera Mexico-Veracruz, CP 52640, Texcoco, Edo de Mexico, Mexico. (J.C.)
| | | | - Jesse Poland
- Department of Agronomy, Kansas State University, 2004 Throckmorton Plant Science Center, Manhattan, KS 66506, USA
| | - Ravi Singh
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
| | - Leonardo Crespo
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
| | - Sushismita Mondal
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
| | - Velu Govidan
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
| | - Philomin Juliana
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
| | - Julio Huerta Espino
- Campo Experimental Valle de Mexico, Instituto Nacional de Investigaciones Forestales, Agricolas y Pecuarias (INIFAP), Universidad Autónoma de Chapingo, Texcoco 56235, Mexico
| | - Sandesh Shrestha
- Department of Agronomy, Kansas State University, 2004 Throckmorton Plant Science Center, Manhattan, KS 66506, USA
| | - Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
- State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch 6150, Australia
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Texoco, Edo. de Mexico, Mexico
- Colegio de Postgraduados, Montecillos, Edo. de México 56230, Mexico
- Corresponding author: Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco 44430, Mexico. (A.M.-L.); International Maize and Wheat Improvement Center (CIMMYT). Km 45 Carretera Mexico-Veracruz, CP 52640, Texcoco, Edo de Mexico, Mexico. (J.C.)
| |
Collapse
|
41
|
Bonnett D, Li Y, Crossa J, Dreisigacker S, Basnet B, Pérez-Rodríguez P, Alvarado G, Jannink JL, Poland J, Sorrells M. Response to Early Generation Genomic Selection for Yield in Wheat. FRONTIERS IN PLANT SCIENCE 2022; 12:718611. [PMID: 35087542 PMCID: PMC8787636 DOI: 10.3389/fpls.2021.718611] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 10/22/2021] [Indexed: 06/14/2023]
Abstract
We investigated increasing genetic gain for grain yield using early generation genomic selection (GS). A training set of 1,334 elite wheat breeding lines tested over three field seasons was used to generate Genomic Estimated Breeding Values (GEBVs) for grain yield under irrigated conditions applying markers and three different prediction methods: (1) Genomic Best Linear Unbiased Predictor (GBLUP), (2) GBLUP with the imputation of missing genotypic data by Ridge Regression BLUP (rrGBLUP_imp), and (3) Reproducing Kernel Hilbert Space (RKHS) a.k.a. Gaussian Kernel (GK). F2 GEBVs were generated for 1,924 individuals from 38 biparental cross populations between 21 parents selected from the training set. Results showed that F2 GEBVs from the different methods were not correlated. Experiment 1 consisted of selecting F2s with the highest average GEBVs and advancing them to form genomically selected bulks and make intercross populations aiming to combine favorable alleles for yield. F4:6 lines were derived from genomically selected bulks, intercrosses, and conventional breeding methods with similar numbers from each. Results of field-testing for Experiment 1 did not find any difference in yield with genomic compared to conventional selection. Experiment 2 compared the predictive ability of the different GEBV calculation methods in F2 using a set of single plant-derived F2:4 lines from randomly selected F2 plants. Grain yield results from Experiment 2 showed a significant positive correlation between observed yields of F2:4 lines and predicted yield GEBVs of F2 single plants from GK (the predictive ability of 0.248, P < 0.001) and GBLUP (0.195, P < 0.01) but no correlation with rrGBLUP_imp. Results demonstrate the potential for the application of GS in early generations of wheat breeding and the importance of using the appropriate statistical model for GEBV calculation, which may not be the same as the best model for inbreds.
Collapse
Affiliation(s)
- David Bonnett
- International Maize and Wheat Improvement Center, Texcoco, Mexico
- BASF Wheat Breeding, Sabin, MN, United States
| | - Yongle Li
- School of Agriculture, Food and Wine, Faculty of Sciences, The University of Adelaide, Adelaide, SA, Australia
| | - Jose Crossa
- International Maize and Wheat Improvement Center, Texcoco, Mexico
- Colegio de Postgraduados, Texcoco, Mexico
| | | | - Bhoja Basnet
- International Maize and Wheat Improvement Center, Texcoco, Mexico
| | | | - G. Alvarado
- International Maize and Wheat Improvement Center, Texcoco, Mexico
| | - J. L. Jannink
- USDA-ARS, Robert W. Holley Center for Agriculture and Health, Ithaca, NY, United States
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, United States
| | - Jesse Poland
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Mark Sorrells
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, United States
| |
Collapse
|
42
|
Hamazaki K, Iwata H. Bayesian optimization of multivariate genomic prediction models based on secondary traits for improved accuracy gains and phenotyping costs. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:35-50. [PMID: 34609531 DOI: 10.1007/s00122-021-03949-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Accepted: 09/14/2021] [Indexed: 06/13/2023]
Abstract
We propose a novel approach to the Bayesian optimization of multivariate genomic prediction models based on secondary traits to improve accuracy gains and phenotyping costs via efficient Pareto frontier estimation. Multivariate genomic prediction based on secondary traits, such as data from various omics technologies including high-throughput phenotyping (e.g., unmanned aerial vehicle-based remote sensing), has attracted much attention because it offers improved accuracy gains compared with genomic prediction based only on marker genotypes. Although there is a trade-off between accuracy gains and phenotyping costs of secondary traits, no attempt has been made to optimize these trade-offs. In this study, we propose a novel approach to optimize multivariate genomic prediction models for secondary traits measurable at early growth stages for improved accuracy gains and phenotyping costs. The proposed approach employs Bayesian optimization for efficient Pareto frontier estimation, representing the maximum accuracy at a given cost. The proposed approach successfully estimated the optimal secondary trait combinations across a range of costs while providing genomic predictions for only about [Formula: see text] of all possible combinations. The simulation results reflecting the characteristics of each scenario of the simulated target traits showed that the obtained optimal combinations were reasonable. Analysis of real-time target trait data showed that the proposed multivariate genomic prediction model had significantly superior accuracy compared to the univariate genomic prediction model.
Collapse
Affiliation(s)
- Kosuke Hamazaki
- Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657, Japan
- JSPS Research Fellow, Tokyo, Japan
| | - Hiroyoshi Iwata
- Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657, Japan.
| |
Collapse
|
43
|
Howard R, Jarquin D, Crossa J. Overview of Genomic Prediction Methods and the Associated Assumptions on the Variance of Marker Effect, and on the Architecture of the Target Trait. Methods Mol Biol 2022; 2467:139-156. [PMID: 35451775 DOI: 10.1007/978-1-0716-2205-6_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Genomic selection (GS) is a methodology that revolutionized the process of breeding improved genetic materials in plant and animal breeding programs. It uses predicted genomic values of the potential of untested/unobserved genotypes as surrogates of phenotypes during the selection process. Such that the predicted genomic values are obtained using exclusively the marker profiles of the untested genotypes, and these potentially can be used by breeders for screening the genotypes to be advanced in the breeding pipeline, to identify potential parents for next improvement cycles, or to find optimal crosses for targeting genotypes among others. Conceptually, GS initially requires a set of genotypes with both molecular marker information and phenotypic data for model calibration and then the performance of untested genotypes is predicted using their marker profiles only. Hence, it is expected that breeders would look at these values in order to conduct selections. Even though the concept of GS seems trivial, due to the high dimensional nature of the data delivered from modern sequencing technologies where the number of molecular markers (p) excess by far the number of data points available for model fitting (n; p ≫ n) a complete renovated set of prediction models was needed to cope with this challenge. In this chapter, we provide a conceptual framework for comparing statistical models to overcome the "large p, small n problem." Given the very large diversity of GS models only the most popular are presented here; mainly we focused on linear regression-based models and nonparametric models that predict the genetic estimated breeding values (GEBV) in a single environment considering a single trait only, mainly in the context of plant breeding.
Collapse
Affiliation(s)
- Réka Howard
- University of Nebraska-Lincoln, Lincoln, NE, USA.
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| |
Collapse
|
44
|
Momen M, Kranis A, Rosa GJM, Muir P, Gianola D. Predictive assessment of single-step BLUP with linear and non-linear similarity RKHS kernels: A case study in chickens. J Anim Breed Genet 2021; 139:247-258. [PMID: 34931377 DOI: 10.1111/jbg.12665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2021] [Revised: 08/27/2021] [Accepted: 12/05/2021] [Indexed: 11/26/2022]
Abstract
Single-step GBLUP (ssGBLUP) to obtain genomic prediction was proposed in 2009. Many studies have investigated ssGBLUP in genomic selection in animals and plants using a standard linear kernel (similarity matrix) called genomic relationship matrix (G). More general kernels should allow capturing non-additive effects as well, whereas GBLUP is based on additive gene action. In this study, we generalized ssBLUP to accommodate two non-linear kernels, the averaged Gaussian kernel (AK) and the recently developed arc-cosine deep kernel (DK). We evaluated the methodology using body weight (BW) and hen-housing production (HHP) traits, recorded on a sample of phenotyped and genotyped commercial broiler chickens. There were, thus, different ssGBLUP models corresponding to G, AK and DK. We used random replication of training (TRN) and testing (TST) layouts at different genotyping rates (20%, 40%, 60% and 80% of all birds) in three selective genotyping scenarios. The selections were genotyping the youngest individuals in the pedigree (YS), random genotyping (RS) and genotyping based on parent average (PA). Predictive abilities were measured using rank correlations between the observed and the predictive phenotypic values in TST for each random partition. Prediction accuracy was influenced by the type of kernel when a large proportion of birds was genotyped. An advantage of non-linear kernels (AK and DK) was more apparent when 60 and 80% of birds had been genotyped. For BW, the lowest rank correlations were obtained with G (0.093 ± 0.015 using RS by 20% genotyped individuals) and the highest values with DK (0.320 ± 0.016 in the PA setting with 80% genotyped individuals). For HHP, the lowest and highest rank correlations were obtained by AK with 20% and 80% genotyped individuals, 0.071 ± 0.016 (in RS) and 0.23 ± 0.016 (in PA) respectively. Our results indicated that AK and DK are more effective than G when a large proportion of the target population is genotyped. Our expectation is that ssGBLUP with AK or DK models can perform even better than G when non-additive genetic effects influence the underlying variability of complex traits.
Collapse
Affiliation(s)
- Mehdi Momen
- Department of Surgical Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Andreas Kranis
- Roslin Institute, University of Edinburgh, Edinburgh, UK
| | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Peter Muir
- Department of Surgical Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Daniel Gianola
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| |
Collapse
|
45
|
Baertschi C, Cao TV, Bartholomé J, Ospina Y, Quintero C, Frouin J, Bouvet JM, Grenier C. Impact of early genomic prediction for recurrent selection in an upland rice synthetic population. G3 (BETHESDA, MD.) 2021; 11:jkab320. [PMID: 34498036 PMCID: PMC8664429 DOI: 10.1093/g3journal/jkab320] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 08/16/2021] [Indexed: 11/14/2022]
Abstract
Population breeding through recurrent selection is based on the repetition of evaluation and recombination among best-selected individuals. In this type of breeding strategy, early evaluation of selection candidates combined with genomic prediction could substantially shorten the breeding cycle length, thus increasing the rate of genetic gain. The objective of this study was to optimize early genomic prediction in an upland rice (Oryza sativa L.) synthetic population improved through recurrent selection via shuttle breeding in two sites. To this end, we used genomic prediction on 334 S0 genotypes evaluated with early generation progeny testing (S0:2 and S0:3) across two sites. Four traits were measured (plant height, days to flowering, grain yield, and grain zinc concentration) and the predictive ability was assessed for the target site. For days to flowering and plant height, which correlate well among sites (0.51-0.62), an increase of up to 0.4 in predictive ability was observed when the model was trained using the two sites. For grain zinc concentration, adding the phenotype of the predicted lines in the nontarget site to the model improved the predictive ability (0.51 with two-site and 0.31 with single-site model), whereas for grain yield the gain was less (0.42 with two-site and 0.35 with single-site calibration). Through these results, we found a good opportunity to optimize the genomic recurrent selection scheme and maximize the use of resources by performing early progeny testing in two sites for traits with best expression and/or relevance in each specific environment.
Collapse
Affiliation(s)
- Cédric Baertschi
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
| | - Tuong-Vi Cao
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
| | - Jérôme Bartholomé
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
- Rice Breeding Platform, International Rice Research Institute, Metro Manila, Philippines
| | - Yolima Ospina
- Alliance Bioversity-CIAT, Recta Palmira Cali, Colombia
| | | | - Julien Frouin
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
| | - Jean-Marc Bouvet
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
- CIRAD, Dispositif de Recherche et d’Enseignement en Partenariat “Forêts et Biodiversité à Madagascar”, Antananarivo, Madagascar
| | - Cécile Grenier
- CIRAD, UMR AGAP Institut, F-34398 Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
- Alliance Bioversity-CIAT, Recta Palmira Cali, Colombia
| |
Collapse
|
46
|
Wilson S, Malosetti M, Maliepaard C, Mulder HA, Visser RGF, van Eeuwijk F. Training Set Construction for Genomic Prediction in Auto-Tetraploids: An Example in Potato. FRONTIERS IN PLANT SCIENCE 2021; 12:771075. [PMID: 34899794 PMCID: PMC8651708 DOI: 10.3389/fpls.2021.771075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 10/20/2021] [Indexed: 06/14/2023]
Abstract
Training set construction is an important prerequisite to Genomic Prediction (GP), and while this has been studied in diploids, polyploids have not received the same attention. Polyploidy is a common feature in many crop plants, like for example banana and blueberry, but also potato which is the third most important crop in the world in terms of food consumption, after rice and wheat. The aim of this study was to investigate the impact of different training set construction methods using a publicly available diversity panel of tetraploid potatoes. Four methods of training set construction were compared: simple random sampling, stratified random sampling, genetic distance sampling and sampling based on the coefficient of determination (CDmean). For stratified random sampling, population structure analyses were carried out in order to define sub-populations, but since sub-populations accounted for only 16.6% of genetic variation, there were negligible differences between stratified and simple random sampling. For genetic distance sampling, four genetic distance measures were compared and though they performed similarly, Euclidean distance was the most consistent. In the majority of cases the CDmean method was the best sampling method, and compared to simple random sampling gave improvements of 4-14% in cross-validation scenarios, and 2-8% in scenarios with an independent test set, while genetic distance sampling gave improvements of 5.5-10.5% and 0.4-4.5%. No interaction was found between sampling method and the statistical model for the traits analyzed.
Collapse
Affiliation(s)
- Stefan Wilson
- Biometris, Wageningen University & Research, Wageningen, Netherlands
| | - Marcos Malosetti
- Biometris, Wageningen University & Research, Wageningen, Netherlands
| | - Chris Maliepaard
- Plant Breeding, Wageningen University & Research, Wageningen, Netherlands
| | - Han A. Mulder
- Wageningen University & Research, Animal Breeding and Genomics, Wageningen, Netherlands
| | | | - Fred van Eeuwijk
- Biometris, Wageningen University & Research, Wageningen, Netherlands
| |
Collapse
|
47
|
Rio S, Gallego-Sánchez L, Montilla-Bascón G, Canales FJ, Isidro Y Sánchez J, Prats E. Genomic prediction and training set optimization in a structured Mediterranean oat population. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021; 134:3595-3609. [PMID: 34341832 DOI: 10.1007/s00122-021-03916-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 07/13/2021] [Indexed: 05/22/2023]
Abstract
The strong genetic structure observed in Mediterranean oats affects the predictive ability of genomic prediction as well as the performance of training set optimization methods. In this study, we investigated the efficiency of genomic prediction and training set optimization in a highly structured population of cultivars and landraces of cultivated oat (Avena sativa) from the Mediterranean basin, including white (subsp. sativa) and red (subsp. byzantina) oats, genotyped using genotype-by-sequencing markers and evaluated for agronomic traits in Southern Spain. For most traits, the predictive abilities were moderate to high with little differences between models, except for biomass for which Bayes-B showed a substantial gain compared to other models. The consistency between the structure of the training population and the population to be predicted was key to the predictive ability of genomic predictions. The predictive ability of inter-subspecies predictions was indeed much lower than that of intra-subspecies predictions for all traits. Regarding training set optimization, the linear mixed model optimization criteria (prediction error variance (PEVmean) and coefficient of determination (CDmean)) performed better than the heuristic approach "partitioning around medoids," even under high population structure. The superiority of CDmean and PEVmean could be explained by their ability to adapt the representation of each genetic group according to those represented in the population to be predicted. These results represent an important step towards the implementation of genomic prediction in oat breeding programs and address important issues faced by the genomic prediction community regarding population structure and training set optimization.
Collapse
Affiliation(s)
- Simon Rio
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Universidad Politécnica de Madrid (UPM), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain.
| | - Luis Gallego-Sánchez
- Institute for Sustainable Agriculture, Spanish Research Council (CSIC), Córdoba, Spain
| | | | - Francisco J Canales
- Institute for Sustainable Agriculture, Spanish Research Council (CSIC), Córdoba, Spain
| | - Julio Isidro Y Sánchez
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Universidad Politécnica de Madrid (UPM), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Elena Prats
- Institute for Sustainable Agriculture, Spanish Research Council (CSIC), Córdoba, Spain
| |
Collapse
|
48
|
Islam MS, McCord PH, Olatoye MO, Qin L, Sood S, Lipka AE, Todd JR. Experimental evaluation of genomic selection prediction for rust resistance in sugarcane. THE PLANT GENOME 2021; 14:e20148. [PMID: 34510803 DOI: 10.1002/tpg2.20148] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 07/22/2021] [Indexed: 06/13/2023]
Abstract
The total sugarcane (Saccharum L.) production has increased worldwide; however, the rate of growth is lower compared with other major crops, mainly due to a plateauing of genetic gain. Genomic selection (GS) has proven to substantially increase the rate of genetic gain in many crops. To investigate the utility of GS in future sugarcane breeding, a field trial was conducted using 432 sugarcane clones using an augmented design with two replications. Two major diseases in sugarcane, brown and orange rust (BR and OR), were screened artificially using whorl inoculation method in the field over two crop cycles. The genotypic data were generated through target enrichment sequencing technologies. After filtering, a set of 8,825 single nucleotide polymorphic markers were used to assess the prediction accuracy of multiple GS models. Using fivefold cross-validation, we observed GS prediction accuracies for BR and OR that ranged from 0.28 to 0.43 and 0.13 to 0.29, respectively, across two crop cycles and combined cycles. The prediction ability further improved by including a known major gene for resistance to BR as a fixed effect in the GS model. It also substantially reduced the minimum number of markers and training population size required for GS. The nonparametric GS models outperformed the parametric GS suggesting that nonadditive genetic effects could contribute genomic sources underlying BR and OR. This study demonstrated that GS could potentially predict the genomic estimated breeding value for selecting the desired germplasm for sugarcane breeding for disease resistance.
Collapse
Affiliation(s)
- Md S Islam
- Sugarcane Production Research Unit, USDA-ARS, Canal Point, FL, USA
| | - Per H McCord
- Sugarcane Production Research Unit, USDA-ARS, Canal Point, FL, USA
- Current address: Irrigated Agriculture Research and Extension Center, WA State Univ., Prosser, WA, USA
| | - Marcus O Olatoye
- Dep. of Crop Sciences, Univ. of Illinois, Urbana-Champaign, IL, USA
| | - Lifang Qin
- Sugarcane Production Research Unit, USDA-ARS, Canal Point, FL, USA
- Current address: Guangxi Univ., Nanning, Guangxi, China
| | - Sushma Sood
- Sugarcane Production Research Unit, USDA-ARS, Canal Point, FL, USA
| | | | | |
Collapse
|
49
|
Cruz M, Arbelaez JD, Loaiza K, Cuasquer J, Rosas J, Graterol E. Genetic and phenotypic characterization of rice grain quality traits to define research strategies for improving rice milling, appearance, and cooking qualities in Latin America and the Caribbean. THE PLANT GENOME 2021; 14:e20134. [PMID: 34510797 DOI: 10.1002/tpg2.20134] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 06/23/2021] [Indexed: 06/13/2023]
Abstract
Rice (Oryza sativa L.)grain quality is a set of complex interrelated traits that include grain milling, appearance, cooking, and edible properties. As consumer preferences in Latin America and the Caribbean evolve, determining what traits best capture regional grain quality preferences is fundamental for breeding and cultivar release. In this study, a genome-wide association study (GWAS), marker-assisted selection (MAS), and genomic selection (GS) were evaluated to help guide the development of new breeding strategies for rice grain quality improvement. For this purpose, 284 rice lines representing over 20 yr of breeding in Latin America and the Caribbean were genotyped and phenotyped for 10 different traits including grain milling, appearance, cooking, and edible quality traits. Genetic correlations among the 10 traits ranged from -0.83 to 0.85. A GWAS identified 19 significant marker/trait combinations associated with eight grain quality traits. Four functional markers, three located in the Waxy and one in the starch synthase IIa genes, were significantly associated with six grain-quality traits. These markers individually explained 51-75% of the phenotypic variance depending on the trait, clearly indicating their potential utility for MAS. Cross-validation studies to evaluate predictive abilities of four different GS models for each of the 10 quality traits were conducted and predictive abilities ranged from 0.3 to 0.72. Overall, the machine learning model random forest had the highest predictive abilities and was especially effective for traits where large effect quantitative trait loci were identified. This study provides the foundation for deploying effective molecular breeding strategies for grain quality in Latin American rice breeding programs.
Collapse
Affiliation(s)
- Maribel Cruz
- FLAR (Fondo Latinoamericano para Arroz de Riego), CIAT (International Center for Tropical Agriculture), Kilómetro 17 c, CP, Cali, Valle del Cauca, 763537, Colombia
| | - Juan David Arbelaez
- Dep. of Crop Sciences, Univ. of Illinois, Urbana-Champaign, Turner Hall N-211|1102 S. Goodwin Ave. | 046, Urbana, IL, 61801, USA
| | - Katherine Loaiza
- FLAR (Fondo Latinoamericano para Arroz de Riego), CIAT (International Center for Tropical Agriculture), Kilómetro 17 c, CP, Cali, Valle del Cauca, 763537, Colombia
| | - Juan Cuasquer
- CIAT (International Center for Tropical Agriculture), Kilómetro 17 Recta Cali, Palmira, CP, Cali, Valle del Cauca, 763537, Colombia
| | - Juan Rosas
- INIA (Instituto Nacional de Investigación Agropecuaria), Ruta 8 Km. 281/33000, Treinta y Tres, Uruguay
| | - Eduardo Graterol
- FLAR (Fondo Latinoamericano para Arroz de Riego), CIAT (International Center for Tropical Agriculture), Kilómetro 17 c, CP, Cali, Valle del Cauca, 763537, Colombia
| |
Collapse
|
50
|
Tomar V, Dhillon GS, Singh D, Singh RP, Poland J, Chaudhary AA, Bhati PK, Joshi AK, Kumar U. Evaluations of Genomic Prediction and Identification of New Loci for Resistance to Stripe Rust Disease in Wheat ( Triticum aestivum L.). Front Genet 2021; 12:710485. [PMID: 34650592 PMCID: PMC8505882 DOI: 10.3389/fgene.2021.710485] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Accepted: 08/24/2021] [Indexed: 01/08/2023] Open
Abstract
Stripe rust is one of the most destructive diseases of wheat (Triticum aestivum L.), caused by Puccinia striiformis f. sp. tritici (Pst), and responsible for significant yield losses worldwide. Single-nucleotide polymorphism (SNP) diagnostic markers were used to identify new sources of resistance at adult plant stage to wheat stripe rust (YR) in 141 CIMMYT advanced bread wheat lines over 3 years in replicated trials at Borlaug Institute for South Asia (BISA), Ludhiana. We performed a genome-wide association study and genomic prediction to aid the genetic gain by accumulating disease resistance alleles. The responses to YR in 141 advanced wheat breeding lines at adult plant stage were used to generate G × E (genotype × environment)-dependent rust scores for prediction and genome-wide association study (GWAS), eliminating variation due to climate and disease pressure changes. The lowest mean prediction accuracies were 0.59 for genomic best linear unbiased prediction (GBLUP) and ridge-regression BLUP (RRBLUP), while the highest mean was 0.63 for extended GBLUP (EGBLUP) and random forest (RF), using 14,563 SNPs and the G × E rust score results. RF and EGBLUP predicted higher accuracies (∼3%) than did GBLUP and RRBLUP. Promising genomic prediction demonstrates the viability and efficacy of improving quantitative rust tolerance. The resistance to YR in these lines was attributed to eight quantitative trait loci (QTLs) using the FarmCPU algorithm. Four (Q.Yr.bisa-2A.1, Q.Yr.bisa-2D, Q.Yr.bisa-5B.2, and Q.Yr.bisa-7A) of eight QTLs linked to the diagnostic markers were mapped at unique loci (previously unidentified for Pst resistance) and possibly new loci. The statistical evidence of effectiveness and distribution of the new diagnostic markers for the resistance loci would help to develop new stripe rust resistance sources. These diagnostic markers along with previously established markers would be used to create novel DNA biosensor-based microarrays for rapid detection of the resistance loci on large panels upon functional validation of the candidate genes identified in the present study to aid in rapid genetic gain in the future breeding programs.
Collapse
Affiliation(s)
- Vipin Tomar
- Borlaug Institute for South Asia, Ludhiana, India.,International Maize and Wheat Improvement Center, New Delhi, India.,Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Guriqbal Singh Dhillon
- Department of Biotechnology, Thapar Institute of Engineering and Technology, Patiala, India
| | - Daljit Singh
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Ravi Prakash Singh
- Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Jesse Poland
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Anis Ahmad Chaudhary
- Department of Biology, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia
| | | | - Arun Kumar Joshi
- Borlaug Institute for South Asia, Ludhiana, India.,International Maize and Wheat Improvement Center, New Delhi, India.,Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Uttam Kumar
- Borlaug Institute for South Asia, Ludhiana, India.,International Maize and Wheat Improvement Center, New Delhi, India.,Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| |
Collapse
|