1
|
Ahadi P, Balasundaram B, Borrero JS, Chen C. Development and optimization of expected cross value for mate selection problems. Heredity (Edinb) 2024; 133:113-125. [PMID: 38956397 PMCID: PMC11286873 DOI: 10.1038/s41437-024-00697-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/04/2024] Open
Abstract
In this study, we address the mate selection problem in the hybridization stage of a breeding pipeline, which constitutes the multi-objective breeding goal key to the performance of a variety development program. The solution framework we formulate seeks to ensure that individuals with the most desirable genomic characteristics are selected to cross in order to maximize the likelihood of the inheritance of desirable genetic materials to the progeny. Unlike approaches that use phenotypic values for parental selection and evaluate individuals separately, we use a criterion that relies on the genetic architecture of traits and evaluates combinations of genomic information of the pairs of individuals. We introduce the expected cross value (ECV) criterion that measures the expected number of desirable alleles for gametes produced by pairs of individuals sampled from a population of potential parents. We use the ECV criterion to develop an integer linear programming formulation for the parental selection problem. The formulation is capable of controlling the inbreeding level between selected mates. We evaluate the approach or two applications: (i) improving multiple target traits simultaneously, and (ii) finding a multi-parental solution to design crossing blocks. We evaluate the performance of the ECV criterion using a simulation study. Finally, we discuss how the ECV criterion and the proposed integer linear programming techniques can be applied to improve breeding efficiency while maintaining genetic diversity in a breeding program.
Collapse
Affiliation(s)
- Pouya Ahadi
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | | | - Juan S Borrero
- School of Industrial Engineering and Management, Oklahoma State University, Stillwater, OK, USA
| | - Charles Chen
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, USA.
| |
Collapse
|
2
|
Niehoff TAM, Ten Napel J, Bijma P, Pook T, Wientjes YCJ, Hegedűs B, Calus MPL. Improving selection decisions with mating information by accounting for Mendelian sampling variances looking two generations ahead. Genet Sel Evol 2024; 56:41. [PMID: 38773363 PMCID: PMC11107025 DOI: 10.1186/s12711-024-00899-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 04/03/2024] [Indexed: 05/23/2024] Open
Abstract
BACKGROUND Breeding programs are judged by the genetic level of animals that are used to disseminate genetic progress. These animals are typically the best ones of the population. To maximise the genetic level of very good animals in the next generation, parents that are more likely to produce top performing offspring need to be selected. The ability of individuals to produce high-performing progeny differs because of differences in their breeding values and gametic variances. Differences in gametic variances among individuals are caused by differences in heterozygosity and linkage. The use of the gametic Mendelian sampling variance has been proposed before, for use in the usefulness criterion or Index5, and in this work, we extend existing approaches by not only considering the gametic Mendelian sampling variance of individuals, but also of their potential offspring. Thus, the criteria developed in this study plan one additional generation ahead. For simplicity, we assumed that the true quantitative trait loci (QTL) effects, genetic map and the haplotypes of all animals are known. RESULTS In this study, we propose a new selection criterion, ExpBVSelGrOff, which describes the genetic level of selected grand-offspring that are produced by selected offspring of a particular mating. We compare our criterion with other published criteria in a stochastic simulation of an ongoing breeding program for 21 generations for proof of concept. ExpBVSelGrOff performed better than all other tested criteria, like the usefulness criterion or Index5 which have been proposed in the literature, without compromising short-term gains. After only five generations, when selection is strong (1%), selection based on ExpBVSelGrOff achieved 5.8% more commercial genetic gain and retained 25% more genetic variance without compromising inbreeding rate compared to selection based only on breeding values. CONCLUSIONS Our proposed selection criterion offers a new tool to accelerate genetic progress for contemporary genomic breeding programs. It retains more genetic variance than previously published criteria that plan less far ahead. Considering future gametic Mendelian sampling variances in the selection process also seems promising for maintaining more genetic variance.
Collapse
Grants
- TKI Agri This study was financially supported by the Dutch Ministry of Economic Affairs (TKI Agri & Food Project LWV20054) and the Breed4Food partners Cobb Europe (Colchester, Essex, United Kingdom), CRV (Arnhem, the Netherlands), Hendrix Genetics (Boxmeer, the Net
- Food Project LWV20054 This study was financially supported by the Dutch Ministry of Economic Affairs (TKI Agri & Food Project LWV20054) and the Breed4Food partners Cobb Europe (Colchester, Essex, United Kingdom), CRV (Arnhem, the Netherlands), Hendrix Genetics (Boxmeer, the Net
- This study was financially supported by the Dutch Ministry of Economic Affairs (TKI Agri & Food Project LWV20054) and the Breed4Food partners Cobb Europe (Colchester, Essex, United Kingdom), CRV (Arnhem, the Netherlands), Hendrix Genetics (Boxmeer, the Net
Collapse
Affiliation(s)
- Tobias A M Niehoff
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands.
| | - Jan Ten Napel
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands
| | - Piter Bijma
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands
| | - Torsten Pook
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands
| | - Yvonne C J Wientjes
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands
| | - Bernadett Hegedűs
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands
| | - Mario P L Calus
- Animal Breeding and Genomics, Wageningen University and Research, Droevendaalsesteeg 1, 6700AH, Wageningen, The Netherlands
| |
Collapse
|
3
|
Hamazaki K, Iwata H. AI-assisted selection of mating pairs through simulation-based optimized progeny allocation strategies in plant breeding. FRONTIERS IN PLANT SCIENCE 2024; 15:1361894. [PMID: 38817943 PMCID: PMC11138345 DOI: 10.3389/fpls.2024.1361894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 03/06/2024] [Indexed: 06/01/2024]
Abstract
Emerging technologies such as genomic selection have been applied to modern plant and animal breeding to increase the speed and efficiency of variety release. However, breeding requires decisions regarding parent selection and mating pairs, which significantly impact the ultimate genetic gain of a breeding scheme. The selection of appropriate parents and mating pairs to increase genetic gain while maintaining genetic diversity is still an urgent need that breeders are facing. This study aimed to determine the best progeny allocation strategies by combining future-oriented simulations and numerical black-box optimization for an improved selection of parents and mating pairs. In this study, we focused on optimizing the allocation of progenies, and the breeding process was regarded as a black-box function whose input is a set of parameters related to the progeny allocation strategies and whose output is the ultimate genetic gain of breeding schemes. The allocation of progenies to each mating pair was parameterized according to a softmax function, whose input is a weighted sum of multiple features for the allocation, including expected genetic variance of progenies and selection criteria such as different types of breeding values, to balance genetic gains and genetic diversity optimally. The weighting parameters were then optimized by the black-box optimization algorithm called StoSOO via future-oriented breeding simulations. Simulation studies to evaluate the potential of our novel method revealed that the breeding strategy based on optimized weights attained almost 10% higher genetic gain than that with an equal allocation of progenies to all mating pairs within just four generations. Among the optimized strategies, those considering the expected genetic variance of progenies could maintain the genetic diversity throughout the breeding process, leading to a higher ultimate genetic gain than those without considering it. These results suggest that our novel method can significantly improve the speed and efficiency of variety development through optimized decisions regarding the selection of parents and mating pairs. In addition, by changing simulation settings, our future-oriented optimization framework for progeny allocation strategies can be easily implemented into general breeding schemes, contributing to accelerated plant and animal breeding with high efficiency.
Collapse
Affiliation(s)
| | - Hiroyoshi Iwata
- Laboratory of Biometry and Bioinformatics, Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
4
|
Hassanpour A, Geibel J, Simianer H, Pook T. Optimization of breeding program design through stochastic simulation with kernel regression. G3 (BETHESDA, MD.) 2023; 13:jkad217. [PMID: 37742059 PMCID: PMC10700053 DOI: 10.1093/g3journal/jkad217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 07/29/2023] [Accepted: 09/02/2023] [Indexed: 09/25/2023]
Abstract
In recent years, breeding programs have increased significantly in size and complexity, with various highly interdependent parameters and many contrasting breeding goals. As a result, resource allocation in these programs has become more complex, and deriving an optimal breeding strategy has become increasingly challenging. To address this, a common practice is to reduce the optimization problem to a set of scenarios that differ only in a few parameters and can therefore be analyzed in detail. The goal of this article is to provide a framework for the numerical optimization of breeding programs that goes beyond the simple comparison of scenarios. For this, we first determine the space of potential breeding programs only limited by basic constraints like the budget and housing capacities. Subsequently, the goal is to identify the optimal breeding program by finding the parametrization that maximizes the target function by combining different breeding goals. To assess the value of the target function for a parametrization, we propose using stochastic simulations and the subsequent use of a kernel regression method to cope with the stochasticity of simulation outcomes. This procedure is performed iteratively to narrow down the most promising areas of the search space and perform more and more simulations in these areas of interest. In a simplified example applied to a dairy cattle program, our proposed framework has shown its ability to identify an optimal breeding strategy that aligns with a target function aiming at genetic gain and genetic diversity conservation limited by budget constraints.
Collapse
Affiliation(s)
- Azadeh Hassanpour
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
| | - Johannes Geibel
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, 31535 Neustadt, Germany
| | - Henner Simianer
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
| | - Torsten Pook
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH Wageningen, Netherlands
| |
Collapse
|
5
|
Kusmec A, Attigala L, Dai X, Srinivasan S, Yeh CTE, Schnable PS. A genetic tradeoff for tolerance to moderate and severe heat stress in US hybrid maize. PLoS Genet 2023; 19:e1010799. [PMID: 37410701 DOI: 10.1371/journal.pgen.1010799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 05/26/2023] [Indexed: 07/08/2023] Open
Abstract
Global climate change is increasing both average temperatures and the frequencies of extreme high temperatures. Past studies have documented a strong negative effect of exposures to temperatures >30°C on hybrid maize yields. However, these studies could not disentangle genetic adaptation via artificial selection from changes in agronomic practices. Because most of the earliest maize hybrids are no longer available, side-by-side comparisons with modern hybrids under current field conditions are generally impossible. Here, we report on the collection and curation of 81 years of public yield trial records covering 4,730 maize hybrids, which enabled us to model genetic variation for temperature responses among maize hybrids. We show that selection may have indirectly and inconsistently contributed to the genetic adaptation of maize to moderate heat stress over this time period while preserving genetic variance for continued adaptation. However, our results reveal the existence of a genetic tradeoff for tolerance to moderate and severe heat stress, leading to a decrease in tolerance to severe heat stress over the same time period. Both trends are particularly conspicuous since the mid-1970s. Such a tradeoff poses challenges to the continued adaptation of maize to warming climates due to a projected increase in the frequency of extreme heat events. Nevertheless, given recent advances in phenomics, enviromics, and physiological modeling, our results offer a degree of optimism for the capacity of plant breeders to adapt maize to warming climates, assuming appropriate levels of R&D investment.
Collapse
Affiliation(s)
- Aaron Kusmec
- Department of Agronomy, Iowa State University; Ames, Iowa, United States of America
| | - Lakshmi Attigala
- Department of Agronomy, Iowa State University; Ames, Iowa, United States of America
| | - Xiongtao Dai
- Department of Statistics, Iowa State University; Ames, Iowa, United States of America
| | - Srikant Srinivasan
- Plant Sciences Institute, Iowa State University; Ames, Iowa, United States of America
| | - Cheng-Ting Eddy Yeh
- Plant Sciences Institute, Iowa State University; Ames, Iowa, United States of America
| | - Patrick S Schnable
- Department of Agronomy, Iowa State University; Ames, Iowa, United States of America
- Plant Sciences Institute, Iowa State University; Ames, Iowa, United States of America
| |
Collapse
|
6
|
Zhang Z, Wang L. A simulation framework for reciprocal recurrent selection-based hybrid breeding under transparent and opaque simulators. FRONTIERS IN PLANT SCIENCE 2023; 14:1174168. [PMID: 37441181 PMCID: PMC10333587 DOI: 10.3389/fpls.2023.1174168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 06/02/2023] [Indexed: 07/15/2023]
Abstract
Hybrid breeding is an established and effective process to improve offspring performance, while it is resource-intensive and time-consuming for the recurrent process in reality. To enable breeders and researchers to evaluate the effectiveness of competing decision-making strategies, we present a modular simulation framework for reciprocal recurrent selection-based hybrid breeding. Consisting of multiple modules such as heterotic separation, genomic prediction, and genomic selection, this simulation framework allows breeders to efficiently simulate the hybrid breeding process with multiple options of simulators and decision-making strategies. We also integrate the recently proposed concepts of transparent and opaque simulators into the framework in order to reflect the breeding process more realistically. Simulation results show the performance comparison among different breeding strategies under the two simulators.
Collapse
Affiliation(s)
- Zerui Zhang
- Program of Bioinformatics and Computational Biology, Iowa State University, Ames, IA, United States
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
- Department of Statistics, Iowa State University, Ames, IA, United States
| | - Lizhi Wang
- Program of Bioinformatics and Computational Biology, Iowa State University, Ames, IA, United States
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
| |
Collapse
|
7
|
Labroo MR, Endelman JB, Gemenet DC, Werner CR, Gaynor RC, Covarrubias-Pazaran GE. Clonal diploid and autopolyploid breeding strategies to harness heterosis: insights from stochastic simulation. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:147. [PMID: 37291402 DOI: 10.1007/s00122-023-04377-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 05/05/2023] [Indexed: 06/10/2023]
Abstract
KEY MESSAGE Reciprocal recurrent selection sometimes increases genetic gain per unit cost in clonal diploids with heterosis due to dominance, but it typically does not benefit autopolyploids. Breeding can change the dominance as well as additive genetic value of populations, thus utilizing heterosis. A common hybrid breeding strategy is reciprocal recurrent selection (RRS), in which parents of hybrids are typically recycled within pools based on general combining ability. However, the relative performances of RRS and other breeding strategies have not been thoroughly compared. RRS can have relatively increased costs and longer cycle lengths, but these are sometimes outweighed by its ability to harness heterosis due to dominance. Here, we used stochastic simulation to compare genetic gain per unit cost of RRS, terminal crossing, recurrent selection on breeding value, and recurrent selection on cross performance considering different amounts of population heterosis due to dominance, relative cycle lengths, time horizons, estimation methods, selection intensities, and ploidy levels. In diploids with phenotypic selection at high intensity, whether RRS was the optimal breeding strategy depended on the initial population heterosis. However, in diploids with rapid-cycling genomic selection at high intensity, RRS was the optimal breeding strategy after 50 years over almost all amounts of initial population heterosis under the study assumptions. Diploid RRS required more population heterosis to outperform other strategies as its relative cycle length increased and as selection intensity and time horizon decreased. The optimal strategy depended on selection intensity, a proxy for inbreeding rate. Use of diploid fully inbred parents vs. outbred parents with RRS typically did not affect genetic gain. In autopolyploids, RRS typically did not outperform one-pool strategies regardless of the initial population heterosis.
Collapse
Affiliation(s)
- Marlee R Labroo
- Excellence in Breeding Platform, Consultative Group of International Agricultural Research, Texcoco, Mexico
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Jeffrey B Endelman
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Dorcus C Gemenet
- Excellence in Breeding Platform, Consultative Group of International Agricultural Research, Texcoco, Mexico
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Christian R Werner
- Excellence in Breeding Platform, Consultative Group of International Agricultural Research, Texcoco, Mexico
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | | | - Giovanny E Covarrubias-Pazaran
- Excellence in Breeding Platform, Consultative Group of International Agricultural Research, Texcoco, Mexico.
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.
| |
Collapse
|
8
|
Bandillo NB, Jarquin D, Posadas LG, Lorenz AJ, Graef GL. Genomic selection performs as effectively as phenotypic selection for increasing seed yield in soybean. THE PLANT GENOME 2023; 16:e20285. [PMID: 36447395 DOI: 10.1002/tpg2.20285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 10/08/2022] [Indexed: 05/10/2023]
Abstract
Increasing the rate of genetic gain for seed yield remains the primary breeding objective in both public and private soybean [Glycine max (L.) Merr.] breeding programs. Genomic selection (GS) has the potential to accelerate the rate of genetic gain for soybean seed yield. Limited studies to date have validated GS accuracy and directly compared GS with phenotypic selection (PS), and none have been reported in soybean. This study conducted the first empirical validation of GS for increasing seed yield using over 1,500 lines and over 7 yr (2010-2016) of replicated experiments in the University of Nebraska-Lincoln soybean breeding program. The study was designed to capture the varying genetic relatedness of the training population to three validation sets: two large biparental populations (TBP-1 and TBP-2) and a large validation set comprised of 457 preselected advanced lines derived from 45 biparental populations (TMP). We found that prediction accuracy (.54) realized in our validation experiments was comparable with what we obtained from a series of cross-validation experiments (.64). Both GS and PS were more effective for increasing the population mean performance compared with random selection (RS). We found a selection advantage of GS over PS, where higher genetic gain and identification of top-performing lines was maximized at 10-20% selected proportion. Genomic selection led to small increases in genetic similarity when compared with PS and RS presumably because of a significant shift on allelic frequencies toward the extremes, suggesting that it could erode genetic diversity more quickly. Overall, we found that GS can perform as effectively as PS but that measures should be considered to protect against loss of genetic variance when using GS.
Collapse
Affiliation(s)
- Nonoy B Bandillo
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
- Dep. of Plant Sciences, North Dakota State Univ., NDSU Dep. 7670, P.O. Box 6050, Fargo, ND, 58108-6050, USA
| | - Diego Jarquin
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
- Agronomy Dep., Univ. of Florida, 2089 McCarthy Hall B, Gainesville, FL, 32611, USA
| | - Luis G Posadas
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
| | - Aaron J Lorenz
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
- Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, St. Paul, MN, 55108-6026, USA
| | - George L Graef
- Dep. of Agronomy and Horticulture, Univ. of Nebraska, 363 Keim Hall, Lincoln, NE, 68583, USA
| |
Collapse
|
9
|
Diot J, Iwata H. Bayesian optimisation for breeding schemes. FRONTIERS IN PLANT SCIENCE 2023; 13:1050198. [PMID: 36714776 PMCID: PMC9875003 DOI: 10.3389/fpls.2022.1050198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/14/2022] [Indexed: 06/18/2023]
Abstract
INTRODUCTION Advances in genotyping technologies have provided breeders with access to the genotypic values of several thousand genetic markers in their breeding materials. Combined with phenotypic data, this information facilitates genomic selection. Although genomic selection can benefit breeders, it does not guarantee efficient genetic improvement. Indeed, multiple components of breeding schemes may affect the efficiency of genetic improvement and controlling all components may not be possible. In this study, we propose a new application of Bayesian optimisation for optimizing breeding schemes under specific constraints using computer simulation. METHODS Breeding schemes are simulated according to nine different parameters. Five of those parameters are considered constraints, and 4 can be optimised. Two optimisation methods are used to optimise those parameters, Bayesian optimisation and random optimisation. RESULTS The results show that Bayesian optimisation indeed finds breeding scheme parametrisations that provide good breeding improvement with regard to the entire parameter space and outperforms random optimisation. Moreover, the results also show that the optimised parameter distributions differ according to breeder constraints. DISCUSSION This study is one of the first to apply Bayesian optimisation to the design of breeding schemes while considering constraints. The presented approach has some limitations and should be considered as a first proof of concept that demonstrates the potential of Bayesian optimisation when applied to breeding schemes. Determining a general "rule of thumb" for breeding optimisation may be difficult and considering the specific constraints of each breeding campaign is important for finding an optimal breeding scheme.
Collapse
|
10
|
Muvunyi BP, Zou W, Zhan J, He S, Ye G. Multi-Trait Genomic Prediction Models Enhance the Predictive Ability of Grain Trace Elements in Rice. Front Genet 2022; 13:883853. [PMID: 35812754 PMCID: PMC9257107 DOI: 10.3389/fgene.2022.883853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 05/05/2022] [Indexed: 11/13/2022] Open
Abstract
Multi-trait (MT) genomic prediction models enable breeders to save phenotyping resources and increase the prediction accuracy of unobserved target traits by exploiting available information from non-target or auxiliary traits. Our study evaluated different MT models using 250 rice accessions from Asian countries genotyped and phenotyped for grain content of zinc (Zn), iron (Fe), copper (Cu), manganese (Mn), and cadmium (Cd). The predictive performance of MT models compared to a traditional single trait (ST) model was assessed by 1) applying different cross-validation strategies (CV1, CV2, and CV3) inferring varied phenotyping patterns and budgets; 2) accounting for local epistatic effects along with the main additive effect in MT models; and 3) using a selective marker panel composed of trait-associated SNPs in MT models. MT models were not statistically significantly (p < 0.05) superior to ST model under CV1, where no phenotypic information was available for the accessions in the test set. After including phenotypes from auxiliary traits in both training and test sets (MT-CV2) or simply in the test set (MT-CV3), MT models significantly (p < 0.05) outperformed ST model for all the traits. The highest increases in the predictive ability of MT models relative to ST models were 11.1% (Mn), 11.5 (Cd), 33.3% (Fe), 95.2% (Cu) and 126% (Zn). Accounting for the local epistatic effects using a haplotype-based model further improved the predictive ability of MT models by 4.6% (Cu), 3.8% (Zn), and 3.5% (Cd) relative to MT models with only additive effects. The predictive ability of the haplotype-based model was not improved after optimizing the marker panel by only considering the markers associated with the traits. This study first assessed the local epistatic effects and marker optimization strategies in the MT genomic prediction framework and then illustrated the power of the MT model in predicting trace element traits in rice for the effective use of genetic resources to improve the nutritional quality of rice grain.
Collapse
Affiliation(s)
- Blaise Pascal Muvunyi
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute in Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Wenli Zou
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute in Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Junhui Zhan
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute in Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Sang He
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute in Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- *Correspondence: Sang He, ; Guoyou Ye,
| | - Guoyou Ye
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute in Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Rice Breeding Innovations Platform, International Rice Research Institute, Los Baños, Philippines
- *Correspondence: Sang He, ; Guoyou Ye,
| |
Collapse
|
11
|
Zhang Z, Wang L. A look-ahead approach to maximizing present value of genetic gains in genomic selection. G3 (BETHESDA, MD.) 2022; 12:6598801. [PMID: 35652749 PMCID: PMC9339320 DOI: 10.1093/g3journal/jkac136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 05/16/2022] [Indexed: 11/17/2022]
Abstract
Look-ahead selection is a sophisticated yet effective algorithm for genomic selection, which optimizes not only the selection of breeding parents but also mating strategy and resource allocation by anticipating the implications of crosses in a prespecified future target generation. Simulation results using maize datasets have suggested that look-ahead selection is able to significantly accelerate genetic gain in the target generation while maintaining genetic diversity. In this paper, we propose a new algorithm to address the limitations of look-ahead selection, including the difficulty in specifying a meaningful deadline in a continuous breeding process and slow growth of genetic gain in early generations. This new algorithm uses the present value of genetic gains as the breeding objective, converting genetic gains realized in different generations to the current generation using a discount rate, similar to using the interest rate to measure the time value of cash flows incurred at different time points. By using the look-ahead techniques to anticipate the future gametes and thus present value of future genetic gains, this algorithm yields a better trade-off between short-term and long-term benefits. Results from simulation experiments showed that the new algorithm can achieve higher genetic gains in early generations and a continuously growing trajectory as opposed to the look-ahead selection algorithm, which features a slow progress in early generations and a growth spike right before the deadline.
Collapse
Affiliation(s)
- Zerui Zhang
- Program of Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50011, USA,Department of Statistics, Iowa State University, Ames, IA 50011, USA
| | - Lizhi Wang
- Corresponding author: Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
12
|
Moeinizade S, Pham H, Han Y, Dobbels A, Hu G. An applied deep learning approach for estimating soybean relative maturity from UAV imagery to aid plant breeding decisions. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2021.100233] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
13
|
Srivastava AK, Safaei N, Khaki S, Lopez G, Zeng W, Ewert F, Gaiser T, Rahimi J. Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci Rep 2022; 12:3215. [PMID: 35217689 PMCID: PMC8881605 DOI: 10.1038/s41598-022-06249-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 12/15/2021] [Indexed: 11/10/2022] Open
Abstract
Crop yield forecasting depends on many interactive factors, including crop genotype, weather, soil, and management practices. This study analyzes the performance of machine learning and deep learning methods for winter wheat yield prediction using an extensive dataset of weather, soil, and crop phenology variables in 271 counties across Germany from 1999 to 2019. We proposed a Convolutional Neural Network (CNN) model, which uses a 1-dimensional convolution operation to capture the time dependencies of environmental variables. We used eight supervised machine learning models as baselines and evaluated their predictive performance using RMSE, MAE, and correlation coefficient metrics to benchmark the yield prediction results. Our findings suggested that nonlinear models such as the proposed CNN, Deep Neural Network (DNN), and XGBoost were more effective in understanding the relationship between the crop yield and input data compared to the linear models. Our proposed CNN model outperformed all other baseline models used for winter wheat yield prediction (7 to 14% lower RMSE, 3 to 15% lower MAE, and 4 to 50% higher correlation coefficient than the best performing baseline across test data). We aggregated soil moisture and meteorological features at the weekly resolution to address the seasonality of the data. We also moved beyond prediction and interpreted the outputs of our proposed CNN model using SHAP and force plots which provided key insights in explaining the yield prediction results (importance of variables by time). We found DUL, wind speed at week ten, and radiation amount at week seven as the most critical features in winter wheat yield prediction.
Collapse
Affiliation(s)
- Amit Kumar Srivastava
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, 53111, Germany.
| | - Nima Safaei
- Department of Business Analytics, Tippie College of Business, University of Iowa, Iowa, USA.
| | - Saeed Khaki
- Industrial and Manufacturing Systems Engineering Department, Iowa State University, Ames, USA.
| | - Gina Lopez
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, 53111, Germany
| | - Wenzhi Zeng
- State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China
| | - Frank Ewert
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, 53111, Germany
| | - Thomas Gaiser
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, 53111, Germany
| | - Jaber Rahimi
- Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research, Atmospheric Environmental Research (IMK-IFU), Karlsruhe, Germany
| |
Collapse
|
14
|
Martins Oliveira IC, Bernardeli A, Soler Guilhen JH, Pastina MM. Genomic Prediction of Complex Traits in an Allogamous Annual Crop: The Case of Maize Single-Cross Hybrids. Methods Mol Biol 2022; 2467:543-567. [PMID: 35451790 DOI: 10.1007/978-1-0716-2205-6_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
For many plant and animal species, commercial products are hybrids between individuals from different genetic groups. For allogamous plant species such as maize, the breeding objective is to produce single-cross hybrid varieties from two inbred lines each selected in complementary groups. Efficient hybrid breeding requires methods that (1) quickly generate homozygous and homogeneous parental lines with high combining abilities, (2) efficiently choose among the large number of available parental lines the most promising ones, and (3) predict the performances of sets of non-phenotyped single-cross hybrids, or hybrids phenotyped in a limited number of environments, based on their relationship with another set of hybrids with known performances. The maize breeding community has been developing model-based prediction of hybrid performances well before the genomic era. This chapter (1) provides a reminder of the maize breeding scheme before the genomic era; (2) describes how genomic data were incorporated in the prediction models involved in different steps of genomic-based single-cross maize hybrid breeding; and (3) reviews factors affecting the accuracy of genomic prediction, approaches for optimizing GP-based single-cross maize hybrid breeding schemes, and ensuring the long-term sustainability of genomic selection.
Collapse
Affiliation(s)
| | - Arthur Bernardeli
- Department of Agronomy, Universidade Federal de Viçosa, Viçosa-MG, Brazil
| | | | | |
Collapse
|
15
|
Cheng H, Xu K, Li J, Abraham KJ. Optimizing Sequencing Resources in Genotyped Livestock Populations Using Linear Programming. Front Genet 2021; 12:740340. [PMID: 34745214 PMCID: PMC8570094 DOI: 10.3389/fgene.2021.740340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 09/20/2021] [Indexed: 11/13/2022] Open
Abstract
Low-cost genome-wide single-nucleotide polymorphisms (SNPs) are routinely used in animal breeding programs. Compared to SNP arrays, the use of whole-genome sequence data generated by the next-generation sequencing technologies (NGS) has great potential in livestock populations. However, sequencing a large number of animals to exploit the full potential of whole-genome sequence data is not feasible. Thus, novel strategies are required for the allocation of sequencing resources in genotyped livestock populations such that the entire population can be imputed, maximizing the efficiency of whole genome sequencing budgets. We present two applications of linear programming for the efficient allocation of sequencing resources. The first application is to identify the minimum number of animals for sequencing subject to the criterion that each haplotype in the population is contained in at least one of the animals selected for sequencing. The second application is the selection of animals whose haplotypes include the largest possible proportion of common haplotypes present in the population, assuming a limited sequencing budget. Both applications are available in an open source program LPChoose. In both applications, LPChoose has similar or better performance than some other methods suggesting that linear programming methods offer great potential for the efficient allocation of sequencing resources. The utility of these methods can be increased through the development of improved heuristics.
Collapse
Affiliation(s)
- Hao Cheng
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| | - Keyu Xu
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| | - Jinghui Li
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| | - Kuruvilla Joseph Abraham
- Department of Economics, FEARP, University of São-Paulo, Ribeirão Preto, Brazil.,Department of Computer Science-ICMC, University of São Paulo, São Carlos, Brazil
| |
Collapse
|
16
|
Michel S, Löschenberger F, Ametz C, Bürstmayr H. Genomic selection of parents and crosses beyond the native gene pool of a breeding program. THE PLANT GENOME 2021; 14:e20153. [PMID: 34651462 DOI: 10.1002/tpg2.20153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 08/03/2021] [Indexed: 06/13/2023]
Abstract
Genomic selection has become a valuable tool for selecting cultivar candidates in many plant breeding programs. Genomic selection of elite parents and crossing combinations with germplasm developed outside a breeding program has, however, hardly been explored until now. The aim of this study was to assess the potential of this method for commonly ranking and selecting elite germplasm developed within and beyond a given breeding program. A winter wheat (Triticum aestivum L.) population consisting of 611 in-house and 87 externally developed lines was used to compare training population compositions and statistical models for genomically predicting baking quality in this framework. Augmenting training populations with lines from other breeding programs had a larger influence on the prediction ability than adding in-house generated lines when aiming to commonly rank both germplasm sets. Exploiting preexisting information of secondary correlated traits resulted likewise in more accurate predictions both in empirical analyses and simulations. Genotyping germplasm developed beyond a given breeding program is moreover a convenient way to clarify its relationships with a breeder's own germplasm because pedigree information is oftentimes not available for this purpose. Genomic predictions can thus support a more informed diversity management, especially when integrating simply to phenotype correlated traits to partly circumvent resource reallocations for a costly phenotyping of germplasm from other programs.
Collapse
Affiliation(s)
- Sebastian Michel
- Dep. of Agrobiotechnology, IFA-Tulln, Univ. of Natural Resources and Life Sciences Vienna, Konrad-Lorenz-Str. 20, 3430 Tulln, Austria
| | | | - Christian Ametz
- Saatzucht Donau GesmbH & CoKG, Saatzuchtstrasse 11, 2301 Probstdorf, Austria
| | - Hermann Bürstmayr
- Dep. of Agrobiotechnology, IFA-Tulln, Univ. of Natural Resources and Life Sciences Vienna, Konrad-Lorenz-Str. 20, 3430 Tulln, Austria
| |
Collapse
|
17
|
Cappetta E, Andolfo G, Guadagno A, Di Matteo A, Barone A, Frusciante L, Ercolano MR. Tomato genomic prediction for good performance under high-temperature and identification of loci involved in thermotolerance response. HORTICULTURE RESEARCH 2021; 8:212. [PMID: 34593775 PMCID: PMC8484564 DOI: 10.1038/s41438-021-00647-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Revised: 07/05/2021] [Accepted: 07/14/2021] [Indexed: 06/13/2023]
Abstract
Many studies showed that few degrees above tomato optimum growth temperature threshold can lead to serious loss in production. Therefore, the development of innovative strategies to obtain tomato cultivars with improved yield under high temperature conditions is a main goal both for basic genetic studies and breeding activities. In this paper, a F4 segregating population was phenotypically evaluated for quantitative and qualitative traits under heat stress conditions. Moreover, a genotyping by sequencing (GBS) approach has been employed for building up genomic selection (GS) models both for yield and soluble solid content (SCC). Several parameters, including training population size, composition and marker quality were tested to predict genotype performance under heat stress conditions. A good prediction accuracy for the two analyzed traits (0.729 for yield production and 0.715 for SCC) was obtained. The predicted models improved the genetic gain of selection in the next breeding cycles, suggesting that GS approach is a promising strategy to accelerate breeding for heat tolerance in tomato. Finally, the annotation of SNPs located in gene body regions combined with QTL analysis allowed the identification of five candidates putatively involved in high temperatures response, and the building up of a GS model based on calibrated panel of SNP markers.
Collapse
Affiliation(s)
- Elisa Cappetta
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy
- Institute of Bioscience and BioResources, National Research Council, Via Università 100, 80055, Portici, Italy
| | - Giuseppe Andolfo
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy
| | - Anna Guadagno
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy
| | - Antonio Di Matteo
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy
| | - Amalia Barone
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy
| | - Luigi Frusciante
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy
| | - Maria Raffaella Ercolano
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055, Portici, Naples, Italy.
| |
Collapse
|
18
|
Atefi A, Ge Y, Pitla S, Schnable J. Robotic Technologies for High-Throughput Plant Phenotyping: Contemporary Reviews and Future Perspectives. FRONTIERS IN PLANT SCIENCE 2021; 12:611940. [PMID: 34249028 PMCID: PMC8267384 DOI: 10.3389/fpls.2021.611940] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 05/14/2021] [Indexed: 05/18/2023]
Abstract
Phenotyping plants is an essential component of any effort to develop new crop varieties. As plant breeders seek to increase crop productivity and produce more food for the future, the amount of phenotype information they require will also increase. Traditional plant phenotyping relying on manual measurement is laborious, time-consuming, error-prone, and costly. Plant phenotyping robots have emerged as a high-throughput technology to measure morphological, chemical and physiological properties of large number of plants. Several robotic systems have been developed to fulfill different phenotyping missions. In particular, robotic phenotyping has the potential to enable efficient monitoring of changes in plant traits over time in both controlled environments and in the field. The operation of these robots can be challenging as a result of the dynamic nature of plants and the agricultural environments. Here we discuss developments in phenotyping robots, and the challenges which have been overcome and others which remain outstanding. In addition, some perspective applications of the phenotyping robots are also presented. We optimistically anticipate that autonomous and robotic systems will make great leaps forward in the next 10 years to advance the plant phenotyping research into a new era.
Collapse
Affiliation(s)
- Abbas Atefi
- Department of Biological Systems Engineering, University of Nebraska–Lincoln, Lincoln, NE, United States
| | - Yufeng Ge
- Department of Biological Systems Engineering, University of Nebraska–Lincoln, Lincoln, NE, United States
| | - Santosh Pitla
- Department of Biological Systems Engineering, University of Nebraska–Lincoln, Lincoln, NE, United States
| | - James Schnable
- Department of Agronomy and Horticulture, University of Nebraska–Lincoln, Lincoln, NE, United States
| |
Collapse
|
19
|
Han Y, Cameron JN, Wang L, Pham H, Beavis WD. Dynamic Programming for Resource Allocation in Multi-Allelic Trait Introgression. FRONTIERS IN PLANT SCIENCE 2021; 12:544854. [PMID: 34220873 PMCID: PMC8253225 DOI: 10.3389/fpls.2021.544854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 05/21/2021] [Indexed: 06/13/2023]
Abstract
Trait introgression is a complex process that plant breeders use to introduce desirable alleles from one variety or species to another. Two of the major types of decisions that must be made during this sophisticated and uncertain workflow are: parental selection and resource allocation. We formulated the trait introgression problem as an engineering process and proposed a Markov Decision Processes (MDP) model to optimize the resource allocation procedure. The efficiency of the MDP model was compared with static resource allocation strategies and their trade-offs among budget, deadline, and probability of success are demonstrated. Simulation results suggest that dynamic resource allocation strategies from the MDP model significantly improve the efficiency of the trait introgression by allocating the right amount of resources according to the genetic outcome of previous generations.
Collapse
Affiliation(s)
- Ye Han
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
| | - John N. Cameron
- Department of Agronomy, Iowa State University, Ames, IA, United States
| | - Lizhi Wang
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
| | - Hieu Pham
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
| | - William D. Beavis
- Department of Agronomy, Iowa State University, Ames, IA, United States
| |
Collapse
|
20
|
Marsh JI, Hu H, Gill M, Batley J, Edwards D. Crop breeding for a changing climate: integrating phenomics and genomics with bioinformatics. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021; 134:1677-1690. [PMID: 33852055 DOI: 10.1007/s00122-021-03820-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 03/18/2021] [Indexed: 05/05/2023]
Abstract
Safeguarding crop yields in a changing climate requires bioinformatics advances in harnessing data from vast phenomics and genomics datasets to translate research findings into climate smart crops in the field. Climate change and an additional 3 billion mouths to feed by 2050 raise serious concerns over global food security. Crop breeding and land management strategies will need to evolve to maximize the utilization of finite resources in coming years. High-throughput phenotyping and genomics technologies are providing researchers with the information required to guide and inform the breeding of climate smart crops adapted to the environment. Bioinformatics has a fundamental role to play in integrating and exploiting this fast accumulating wealth of data, through association studies to detect genomic targets underlying key adaptive climate-resilient traits. These data provide tools for breeders to tailor crops to their environment and can be introduced using advanced selection or genome editing methods. To effectively translate research into the field, genomic and phenomic information will need to be integrated into comprehensive clade-specific databases and platforms alongside accessible tools that can be used by breeders to inform the selection of climate adaptive traits. Here we discuss the role of bioinformatics in extracting, analysing, integrating and managing genomic and phenomic data to improve climate resilience in crops, including current, emerging and potential approaches, applications and bottlenecks in the research and breeding pipeline.
Collapse
Affiliation(s)
- Jacob I Marsh
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, 6009, Australia
| | - Haifei Hu
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, 6009, Australia
| | - Mitchell Gill
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, 6009, Australia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, 6009, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, 6009, Australia.
| |
Collapse
|
21
|
Akhavizadegan F, Ansarifar J, Wang L, Huber I, Archontoulis SV. A time-dependent parameter estimation framework for crop modeling. Sci Rep 2021; 11:11437. [PMID: 34075079 PMCID: PMC8169860 DOI: 10.1038/s41598-021-90835-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 05/17/2021] [Indexed: 02/06/2023] Open
Abstract
The performance of crop models in simulating various aspects of the cropping system is sensitive to parameter calibration. Parameter estimation is challenging, especially for time-dependent parameters such as cultivar parameters with 2-3 years of lifespan. Manual calibration of the parameters is time-consuming, requires expertise, and is prone to error. This research develops a new automated framework to estimate time-dependent parameters for crop models using a parallel Bayesian optimization algorithm. This approach integrates the power of optimization and machine learning with prior agronomic knowledge. To test the proposed time-dependent parameter estimation method, we simulated historical yield increase (from 1985 to 2018) in 25 environments in the US Corn Belt with APSIM. Then we compared yield simulation results and nine parameter estimates from our proposed parallel Bayesian framework, with Bayesian optimization and manual calibration. Results indicated that parameters calibrated using the proposed framework achieved an 11.6% reduction in the prediction error over Bayesian optimization and a 52.1% reduction over manual calibration. We also trained nine machine learning models for yield prediction and found that none of them was able to outperform the proposed method in terms of root mean square error and R2. The most significant contribution of the new automated framework for time-dependent parameter estimation is its capability to find close-to-optimal parameters for the crop model. The proposed approach also produced explainable insight into cultivar traits' trends over 34 years (1985-2018).
Collapse
Affiliation(s)
- Faezeh Akhavizadegan
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA.
| | - Javad Ansarifar
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA
| | - Lizhi Wang
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA
| | - Isaiah Huber
- Department of Agronomy, Iowa State University, Ames, IA, 50011, USA
| | | |
Collapse
|
22
|
Labroo MR, Studer AJ, Rutkoski JE. Heterosis and Hybrid Crop Breeding: A Multidisciplinary Review. Front Genet 2021; 12:643761. [PMID: 33719351 PMCID: PMC7943638 DOI: 10.3389/fgene.2021.643761] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/08/2021] [Indexed: 11/24/2022] Open
Abstract
Although hybrid crop varieties are among the most popular agricultural innovations, the rationale for hybrid crop breeding is sometimes misunderstood. Hybrid breeding is slower and more resource-intensive than inbred breeding, but it allows systematic improvement of a population by recurrent selection and exploitation of heterosis simultaneously. Inbred parental lines can identically reproduce both themselves and their F1 progeny indefinitely, whereas outbred lines cannot, so uniform outbred lines must be bred indirectly through their inbred parents to harness heterosis. Heterosis is an expected consequence of whole-genome non-additive effects at the population level over evolutionary time. Understanding heterosis from the perspective of molecular genetic mechanisms alone may be elusive, because heterosis is likely an emergent property of populations. Hybrid breeding is a process of recurrent population improvement to maximize hybrid performance. Hybrid breeding is not maximization of heterosis per se, nor testing random combinations of individuals to find an exceptional hybrid, nor using heterosis in place of population improvement. Though there are methods to harness heterosis other than hybrid breeding, such as use of open-pollinated varieties or clonal propagation, they are not currently suitable for all crops or production environments. The use of genomic selection can decrease cycle time and costs in hybrid breeding, particularly by rapidly establishing heterotic pools, reducing testcrossing, and limiting the loss of genetic variance. Open questions in optimal use of genomic selection in hybrid crop breeding programs remain, such as how to choose founders of heterotic pools, the importance of dominance effects in genomic prediction, the necessary frequency of updating the training set with phenotypic information, and how to maintain genetic variance and prevent fixation of deleterious alleles.
Collapse
Affiliation(s)
| | | | - Jessica E. Rutkoski
- Department of Crop Sciences, University of Illinois at Urbana–Champaign, Urbana, IL, United States
| |
Collapse
|
23
|
Amini F, Franco FR, Hu G, Wang L. The look ahead trace back optimizer for genomic selection under transparent and opaque simulators. Sci Rep 2021; 11:4124. [PMID: 33602979 PMCID: PMC7893003 DOI: 10.1038/s41598-021-83567-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 02/02/2021] [Indexed: 11/29/2022] Open
Abstract
Recent advances in genomic selection (GS) have demonstrated the importance of not only the accuracy of genomic prediction but also the intelligence of selection strategies. The look ahead selection algorithm, for example, has been found to significantly outperform the widely used truncation selection approach in terms of genetic gain, thanks to its strategy of selecting breeding parents that may not necessarily be elite themselves but have the best chance of producing elite progeny in the future. This paper presents the look ahead trace back algorithm as a new variant of the look ahead approach, which introduces several improvements to further accelerate genetic gain especially under imperfect genomic prediction. Perhaps an even more significant contribution of this paper is the design of opaque simulators for evaluating the performance of GS algorithms. These simulators are partially observable, explicitly capture both additive and non-additive genetic effects, and simulate uncertain recombination events more realistically. In contrast, most existing GS simulation settings are transparent, either explicitly or implicitly allowing the GS algorithm to exploit certain critical information that may not be possible in actual breeding programs. Comprehensive computational experiments were carried out using a maize data set to compare a variety of GS algorithms under four simulators with different levels of opacity. These results reveal how differently a same GS algorithm would interact with different simulators, suggesting the need for continued research in the design of more realistic simulators. As long as GS algorithms continue to be trained in silico rather than in planta, the best way to avoid disappointing discrepancy between their simulated and actual performances may be to make the simulator as akin to the complex and opaque nature as possible.
Collapse
Affiliation(s)
- Fatemeh Amini
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA
| | - Felipe Restrepo Franco
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA
| | - Guiping Hu
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA
| | - Lizhi Wang
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
24
|
A look-ahead Monte Carlo simulation method for improving parental selection in trait introgression. Sci Rep 2021; 11:3918. [PMID: 33594238 PMCID: PMC7887201 DOI: 10.1038/s41598-021-83634-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 01/14/2021] [Indexed: 11/08/2022] Open
Abstract
Multiple trait introgression is the process by which multiple desirable traits are converted from a donor to a recipient cultivar through backcrossing and selfing. The goal of this procedure is to recover all the attributes of the recipient cultivar, with the addition of the specified desirable traits. A crucial step in this process is the selection of parents to form new crosses. In this study, we propose a new selection approach that estimates the genetic distribution of the progeny of backcrosses after multiple generations using information of recombination events. Our objective is to select the most promising individuals for further backcrossing or selfing. To demonstrate the effectiveness of the proposed method, a case study has been conducted using maize data where our method is compared with state-of-the-art approaches. Simulation results suggest that the proposed method, look-ahead Monte Carlo, achieves higher probability of success than existing approaches. Our proposed selection method can assist breeders to efficiently design trait introgression projects.
Collapse
|
25
|
How Prediction Accuracy Can Affect the Decision-Making Process in Pavement Management System. INFRASTRUCTURES 2021. [DOI: 10.3390/infrastructures6020028] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
One of the most important components of pavement management systems is predicting the deterioration of the network through performance models. The accuracy of the prediction model is important for prioritizing maintenance action. This paper describes how the accuracy of prediction models can have an effect on the decision-making process in terms of the cost of maintenance and rehabilitation activities. The process is simulating the propagation of the error between the actual and predicted values of pavement performance indicators. Different rate of error (10%, 30%, 50%, 70%, and 90%) was added into the result of prediction models. The results showed a strong correlation between the prediction models’ accuracy and the cost of maintenance and rehabilitation activities. The cost of treatment (in millions of dollars) over 20 years for five different scenarios increased from ($54.07–$92.95), ($53.89–$155.48), and ($74.41–$107.77) for asphalt, composite, and concrete pavement types, respectively. Increasing the rate of error also contributed to the prediction model, resulting in a higher benefit reduction rate.
Collapse
|
26
|
Shahhosseini M, Hu G, Huber I, Archontoulis SV. Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt. Sci Rep 2021; 11:1606. [PMID: 33452349 PMCID: PMC7810832 DOI: 10.1038/s41598-020-80820-1] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 12/16/2020] [Indexed: 01/29/2023] Open
Abstract
This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.
Collapse
Affiliation(s)
- Mohsen Shahhosseini
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, USA
| | - Guiping Hu
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, USA.
| | - Isaiah Huber
- Department of Agronomy, Iowa State University, Ames, IA, USA
| | | |
Collapse
|
27
|
Cappetta E, Andolfo G, Di Matteo A, Barone A, Frusciante L, Ercolano MR. Accelerating Tomato Breeding by Exploiting Genomic Selection Approaches. PLANTS (BASEL, SWITZERLAND) 2020; 9:E1236. [PMID: 32962095 PMCID: PMC7569914 DOI: 10.3390/plants9091236] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 05/13/2020] [Accepted: 09/15/2020] [Indexed: 01/16/2023]
Abstract
Genomic selection (GS) is a predictive approach that was built up to increase the rate of genetic gain per unit of time and reduce the generation interval by utilizing genome-wide markers in breeding programs. It has emerged as a valuable method for improving complex traits that are controlled by many genes with small effects. GS enables the prediction of the breeding value of candidate genotypes for selection. In this work, we address important issues related to GS and its implementation in the plant context with special emphasis on tomato breeding. Genomic constraints and critical parameters affecting the accuracy of prediction such as the number of markers, statistical model, phenotyping and complexity of trait, training population size and composition should be carefully evaluated. The comparison of GS approaches for facilitating the selection of tomato superior genotypes during breeding programs is also discussed. GS applied to tomato breeding has already been shown to be feasible. We illustrated how GS can improve the rate of gain in elite line selection, and descendent and backcross schemes. The GS schemes have begun to be delineated and computer science can provide support for future selection strategies. A new promising breeding framework is beginning to emerge for optimizing tomato improvement procedures.
Collapse
Affiliation(s)
| | | | | | | | | | - Maria Raffaella Ercolano
- Department of Agricultural Sciences, University of Naples Federico II, Via Università 100, 80055 Naples, Italy; (E.C.); (G.A.); (A.D.M.); (A.B.); (L.F.)
| |
Collapse
|
28
|
Multi-trait Genomic Selection Methods for Crop Improvement. Genetics 2020; 215:931-945. [PMID: 32482640 DOI: 10.1534/genetics.120.303305] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 05/26/2020] [Indexed: 11/18/2022] Open
Abstract
Plant breeders make selection decisions based on multiple traits, such as yield, plant height, flowering time, and disease resistance. A commonly used approach in multi-trait genomic selection is index selection, which assigns weights to different traits relative to their economic importance. However, classical index selection only optimizes genetic gain in the next generation, requires some experimentation to find weights that lead to desired outcomes, and has difficulty optimizing nonlinear breeding objectives. Multi-objective optimization has also been used to identify the Pareto frontier of selection decisions, which represents different trade-offs across multiple traits. We propose a new approach, which maximizes certain traits while keeping others within desirable ranges. Optimal selection decisions are made using a new version of the look-ahead selection (LAS) algorithm, which was recently proposed for single-trait genomic selection, and achieved superior performance with respect to other state-of-the-art selection methods. To demonstrate the effectiveness of the new method, a case study is developed using a realistic data set where our method is compared with conventional index selection. Results suggest that the multi-trait LAS is more effective at balancing multiple traits compared with index selection.
Collapse
|
29
|
Xu Y, Liu X, Fu J, Wang H, Wang J, Huang C, Prasanna BM, Olsen MS, Wang G, Zhang A. Enhancing Genetic Gain through Genomic Selection: From Livestock to Plants. PLANT COMMUNICATIONS 2020; 1:100005. [PMID: 33404534 PMCID: PMC7747995 DOI: 10.1016/j.xplc.2019.100005] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies, the rate of genetic gain needs to be accelerated to meet humanity's demand for agricultural products. In this regard, genomic selection (GS) has been considered most promising for genetic improvement of the complex traits controlled by many genes each with minor effects. Livestock scientists pioneered GS application largely due to livestock's significantly higher individual values and the greater reduction in generation interval that can be achieved in GS. Large-scale application of GS in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects, along with significant cost reduction. Moreover, it would be more effective to integrate GS with other breeding tools and platforms for accelerating the breeding process and thereby further enhancing genetic gain. In addition, establishing an open-source breeding network and developing transdisciplinary approaches would be essential in enhancing breeding efficiency for small- and medium-sized enterprises and agricultural research systems in developing countries. New strategies centered on GS for enhancing genetic gain need to be developed.
Collapse
Affiliation(s)
- Yunbi Xu
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
- CIMMYT-China Tropical Maize Research Center, Foshan University, Foshan 528231, China
- CIMMYT-China Specialty Maize Research Center, Shanghai Academy of Agricultural Sciences, Shanghai 201400, China
| | - Xiaogang Liu
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Junjie Fu
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Hongwu Wang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Jiankang Wang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Changling Huang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Boddupalli M. Prasanna
- CIMMYT (International Maize and Wheat Improvement Center), ICRAF Campus, United Nations Avenue, Nairobi, Kenya
| | - Michael S. Olsen
- CIMMYT (International Maize and Wheat Improvement Center), ICRAF Campus, United Nations Avenue, Nairobi, Kenya
| | - Guoying Wang
- Institute of Crop Science/CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Aimin Zhang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
30
|
Shahhosseini M, Hu G, Archontoulis SV. Forecasting Corn Yield With Machine Learning Ensembles. FRONTIERS IN PLANT SCIENCE 2020; 11:1120. [PMID: 32849688 PMCID: PMC7411227 DOI: 10.3389/fpls.2020.01120] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 07/07/2020] [Indexed: 05/03/2023]
Abstract
The emergence of new technologies to synthesize and analyze big data with high-performance computing has increased our capacity to more accurately predict crop yields. Recent research has shown that machine learning (ML) can provide reasonable predictions faster and with higher flexibility compared to simulation crop modeling. However, a single machine learning model can be outperformed by a "committee" of models (machine learning ensembles) that can reduce prediction bias, variance, or both and is able to better capture the underlying distribution of the data. Yet, there are many aspects to be investigated with regard to prediction accuracy, time of the prediction, and scale. The earlier the prediction during the growing season the better, but this has not been thoroughly investigated as previous studies considered all data available to predict yields. This paper provides a machine leaning based framework to forecast corn yields in three US Corn Belt states (Illinois, Indiana, and Iowa) considering complete and partial in-season weather knowledge. Several ensemble models are designed using blocked sequential procedure to generate out-of-bag predictions. The forecasts are made in county-level scale and aggregated for agricultural district and state level scales. Results show that the proposed optimized weighted ensemble and the average ensemble are the most precise models with RRMSE of 9.5%. Stacked LASSO makes the least biased predictions (MBE of 53 kg/ha), while other ensemble models also outperformed the base learners in terms of bias. On the contrary, although random k-fold cross-validation is replaced by blocked sequential procedure, it is shown that stacked ensembles perform not as good as weighted ensemble models for time series data sets as they require the data to be non-IID to perform favorably. Comparing our proposed model forecasts with the literature demonstrates the acceptable performance of forecasts made by our proposed ensemble model. Results from the scenario of having partial in-season weather knowledge reveals that decent yield forecasts with RRMSE of 9.2% can be made as early as June 1st. Moreover, it was shown that the proposed model performed better than individual models and benchmark ensembles at agricultural district and state-level scales as well as county-level scale. To find the marginal effect of each input feature on the forecasts made by the proposed ensemble model, a methodology is suggested that is the basis for finding feature importance for the ensemble model. The findings suggest that weather features corresponding to weather in weeks 18-24 (May 1st to June 1st) are the most important input features.
Collapse
Affiliation(s)
- Mohsen Shahhosseini
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
| | - Guiping Hu
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States
- *Correspondence: Guiping Hu,
| | | |
Collapse
|