1
|
Fernandes IK, Vieira CC, Dias KOG, Fernandes SB. Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:189. [PMID: 39044035 PMCID: PMC11266441 DOI: 10.1007/s00122-024-04687-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 06/29/2024] [Indexed: 07/25/2024]
Abstract
KEY MESSAGE Incorporating feature-engineered environmental data into machine learning-based genomic prediction models is an efficient approach to indirectly model genotype-by-environment interactions. Complementing phenotypic traits and molecular markers with high-dimensional data such as climate and soil information is becoming a common practice in breeding programs. This study explored new ways to combine non-genetic information in genomic prediction models using machine learning. Using the multi-environment trial data from the Genomes To Fields initiative, different models to predict maize grain yield were adjusted using various inputs: genetic, environmental, or a combination of both, either in an additive (genetic-and-environmental; G+E) or a multiplicative (genotype-by-environment interaction; GEI) manner. When including environmental data, the mean prediction accuracy of machine learning genomic prediction models increased up to 7% over the well-established Factor Analytic Multiplicative Mixed Model among the three cross-validation scenarios evaluated. Moreover, using the G+E model was more advantageous than the GEI model given the superior, or at least comparable, prediction accuracy, the lower usage of computational memory and time, and the flexibility of accounting for interactions by construction. Our results illustrate the flexibility provided by the ML framework, particularly with feature engineering. We show that the feature engineering stage offers a viable option for envirotyping and generates valuable information for machine learning-based genomic prediction models. Furthermore, we verified that the genotype-by-environment interactions may be considered using tree-based approaches without explicitly including interactions in the model. These findings support the growing interest in merging high-dimensional genotypic and environmental data into predictive modeling.
Collapse
Affiliation(s)
- Igor K Fernandes
- Department of Crop, Soil, and Environmental Sciences, Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA
| | - Caio C Vieira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Kaio O G Dias
- Department of General Biology, Federal University of Viçosa, Viçosa, Brazil
| | - Samuel B Fernandes
- Department of Crop, Soil, and Environmental Sciences, Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA.
| |
Collapse
|
2
|
Montesinos-López OA, Herr AW, Crossa J, Montesinos-López A, Carter AH. Enhancing winter wheat prediction with genomics, phenomics and environmental data. BMC Genomics 2024; 25:544. [PMID: 38822262 PMCID: PMC11143639 DOI: 10.1186/s12864-024-10438-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 05/21/2024] [Indexed: 06/02/2024] Open
Abstract
In the realm of multi-environment prediction, when the goal is to predict a complete environment using the others as a training set, the efficiency of genomic selection (GS) falls short of expectations. Genotype by environment interaction poses a challenge in achieving high prediction accuracies. Consequently, current efforts are focused on enhancing efficiency by integrating various types of inputs, such as phenomics data, environmental information, and other omics data. In this study, we sought to evaluate the impact of incorporating environmental information into the modeling process, in addition to genomic and phenomics information. Our evaluation encompassed five data sets of soft white winter wheat, and the results revealed a significant improvement in prediction accuracy, as measured by the normalized root mean square error (NRMSE), through the integration of environmental information. Notably, there was an average gain in prediction accuracy of 49.19% in terms of NRMSE across the data sets. Moreover, the observed prediction accuracy ranged from 5.68% (data set 3) to 60.36% (data set 4), underscoring the substantial effect of integrating environmental information. By including genomic, phenomic, and environmental data in prediction models, plant breeding programs can improve selection efficiency across locations.
Collapse
Affiliation(s)
| | - Andrew W Herr
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, 99164, USA
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México- Veracruz, Edo. de México, CP 52640, México
- Universidad de Guadalajara, Montecillos, Edo. de México, CP 56230, México
| | | | - Arron H Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, 99164, USA.
| |
Collapse
|
3
|
Montesinos-López OA, Crespo-Herrera L, Pierre CS, Cano-Paez B, Huerta-Prado GI, Mosqueda-González BA, Ramos-Pulido S, Gerard G, Alnowibet K, Fritsche-Neto R, Montesinos-López A, Crossa J. Feature engineering of environmental covariates improves plant genomic-enabled prediction. FRONTIERS IN PLANT SCIENCE 2024; 15:1349569. [PMID: 38812738 PMCID: PMC11135473 DOI: 10.3389/fpls.2024.1349569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 04/11/2024] [Indexed: 05/31/2024]
Abstract
Introduction Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology. Methods When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models. Results and discussion We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.
Collapse
Affiliation(s)
| | | | - Carolina Saint Pierre
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Edo. de Mexico, Mexico
| | - Bernabe Cano-Paez
- Facultad de Ciencias, Universidad Nacioanl Autónoma de México (UNAM), México City, Mexico
| | | | | | - Sofia Ramos-Pulido
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Guillermo Gerard
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Edo. de Mexico, Mexico
| | - Khalid Alnowibet
- Department of Statistics and Operations Research, King Saud University, Riyah, Saudi Arabia
| | | | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Edo. de Mexico, Mexico
- Louisiana State University, Baton Rouge, LA, United States
- Distinguished Scientist Fellowship Program, King Saud University, Riyah, Saudi Arabia
- Instituto de Socieconomia, Estadistica e Informatica, Colegio de Postgraduados, Montecillos, Edo. de México, Texcoco, Mexico
| |
Collapse
|
4
|
Araújo MS, Chaves SFS, Dias LAS, Ferreira FM, Pereira GR, Bezerra ARG, Alves RS, Heinemann AB, Breseghello F, Carneiro PCS, Krause MD, Costa-Neto G, Dias KOG. GIS-FA: an approach to integrating thematic maps, factor-analytic, and envirotyping for cultivar targeting. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:80. [PMID: 38472532 DOI: 10.1007/s00122-024-04579-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 02/06/2024] [Indexed: 03/14/2024]
Abstract
KEY MESSAGE We propose an "enviromics" prediction model for recommending cultivars based on thematic maps aimed at decision-makers. Parsimonious methods that capture genotype-by-environment interaction (GEI) in multi-environment trials (MET) are important in breeding programs. Understanding the causes and factors of GEI allows the utilization of genotype adaptations in the target population of environments through environmental features and factor-analytic (FA) models. Here, we present a novel predictive breeding approach called GIS-FA, which integrates geographic information systems (GIS) techniques, FA models, partial least squares (PLS) regression, and enviromics to predict phenotypic performance in untested environments. The GIS-FA approach enables: (i) the prediction of the phenotypic performance of tested genotypes in untested environments, (ii) the selection of the best-ranking genotypes based on their overall performance and stability using the FA selection tools, and (iii) the creation of thematic maps showing overall or pairwise performance and stability for decision-making. We exemplify the usage of the GIS-FA approach using two datasets of rice [Oryza sativa (L.)] and soybean [Glycine max (L.) Merr.] in MET spread over tropical areas. In summary, our novel predictive method allows the identification of new breeding scenarios by pinpointing groups of environments where genotypes demonstrate superior predicted performance. It also facilitates and optimizes cultivar recommendations by utilizing thematic maps.
Collapse
Affiliation(s)
- Maurício S Araújo
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Saulo F S Chaves
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Luiz A S Dias
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Filipe M Ferreira
- Department of Crop Science - College of Agricultural Sciences, São Paulo State University, Botucatu, São Paulo, Brazil
| | - Guilherme R Pereira
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | - Rodrigo S Alves
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Alexandre B Heinemann
- Brazilian Agricultural Research Corporation (Embrapa Rice and Beans), Santo Antônio de Goiás, Goiás, Brazil
| | - Flávio Breseghello
- Brazilian Agricultural Research Corporation (Embrapa Rice and Beans), Santo Antônio de Goiás, Goiás, Brazil
| | - Pedro C S Carneiro
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | | | - Kaio O G Dias
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil.
| |
Collapse
|
5
|
Hoque A, Anderson JV, Rahman M. Genomic prediction for agronomic traits in a diverse Flax (Linum usitatissimum L.) germplasm collection. Sci Rep 2024; 14:3196. [PMID: 38326469 PMCID: PMC10850546 DOI: 10.1038/s41598-024-53462-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/31/2024] [Indexed: 02/09/2024] Open
Abstract
Breeding programs require exhaustive phenotyping of germplasms, which is time-demanding and expensive. Genomic prediction helps breeders harness the diversity of any collection to bypass phenotyping. Here, we examined the genomic prediction's potential for seed yield and nine agronomic traits using 26,171 single nucleotide polymorphism (SNP) markers in a set of 337 flax (Linum usitatissimum L.) germplasm, phenotyped in five environments. We evaluated 14 prediction models and several factors affecting predictive ability based on cross-validation schemes. Models yielded significant variation among predictive ability values across traits for the whole marker set. The ridge regression (RR) model covering additive gene action yielded better predictive ability for most of the traits, whereas it was higher for low heritable traits by models capturing epistatic gene action. Marker subsets based on linkage disequilibrium decay distance gave significantly higher predictive abilities to the whole marker set, but for randomly selected markers, it reached a plateau above 3000 markers. Markers having significant association with traits improved predictive abilities compared to the whole marker set when marker selection was made on the whole population instead of the training set indicating a clear overfitting. The correction for population structure did not increase predictive abilities compared to the whole collection. However, stratified sampling by picking representative genotypes from each cluster improved predictive abilities. The indirect predictive ability for a trait was proportionate to its correlation with other traits. These results will help breeders to select the best models, optimum marker set, and suitable genotype set to perform an indirect selection for quantitative traits in this diverse flax germplasm collection.
Collapse
Affiliation(s)
- Ahasanul Hoque
- Department of Plant Sciences, North Dakota State University, Fargo, ND, USA
- Department of Genetics and Plant Breeding, Bangladesh Agricultural University, Mymensingh, 2202, Bangladesh
| | - James V Anderson
- USDA-ARS, Edward T. Schafer Agricultural Research Center, Fargo, ND, USA
| | - Mukhlesur Rahman
- Department of Plant Sciences, North Dakota State University, Fargo, ND, USA.
| |
Collapse
|
6
|
Zhou G, Gao J, Zuo D, Li J, Li R. MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction. BMC Bioinformatics 2023; 24:384. [PMID: 37817077 PMCID: PMC10566073 DOI: 10.1186/s12859-023-05514-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 10/02/2023] [Indexed: 10/12/2023] Open
Abstract
BACKGROUND With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still face the problem of poor prediction accuracy in practical applications. RESULTS This paper proposes a genome prediction method MSXFGP based on a multi-strategy improved sparrow search algorithm (SSA) to optimize XGBoost parameters and feature selection. Firstly, logistic chaos mapping, elite learning, adaptive parameter adjustment, Levy flight, and an early stop strategy are incorporated into the SSA. This integration serves to enhance the global and local search capabilities of the algorithm, thereby improving its convergence accuracy and stability. Subsequently, the improved SSA is utilized to concurrently optimize XGBoost parameters and feature selection, leading to the establishment of a new genomic selection method, MSXFGP. Utilizing both the coefficient of determination R2 and the Pearson correlation coefficient as evaluation metrics, MSXFGP was evaluated against six existing genomic selection models across six datasets. The findings reveal that MSXFGP prediction accuracy is comparable or better than existing widely used genomic selection methods, and it exhibits better accuracy when R2 is utilized as an assessment metric. Additionally, this research provides a user-friendly Python utility designed to aid breeders in the effective application of this innovative method. MSXFGP is accessible at https://github.com/DIBreeding/MSXFGP . CONCLUSIONS The experimental results show that the prediction accuracy of MSXFGP is comparable or better than existing genome selection methods, providing a new approach for plant genome selection.
Collapse
Affiliation(s)
- Ganghui Zhou
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot, 010011, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot, 010018, China
| | - Jing Gao
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot, 010011, China.
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot, 010018, China.
- Inner Mongolia Autonomous Region Big Data Center, Chilechuan Street No. 1, Hohhot, 010091, China.
| | - Dongshi Zuo
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot, 010011, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot, 010018, China
| | - Jin Li
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot, 010011, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot, 010018, China
| | - Rui Li
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot, 010011, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot, 010018, China
| |
Collapse
|
7
|
Tolley SA, Brito LF, Wang DR, Tuinstra MR. Genomic prediction and association mapping of maize grain yield in multi-environment trials based on reaction norm models. Front Genet 2023; 14:1221751. [PMID: 37719703 PMCID: PMC10501150 DOI: 10.3389/fgene.2023.1221751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/15/2023] [Indexed: 09/19/2023] Open
Abstract
Genotype-by-environment interaction (GEI) is among the greatest challenges for maize breeding programs. Strong GEI limits both the prediction of genotype performance across variable environmental conditions and the identification of genomic regions associated with grain yield. Incorporating GEI into yield prediction models has been shown to improve prediction accuracy of yield; nevertheless, more work is needed to further understand this complex interaction across populations and environments. The main objectives of this study were to: 1) assess GEI in maize grain yield based on reaction norm models and predict hybrid performance across a gradient of environmental (EG) conditions and 2) perform a genome-wide association study (GWAS) and post-GWAS analyses for maize grain yield using data from 2014 to 2017 of the Genomes to Fields initiative hybrid trial. After quality control, 2,126 hybrids with genotypic and phenotypic data were assessed across 86 environments representing combinations of locations and years, although not all hybrids were evaluated in all environments. Heritability was greater in higher-yielding environments due to an increase in genetic variability in these environments in comparison to the low-yielding environments. GWAS was carried out for yield and five single nucleotide polymorphisms (SNPs) with the highest magnitude of effect were selected in each environment for follow-up analyses. Many candidate genes in proximity of selected SNPs have been previously reported with roles in stress response. Genomic prediction was performed to assess prediction accuracy of previously tested or untested hybrids in environments from a new growing season. Prediction accuracy was 0.34 for cross validation across years (CV0-Predicted EG) and 0.21 for cross validation across years with only untested hybrids (CV00-Predicted EG) when compared to Best Linear Unbiased Prediction (BLUPs) that did not utilize genotypic or environmental relationships. Prediction accuracy improved to 0.80 (CV0-Predicted EG) and 0.60 (CV00-Predicted EG) when compared to the whole-dataset model that used the genomic relationships and the environmental gradient of all environments in the study. These results identify regions of the genome for future selection to improve yield and a methodology to increase the number of hybrids evaluated across locations of a multi-environment trial through genomic prediction.
Collapse
Affiliation(s)
- Seth A. Tolley
- Department of Agronomy, Purdue University, West Lafayette, IN, United States
| | - Luiz F. Brito
- Department of Animal Sciences, Purdue University, West Lafayette, IN, United States
| | - Diane R. Wang
- Department of Agronomy, Purdue University, West Lafayette, IN, United States
| | | |
Collapse
|
8
|
Li R, Huang Y, Yang X, Su M, Xiong H, Dai Y, Wu W, Pei X, Yuan Q. Genetic Diversity and Relationship of Shanlan Upland Rice Were Revealed Based on 214 Upland Rice SSR Markers. PLANTS (BASEL, SWITZERLAND) 2023; 12:2876. [PMID: 37571029 PMCID: PMC10421310 DOI: 10.3390/plants12152876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 07/15/2023] [Accepted: 07/31/2023] [Indexed: 08/13/2023]
Abstract
Shanlan upland rice (Oryza sativa L.) is a unique upland rice variety cultivated by the Li nationality for a long time, which has good drought resistance and high utilization value in drought resistance breeding. To explore the origin of Shanlan upland rice and its genetic relationship with upland rice from other geographical sources, 214 upland rice cultivars from Southeast Asia and five provinces (regions) in southern China were used to study genetic diversity by using SSR markers. Twelve SSR primers were screened and 164 alleles (Na) were detected, with the minimum number of alleles being 8 and the maximum number of alleles being 23, with an average of 13.667. The analysis of genetic diversity and analysis of molecular variance (AMOVA) showed that the differences among the materials mainly came from the individuals of upland rice. The results of gene flow and genetic differentiation revealed the relationship between the upland rice populations, and Hainan Shanlan upland rice presumably originated from upland rice in Guangdong province, and some of them were genetically differentiated from Hunan upland rice. It can be indirectly proved that the Li nationality in Hainan is a descendant of the ancient Baiyue ethnic group, which provides circumstantial evidence for the migration history of the Li nationality in Hainan, and also provides basic data for the advanced protection of Shanlan upland rice, and the innovative utilization of germplasm resources.
Collapse
Affiliation(s)
- Rongju Li
- College of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (R.L.); (Y.H.); (X.Y.); (M.S.); (W.W.)
| | - Yinling Huang
- College of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (R.L.); (Y.H.); (X.Y.); (M.S.); (W.W.)
| | - Xinsen Yang
- College of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (R.L.); (Y.H.); (X.Y.); (M.S.); (W.W.)
| | - Meng Su
- College of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (R.L.); (Y.H.); (X.Y.); (M.S.); (W.W.)
| | - Huaiyang Xiong
- Hainan Guangling High-Tech Industrial Co., Ltd., Lingshui 572400, China; (H.X.); (Y.D.)
| | - Yang Dai
- Hainan Guangling High-Tech Industrial Co., Ltd., Lingshui 572400, China; (H.X.); (Y.D.)
| | - Wei Wu
- College of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (R.L.); (Y.H.); (X.Y.); (M.S.); (W.W.)
| | - Xinwu Pei
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Qianhua Yuan
- College of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (R.L.); (Y.H.); (X.Y.); (M.S.); (W.W.)
| |
Collapse
|
9
|
Montesinos-López OA, Ramos-Pulido S, Hernández-Suárez CM, Mosqueda González BA, Valladares-Anguiano FA, Vitale P, Montesinos-López A, Crossa J. A novel method for genomic-enabled prediction of cultivars in new environments. FRONTIERS IN PLANT SCIENCE 2023; 14:1218151. [PMID: 37564390 PMCID: PMC10411573 DOI: 10.3389/fpls.2023.1218151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 07/03/2023] [Indexed: 08/12/2023]
Abstract
Introduction Genomic selection (GS) has gained global importance due to its potential to accelerate genetic progress and improve the efficiency of breeding programs. Objectives of the research In this research we proposed a method to improve the prediction accuracy of tested lines in new (untested) environments. Method-1 The new method trained the model with a modified response variable (a difference of response variables) that decreases the lack of a non-stationary distribution between the training and testing and improved the prediction accuracy. Comparing new and conventional method We compared the prediction accuracy of the conventional genomic best linear unbiased prediction (GBLUP) model (M1) including (or not) genotype × environment interaction (GE) (M1_GE; M1_NO_GE) versus the proposed method (M2) on several data sets. Results and discussion The gain in prediction accuracy of M2, versus M1_GE, M1_NO_GE in terms of Pearson´s correlation was of at least 4.3%, while in terms of percentage of top-yielding lines captured when was selected the 10% (Best10) and 20% (Best20) of lines was at least of 19.5%, while in terms of Normalized Root Mean Squared Error (NRMSE) was of at least of 42.29%.
Collapse
Affiliation(s)
| | - Sofia Ramos-Pulido
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | | | | | | | - Paolo Vitale
- International Maize and Wheat Improvement Center (CIMMYT), El Batan, Edo. de México, Mexico
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), El Batan, Edo. de México, Mexico
- Colegio de Postgraduados, Montecillos, Edo. de México, Mexico
- Centre for Crop & Food Innovation, Food Futures Institute, Murdoch University, Perth, WA, Australia
| |
Collapse
|
10
|
Montesinos-López OA, Crespo-Herrera L, Saint Pierre C, Bentley AR, de la Rosa-Santamaria R, Ascencio-Laguna JA, Agbona A, Gerard GS, Montesinos-López A, Crossa J. Do feature selection methods for selecting environmental covariables enhance genomic prediction accuracy? Front Genet 2023; 14:1209275. [PMID: 37554404 PMCID: PMC10405933 DOI: 10.3389/fgene.2023.1209275] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/03/2023] [Indexed: 08/10/2023] Open
Abstract
Genomic selection (GS) is transforming plant and animal breeding, but its practical implementation for complex traits and multi-environmental trials remains challenging. To address this issue, this study investigates the integration of environmental information with genotypic information in GS. The study proposes the use of two feature selection methods (Pearson's correlation and Boruta) for the integration of environmental information. Results indicate that the simple incorporation of environmental covariates may increase or decrease prediction accuracy depending on the case. However, optimal incorporation of environmental covariates using feature selection significantly improves prediction accuracy in four out of six datasets between 14.25% and 218.71% under a leave one environment out cross validation scenario in terms of Normalized Root Mean Squared Error, but not relevant gain was observed in terms of Pearson´s correlation. In two datasets where environmental covariates are unrelated to the response variable, feature selection is unable to enhance prediction accuracy. Therefore, the study provides empirical evidence supporting the use of feature selection to improve the prediction power of GS.
Collapse
Affiliation(s)
| | | | | | - Alison R. Bentley
- International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico
| | | | | | - Afolabi Agbona
- International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria
- Molecular & Environmental Plant Sciences, Texas A&M University, College Station, TX, United States
| | - Guillermo S. Gerard
- International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, JA, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico
- Colegio de Postgraduados, Campus Montecillos, Montecillos, Mexico
| |
Collapse
|
11
|
Li Z, Gutierrez L. Editorial: Statistical methods for analyzing multiple environmental quantitative genomic data. Front Genet 2023; 14:1212804. [PMID: 37404327 PMCID: PMC10316013 DOI: 10.3389/fgene.2023.1212804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/09/2023] [Indexed: 07/06/2023] Open
Affiliation(s)
- Zitong Li
- CSIRO Agriculture and Food, Canberra, ACT, Australia
| | - Lucia Gutierrez
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
12
|
Montesinos-López OA, Bentley AR, Pierre CS, Crespo-Herrera L, Rebollar-Ruellas L, Valladares-Celis PE, Lillemo M, Montesinos-López A, Crossa J. Efficacy of plant breeding using genomic information. THE PLANT GENOME 2023:e20346. [PMID: 37139645 DOI: 10.1002/tpg2.20346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 04/05/2023] [Accepted: 04/07/2023] [Indexed: 05/05/2023]
Abstract
Genomic selection (GS) proposed by Meuwissen et al. more than 20 years ago, is revolutionizing plant and animal breeding. Although GS has been widely accepted and applied to plant and animal breeding, there are many factors affecting its efficacy. We studied 14 real datasets to respond to the practical question of whether the accuracy of genomic prediction increases when considering genomic as compared with not using genomic. We found across traits, environments, datasets, and metrics, that the average gain in prediction accuracy when genomic information is considered was 26.31%, while only in terms of Pearson's correlation the gain was of 46.1%, while only in terms of normalized root mean squared error the gain was of 6.6%. If the quality of the makers and relatedness of the individuals increase, major gains in prediction accuracy can be obtained, but if these two factors decrease, a lower increase is possible. Finally, our findings reinforce genomic is vital for improving the prediction accuracy and, therefore, the realized genetic gain in genomic assisted plant breeding programs.
Collapse
Affiliation(s)
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Estado de México, México
| | | | | | | | | | - Morten Lillemo
- Department of Plant Sciences, Norwegian University of Life Sciences, Ås, Norway
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, México
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Estado de México, México
- Colegio de Postgraduados, Montecillos, Estado de México, México
| |
Collapse
|
13
|
Montesinos-López OA, Herr AW, Crossa J, Carter AH. Genomics combined with UAS data enhances prediction of grain yield in winter wheat. Front Genet 2023; 14:1124218. [PMID: 37065497 PMCID: PMC10090417 DOI: 10.3389/fgene.2023.1124218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 03/17/2023] [Indexed: 03/31/2023] Open
Abstract
With the human population continuing to increase worldwide, there is pressure to employ novel technologies to increase genetic gain in plant breeding programs that contribute to nutrition and food security. Genomic selection (GS) has the potential to increase genetic gain because it can accelerate the breeding cycle, increase the accuracy of estimated breeding values, and improve selection accuracy. However, with recent advances in high throughput phenotyping in plant breeding programs, the opportunity to integrate genomic and phenotypic data to increase prediction accuracy is present. In this paper, we applied GS to winter wheat data integrating two types of inputs: genomic and phenotypic. We observed the best accuracy of grain yield when combining both genomic and phenotypic inputs, while only using genomic information fared poorly. In general, the predictions with only phenotypic information were very competitive to using both sources of information, and in many cases using only phenotypic information provided the best accuracy. Our results are encouraging because it is clear we can enhance the prediction accuracy of GS by integrating high quality phenotypic inputs in the models.
Collapse
Affiliation(s)
| | - Andrew W. Herr
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Edo. de México, México
- Colegio de Postgraduados, Montecillos, Edo. de México, México
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
- *Correspondence: Arron H. Carter,
| |
Collapse
|
14
|
Montesinos-López OA, Mosqueda-González BA, Salinas-Ruiz J, Montesinos-López A, Crossa J. Sparse multi-trait genomic prediction under balanced incomplete block design. THE PLANT GENOME 2023:e20305. [PMID: 36815225 DOI: 10.1002/tpg2.20305] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 01/05/2023] [Indexed: 06/18/2023]
Abstract
Sparse testing is essential to increase the efficiency of the genomic selection methodology, as the same efficiency (in this case prediction power) can be obtained while using less genotypes evaluated in the fields. For this reason, it is important to evaluate the existing methods for performing the allocation of lines to environments. With this goal, four methods (M1-M4) to allocate lines to environments were evaluated under the context of a multi-trait genomic prediction problem: M1 denotes the allocation of a fraction (subset) of lines in all locations, M2 denotes the allocation of a fraction of lines with some shared lines in locations but not arranged based on the balanced incomplete block design (BIBD) principle, M3 denotes the random allocation of a subset of lines to locations, and M4 denotes the allocation of a subset of lines to locations using the BIBD principle. The evaluation was done using seven real multi-environment data sets common in plant breeding programs. We found that the best method was M4 and the worst was M1, while no important differences were found between M3 and M4. We concluded that M4 and M3 are efficient in the context of sparse testing for multi-trait prediction.
Collapse
Affiliation(s)
| | | | | | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Edo. de México, Mexico
- Colegio de Postgraduados, Montecillos, Mexico
| |
Collapse
|
15
|
Nguyen VH, Morantte RIZ, Lopena V, Verdeprado H, Murori R, Ndayiragije A, Katiyar SK, Islam MR, Juma RU, Flandez-Galvez H, Glaszmann JC, Cobb JN, Bartholomé J. Multi-environment Genomic Selection in Rice Elite Breeding Lines. RICE (NEW YORK, N.Y.) 2023; 16:7. [PMID: 36752880 PMCID: PMC9908796 DOI: 10.1186/s12284-023-00623-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 01/31/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND Assessing the performance of elite lines in target environments is essential for breeding programs to select the most relevant genotypes. One of the main complexities in this task resides in accounting for the genotype by environment interactions. Genomic prediction models that integrate information from multi-environment trials and environmental covariates can be efficient tools in this context. The objective of this study was to assess the predictive ability of different genomic prediction models to optimize the use of multi-environment information. We used 111 elite breeding lines representing the diversity of the international rice research institute breeding program for irrigated ecosystems. The lines were evaluated for three traits (days to flowering, plant height, and grain yield) in 15 environments in Asia and Africa and genotyped with 882 SNP markers. We evaluated the efficiency of genomic prediction to predict untested environments using seven multi-environment models and three cross-validation scenarios. RESULTS The elite lines were found to belong to the indica group and more specifically the indica-1B subgroup which gathered improved material originating from the Green Revolution. Phenotypic correlations between environments were high for days to flowering and plant height (33% and 54% of pairwise correlation greater than 0.5) but low for grain yield (lower than 0.2 in most cases). Clustering analyses based on environmental covariates separated Asia's and Africa's environments into different clusters or subclusters. The predictive abilities ranged from 0.06 to 0.79 for days to flowering, 0.25-0.88 for plant height, and - 0.29-0.62 for grain yield. We found that models integrating genotype-by-environment interaction effects did not perform significantly better than models integrating only main effects (genotypes and environment or environmental covariates). The different cross-validation scenarios showed that, in most cases, the use of all available environments gave better results than a subset. CONCLUSION Multi-environment genomic prediction models with main effects were sufficient for accurate phenotypic prediction of elite lines in targeted environments. These results will help refine the testing strategy to update the genomic prediction models to improve predictive ability.
Collapse
Affiliation(s)
- Van Hieu Nguyen
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
- Institute of Crop Science, College of Agriculture and Food Science, University of the Philippines, Los Baños, Laguna, Philippines
| | - Rose Imee Zhella Morantte
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Vitaliano Lopena
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Holden Verdeprado
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Rosemary Murori
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Alexis Ndayiragije
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Sanjay Kumar Katiyar
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Md Rafiqul Islam
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Roselyne Uside Juma
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
| | - Hayde Flandez-Galvez
- Institute of Crop Science, College of Agriculture and Food Science, University of the Philippines, Los Baños, Laguna, Philippines
| | - Jean-Christophe Glaszmann
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Joshua N Cobb
- Rice Breeding Innovation Platform, International Rice Research Institute, DAPO, Box7777, Metro Manila, Philippines
- RiceTec. Inc, PO Box 1305, Alvin, TX, 77512, USA
| | - Jérôme Bartholomé
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France.
- CIRAD, UMR AGAP Institut, Cali, Colombia.
- Alliance Bioversity-CIAT, Cali, Colombia.
| |
Collapse
|
16
|
Costa-Neto G, Crespo-Herrera L, Fradgley N, Gardner K, Bentley AR, Dreisigacker S, Fritsche-Neto R, Montesinos-López OA, Crossa J. Envirome-wide associations enhance multi-year genome-based prediction of historical wheat breeding data. G3 (BETHESDA, MD.) 2022; 13:6861853. [PMID: 36454213 PMCID: PMC9911085 DOI: 10.1093/g3journal/jkac313] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 11/02/2022] [Accepted: 11/03/2022] [Indexed: 12/03/2022]
Abstract
Linking high-throughput environmental data (enviromics) to genomic prediction (GP) is a cost-effective strategy for increasing selection intensity under genotype-by-environment interactions (G × E). This study developed a data-driven approach based on Environment-Phenotype Association (EPA) aimed at recycling important G × E information from historical breeding data. EPA was developed in two applications: (1) scanning a secondary source of genetic variation, weighted from the shared reaction-norms of past-evaluated genotypes and (2) pinpointing weights of the similarity among trial-sites (locations), given the historical impact of each envirotyping data variable for a given site. These results were then used as a dimensionality reduction strategy, integrating historical data to feed multi-environment GP models, which led to the development of four new G × E kernels considering genomics, enviromics, and EPA outcomes. The wheat trial data used included 36 locations, 8 years, and three target populations of environments (TPEs) in India. Four prediction scenarios and six kernel models within/across TPEs were tested. Our results suggest that the conventional GBLUP, without enviromic data or when omitting EPA, is inefficient in predicting the performance of wheat lines in future years. Nevertheless, when EPA was introduced as an intermediary learning step to reduce the dimensionality of the G × E kernels while connecting phenotypic and environmental-wide variation, a significant enhancement of G × E prediction accuracy was evident. EPA revealed that the effect of seasonality makes strategies such as "covariable selection" unfeasible because G × E is year-germplasm specific. We propose that the EPA effectively serves as a "reinforcement learner" algorithm capable of uncovering the effect of seasonality over the reaction-norms, with the benefits of better forecasting the similarities between past and future trialing sites. EPA combines the benefits of dimensionality reduction while reducing the uncertainty of genotype-by-year predictions and increasing the resolution of GP for the genotype-specific level.
Collapse
Affiliation(s)
- Germano Costa-Neto
- Institute for Genomics Diversity, Cornell University, Ithaca, NY 14853, USA
| | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, El Batan, Edo. de México 5623, Mexico
| | - Nick Fradgley
- NIAB, 93 Lawrence Weaver Road, Cambridge CB3 0LE, UK
| | - Keith Gardner
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, El Batan, Edo. de México 5623, Mexico
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, El Batan, Edo. de México 5623, Mexico
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, El Batan, Edo. de México 5623, Mexico
| | | | - Osval A Montesinos-López
- Corresponding authors: Facultad de Telematica, Universidad de Colima, Mexico. ; and International Maize and Wheat Improvement Center (CIMMYT) and Colegio de Post-Graduados, Mexico.
| | - Jose Crossa
- Corresponding authors: Facultad de Telematica, Universidad de Colima, Mexico. ; and International Maize and Wheat Improvement Center (CIMMYT) and Colegio de Post-Graduados, Mexico.
| |
Collapse
|
17
|
Sandro P, Kucek LK, Sorrells ME, Dawson JC, Gutierrez L. Developing high-quality value-added cereals for organic systems in the US Upper Midwest: hard red winter wheat (Triticum aestivum L.) breeding. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:4005-4027. [PMID: 35633380 PMCID: PMC9142347 DOI: 10.1007/s00122-022-04112-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 04/19/2022] [Indexed: 06/15/2023]
Abstract
There is an increased demand for food-grade grains grown sustainably. Hard red winter wheat has comparative advantages for organic farm rotations due to fall soil cover, weed competition, and grain yields. However, limitations of currently available cultivars such as poor disease resistance, winter hardiness, and baking quality, challenges its adoption and use. Our goal was to develop a participatory hard red winter wheat breeding program for the US Upper Midwest involving farmers, millers, and bakers. Specifically, our goals include (1) an evaluation of genotype-by-environment interaction (GEI) and genotypic stability for both agronomic and quality traits, and (2) the development of on-farm trials as well as baking and sensory evaluations of genotypes to include farmers, millers, and bakers' perspectives in the breeding process. Selection in early generations for diseases and protein content was followed by multi-environment evaluations for agronomic, disease, and quality traits in three locations during five years, on-farm evaluations, baking trials, and sensory evaluations. GEI was substantial for most traits, but no repeatable environmental conditions were significant contributors to GEI making selection for stability a critical trait. Breeding lines had similar performance in on-station and on-farm trials compared to commercial checks, but some breeding lines were more stable than the checks for agronomic, quality traits, and baking performance. These results suggest that stable lines can be developed using a participatory breeding approach under organic management. Crop improvement explicitly targeting sustainable agriculture practices for selection with farm to table participatory perspectives are critical to achieve long-term sustainable crop production. KEY MESSAGE: We describe a hard red winter wheat breeding program focused on developing genotypes adapted to organic systems in the US Upper Midwest for high-end artisan baking quality using participatory approaches.
Collapse
Affiliation(s)
- Pablo Sandro
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | | | - Mark E Sorrells
- Plant Breeding, and Genetics Section, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, 14853, USA
| | - Julie C Dawson
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Lucia Gutierrez
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| |
Collapse
|
18
|
Westhues CC, Simianer H, Beissinger TM. learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data. G3 GENES|GENOMES|GENETICS 2022; 12:6705235. [PMID: 36124944 PMCID: PMC9635651 DOI: 10.1093/g3journal/jkac226] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/29/2022] [Indexed: 12/04/2022]
Abstract
We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or to retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated over specific periods of time based on naive (for instance, nonoverlapping 10-day windows) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient-boosted decision trees, random forests, stacked ensemble models, and multilayer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with multi-environment trial experimental data in a user-friendly way. The package is published under an MIT license and accessible on GitHub.
Collapse
Affiliation(s)
- Cathy C Westhues
- Division of Plant Breeding Methodology, Department of Crop Sciences, University of Goettingen , 37075 Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen , 37075 Goettingen, Germany
| | - Henner Simianer
- Center for Integrated Breeding Research, University of Goettingen , 37075 Goettingen, Germany
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Gottingen , 37075 Gottingen, Germany
| | - Timothy M Beissinger
- Division of Plant Breeding Methodology, Department of Crop Sciences, University of Goettingen , 37075 Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen , 37075 Goettingen, Germany
| |
Collapse
|
19
|
Montesinos-López OA, Montesinos-López A, Bernal Sandoval DA, Mosqueda-Gonzalez BA, Valenzo-Jiménez MA, Crossa J. Multi-trait genome prediction of new environments with partial least squares. Front Genet 2022; 13:966775. [PMID: 36134027 PMCID: PMC9483856 DOI: 10.3389/fgene.2022.966775] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open
Abstract
The genomic selection (GS) methodology proposed over 20 years ago by Meuwissen et al. (Genetics, 2001) has revolutionized plant breeding. A predictive methodology that trains statistical machine learning algorithms with phenotypic and genotypic data of a reference population and makes predictions for genotyped candidate lines, GS saves significant resources in the selection of candidate individuals. However, its practical implementation is still challenging when the plant breeder is interested in the prediction of future seasons or new locations and/or environments, which is called the "leave one environment out" issue. Furthermore, because the distributions of the training and testing set do not match, most statistical machine learning methods struggle to produce moderate or reasonable prediction accuracies. For this reason, the main objective of this study was to explore the use of the multi-trait partial least square (MT-PLS) regression methodology for this specific task, benchmarking its performance with the Bayesian Multi-trait Genomic Best Linear Unbiased Predictor (MT-GBLUP) method. The benchmarking process was performed with five actual data sets. We found that in all data sets the MT-PLS method outperformed the popular MT-GBLUP method by 349.8% (under predictor E + G), 484.4% (under predictor E + G + GE; where E denotes environments, G genotypes and GE the genotype by environment interaction) and 15.9% (under predictor G + GE) across traits. Our results provide empirical evidence of the power of the MT-PLS methodology for the prediction of future seasons or new environments. Furthermore, the comparison between single univariate-trait (UT) versus MT for GBLUP and PLS gave an increase in prediction accuracy of MT-GBLUP versus UT-GBLUP, but not for MT-PLS versus UT-PLS.
Collapse
Affiliation(s)
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Mexico
| | | | | | - Marco Alberto Valenzo-Jiménez
- Universidad Michoacana de San Nicolas de Hidalgo (UMSNH), Avenida Francisco J. Mujica S/N Ciudad Universitaria, Morelia, MC, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center, Texcoco, Edo. de Mexico, Mexico
- Colegio de Porstgraduados, Montecillos, Edo. de Mexico, Mexico
| |
Collapse
|
20
|
Montesinos-López OA, Montesinos-López A, Cano-Paez B, Hernández-Suárez CM, Santana-Mancilla PC, Crossa J. A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library. Genes (Basel) 2022; 13:1494. [PMID: 36011405 PMCID: PMC9407886 DOI: 10.3390/genes13081494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 08/10/2022] [Accepted: 08/19/2022] [Indexed: 11/30/2022] Open
Abstract
Genomic selection (GS) changed the way plant breeders select genotypes. GS takes advantage of phenotypic and genotypic information to training a statistical machine learning model, which is used to predict phenotypic (or breeding) values of new lines for which only genotypic information is available. Therefore, many statistical machine learning methods have been proposed for this task. Multi-trait (MT) genomic prediction models take advantage of correlated traits to improve prediction accuracy. Therefore, some multivariate statistical machine learning methods are popular for GS. In this paper, we compare the prediction performance of three MT methods: the MT genomic best linear unbiased predictor (GBLUP), the MT partial least squares (PLS) and the multi-trait random forest (RF) methods. Benchmarking was performed with six real datasets. We found that the three investigated methods produce similar results, but under predictors with genotype (G) and environment (E), that is, E + G, the MT GBLUP achieved superior performance, whereas under predictors E + G + genotype × environment (GE) and G + GE, random forest achieved the best results. We also found that the best predictions were achieved under the predictors E + G and E + G + GE. Here, we also provide the R code for the implementation of these three statistical machine learning methods in the sparse kernel method (SKM) library, which offers not only options for single-trait prediction with various statistical machine learning methods but also some options for MT predictions that can help to capture improved complex patterns in datasets that are common in genomic selection.
Collapse
Affiliation(s)
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44100, Mexico
| | - Bernabe Cano-Paez
- Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico
| | - Carlos Moisés Hernández-Suárez
- Instituto de Ciencias Tecnología e Innovación, Universidad Francisco Gavidia, El Progreso St., No. 2748, Colonia Flor Blanca, San Salvador CP 1101, El Salvador
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco 56237, Mexico
- Colegio de Postgraduados, Montecillo 56230, Mexico
| |
Collapse
|
21
|
Montesinos-López OA, Montesinos-López A, Kismiantini, Roman-Gallardo A, Gardner K, Lillemo M, Fritsche-Neto R, Crossa J. Partial Least Squares Enhances Genomic Prediction of New Environments. Front Genet 2022; 13:920689. [PMID: 36313422 PMCID: PMC9608852 DOI: 10.3389/fgene.2022.920689] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 05/19/2022] [Indexed: 12/01/2022] Open
Abstract
In plant breeding, the need to improve the prediction of future seasons or new locations and/or environments, also denoted as “leave one environment out,” is of paramount importance to increase the genetic gain in breeding programs and contribute to food and nutrition security worldwide. Genomic selection (GS) has the potential to increase the accuracy of future seasons or new locations because it is a predictive methodology. However, most statistical machine learning methods used for the task of predicting a new environment or season struggle to produce moderate or high prediction accuracies. For this reason, in this study we explore the use of the partial least squares (PLS) regression methodology for this specific task, and we benchmark its performance with the Bayesian Genomic Best Linear Unbiased Predictor (GBLUP) method. The benchmarking process was done with 14 real datasets. We found that in all datasets the PLS method outperformed the popular GBLUP method by margins between 0% (in the Indica data) and 228.28% (in the Disease data) across traits, environments, and types of predictors. Our results show great empirical evidence of the power of the PLS methodology for the prediction of future seasons or new environments.
Collapse
|
22
|
Atanda SA, Govindan V, Singh R, Robbins KR, Crossa J, Bentley AR. Sparse testing using genomic prediction improves selection for breeding targets in elite spring wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1939-1950. [PMID: 35348821 PMCID: PMC9205816 DOI: 10.1007/s00122-022-04085-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 03/16/2022] [Indexed: 06/08/2023]
Abstract
Sparse testing using genomic prediction can be efficiently used to increase the number of testing environments while maintaining selection intensity in the early yield testing stage without increasing the breeding budget. Sparse testing using genomic prediction enables expanded use of selection environments in early-stage yield testing without increasing phenotyping cost. We evaluated different sparse testing strategies in the yield testing stage of a CIMMYT spring wheat breeding pipeline characterized by multiple populations each with small family sizes of 1-9 individuals. Our results indicated that a substantial overlap between lines across environments should be used to achieve optimal prediction accuracy. As sparse testing leverages information generated within and across environments, the genetic correlations between environments and genomic relationships of lines across environments were the main drivers of prediction accuracy in multi-environment yield trials. Including information from previous evaluation years did not consistently improve the prediction performance. Genomic best linear unbiased prediction was found to be the best predictor of true breeding value, and therefore, we propose that it should be used as a selection decision metric in the early yield testing stages. We also propose it as a proxy for assessing prediction performance to mirror breeder's advancement decisions in a breeding program so that it can be readily applied for advancement decisions by breeding programs.
Collapse
Affiliation(s)
| | - Velu Govindan
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Ravi Singh
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Kelly R Robbins
- Section of Plant Breeding and Genetics, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, USA
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.
| |
Collapse
|
23
|
Zhang B, Ma L, Wu B, Xing Y, Qiu X. Introgression Lines: Valuable Resources for Functional Genomics Research and Breeding in Rice ( Oryza sativa L.). FRONTIERS IN PLANT SCIENCE 2022; 13:863789. [PMID: 35557720 PMCID: PMC9087921 DOI: 10.3389/fpls.2022.863789] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 04/01/2022] [Indexed: 05/14/2023]
Abstract
The narrow base of genetic diversity of modern rice varieties is mainly attributed to the overuse of the common backbone parents that leads to the lack of varied favorable alleles in the process of breeding new varieties. Introgression lines (ILs) developed by a backcross strategy combined with marker-assisted selection (MAS) are powerful prebreeding tools for broadening the genetic base of existing cultivars. They have high power for mapping quantitative trait loci (QTLs) either with major or minor effects, and are used for precisely evaluating the genetic effects of QTLs and detecting the gene-by-gene or gene-by-environment interactions due to their low genetic background noise. ILs developed from multiple donors in a fixed background can be used as an IL platform to identify the best alleles or allele combinations for breeding by design. In the present paper, we reviewed the recent achievements from ILs in rice functional genomics research and breeding, including the genetic dissection of complex traits, identification of elite alleles and background-independent and epistatic QTLs, analysis of genetic interaction, and genetic improvement of single and multiple target traits. We also discussed how to develop ILs for further identification of new elite alleles, and how to utilize IL platforms for rice genetic improvement.
Collapse
Affiliation(s)
- Bo Zhang
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research, Huazhong Agricultural University, Wuhan, China
| | - Ling Ma
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research, Huazhong Agricultural University, Wuhan, China
| | - Bi Wu
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research, Huazhong Agricultural University, Wuhan, China
| | - Yongzhong Xing
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research, Huazhong Agricultural University, Wuhan, China
| | - Xianjin Qiu
- College of Agriculture, Yangtze University, Jingzhou, China
| |
Collapse
|
24
|
Sandhu KS, Patil SS, Aoun M, Carter AH. Multi-Trait Multi-Environment Genomic Prediction for End-Use Quality Traits in Winter Wheat. Front Genet 2022; 13:831020. [PMID: 35173770 PMCID: PMC8841657 DOI: 10.3389/fgene.2022.831020] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 01/06/2022] [Indexed: 11/13/2022] Open
Abstract
Soft white wheat is a wheat class used in foreign and domestic markets to make various end products requiring specific quality attributes. Due to associated cost, time, and amount of seed needed, phenotyping for the end-use quality trait is delayed until later generations. Previously, we explored the potential of using genomic selection (GS) for selecting superior genotypes earlier in the breeding program. Breeders typically measure multiple traits across various locations, and it opens up the avenue for exploring multi-trait-based GS models. This study's main objective was to explore the potential of using multi-trait GS models for predicting seven different end-use quality traits using cross-validation, independent prediction, and across-location predictions in a wheat breeding program. The population used consisted of 666 soft white wheat genotypes planted for 5 years at two locations in Washington, United States. We optimized and compared the performances of four uni-trait- and multi-trait-based GS models, namely, Bayes B, genomic best linear unbiased prediction (GBLUP), multilayer perceptron (MLP), and random forests. The prediction accuracies for multi-trait GS models were 5.5 and 7.9% superior to uni-trait models for the within-environment and across-location predictions. Multi-trait machine and deep learning models performed superior to GBLUP and Bayes B for across-location predictions, but their advantages diminished when the genotype by environment component was included in the model. The highest improvement in prediction accuracy, that is, 35% was obtained for flour protein content with the multi-trait MLP model. This study showed the potential of using multi-trait-based GS models to enhance prediction accuracy by using information from previously phenotyped traits. It would assist in speeding up the breeding cycle time in a cost-friendly manner.
Collapse
Affiliation(s)
- Karansher S. Sandhu
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - Shruti Sunil Patil
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA, United States1
| | - Meriem Aoun
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| |
Collapse
|
25
|
Sandhu KS, Merrick LF, Sankaran S, Zhang Z, Carter AH. Prospectus of Genomic Selection and Phenomics in Cereal, Legume and Oilseed Breeding Programs. Front Genet 2022. [PMCID: PMC8814369 DOI: 10.3389/fgene.2021.829131] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The last decade witnessed an unprecedented increase in the adoption of genomic selection (GS) and phenomics tools in plant breeding programs, especially in major cereal crops. GS has demonstrated the potential for selecting superior genotypes with high precision and accelerating the breeding cycle. Phenomics is a rapidly advancing domain to alleviate phenotyping bottlenecks and explores new large-scale phenotyping and data acquisition methods. In this review, we discuss the lesson learned from GS and phenomics in six self-pollinated crops, primarily focusing on rice, wheat, soybean, common bean, chickpea, and groundnut, and their implementation schemes are discussed after assessing their impact in the breeding programs. Here, the status of the adoption of genomics and phenomics is provided for those crops, with a complete GS overview. GS’s progress until 2020 is discussed in detail, and relevant information and links to the source codes are provided for implementing this technology into plant breeding programs, with most of the examples from wheat breeding programs. Detailed information about various phenotyping tools is provided to strengthen the field of phenomics for a plant breeder in the coming years. Finally, we highlight the benefits of merging genomic selection, phenomics, and machine and deep learning that have resulted in extraordinary results during recent years in wheat, rice, and soybean. Hence, there is a potential for adopting these technologies into crops like the common bean, chickpea, and groundnut. The adoption of phenomics and GS into different breeding programs will accelerate genetic gain that would create an impact on food security, realizing the need to feed an ever-growing population.
Collapse
Affiliation(s)
- Karansher S. Sandhu
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
- *Correspondence: Karansher S. Sandhu,
| | - Lance F. Merrick
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - Sindhuja Sankaran
- Department of Biological System Engineering, Washington State University, Pullman, WA, United States
| | - Zhiwu Zhang
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| |
Collapse
|
26
|
Coulibaly M, Bodjrenou G, Akohoue F, Agoyi EE, Merinosy Francisco FM, Agossou COA, Sawadogo M, Achigan-Dako EG. Profiling Cultivars Development in Kersting's Groundnut [Macrotyloma geocarpum (Harms) Maréchal and Baudet] for Improved Yield, Higher Nutrient Content, and Adaptation to Current and Future Climates. FRONTIERS IN SUSTAINABLE FOOD SYSTEMS 2022. [DOI: 10.3389/fsufs.2021.759575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Kersting's groundnut [Macrotyloma geocarpum (Harms.) Maréchal and Baudet], Fabaceae, is an important source of protein and essential amino acids. As a grain legume species, it also contributes to improving soil fertility through symbiotic nitrogen fixation. However, the crop is characterized by a relatively low yield (≤500 kg/ha), and limited progress has been made so far, toward the development of high-yielding cultivars that can enhance and sustain its productivity. Recently, there was an increased interest in alleviating the burdens related to Kersting's groundnut (KG) cultivation through the development of improved varieties. Preliminary investigations assembled germplasms from various producing countries. In-depth ethnobotanical studies and insightful investigation on the reproductive biology of the species were undertaken alongside morphological, biochemical, and molecular characterizations. Those studies revealed a narrow genetic base for KG. In addition, the self-pollinating nature of its flowers prevents cross-hybridization and represents a major barrier limiting the broadening of the genetic basis. Therefore, the development of a research pipeline to address the bottlenecks specific to KG is a prerequisite for the successful expansion of the crop. In this paper, we offer an overview of the current state of research on KG and pinpoint the knowledge gaps; we defined and discussed the main steps of breeding for KG' cultivars development; this included (i) developing an integrated genebank, inclusive germplasm, and seed system management; (ii) assessing end-users preferences and possibility for industrial exploitation of the crop; (iii) identifying biotic and abiotic stressors and the genetic control of responsive traits to those factors; (iv) overcoming the cross-pollination challenges in KG to propel the development of hybrids; (v) developing new approaches to create variability and setting adequate cultivars and breeding approaches; (vi) karyotyping and draft genome analysis to accelerate cultivars development and increase genetic gains; and (vii) evaluating the adaptability and stability of cultivars across various ecological regions.
Collapse
|
27
|
Bartholomé J, Prakash PT, Cobb JN. Genomic Prediction: Progress and Perspectives for Rice Improvement. Methods Mol Biol 2022; 2467:569-617. [PMID: 35451791 DOI: 10.1007/978-1-0716-2205-6_21] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Genomic prediction can be a powerful tool to achieve greater rates of genetic gain for quantitative traits if thoroughly integrated into a breeding strategy. In rice as in other crops, the interest in genomic prediction is very strong with a number of studies addressing multiple aspects of its use, ranging from the more conceptual to the more practical. In this chapter, we review the literature on rice (Oryza sativa) and summarize important considerations for the integration of genomic prediction in breeding programs. The irrigated breeding program at the International Rice Research Institute is used as a concrete example on which we provide data and R scripts to reproduce the analysis but also to highlight practical challenges regarding the use of predictions. The adage "To someone with a hammer, everything looks like a nail" describes a common psychological pitfall that sometimes plagues the integration and application of new technologies to a discipline. We have designed this chapter to help rice breeders avoid that pitfall and appreciate the benefits and limitations of applying genomic prediction, as it is not always the best approach nor the first step to increasing the rate of genetic gain in every context.
Collapse
Affiliation(s)
- Jérôme Bartholomé
- CIRAD, UMR AGAP Institut, Montpellier, France.
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Montpellier SupAgro, Montpellier, France.
- Rice Breeding Platform, International Rice Research Institute, Manila, Philippines.
| | | | | |
Collapse
|
28
|
Vandermeulen MD, Cullen PJ. Gene by Environment Interactions reveal new regulatory aspects of signaling network plasticity. PLoS Genet 2022; 18:e1009988. [PMID: 34982769 PMCID: PMC8759647 DOI: 10.1371/journal.pgen.1009988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 01/14/2022] [Accepted: 12/09/2021] [Indexed: 11/18/2022] Open
Abstract
Phenotypes can change during exposure to different environments through the regulation of signaling pathways that operate in integrated networks. How signaling networks produce different phenotypes in different settings is not fully understood. Here, Gene by Environment Interactions (GEIs) were used to explore the regulatory network that controls filamentous/invasive growth in the yeast Saccharomyces cerevisiae. GEI analysis revealed that the regulation of invasive growth is decentralized and varies extensively across environments. Different regulatory pathways were critical or dispensable depending on the environment, microenvironment, or time point tested, and the pathway that made the strongest contribution changed depending on the environment. Some regulators even showed conditional role reversals. Ranking pathways' roles across environments revealed an under-appreciated pathway (OPI1) as the single strongest regulator among the major pathways tested (RAS, RIM101, and MAPK). One mechanism that may explain the high degree of regulatory plasticity observed was conditional pathway interactions, such as conditional redundancy and conditional cross-pathway regulation. Another mechanism was that different pathways conditionally and differentially regulated gene expression, such as target genes that control separate cell adhesion mechanisms (FLO11 and SFG1). An exception to decentralized regulation of invasive growth was that morphogenetic changes (cell elongation and budding pattern) were primarily regulated by one pathway (MAPK). GEI analysis also uncovered a round-cell invasion phenotype. Our work suggests that GEI analysis is a simple and powerful approach to define the regulatory basis of complex phenotypes and may be applicable to many systems.
Collapse
Affiliation(s)
- Matthew D. Vandermeulen
- Department of Biological Sciences, University at Buffalo, Buffalo, New York, United States of America
| | - Paul J. Cullen
- Department of Biological Sciences, University at Buffalo, Buffalo, New York, United States of America
| |
Collapse
|
29
|
Rogers AR, Holland JB. Environment-specific genomic prediction ability in maize using environmental covariates depends on environmental similarity to training data. G3 (BETHESDA, MD.) 2021; 12:6486423. [PMID: 35100364 PMCID: PMC9245610 DOI: 10.1093/g3journal/jkab440] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/06/2021] [Indexed: 12/30/2022]
Abstract
Technology advances have made possible the collection of a wealth of genomic, environmental, and phenotypic data for use in plant breeding. Incorporation of environmental data into environment-specific genomic prediction is hindered in part because of inherently high data dimensionality. Computationally efficient approaches to combining genomic and environmental information may facilitate extension of genomic prediction models to new environments and germplasm, and better understanding of genotype-by-environment (G × E) interactions. Using genomic, yield trial, and environmental data on 1,918 unique hybrids evaluated in 59 environments from the maize Genomes to Fields project, we determined that a set of 10,153 SNP dominance coefficients and a 5-day temporal window size for summarizing environmental variables were optimal for genomic prediction using only genetic and environmental main effects. Adding marker-by-environment variable interactions required dimension reduction, and we found that reducing dimensionality of the genetic data while keeping the full set of environmental covariates was best for environment-specific genomic prediction of grain yield, leading to an increase in prediction ability of 2.7% to achieve a prediction ability of 80% across environments when data were masked at random. We then measured how prediction ability within environments was affected under stratified training-testing sets to approximate scenarios commonly encountered by plant breeders, finding that incorporation of marker-by-environment effects improved prediction ability in cases where training and test sets shared environments, but did not improve prediction in new untested environments. The environmental similarity between training and testing sets had a greater impact on the efficacy of prediction than genetic similarity between training and test sets.
Collapse
Affiliation(s)
- Anna R Rogers
- Program in Genetics, North Carolina State University, Raleigh, NC
27695, USA
| | - James B Holland
- Program in Genetics, North Carolina State University, Raleigh, NC
27695, USA,USDA-ARS Plant Science Research Unit, North Carolina State
University, Raleigh, NC 27695, USA,Department of Crop and Soil Sciences, North Carolina State
University, Raleigh, NC 27695, USA,Corresponding author: Department of Agriculture—Agriculture
Research Service, Box 7620 North Carolina State University, Raleigh, NC 27695-7620, USA.
| |
Collapse
|
30
|
Westhues CC, Mahone GS, da Silva S, Thorwarth P, Schmidt M, Richter JC, Simianer H, Beissinger TM. Prediction of Maize Phenotypic Traits With Genomic and Environmental Predictors Using Gradient Boosting Frameworks. FRONTIERS IN PLANT SCIENCE 2021; 12:699589. [PMID: 34880880 PMCID: PMC8647909 DOI: 10.3389/fpls.2021.699589] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 10/15/2021] [Indexed: 05/26/2023]
Abstract
The development of crop varieties with stable performance in future environmental conditions represents a critical challenge in the context of climate change. Environmental data collected at the field level, such as soil and climatic information, can be relevant to improve predictive ability in genomic prediction models by describing more precisely genotype-by-environment interactions, which represent a key component of the phenotypic response for complex crop agronomic traits. Modern predictive modeling approaches can efficiently handle various data types and are able to capture complex nonlinear relationships in large datasets. In particular, machine learning techniques have gained substantial interest in recent years. Here we examined the predictive ability of machine learning-based models for two phenotypic traits in maize using data collected by the Maize Genomes to Fields (G2F) Initiative. The data we analyzed consisted of multi-environment trials (METs) dispersed across the United States and Canada from 2014 to 2017. An assortment of soil- and weather-related variables was derived and used in prediction models alongside genotypic data. Linear random effects models were compared to a linear regularized regression method (elastic net) and to two nonlinear gradient boosting methods based on decision tree algorithms (XGBoost, LightGBM). These models were evaluated under four prediction problems: (1) tested and new genotypes in a new year; (2) only unobserved genotypes in a new year; (3) tested and new genotypes in a new site; (4) only unobserved genotypes in a new site. Accuracy in forecasting grain yield performance of new genotypes in a new year was improved by up to 20% over the baseline model by including environmental predictors with gradient boosting methods. For plant height, an enhancement of predictive ability could neither be observed by using machine learning-based methods nor by using detailed environmental information. An investigation of key environmental factors using gradient boosting frameworks also revealed that temperature at flowering stage, frequency and amount of water received during the vegetative and grain filling stage, and soil organic matter content appeared as important predictors for grain yield in our panel of environments.
Collapse
Affiliation(s)
- Cathy C. Westhues
- Division of Plant Breeding Methodology, Department of Crop Sciences, University of Goettingen, Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Goettingen, Germany
| | | | - Sofia da Silva
- Kleinwanzlebener Saatzucht (KWS) SAAT SE, Einbeck, Germany
| | | | - Malthe Schmidt
- Kleinwanzlebener Saatzucht (KWS) SAAT SE, Einbeck, Germany
| | | | - Henner Simianer
- Center for Integrated Breeding Research, University of Goettingen, Goettingen, Germany
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Goettingen, Goettingen, Germany
| | - Timothy M. Beissinger
- Division of Plant Breeding Methodology, Department of Crop Sciences, University of Goettingen, Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Goettingen, Germany
| |
Collapse
|
31
|
Mahadevaiah C, Appunu C, Aitken K, Suresha GS, Vignesh P, Mahadeva Swamy HK, Valarmathi R, Hemaprabha G, Alagarasan G, Ram B. Genomic Selection in Sugarcane: Current Status and Future Prospects. FRONTIERS IN PLANT SCIENCE 2021; 12:708233. [PMID: 34646284 PMCID: PMC8502939 DOI: 10.3389/fpls.2021.708233] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 08/24/2021] [Indexed: 05/18/2023]
Abstract
Sugarcane is a C4 and agro-industry-based crop with a high potential for biomass production. It serves as raw material for the production of sugar, ethanol, and electricity. Modern sugarcane varieties are derived from the interspecific and intergeneric hybridization between Saccharum officinarum, Saccharum spontaneum, and other wild relatives. Sugarcane breeding programmes are broadly categorized into germplasm collection and characterization, pre-breeding and genetic base-broadening, and varietal development programmes. The varietal identification through the classic breeding programme requires a minimum of 12-14 years. The precise phenotyping in sugarcane is extremely tedious due to the high propensity of lodging and suckering owing to the influence of environmental factors and crop management practices. This kind of phenotyping requires data from both plant crop and ratoon experiments conducted over locations and seasons. In this review, we explored the feasibility of genomic selection schemes for various breeding programmes in sugarcane. The genetic diversity analysis using genome-wide markers helps in the formation of core set germplasm representing the total genomic diversity present in the Saccharum gene bank. The genome-wide association studies and genomic prediction in the Saccharum gene bank are helpful to identify the complete genomic resources for cane yield, commercial cane sugar, tolerances to biotic and abiotic stresses, and other agronomic traits. The implementation of genomic selection in pre-breeding, genetic base-broadening programmes assist in precise introgression of specific genes and recurrent selection schemes enhance the higher frequency of favorable alleles in the population with a considerable reduction in breeding cycles and population size. The integration of environmental covariates and genomic prediction in multi-environment trials assists in the prediction of varietal performance for different agro-climatic zones. This review also directed its focus on enhancing the genetic gain over time, cost, and resource allocation at various stages of breeding programmes.
Collapse
Affiliation(s)
| | - Chinnaswamy Appunu
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Karen Aitken
- CSIRO (Commonwealth Scientific and Industrial Research Organization), St. Lucia, QLD, Australia
| | | | - Palanisamy Vignesh
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | | | | | - Govind Hemaprabha
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Ganesh Alagarasan
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| | - Bakshi Ram
- Division of Crop Improvement, ICAR-Sugarcane Breeding Institute, Coimbatore, India
| |
Collapse
|
32
|
Sandhu KS, Aoun M, Morris CF, Carter AH. Genomic Selection for End-Use Quality and Processing Traits in Soft White Winter Wheat Breeding Program with Machine and Deep Learning Models. BIOLOGY 2021; 10:689. [PMID: 34356544 PMCID: PMC8301459 DOI: 10.3390/biology10070689] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 07/13/2021] [Accepted: 07/17/2021] [Indexed: 01/12/2023]
Abstract
Breeding for grain yield, biotic and abiotic stress resistance, and end-use quality are important goals of wheat breeding programs. Screening for end-use quality traits is usually secondary to grain yield due to high labor needs, cost of testing, and large seed requirements for phenotyping. Genomic selection provides an alternative to predict performance using genome-wide markers under forward and across location predictions, where a previous year's dataset can be used to build the models. Due to large datasets in breeding programs, we explored the potential of the machine and deep learning models to predict fourteen end-use quality traits in a winter wheat breeding program. The population used consisted of 666 wheat genotypes screened for five years (2015-19) at two locations (Pullman and Lind, WA, USA). Nine different models, including two machine learning (random forest and support vector machine) and two deep learning models (convolutional neural network and multilayer perceptron) were explored for cross-validation, forward, and across locations predictions. The prediction accuracies for different traits varied from 0.45-0.81, 0.29-0.55, and 0.27-0.50 under cross-validation, forward, and across location predictions. In general, forward prediction accuracies kept increasing over time due to increments in training data size and was more evident for machine and deep learning models. Deep learning models were superior over the traditional ridge regression best linear unbiased prediction (RRBLUP) and Bayesian models under all prediction scenarios. The high accuracy observed for end-use quality traits in this study support predicting them in early generations, leading to the advancement of superior genotypes to more extensive grain yield trails. Furthermore, the superior performance of machine and deep learning models strengthens the idea to include them in large scale breeding programs for predicting complex traits.
Collapse
Affiliation(s)
- Karansher Singh Sandhu
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99164, USA; (K.S.S.); (M.A.)
| | - Meriem Aoun
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99164, USA; (K.S.S.); (M.A.)
| | - Craig F. Morris
- USDA-ARS Western Wheat Quality Laboratory, E-202 Food Quality Building, Washington State University, Pullman, WA 99164, USA;
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99164, USA; (K.S.S.); (M.A.)
| |
Collapse
|
33
|
Rogers AR, Dunne JC, Romay C, Bohn M, Buckler ES, Ciampitti IA, Edwards J, Ertl D, Flint-Garcia S, Gore MA, Graham C, Hirsch CN, Hood E, Hooker DC, Knoll J, Lee EC, Lorenz A, Lynch JP, McKay J, Moose SP, Murray SC, Nelson R, Rocheford T, Schnable JC, Schnable PS, Sekhon R, Singh M, Smith M, Springer N, Thelen K, Thomison P, Thompson A, Tuinstra M, Wallace J, Wisser RJ, Xu W, Gilmour AR, Kaeppler SM, De Leon N, Holland JB. The importance of dominance and genotype-by-environment interactions on grain yield variation in a large-scale public cooperative maize experiment. G3-GENES GENOMES GENETICS 2021; 11:6062399. [PMID: 33585867 DOI: 10.1093/g3journal/jkaa050] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 11/07/2020] [Indexed: 11/12/2022]
Abstract
High-dimensional and high-throughput genomic, field performance, and environmental data are becoming increasingly available to crop breeding programs, and their integration can facilitate genomic prediction within and across environments and provide insights into the genetic architecture of complex traits and the nature of genotype-by-environment interactions. To partition trait variation into additive and dominance (main effect) genetic and corresponding genetic-by-environment variances, and to identify specific environmental factors that influence genotype-by-environment interactions, we curated and analyzed genotypic and phenotypic data on 1918 maize (Zea mays L.) hybrids and environmental data from 65 testing environments. For grain yield, dominance variance was similar in magnitude to additive variance, and genetic-by-environment variances were more important than genetic main effect variances. Models involving both additive and dominance relationships best fit the data and modeling unique genetic covariances among all environments provided the best characterization of the genotype-by-environment interaction patterns. Similarity of relative hybrid performance among environments was modeled as a function of underlying weather variables, permitting identification of weather covariates driving correlations of genetic effects across environments. The resulting models can be used for genomic prediction of mean hybrid performance across populations of environments tested or for environment-specific predictions. These results can also guide efforts to incorporate high-throughput environmental data into genomic prediction models and predict values in new environments characterized with the same environmental characteristics.
Collapse
Affiliation(s)
- Anna R Rogers
- Program in Genetics, North Carolina State University, Raleigh, NC 27695, USA
| | - Jeffrey C Dunne
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Cinta Romay
- Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853, USA
| | - Martin Bohn
- Department of Crop Sciences, University of Illinois at Urban-Champaign, Urbana, IL 61801, USA
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853, USA.,USDA-ARS Plant, Soil, and Nutrition Research Unit, Cornell University, Ithaca, NY 14853, USA
| | | | - Jode Edwards
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA.,USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA 50011, USA
| | - David Ertl
- Iowa Corn Promotion Board, Johnston, IA 50131, USA
| | - Sherry Flint-Garcia
- USDA-ARS Plant Genetics Research Unit, University of Missouri, Columbia, MO 65211, USA
| | - Michael A Gore
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Christopher Graham
- Plant Science Department, West River Agricultural Center, South Dakota State University, Rapid City, SD 57769, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Elizabeth Hood
- College of Agriculture, Arkansas State University, Jonesboro, AR 72467, USA
| | - David C Hooker
- Department of Plant Agriculture, Ridgetown Campus, University of Guelph, Ridgetown, ON N0P 2C0, Canada
| | - Joseph Knoll
- USDA-ARS Crop Genetics and Breeding Research Unit, Tifton, GA 31793, USA
| | - Elizabeth C Lee
- Department of Plant Agriculture, University of Guelph, Guelph N1G 2W1, Canada
| | - Aaron Lorenz
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Jonathan P Lynch
- Department of Plant Science, Penn State University, University Park, PA 16802, USA
| | - John McKay
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523, USA
| | - Stephen P Moose
- Department of Crop Sciences, University of Illinois at Urban-Champaign, Urbana, IL 61801, USA
| | - Seth C Murray
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Rebecca Nelson
- Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Torbert Rocheford
- Department of Agronomy, Purdue University, West Lafayette, IN 47907, USA
| | - James C Schnable
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, NE 68583, USA
| | - Patrick S Schnable
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA.,Plant Sciences Institute, Iowa State University, Ames, IA 50011, USA
| | - Rajandeep Sekhon
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA
| | - Maninder Singh
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA
| | - Margaret Smith
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Nathan Springer
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, NE 68583, USA
| | - Kurt Thelen
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Peter Thomison
- Department of Horticulture and Crop Science, The Ohio State University, Columbus, OH 43210, USA
| | - Addie Thompson
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Mitch Tuinstra
- Department of Agronomy, Purdue University, West Lafayette, IN 47907, USA
| | - Jason Wallace
- Department of Crop and Soil Sciences, University of Georgia, Athens GA 30602, USA
| | - Randall J Wisser
- Department of Plant and Soil Sciences, University of Delaware, Newark, DE 19716, USA
| | - Wenwei Xu
- Texas A& M AgriLife Research, Texas A& M University, Lubbock, TX 79403, USA
| | | | - Shawn M Kaeppler
- Department of Agronomy, University of Wisconsin, Madison, WI 53706, USA
| | - Natalia De Leon
- Department of Agronomy, University of Wisconsin, Madison, WI 53706, USA
| | - James B Holland
- Program in Genetics, North Carolina State University, Raleigh, NC 27695, USA.,Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC 27695, USA.,USDA-ARS Plant Science Research Unit, North Carolina State University, Raleigh, NC 27695-7620, USA
| |
Collapse
|
34
|
Costa-Neto G, Crossa J, Fritsche-Neto R. Enviromic Assembly Increases Accuracy and Reduces Costs of the Genomic Prediction for Yield Plasticity in Maize. FRONTIERS IN PLANT SCIENCE 2021; 12:717552. [PMID: 34691099 PMCID: PMC8529011 DOI: 10.3389/fpls.2021.717552] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 09/03/2021] [Indexed: 05/21/2023]
Abstract
Quantitative genetics states that phenotypic variation is a consequence of the interaction between genetic and environmental factors. Predictive breeding is based on this statement, and because of this, ways of modeling genetic effects are still evolving. At the same time, the same refinement must be used for processing environmental information. Here, we present an "enviromic assembly approach," which includes using ecophysiology knowledge in shaping environmental relatedness into whole-genome predictions (GP) for plant breeding (referred to as enviromic-aided genomic prediction, E-GP). We propose that the quality of an environment is defined by the core of environmental typologies and their frequencies, which describe different zones of plant adaptation. From this, we derived markers of environmental similarity cost-effectively. Combined with the traditional additive and non-additive effects, this approach may better represent the putative phenotypic variation observed across diverse growing conditions (i.e., phenotypic plasticity). Then, we designed optimized multi-environment trials coupling genetic algorithms, enviromic assembly, and genomic kinships capable of providing in-silico realization of the genotype-environment combinations that must be phenotyped in the field. As proof of concept, we highlighted two E-GP applications: (1) managing the lack of phenotypic information in training accurate GP models across diverse environments and (2) guiding an early screening for yield plasticity exerting optimized phenotyping efforts. Our approach was tested using two tropical maize sets, two types of enviromics assembly, six experimental network sizes, and two types of optimized training set across environments. We observed that E-GP outperforms benchmark GP in all scenarios, especially when considering smaller training sets. The representativeness of genotype-environment combinations is more critical than the size of multi-environment trials (METs). The conventional genomic best-unbiased prediction (GBLUP) is inefficient in predicting the quality of a yet-to-be-seen environment, while enviromic assembly enabled it by increasing the accuracy of yield plasticity predictions. Furthermore, we discussed theoretical backgrounds underlying how intrinsic envirotype-phenotype covariances within the phenotypic records can impact the accuracy of GP. The E-GP is an efficient approach to better use environmental databases to deliver climate-smart solutions, reduce field costs, and anticipate future scenarios.
Collapse
Affiliation(s)
- Germano Costa-Neto
- Department of Genetics, “Luiz de Queiroz” Agriculture College, University of São Paulo (ESALQ/USP), Piracicaba, Brazil
- Institute for Genomic Diversity, Cornell University, Ithaca, NY, United States
- *Correspondence: Germano Costa-Neto
| | - Jose Crossa
- Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Mexico City, Mexico
- Colegio de Posgraduado, Mexico City, Mexico
| | - Roberto Fritsche-Neto
- Department of Genetics, “Luiz de Queiroz” Agriculture College, University of São Paulo (ESALQ/USP), Piracicaba, Brazil
- Breeding Analytics and Data Management Unit, International Rice Research Institute (IRRI), Los Baños, Philippines
| |
Collapse
|
35
|
Crossa J, Fritsche-Neto R, Montesinos-Lopez OA, Costa-Neto G, Dreisigacker S, Montesinos-Lopez A, Bentley AR. The Modern Plant Breeding Triangle: Optimizing the Use of Genomics, Phenomics, and Enviromics Data. FRONTIERS IN PLANT SCIENCE 2021; 12:651480. [PMID: 33936136 PMCID: PMC8085545 DOI: 10.3389/fpls.2021.651480] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Accepted: 02/11/2021] [Indexed: 05/04/2023]
Affiliation(s)
- Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, de Mexico, Mexico
- Colegio de Postgraduados, Montecillo, Edo. de Mexico, Mexico
| | - Roberto Fritsche-Neto
- Department of Genetics, “Luiz de Queiroz” Agriculture College, University of São Paulo, São Paulo, Brazil
| | | | - Germano Costa-Neto
- Department of Genetics, “Luiz de Queiroz” Agriculture College, University of São Paulo, São Paulo, Brazil
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, de Mexico, Mexico
| | - Abelardo Montesinos-Lopez
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Mexico
| | - Alison R. Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, de Mexico, Mexico
- *Correspondence: Alison R. Bentley
| |
Collapse
|
36
|
Denney DA, Jameel MI, Bemmels JB, Rochford ME, Anderson JT. Small spaces, big impacts: contributions of micro-environmental variation to population persistence under climate change. AOB PLANTS 2020; 12:plaa005. [PMID: 32211145 PMCID: PMC7082537 DOI: 10.1093/aobpla/plaa005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 02/06/2020] [Indexed: 05/05/2023]
Abstract
Individuals within natural populations can experience very different abiotic and biotic conditions across small spatial scales owing to microtopography and other micro-environmental gradients. Ecological and evolutionary studies often ignore the effects of micro-environment on plant population and community dynamics. Here, we explore the extent to which fine-grained variation in abiotic and biotic conditions contributes to within-population variation in trait expression and genetic diversity in natural plant populations. Furthermore, we consider whether benign microhabitats could buffer local populations of some plant species from abiotic stresses imposed by rapid anthropogenic climate change. If microrefugia sustain local populations and communities in the short term, other eco-evolutionary processes, such as gene flow and adaptation, could enhance population stability in the longer term. We caution, however, that local populations may still decline in size as they contract into rare microhabitats and microrefugia. We encourage future research that explicitly examines the role of the micro-environment in maintaining genetic variation within local populations, favouring the evolution of phenotypic plasticity at local scales and enhancing population persistence under global change.
Collapse
Affiliation(s)
- Derek A Denney
- Department of Plant Biology, University of Georgia, Athens, GA, USA
| | - M Inam Jameel
- Department of Genetics, University of Georgia, Athens, GA, USA
| | - Jordan B Bemmels
- Department of Genetics, University of Georgia, Athens, GA, USA
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, ON, Canada
| | - Mia E Rochford
- Department of Plant Biology, University of Georgia, Athens, GA, USA
| | - Jill T Anderson
- Department of Genetics, University of Georgia, Athens, GA, USA
| |
Collapse
|
37
|
Calayugan MIC, Formantes AK, Amparado A, Descalsota-Empleo GI, Nha CT, Inabangan-Asilo MA, Swe ZM, Hernandez JE, Borromeo TH, Lalusin AG, Mendioro MS, Diaz MGQ, Viña CBD, Reinke R, Swamy BPM. Genetic Analysis of Agronomic Traits and Grain Iron and Zinc Concentrations in a Doubled Haploid Population of Rice (Oryza sativa L.). Sci Rep 2020; 10:2283. [PMID: 32042046 PMCID: PMC7010768 DOI: 10.1038/s41598-020-59184-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Accepted: 01/24/2020] [Indexed: 12/28/2022] Open
Abstract
The development of micronutrient dense rice varieties with good agronomic traits is one of the sustainable and cost-effective approaches for reducing malnutrition. Identification of QTLs for high grain Fe and Zn, yield and yield components helps in precise and faster development of high Fe and Zn rice. We carried out a three-season evaluation using IR05F102 x IR69428 derived doubled-haploid population at IRRI. Inclusive composite interval mapping was carried out using SNP markers and Best Linear Unbiased Estimates of the phenotypic traits. A total of 23 QTLs were identified for eight agronomic traits and grain Fe and Zn concentration that explained 7.2 to 22.0% PV. A QTL by environment interaction analysis confirmed the stability of nine QTLs, including two QTLs for Zn on chromosomes 5 and 12. One epistatic interaction for plant height was significant with 28.4% PVE. Moreover, five QTLs were identified for Fe and Zn that harbor several candidate genes, e.g. OsZIP6 on QTL qZn5.1. A number of QTLs were associated with a combination of greater yield and increased grain Zn levels. These results are useful for development of new rice varieties with good agronomic traits and high grain Zn using MAS, and identification of genetic resources with the novel QTLs for grain Zn.
Collapse
Affiliation(s)
- Mark Ian C Calayugan
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines.,University of the Philippines Los Baños, Laguna, 4031, Philippines
| | - Andrea Kariza Formantes
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines.,University of the Philippines Los Baños, Laguna, 4031, Philippines
| | - Amery Amparado
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines
| | - Gwen Iris Descalsota-Empleo
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines.,University of the Philippines Los Baños, Laguna, 4031, Philippines.,University of the Southern Mindanao, Kabacan, Cotabato, 9407, Philippines
| | - Chau Thanh Nha
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines.,Cuu Long Delta Rice Research Institute (CLRRI), Cần Thơ, Vietnam
| | | | - Zin Mar Swe
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines.,Department of Agriculture, Yezin, Myanmar
| | - Jose E Hernandez
- University of the Philippines Los Baños, Laguna, 4031, Philippines
| | | | | | | | | | | | - Russell Reinke
- International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines
| | | |
Collapse
|