1
|
Tadese D, Piepho HP, Hartung J. Accuracy of prediction from multi-environment trials for new locations using pedigree information and environmental covariates: the case of sorghum (Sorghum bicolor (L.) Moench) breeding. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:181. [PMID: 38985188 PMCID: PMC11236881 DOI: 10.1007/s00122-024-04684-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 06/25/2024] [Indexed: 07/11/2024]
Abstract
KEY MESSAGES We investigate a method of extracting and fitting synthetic environmental covariates and pedigree information in multilocation trial data analysis to predict genotype performances in untested locations. Plant breeding trials are usually conducted across multiple testing locations to predict genotype performances in the targeted population of environments. The predictive accuracy can be increased by the use of adequate statistical models. We compared linear mixed models with and without synthetic covariates (SCs) and pedigree information under the identity, the diagonal and the factor-analytic variance-covariance structures of the genotype-by-location interactions. A comparison was made to evaluate the accuracy of different models in predicting genotype performances in untested locations using the mean squared error of predicted differences (MSEPD) and the Spearman rank correlation between predicted and adjusted means. A multi-environmental trial (MET) dataset evaluated for yield performance in the dry lowland sorghum (Sorghum bicolor (L.) Moench) breeding program of Ethiopia was used. For validating our models, we followed a leave-one-location-out cross-validation strategy. A total of 65 environmental covariates (ECs) obtained from the sorghum test locations were considered. The SCs were extracted from the ECs using multivariate partial least squares analysis and subsequently fitted in the linear mixed model. Then, the model was extended accounting for pedigree information. According to the MSEPD, models accounting for SC improve predictive accuracy of genotype performances in the three of the variance-covariance structures compared to others without SC. The rank correlation was also higher for the model with the SC. When the SC was fitted, the rank correlation was 0.58 for the factor analytic, 0.51 for the diagonal and 0.46 for the identity variance-covariance structures. Our approach indicates improvement in predictive accuracy with SC in the context of genotype-by-location interactions of a sorghum breeding in Ethiopia.
Collapse
Affiliation(s)
- Diriba Tadese
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Fruwirthstraße 23, 70599, Stuttgart, Germany.
| | - Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Fruwirthstraße 23, 70599, Stuttgart, Germany
| | - Jens Hartung
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Fruwirthstraße 23, 70599, Stuttgart, Germany
| |
Collapse
|
2
|
Gilbert C, Martin N. Using agro-ecological zones to improve the representation of a multi-environment trial of soybean varieties. FRONTIERS IN PLANT SCIENCE 2024; 15:1310461. [PMID: 38590744 PMCID: PMC10999551 DOI: 10.3389/fpls.2024.1310461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/27/2024] [Indexed: 04/10/2024]
Abstract
This research introduces a novel framework for enhancing soybean cultivation in North America by categorizing growing environments into distinct ecological and maturity-based zones. Using an integrated analysis of long-term climatic data and records of soybean varietal trials, this research generates a zonal environmental characterization which captures major components of the growing environment which affect the range of adaptation of soybean varieties. These findings have immediate applications for optimizing multi-environment soybean trials. This characterization allows breeders to assess the environmental representation of a multi-environmental trial of soybean varieties, and to strategize the distribution of testing and the placement of test sites accordingly. This application is demonstrated with a historical scenario of a soybean multi-environment trial, using two resource allocation models: one targeted towards improving the general adaptation of soybean varieties, which focuses on widely cultivated areas, and one targeted towards specific adaptation, which captures diverse environmental conditions. Ultimately, the study aims to improve the efficiency and impact of soybean breeding programs, leading to the development of cultivars resilient to variable and changing climates.
Collapse
Affiliation(s)
- Catherine Gilbert
- University of Illinois at Urbana-Champaign, Department of Crop Sciences, Urbana, IL, United States
| | - Nicolas Martin
- University of Illinois at Urbana-Champaign, Department of Crop Sciences, Urbana, IL, United States
| |
Collapse
|
3
|
Araújo MS, Chaves SFS, Dias LAS, Ferreira FM, Pereira GR, Bezerra ARG, Alves RS, Heinemann AB, Breseghello F, Carneiro PCS, Krause MD, Costa-Neto G, Dias KOG. GIS-FA: an approach to integrating thematic maps, factor-analytic, and envirotyping for cultivar targeting. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:80. [PMID: 38472532 DOI: 10.1007/s00122-024-04579-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 02/06/2024] [Indexed: 03/14/2024]
Abstract
KEY MESSAGE We propose an "enviromics" prediction model for recommending cultivars based on thematic maps aimed at decision-makers. Parsimonious methods that capture genotype-by-environment interaction (GEI) in multi-environment trials (MET) are important in breeding programs. Understanding the causes and factors of GEI allows the utilization of genotype adaptations in the target population of environments through environmental features and factor-analytic (FA) models. Here, we present a novel predictive breeding approach called GIS-FA, which integrates geographic information systems (GIS) techniques, FA models, partial least squares (PLS) regression, and enviromics to predict phenotypic performance in untested environments. The GIS-FA approach enables: (i) the prediction of the phenotypic performance of tested genotypes in untested environments, (ii) the selection of the best-ranking genotypes based on their overall performance and stability using the FA selection tools, and (iii) the creation of thematic maps showing overall or pairwise performance and stability for decision-making. We exemplify the usage of the GIS-FA approach using two datasets of rice [Oryza sativa (L.)] and soybean [Glycine max (L.) Merr.] in MET spread over tropical areas. In summary, our novel predictive method allows the identification of new breeding scenarios by pinpointing groups of environments where genotypes demonstrate superior predicted performance. It also facilitates and optimizes cultivar recommendations by utilizing thematic maps.
Collapse
Affiliation(s)
- Maurício S Araújo
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Saulo F S Chaves
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Luiz A S Dias
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Filipe M Ferreira
- Department of Crop Science - College of Agricultural Sciences, São Paulo State University, Botucatu, São Paulo, Brazil
| | - Guilherme R Pereira
- Department of Agronomy, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | - Rodrigo S Alves
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Alexandre B Heinemann
- Brazilian Agricultural Research Corporation (Embrapa Rice and Beans), Santo Antônio de Goiás, Goiás, Brazil
| | - Flávio Breseghello
- Brazilian Agricultural Research Corporation (Embrapa Rice and Beans), Santo Antônio de Goiás, Goiás, Brazil
| | - Pedro C S Carneiro
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | | | - Kaio O G Dias
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil.
| |
Collapse
|
4
|
Montesinos-López OA, Crespo-Herrera L, Saint Pierre C, Bentley AR, de la Rosa-Santamaria R, Ascencio-Laguna JA, Agbona A, Gerard GS, Montesinos-López A, Crossa J. Do feature selection methods for selecting environmental covariables enhance genomic prediction accuracy? Front Genet 2023; 14:1209275. [PMID: 37554404 PMCID: PMC10405933 DOI: 10.3389/fgene.2023.1209275] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/03/2023] [Indexed: 08/10/2023] Open
Abstract
Genomic selection (GS) is transforming plant and animal breeding, but its practical implementation for complex traits and multi-environmental trials remains challenging. To address this issue, this study investigates the integration of environmental information with genotypic information in GS. The study proposes the use of two feature selection methods (Pearson's correlation and Boruta) for the integration of environmental information. Results indicate that the simple incorporation of environmental covariates may increase or decrease prediction accuracy depending on the case. However, optimal incorporation of environmental covariates using feature selection significantly improves prediction accuracy in four out of six datasets between 14.25% and 218.71% under a leave one environment out cross validation scenario in terms of Normalized Root Mean Squared Error, but not relevant gain was observed in terms of Pearson´s correlation. In two datasets where environmental covariates are unrelated to the response variable, feature selection is unable to enhance prediction accuracy. Therefore, the study provides empirical evidence supporting the use of feature selection to improve the prediction power of GS.
Collapse
Affiliation(s)
| | | | | | - Alison R. Bentley
- International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico
| | | | | | - Afolabi Agbona
- International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria
- Molecular & Environmental Plant Sciences, Texas A&M University, College Station, TX, United States
| | - Guillermo S. Gerard
- International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, JA, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), El Battan, Mexico
- Colegio de Postgraduados, Campus Montecillos, Montecillos, Mexico
| |
Collapse
|
5
|
Li Z, Gutierrez L. Editorial: Statistical methods for analyzing multiple environmental quantitative genomic data. Front Genet 2023; 14:1212804. [PMID: 37404327 PMCID: PMC10316013 DOI: 10.3389/fgene.2023.1212804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/09/2023] [Indexed: 07/06/2023] Open
Affiliation(s)
- Zitong Li
- CSIRO Agriculture and Food, Canberra, ACT, Australia
| | - Lucia Gutierrez
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
6
|
Tolhurst DJ, Gaynor RC, Gardunia B, Hickey JM, Gorjanc G. Genomic selection using random regressions on known and latent environmental covariates. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:3393-3415. [PMID: 36066596 PMCID: PMC9519718 DOI: 10.1007/s00122-022-04186-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 06/28/2022] [Indexed: 05/26/2023]
Abstract
The integration of known and latent environmental covariates within a single-stage genomic selection approach provides breeders with an informative and practical framework to utilise genotype by environment interaction for prediction into current and future environments. This paper develops a single-stage genomic selection approach which integrates known and latent environmental covariates within a special factor analytic framework. The factor analytic linear mixed model of Smith et al. (2001) is an effective method for analysing multi-environment trial (MET) datasets, but has limited practicality since the underlying factors are latent so the modelled genotype by environment interaction (GEI) is observable, rather than predictable. The advantage of using random regressions on known environmental covariates, such as soil moisture and daily temperature, is that the modelled GEI becomes predictable. The integrated factor analytic linear mixed model (IFA-LMM) developed in this paper includes a model for predictable and observable GEI in terms of a joint set of known and latent environmental covariates. The IFA-LMM is demonstrated on a late-stage cotton breeding MET dataset from Bayer CropScience. The results show that the known covariates predominately capture crossover GEI and explain 34.4% of the overall genetic variance. The most notable covariates are maximum downward solar radiation (10.1%), average cloud cover (4.5%) and maximum temperature (4.0%). The latent covariates predominately capture non-crossover GEI and explain 40.5% of the overall genetic variance. The results also show that the average prediction accuracy of the IFA-LMM is [Formula: see text] higher than conventional random regression models for current environments and [Formula: see text] higher for future environments. The IFA-LMM is therefore an effective method for analysing MET datasets which also utilises crossover and non-crossover GEI for genomic prediction into current and future environments. This is becoming increasingly important with the emergence of rapidly changing environments and climate change.
Collapse
Affiliation(s)
- Daniel J Tolhurst
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, United Kingdom.
| | | | | | - John M Hickey
- Corn Product Design, Bayer CropScience, Barcelona, Spain
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, United Kingdom
| |
Collapse
|
7
|
Malik WA, Buntaran H, Przystalski M, Lenartowicz T, Piepho HP. Assessing the between-country genetic correlation in maize yield using German and Polish official variety trials. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:3025-3038. [PMID: 35831460 PMCID: PMC9482609 DOI: 10.1007/s00122-022-04164-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/22/2022] [Indexed: 06/15/2023]
Abstract
We assess the genetic gain and genetic correlation in maize yield using German and Polish official variety trials. The random coefficient models were fitted to assess the genetic correlation. Official variety testing is performed in many countries by statutory agencies in order to identify the best candidates and make decisions on the addition to the national list. Neighbouring countries can have similarities in agroecological conditions, so it is worthwhile to consider a joint analysis of data from national list trials to assess the similarity in performance of those varieties tested in both countries. Here, maize yield data from official German and Poland variety trials for cultivation and use (VCU) were analysed for the period from 1987 to 2017. Several statistical models that incorporate environmental covariates were fitted. The best fitting model was used to compute estimates of genotype main effects for each country. It is demonstrated that a model with random genotype-by-country effects can be used to borrow strength across countries. The genetic correlation between cultivars from the two countries equalled 0.89. The analysis based on agroecological zones showed high correlation between zones in the two countries. The results also showed that 22 agroecological zones in Germany can be merged into five zones, whereas the six zones in Poland had very high correlation and can be considered as a single zone for maize. The 43 common varieties which were tested in both countries performed equally in both countries. The mean performances of these common varieties in both countries were highly correlated.
Collapse
Affiliation(s)
- Waqas Ahmed Malik
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Fruwirthstrasse 23, 70599, Stuttgart, Germany.
| | - Harimurti Buntaran
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Fruwirthstrasse 23, 70599, Stuttgart, Germany
| | - Marcin Przystalski
- Research Centre for Cultivar Testing, Słupia Wielka 34, 63-022, Słupia Wielka, Poland
| | - Tomasz Lenartowicz
- Research Centre for Cultivar Testing, Słupia Wielka 34, 63-022, Słupia Wielka, Poland
| | - Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Fruwirthstrasse 23, 70599, Stuttgart, Germany
| |
Collapse
|
8
|
Dias KOG, Dos Santos JPR, Krause MD, Piepho HP, Guimarães LJM, Pastina MM, Garcia AAF. Leveraging probability concepts for cultivar recommendation in multi-environment trials. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1385-1399. [PMID: 35192008 DOI: 10.1007/s00122-022-04041-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 01/07/2022] [Indexed: 06/14/2023]
Abstract
We propose using probability concepts from Bayesian models to leverage a more informed decision-making process toward cultivar recommendation in multi-environment trials. Statistical models that capture the phenotypic plasticity of a genotype across environments are crucial in plant breeding programs to potentially identify parents, generate offspring, and obtain highly productive genotypes for target environments. In this study, our aim is to leverage concepts of Bayesian models and probability methods of stability analysis to untangle genotype-by-environment interaction (GEI). The proposed method employs the posterior distribution obtained with the No-U-Turn sampler algorithm to get Hamiltonian Monte Carlo estimates of adaptation and stability probabilities. We applied the proposed models in two empirical tropical datasets. Our findings provide a basis to enhance our ability to consider the uncertainty of cultivar recommendation for global or specific adaptation. We further demonstrate that probability methods of stability analysis in a Bayesian framework are a powerful tool for unraveling GEI given a defined intensity of selection that results in a more informed decision-making process toward cultivar recommendation in multi-environment trials.
Collapse
Affiliation(s)
- Kaio O G Dias
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil
- Department of General Biology, Federal University of Viçosa, Viçosa, Brazil
| | - Jhonathan P R Dos Santos
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil
| | | | | | | | | | - Antonio A F Garcia
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil.
| |
Collapse
|