1
|
Montesinos-López A, Rivera C, Pinto F, Piñera F, Gonzalez D, Reynolds M, Pérez-Rodríguez P, Li H, Montesinos-López OA, Crossa J. Multimodal deep learning methods enhance genomic prediction of wheat breeding. G3 (BETHESDA, MD.) 2023; 13:jkad045. [PMID: 36869747 PMCID: PMC10151399 DOI: 10.1093/g3journal/jkad045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/21/2023] [Accepted: 02/22/2023] [Indexed: 03/05/2023]
Abstract
While several statistical machine learning methods have been developed and studied for assessing the genomic prediction (GP) accuracy of unobserved phenotypes in plant breeding research, few methods have linked genomics and phenomics (imaging). Deep learning (DL) neural networks have been developed to increase the GP accuracy of unobserved phenotypes while simultaneously accounting for the complexity of genotype-environment interaction (GE); however, unlike conventional GP models, DL has not been investigated for when genomics is linked with phenomics. In this study we used 2 wheat data sets (DS1 and DS2) to compare a novel DL method with conventional GP models. Models fitted for DS1 were GBLUP, gradient boosting machine (GBM), support vector regression (SVR) and the DL method. Results indicated that for 1 year, DL provided better GP accuracy than results obtained by the other models. However, GP accuracy obtained for other years indicated that the GBLUP model was slightly superior to the DL. DS2 is comprised only of genomic data from wheat lines tested for 3 years, 2 environments (drought and irrigated) and 2-4 traits. DS2 results showed that when predicting the irrigated environment with the drought environment, DL had higher accuracy than the GBLUP model in all analyzed traits and years. When predicting drought environment with information on the irrigated environment, the DL model and GBLUP model had similar accuracy. The DL method used in this study is novel and presents a strong degree of generalization as several modules can potentially be incorporated and concatenated to produce an output for a multi-input data structure.
Collapse
Affiliation(s)
- Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430, Guadalajara, Jalisco, Mexico
| | - Carolina Rivera
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico
| | - Francisco Pinto
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico
| | - Francisco Piñera
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico
| | - David Gonzalez
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico
| | - Mathew Reynolds
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico
| | | | - Huihui Li
- Institute of Crop Sciences, The National Key Facility for Crop Gene Resources and Genetic Improvement and CIMMYT China office, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | | | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México- Veracruz Km. 45, El Batán, CP 56237, Texcoco, Edo. de México, Mexico
- Colegio de Postgraduados, Montecillos, Edo. de México, CP 56230, Mexico
| |
Collapse
|
2
|
Campos JC, Manrique-Silupú J, Dorneanu B, Ipanaqué W, Arellano-García H. A smart decision framework for the prediction of thrips incidence in organic banana crops. Ecol Modell 2022. [DOI: 10.1016/j.ecolmodel.2022.110147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
3
|
Morota G, Jarquin D, Campbell MT, Iwata H. Statistical Methods for the Quantitative Genetic Analysis of High-Throughput Phenotyping Data. Methods Mol Biol 2022; 2539:269-296. [PMID: 35895210 DOI: 10.1007/978-1-0716-2537-8_21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The advent of plant phenomics, coupled with the wealth of genotypic data generated by next-generation sequencing technologies, provides exciting new resources for investigations into and improvement of complex traits. However, these new technologies also bring new challenges in quantitative genetics, namely, a need for the development of robust frameworks that can accommodate these high-dimensional data. In this chapter, we describe methods for the statistical analysis of high-throughput phenotyping (HTP) data with the goal of enhancing the prediction accuracy of genomic selection (GS). Following the Introduction in Sec. 1, Sec. 2 discusses field-based HTP, including the use of unoccupied aerial vehicles and light detection and ranging, as well as how we can achieve increased genetic gain by utilizing image data derived from HTP. Section 3 considers extending commonly used GS models to integrate HTP data as covariates associated with the principal trait response, such as yield. Particular focus is placed on single-trait, multi-trait, and genotype by environment interaction models. One unique aspect of HTP data is that phenomics platforms often produce large-scale data with high spatial and temporal resolution for capturing dynamic growth, development, and stress responses. Section 4 discusses the utility of a random regression model for performing longitudinal modeling. The chapter concludes with a discussion of some standing issues.
Collapse
Affiliation(s)
- Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA.
| | - Diego Jarquin
- Agronomy Department, University of Florida, Gainesville, FL, USA
| | - Malachy T Campbell
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | - Hiroyoshi Iwata
- Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
4
|
Kismiantini, Montesinos-López OA, Crossa J, Setiawan EP, Wutsqa DU. Prediction of count phenotypes using high-resolution images and genomic data. G3-GENES GENOMES GENETICS 2021; 11:jkab035. [PMID: 33847694 PMCID: PMC8022939 DOI: 10.1093/g3journal/jkab035] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 01/24/2021] [Indexed: 12/04/2022]
Abstract
Genomic selection (GS) is revolutionizing plant breeding since the selection process is done with the help of statistical machine learning methods. A model is trained with a reference population and then it is used for predicting the candidate individuals available in the testing set. However, given that breeding phenotypic values are very noisy, new models must be able to integrate not only genotypic and environmental data but also high-resolution images that have been collected by breeders with advanced image technology. For this reason, this paper explores the use of generalized Poisson regression (GPR) for genome-enabled prediction of count phenotypes using genomic and hyperspectral images. The GPR model allows integrating input information of many sources like environments, genomic data, high resolution data, and interaction terms between these three sources. We found that the best prediction performance was obtained when the three sources of information were taken into account in the predictor, and those measures of high-resolution images close to the harvest day provided the best prediction performance.
Collapse
Affiliation(s)
- Kismiantini
- Department of Statistics, Universitas Negeri Yogyakarta, Yogyakarta, 55281, Indonesia
| | | | - José Crossa
- Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, CP 52640, México; Colegio de Postgraduados, Montecillos, Edo. de México CP 56230, México
| | | | | |
Collapse
|
5
|
Lyra DH, Virlet N, Sadeghi-Tehran P, Hassall KL, Wingen LU, Orford S, Griffiths S, Hawkesford MJ, Slavov GT. Functional QTL mapping and genomic prediction of canopy height in wheat measured using a robotic field phenotyping platform. JOURNAL OF EXPERIMENTAL BOTANY 2020; 71:1885-1898. [PMID: 32097472 PMCID: PMC7094083 DOI: 10.1093/jxb/erz545] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 02/19/2020] [Indexed: 05/08/2023]
Abstract
Genetic studies increasingly rely on high-throughput phenotyping, but the resulting longitudinal data pose analytical challenges. We used canopy height data from an automated field phenotyping platform to compare several approaches to scanning for quantitative trait loci (QTLs) and performing genomic prediction in a wheat recombinant inbred line mapping population based on up to 26 sampled time points (TPs). We detected four persistent QTLs (i.e. expressed for most of the growing season), with both empirical and simulation analyses demonstrating superior statistical power of detecting such QTLs through functional mapping approaches compared with conventional individual TP analyses. In contrast, even very simple individual TP approaches (e.g. interval mapping) had superior detection power for transient QTLs (i.e. expressed during very short periods). Using spline-smoothed phenotypic data resulted in improved genomic predictive abilities (5-8% higher than individual TP prediction), while the effect of including significant QTLs in prediction models was relatively minor (<1-4% improvement). Finally, although QTL detection power and predictive ability generally increased with the number of TPs analysed, gains beyond five or 10 TPs chosen based on phenological information had little practical significance. These results will inform the development of an integrated, semi-automated analytical pipeline, which will be more broadly applicable to similar data sets in wheat and other crops.
Collapse
Affiliation(s)
- Danilo H Lyra
- Department of Computational & Analytical Sciences, Rothamsted Research, Harpenden, UK
| | - Nicolas Virlet
- Department of Plant Sciences, Rothamsted Research, Harpenden, UK
| | | | - Kirsty L Hassall
- Department of Computational & Analytical Sciences, Rothamsted Research, Harpenden, UK
| | - Luzie U Wingen
- John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK
| | - Simon Orford
- John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK
| | - Simon Griffiths
- John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK
| | | | - Gancho T Slavov
- Department of Computational & Analytical Sciences, Rothamsted Research, Harpenden, UK
- Scion, Rotorua, New Zealand
| |
Collapse
|
6
|
Lyra DH, Virlet N, Sadeghi-Tehran P, Hassall KL, Wingen LU, Orford S, Griffiths S, Hawkesford MJ, Slavov GT. Functional QTL mapping and genomic prediction of canopy height in wheat measured using a robotic field phenotyping platform. JOURNAL OF EXPERIMENTAL BOTANY 2020. [PMID: 32097472 DOI: 10.17632/pkxpkw6j43.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Genetic studies increasingly rely on high-throughput phenotyping, but the resulting longitudinal data pose analytical challenges. We used canopy height data from an automated field phenotyping platform to compare several approaches to scanning for quantitative trait loci (QTLs) and performing genomic prediction in a wheat recombinant inbred line mapping population based on up to 26 sampled time points (TPs). We detected four persistent QTLs (i.e. expressed for most of the growing season), with both empirical and simulation analyses demonstrating superior statistical power of detecting such QTLs through functional mapping approaches compared with conventional individual TP analyses. In contrast, even very simple individual TP approaches (e.g. interval mapping) had superior detection power for transient QTLs (i.e. expressed during very short periods). Using spline-smoothed phenotypic data resulted in improved genomic predictive abilities (5-8% higher than individual TP prediction), while the effect of including significant QTLs in prediction models was relatively minor (<1-4% improvement). Finally, although QTL detection power and predictive ability generally increased with the number of TPs analysed, gains beyond five or 10 TPs chosen based on phenological information had little practical significance. These results will inform the development of an integrated, semi-automated analytical pipeline, which will be more broadly applicable to similar data sets in wheat and other crops.
Collapse
Affiliation(s)
- Danilo H Lyra
- Department of Computational & Analytical Sciences, Rothamsted Research, Harpenden, UK
| | - Nicolas Virlet
- Department of Plant Sciences, Rothamsted Research, Harpenden, UK
| | | | - Kirsty L Hassall
- Department of Computational & Analytical Sciences, Rothamsted Research, Harpenden, UK
| | - Luzie U Wingen
- John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK
| | - Simon Orford
- John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK
| | - Simon Griffiths
- John Innes Centre, Norwich Research Park, Colney Lane, Norwich, UK
| | | | - Gancho T Slavov
- Department of Computational & Analytical Sciences, Rothamsted Research, Harpenden, UK
- Scion, Rotorua, New Zealand
| |
Collapse
|
7
|
Sun J, Poland JA, Mondal S, Crossa J, Juliana P, Singh RP, Rutkoski JE, Jannink JL, Crespo-Herrera L, Velu G, Huerta-Espino J, Sorrells ME. High-throughput phenotyping platforms enhance genomic selection for wheat grain yield across populations and cycles in early stage. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2019; 132:1705-1720. [PMID: 30778634 DOI: 10.1007/s00122-019-03309-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 02/06/2019] [Indexed: 05/18/2023]
Abstract
Genomic selection (GS) models have been validated for many quantitative traits in wheat (Triticum aestivum L.) breeding. However, those models are mostly constrained within the same growing cycle and the extension of GS to the case of across cycles has been a challenge, mainly due to the low predictive accuracy resulting from two factors: reduced genetic relationships between different families and augmented environmental variances between cycles. Using the data collected from diverse field conditions at the International Wheat and Maize Improvement Center, we evaluated GS for grain yield in three elite yield trials across three wheat growing cycles. The objective of this project was to employ the secondary traits, canopy temperature, and green normalized difference vegetation index, which are closely associated with grain yield from high-throughput phenotyping platforms, to improve prediction accuracy for grain yield. The ability to predict grain yield was evaluated reciprocally across three cycles with or without secondary traits. Our results indicate that prediction accuracy increased by an average of 146% for grain yield across cycles with secondary traits. In addition, our results suggest that secondary traits phenotyped during wheat heading and early grain filling stages were optimal for enhancing the prediction accuracy for grain yield.
Collapse
Affiliation(s)
- Jin Sun
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, 14853, USA
| | - Jesse A Poland
- Department of Plant Pathology and Department of Agronomy, Kansas State University, Manhattan, KS, 66506, USA
| | - Suchismita Mondal
- International Maize and Wheat Improvement Center (CIMMYT), Km. 45, Carretera México-Veracruz, El Batán, 56237, Texcoco, CP, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km. 45, Carretera México-Veracruz, El Batán, 56237, Texcoco, CP, Mexico
| | - Philomin Juliana
- International Maize and Wheat Improvement Center (CIMMYT), Km. 45, Carretera México-Veracruz, El Batán, 56237, Texcoco, CP, Mexico
| | - Ravi P Singh
- International Maize and Wheat Improvement Center (CIMMYT), Km. 45, Carretera México-Veracruz, El Batán, 56237, Texcoco, CP, Mexico
| | - Jessica E Rutkoski
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, 14853, USA
- International Rice Research Institute, 4030, Los Baños, Philippines
| | - Jean-Luc Jannink
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, 14853, USA
- USDA-ARS R.W. Holley Center for Agriculture and Health, Ithaca, NY, 14853, USA
| | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Km. 45, Carretera México-Veracruz, El Batán, 56237, Texcoco, CP, Mexico
| | - Govindan Velu
- International Maize and Wheat Improvement Center (CIMMYT), Km. 45, Carretera México-Veracruz, El Batán, 56237, Texcoco, CP, Mexico
| | - Julio Huerta-Espino
- Campo Experimental Valle de México INIFAP, Apdo. Postal 10, 56230, Chapingo, Edo. de México, Mexico
| | - Mark E Sorrells
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
8
|
Montesinos-López A, Montesinos-López OA, de los Campos G, Crossa J, Burgueño J, Luna-Vazquez FJ. Correction to: Bayesian functional regression as an alternative statistical analysis of high-throughput phenotyping data of modern agriculture. PLANT METHODS 2018; 14:57. [PMID: 30002724 PMCID: PMC6036691 DOI: 10.1186/s13007-018-0321-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
[This corrects the article DOI: 10.1186/s13007-018-0314-7.].
Collapse
Affiliation(s)
- Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430 Guadalajara, Jalisco Mexico
| | | | - Gustavo de los Campos
- Epidemiology and Biostatistics and Statistics and Probability Departments, Michigan State University, 909 Fee Road, East Lansing, MI 48824 USA
| | - José Crossa
- Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 Mexico City, Mexico
| | - Juan Burgueño
- Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 Mexico City, Mexico
| | | |
Collapse
|