1
|
Bose S, Banerjee S, Kumar S, Saha A, Nandy D, Hazra S. Review of applications of artificial intelligence (AI) methods in crop research. J Appl Genet 2024; 65:225-240. [PMID: 38216788 DOI: 10.1007/s13353-023-00826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/23/2023] [Accepted: 12/26/2023] [Indexed: 01/14/2024]
Abstract
Sophisticated and modern crop improvement techniques can bridge the gap for feeding the ever-increasing population. Artificial intelligence (AI) refers to the simulation of human intelligence in machines, which refers to the application of computational algorithms, machine learning (ML) and deep learning (DL) techniques. This is aimed to generalise patterns and relationships from historical data, employing various mathematical optimisation techniques thus making prediction models for facilitating selection of superior genotypes. These techniques are less resource intensive and can solve the problem based on the analysis of large-scale phenotypic datasets. ML for genomic selection (GS) uses high-throughput genotyping technologies to gather genetic information on a large number of markers across the genome. The prediction of GS models is based on the mathematical relation between genotypic and phenotypic data from the training population. ML techniques have emerged as powerful tools for genome editing through analysing large-scale genomic data and facilitating the development of accurate prediction models. Precise phenotyping is a prerequisite to advance crop breeding for solving agricultural production-related issues. ML algorithms can solve this problem through generating predictive models, based on the analysis of large-scale phenotypic datasets. DL models also have the potential reliability of precise phenotyping. This review provides a comprehensive overview on various ML and DL models, their applications, potential to enhance the efficiency, specificity and safety towards advanced crop improvement protocols such as genomic selection, genome editing, along with phenotypic prediction to promote accelerated breeding.
Collapse
Affiliation(s)
- Suvojit Bose
- Department of Vegetables and Spice Crops, Uttar Banga Krishi Viswavidyalaya, Pundibari, Cooch Behar, 736165, West Bengal, India
| | | | - Soumya Kumar
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Akash Saha
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Debalina Nandy
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Soham Hazra
- Department of Agriculture, Brainware University, Barasat, 700125, West Bengal, India.
| |
Collapse
|
2
|
Ruperao P, Rangan P, Shah T, Thakur V, Kalia S, Mayes S, Rathore A. The Progression in Developing Genomic Resources for Crop Improvement. Life (Basel) 2023; 13:1668. [PMID: 37629524 PMCID: PMC10455509 DOI: 10.3390/life13081668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/21/2023] [Accepted: 07/25/2023] [Indexed: 08/27/2023] Open
Abstract
Sequencing technologies have rapidly evolved over the past two decades, and new technologies are being continually developed and commercialized. The emerging sequencing technologies target generating more data with fewer inputs and at lower costs. This has also translated to an increase in the number and type of corresponding applications in genomics besides enhanced computational capacities (both hardware and software). Alongside the evolving DNA sequencing landscape, bioinformatics research teams have also evolved to accommodate the increasingly demanding techniques used to combine and interpret data, leading to many researchers moving from the lab to the computer. The rich history of DNA sequencing has paved the way for new insights and the development of new analysis methods. Understanding and learning from past technologies can help with the progress of future applications. This review focuses on the evolution of sequencing technologies, their significant enabling role in generating plant genome assemblies and downstream applications, and the parallel development of bioinformatics tools and skills, filling the gap in data analysis techniques.
Collapse
Affiliation(s)
- Pradeep Ruperao
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
| | - Parimalan Rangan
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India;
| | - Trushar Shah
- International Institute of Tropical Agriculture (IITA), Nairobi 30709-00100, Kenya;
| | - Vivek Thakur
- Department of Systems & Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad 500046, India;
| | - Sanjay Kalia
- Department of Biotechnology, Ministry of Science and Technology, Government of India, New Delhi 110003, India;
| | - Sean Mayes
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
| | - Abhishek Rathore
- Excellence in Breeding, International Maize and Wheat Improvement Center (CIMMYT), Hyderabad 502324, India
| |
Collapse
|
3
|
Ortiz R, Reslow F, Montesinos-López A, Huicho J, Pérez-Rodríguez P, Montesinos-López OA, Crossa J. Partial least squares enhance multi-trait genomic prediction of potato cultivars in new environments. Sci Rep 2023; 13:9947. [PMID: 37336933 DOI: 10.1038/s41598-023-37169-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 06/17/2023] [Indexed: 06/21/2023] Open
Abstract
It is of paramount importance in plant breeding to have methods dealing with large numbers of predictor variables and few sample observations, as well as efficient methods for dealing with high correlation in predictors and measured traits. This paper explores in terms of prediction performance the partial least squares (PLS) method under single-trait (ST) and multi-trait (MT) prediction of potato traits. The first prediction was for tested lines in tested environments under a five-fold cross-validation (5FCV) strategy and the second prediction was for tested lines in untested environments (herein denoted as leave one environment out cross validation, LOEO). There was a good performance in terms of predictions (with accuracy mostly > 0.5 for Pearson's correlation) the accuracy of 5FCV was better than LOEO. Hence, we have empirical evidence that the ST and MT PLS framework is a very valuable tool for prediction in the context of potato breeding data.
Collapse
Affiliation(s)
- Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), P.O. Box 190, SE 23436, Lomma, Sweden.
| | - Fredrik Reslow
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), P.O. Box 190, SE 23436, Lomma, Sweden
| | - Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), 44430, Guadalajara, México
| | - José Huicho
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz Km. 45, El Batán, 56237, Texcoco, Edo. de México, México
| | | | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz Km. 45, El Batán, 56237, Texcoco, Edo. de México, México.
- Colegio de Postgraduados (COLPOS), 56230, Montecillos, Edo. de México, México.
- Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Australia.
| |
Collapse
|
4
|
Endelman JB. Fully efficient, two-stage analysis of multi-environment trials with directional dominance and multi-trait genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:65. [PMID: 36949348 PMCID: PMC10033618 DOI: 10.1007/s00122-023-04298-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 01/02/2023] [Indexed: 06/18/2023]
Abstract
R/StageWise enables fully efficient, two-stage analysis of multi-environment, multi-trait datasets for genomic selection, including support for dominance heterosis and polyploidy. Plant breeders interested in genomic selection often face challenges to fully utilizing multi-trait, multi-environment datasets. R package StageWise was developed to go beyond the capabilities of most specialized software for genomic prediction, without requiring the programming skills needed for more general-purpose software for mixed models. As the name suggests, one of the core features is a fully efficient, two-stage analysis for multiple environments, in which the full variance-covariance matrix of the Stage 1 genotype means is used in Stage 2. Another feature is directional dominance, including for polyploids, to account for inbreeding depression in outbred crops. StageWise enables selection with multi-trait indices, including restricted indices with one or more traits constrained to have zero response. For a potato dataset with 943 genotypes evaluated over 6 years, including the Stage 1 errors in Stage 2 reduced the Akaike Information Criterion (AIC) by 29, 67, and 104 for maturity, yield, and fry color, respectively. The proportion of variation explained by heterosis was largest for yield but still only 0.03, likely because of limited variation for the genomic inbreeding coefficient. Due to the large additive genetic correlation (0.57) between yield and maturity, naïve selection on an index combining yield and fry color led to an undesirable response for later maturity. The restricted index coefficients to maximize genetic merit without delaying maturity were identified. The software and three vignettes are available at https://github.com/jendelman/StageWise .
Collapse
Affiliation(s)
- Jeffrey B Endelman
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| |
Collapse
|
5
|
Westhues CC, Simianer H, Beissinger TM. learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data. G3 GENES|GENOMES|GENETICS 2022; 12:6705235. [PMID: 36124944 PMCID: PMC9635651 DOI: 10.1093/g3journal/jkac226] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/29/2022] [Indexed: 12/04/2022]
Abstract
We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or to retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated over specific periods of time based on naive (for instance, nonoverlapping 10-day windows) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient-boosted decision trees, random forests, stacked ensemble models, and multilayer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with multi-environment trial experimental data in a user-friendly way. The package is published under an MIT license and accessible on GitHub.
Collapse
Affiliation(s)
- Cathy C Westhues
- Division of Plant Breeding Methodology, Department of Crop Sciences, University of Goettingen , 37075 Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen , 37075 Goettingen, Germany
| | - Henner Simianer
- Center for Integrated Breeding Research, University of Goettingen , 37075 Goettingen, Germany
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Gottingen , 37075 Gottingen, Germany
| | - Timothy M Beissinger
- Division of Plant Breeding Methodology, Department of Crop Sciences, University of Goettingen , 37075 Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen , 37075 Goettingen, Germany
| |
Collapse
|
6
|
Montesinos-López OA, Montesinos-López A, Bernal Sandoval DA, Mosqueda-Gonzalez BA, Valenzo-Jiménez MA, Crossa J. Multi-trait genome prediction of new environments with partial least squares. Front Genet 2022; 13:966775. [PMID: 36134027 PMCID: PMC9483856 DOI: 10.3389/fgene.2022.966775] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open
Abstract
The genomic selection (GS) methodology proposed over 20 years ago by Meuwissen et al. (Genetics, 2001) has revolutionized plant breeding. A predictive methodology that trains statistical machine learning algorithms with phenotypic and genotypic data of a reference population and makes predictions for genotyped candidate lines, GS saves significant resources in the selection of candidate individuals. However, its practical implementation is still challenging when the plant breeder is interested in the prediction of future seasons or new locations and/or environments, which is called the “leave one environment out” issue. Furthermore, because the distributions of the training and testing set do not match, most statistical machine learning methods struggle to produce moderate or reasonable prediction accuracies. For this reason, the main objective of this study was to explore the use of the multi-trait partial least square (MT-PLS) regression methodology for this specific task, benchmarking its performance with the Bayesian Multi-trait Genomic Best Linear Unbiased Predictor (MT-GBLUP) method. The benchmarking process was performed with five actual data sets. We found that in all data sets the MT-PLS method outperformed the popular MT-GBLUP method by 349.8% (under predictor E + G), 484.4% (under predictor E + G + GE; where E denotes environments, G genotypes and GE the genotype by environment interaction) and 15.9% (under predictor G + GE) across traits. Our results provide empirical evidence of the power of the MT-PLS methodology for the prediction of future seasons or new environments. Furthermore, the comparison between single univariate-trait (UT) versus MT for GBLUP and PLS gave an increase in prediction accuracy of MT-GBLUP versus UT-GBLUP, but not for MT-PLS versus UT-PLS.
Collapse
Affiliation(s)
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Mexico
- *Correspondence: Abelardo Montesinos-López, , José Crossa,
| | | | | | - Marco Alberto Valenzo-Jiménez
- Universidad Michoacana de San Nicolas de Hidalgo (UMSNH), Avenida Francisco J. Mujica S/N Ciudad Universitaria, Morelia, MC, Mexico
| | - José Crossa
- International Maize and Wheat Improvement Center, Texcoco, Edo. de Mexico, Mexico
- Colegio de Porstgraduados, Montecillos, Edo. de Mexico, Mexico
- *Correspondence: Abelardo Montesinos-López, , José Crossa,
| |
Collapse
|
7
|
Montesinos-López OA, Montesinos-López A, Cano-Paez B, Hernández-Suárez CM, Santana-Mancilla PC, Crossa J. A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library. Genes (Basel) 2022; 13:genes13081494. [PMID: 36011405 PMCID: PMC9407886 DOI: 10.3390/genes13081494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 08/10/2022] [Accepted: 08/19/2022] [Indexed: 11/30/2022] Open
Abstract
Genomic selection (GS) changed the way plant breeders select genotypes. GS takes advantage of phenotypic and genotypic information to training a statistical machine learning model, which is used to predict phenotypic (or breeding) values of new lines for which only genotypic information is available. Therefore, many statistical machine learning methods have been proposed for this task. Multi-trait (MT) genomic prediction models take advantage of correlated traits to improve prediction accuracy. Therefore, some multivariate statistical machine learning methods are popular for GS. In this paper, we compare the prediction performance of three MT methods: the MT genomic best linear unbiased predictor (GBLUP), the MT partial least squares (PLS) and the multi-trait random forest (RF) methods. Benchmarking was performed with six real datasets. We found that the three investigated methods produce similar results, but under predictors with genotype (G) and environment (E), that is, E + G, the MT GBLUP achieved superior performance, whereas under predictors E + G + genotype × environment (GE) and G + GE, random forest achieved the best results. We also found that the best predictions were achieved under the predictors E + G and E + G + GE. Here, we also provide the R code for the implementation of these three statistical machine learning methods in the sparse kernel method (SKM) library, which offers not only options for single-trait prediction with various statistical machine learning methods but also some options for MT predictions that can help to capture improved complex patterns in datasets that are common in genomic selection.
Collapse
Affiliation(s)
| | - Abelardo Montesinos-López
- Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44100, Mexico
- Correspondence: (A.M.-L.); (J.C.)
| | - Bernabe Cano-Paez
- Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico
| | - Carlos Moisés Hernández-Suárez
- Instituto de Ciencias Tecnología e Innovación, Universidad Francisco Gavidia, El Progreso St., No. 2748, Colonia Flor Blanca, San Salvador CP 1101, El Salvador
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Texcoco 56237, Mexico
- Colegio de Postgraduados, Montecillo 56230, Mexico
- Correspondence: (A.M.-L.); (J.C.)
| |
Collapse
|
8
|
Pérez-Rodríguez P, de Los Campos G. Multi-trait Bayesian Shrinkage and Variable Selection Models with the BGLR R-package. Genetics 2022; 222:6655691. [PMID: 35924977 PMCID: PMC9434216 DOI: 10.1093/genetics/iyac112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 07/14/2022] [Indexed: 12/02/2022] Open
Abstract
The BGLR-R package implements various types of single-trait shrinkage/variable selection Bayesian regressions. The package was first released in 2014, since then it has become a software very often used in genomic studies. We recently develop functionality for multitrait models. The implementation allows users to include an arbitrary number of random-effects terms. For each set of predictors, users can choose diffuse, Gaussian, and Gaussian–spike–slab multivariate priors. Unlike other software packages for multitrait genomic regressions, BGLR offers many specifications for (co)variance parameters (unstructured, diagonal, factor analytic, and recursive). Samples from the posterior distribution of the models implemented in the multitrait function are generated using a Gibbs sampler, which is implemented by combining code written in the R and C programming languages. In this article, we provide an overview of the models and methods implemented BGLR’s multitrait function, present examples that illustrate the use of the package, and benchmark the performance of the software.
Collapse
Affiliation(s)
- Paulino Pérez-Rodríguez
- Colegio de Postgraduados, CP 56230, Montecillos, Estado de México, México.,Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, USA
| | - Gustavo de Los Campos
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, USA.,Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA.,Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
9
|
Gaire R, de Arruda MP, Mohammadi M, Brown-Guedira G, Kolb FL, Rutkoski J. Multi-trait genomic selection can increase selection accuracy for deoxynivalenol accumulation resulting from fusarium head blight in wheat. THE PLANT GENOME 2022; 15:e20188. [PMID: 35043582 DOI: 10.1002/tpg2.20188] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 11/18/2021] [Indexed: 06/14/2023]
Abstract
Multi-trait genomic prediction (MTGP) can improve selection accuracy for economically valuable 'primary' traits by incorporating data on correlated secondary traits. Resistance to Fusarium head blight (FHB), a fungal disease of wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.), is evaluated using four genetically correlated traits: incidence (INC), severity (SEV), Fusarium damaged kernels (FDK), and deoxynivalenol content (DON). Both FDK and DON are primary traits; DON evaluation is expensive and usually requires several months for wheat breeders to get results from service laboratories performing the evaluations. We evaluated MTGP for DON using three soft red winter wheat breeding datasets: two diversity panels from the University of Illinois (IL) and Purdue University (PU) and a dataset consisting of 2019-2020 University of Illinois breeding cohorts. For DON, relative to single-trait (ST) genomic prediction, MTGP including phenotypic data for secondary traits on both validation and training sets, resulted in 23.4 and 10.6% higher predictive abilities in IL and PU panels, respectively. The MTGP models were advantageous only when secondary traits were included in both training and validation sets. In addition, MTGP models were more accurate than ST models only when FDK was included, and once FDK was included in the model, adding additional traits hardly improved accuracy. Evaluation of MTGP models across testing cohorts indicated that MTGP could increase accuracy by more than twofold in the early stages. Overall, we show that MTGP can increase selection accuracy for resistance to DON accumulation in wheat provided FDK is evaluated on the selection candidates.
Collapse
Affiliation(s)
- Rupesh Gaire
- Crop Sciences, Univ. of Illinois at Urbana-Champaign, 1102 S. Goodwin Avenue, Urbana, IL, 61801, USA
| | | | - Mohsen Mohammadi
- Agronomy Dep., Purdue Univ., 915 W State St, West Lafayette, IN, 47907, USA
| | - Gina Brown-Guedira
- USDA-ARS Plant Science Research & Crop and Soil Sciences, North Carolina State University, Williams Hall 4114A, Raleigh, NC, 27695, USA
| | - Frederic L Kolb
- Crop Sciences, Univ. of Illinois at Urbana-Champaign, 1102 S. Goodwin Avenue, Urbana, IL, 61801, USA
| | - Jessica Rutkoski
- Crop Sciences, Univ. of Illinois at Urbana-Champaign, 1102 S. Goodwin Avenue, Urbana, IL, 61801, USA
| |
Collapse
|
10
|
Sandhu KS, Patil SS, Aoun M, Carter AH. Multi-Trait Multi-Environment Genomic Prediction for End-Use Quality Traits in Winter Wheat. Front Genet 2022; 13:831020. [PMID: 35173770 PMCID: PMC8841657 DOI: 10.3389/fgene.2022.831020] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 01/06/2022] [Indexed: 11/13/2022] Open
Abstract
Soft white wheat is a wheat class used in foreign and domestic markets to make various end products requiring specific quality attributes. Due to associated cost, time, and amount of seed needed, phenotyping for the end-use quality trait is delayed until later generations. Previously, we explored the potential of using genomic selection (GS) for selecting superior genotypes earlier in the breeding program. Breeders typically measure multiple traits across various locations, and it opens up the avenue for exploring multi-trait-based GS models. This study's main objective was to explore the potential of using multi-trait GS models for predicting seven different end-use quality traits using cross-validation, independent prediction, and across-location predictions in a wheat breeding program. The population used consisted of 666 soft white wheat genotypes planted for 5 years at two locations in Washington, United States. We optimized and compared the performances of four uni-trait- and multi-trait-based GS models, namely, Bayes B, genomic best linear unbiased prediction (GBLUP), multilayer perceptron (MLP), and random forests. The prediction accuracies for multi-trait GS models were 5.5 and 7.9% superior to uni-trait models for the within-environment and across-location predictions. Multi-trait machine and deep learning models performed superior to GBLUP and Bayes B for across-location predictions, but their advantages diminished when the genotype by environment component was included in the model. The highest improvement in prediction accuracy, that is, 35% was obtained for flour protein content with the multi-trait MLP model. This study showed the potential of using multi-trait-based GS models to enhance prediction accuracy by using information from previously phenotyped traits. It would assist in speeding up the breeding cycle time in a cost-friendly manner.
Collapse
Affiliation(s)
- Karansher S. Sandhu
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - Shruti Sunil Patil
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA, United States1
| | - Meriem Aoun
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| | - Arron H. Carter
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, United States
| |
Collapse
|
11
|
Covarrubias-Pazaran G. Overview of Major Computer Packages for Genomic Prediction of Complex Traits. Methods Mol Biol 2022; 2467:157-187. [PMID: 35451776 DOI: 10.1007/978-1-0716-2205-6_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Genomic prediction models are showing their power to increase the rate of genetic gain by boosting all the elements of the breeder's equation. Insight into the factors associated with the successful implementation of this prediction model is increasing with time but the technology has reached a stage of acceptance. Most genomic prediction models require specialized computer packages based mainly on linear models and related methods. The number of computer packages has exploded in recent years given the interest in this technology. In this chapter, we explore the main computer packages available to fit these models; we also review the special features, strengths, and weaknesses of the methods behind the most popular computer packages.
Collapse
Affiliation(s)
- Giovanny Covarrubias-Pazaran
- Centro Internacional de Mejoramiento de Maiz y Trigo (CIMMYT), Texcoco, Mexico.
- Excellence in Breeding Platform (EiB), Texcoco, Mexico.
| |
Collapse
|
12
|
Crossa J, Montesinos-López OA, Pérez-Rodríguez P, Costa-Neto G, Fritsche-Neto R, Ortiz R, Martini JWR, Lillemo M, Montesinos-López A, Jarquin D, Breseghello F, Cuevas J, Rincent R. Genome and Environment Based Prediction Models and Methods of Complex Traits Incorporating Genotype × Environment Interaction. Methods Mol Biol 2022; 2467:245-283. [PMID: 35451779 DOI: 10.1007/978-1-0716-2205-6_9] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Genomic-enabled prediction models are of paramount importance for the successful implementation of genomic selection (GS) based on breeding values. As opposed to animal breeding, plant breeding includes extensive multienvironment and multiyear field trial data. Hence, genomic-enabled prediction models should include genotype × environment (G × E) interaction, which most of the time increases the prediction performance when the response of lines are different from environment to environment. In this chapter, we describe a historical timeline since 2012 related to advances of the GS models that take into account G × E interaction. We describe theoretical and practical aspects of those GS models, including the gains in prediction performance when including G × E structures for both complex continuous and categorical scale traits. Then, we detailed and explained the main G × E genomic prediction models for complex traits measured in continuous and noncontinuous (categorical) scale. Related to G × E interaction models this review also examine the analyses of the information generated with high-throughput phenotype data (phenomic) and the joint analyses of multitrait and multienvironment field trial data that is also employed in the general assessment of multitrait G × E interaction. The inclusion of nongenomic data in increasing the accuracy and biological reliability of the G × E approach is also outlined. We show the recent advances in large-scale envirotyping (enviromics), and how the use of mechanistic computational modeling can derive the crop growth and development aspects useful for predicting phenotypes and explaining G × E.
Collapse
Affiliation(s)
- José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, Mexico
- Colegio de Postgraduados, Montecillos, Mexico
| | | | | | - Germano Costa-Neto
- Departamento de Genética, Escola Superior de Agricultura "Luiz de Queiroz" (ESALQ/USP), São Paulo, Brazil
| | - Roberto Fritsche-Neto
- Departamento de Genética, Escola Superior de Agricultura "Luiz de Queiroz" (ESALQ/USP), São Paulo, Brazil
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), Alnarp, Sweden
| | - Johannes W R Martini
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, Mexico
| | - Morten Lillemo
- Department of Plant Sciences, Norwegian University of Life Sciences, IHA/CIGENE, Ås, Norway
| | - Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | | | | | - Jaime Cuevas
- Universidad de Quintana Roo, Chetumal, Quintana Roo, Mexico.
| | - Renaud Rincent
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution - Le Moulon, Gif-sur-Yvette, France.
| |
Collapse
|
13
|
Montesinos-López A, Runcie DE, Ibba MI, Pérez-Rodríguez P, Montesinos-López OA, Crespo LA, Bentley AR, Crossa J. Multi-trait genomic-enabled prediction enhances accuracy in multi-year wheat breeding trials. G3-GENES GENOMES GENETICS 2021; 11:6332007. [PMID: 34568924 PMCID: PMC8496321 DOI: 10.1093/g3journal/jkab270] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 07/25/2021] [Indexed: 11/14/2022]
Abstract
Implementing genomic-based prediction models in genomic selection requires an understanding of the measures for evaluating prediction accuracy from different models and methods using multi-trait data. In this study, we compared prediction accuracy using six large multi-trait wheat data sets (quality and grain yield). The data were used to predict 1 year (testing) from the previous year (training) to assess prediction accuracy using four different prediction models. The results indicated that the conventional Pearson’s correlation between observed and predicted values underestimated the true correlation value, whereas the corrected Pearson’s correlation calculated by fitting a bivariate model was higher than the division of the Pearson’s correlation by the squared root of the heritability across traits, by 2.53–11.46%. Across the datasets, the corrected Pearson’s correlation was higher than the uncorrected by 5.80–14.01%. Overall, we found that for grain yield the prediction performance was highest using a multi-trait compared to a single-trait model. The higher the absolute genetic correlation between traits the greater the benefits of multi-trait models for increasing the genomic-enabled prediction accuracy of traits.
Collapse
Affiliation(s)
- Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44430, Mexico
| | - Daniel E Runcie
- Department of Plant Sciences, College of Agricultural & Environmental Sciences, University of California Davis, Davis CA 95616, USA
| | - Maria Itria Ibba
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, México
| | | | | | - Leonardo A Crespo
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, México
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, México
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, México.,Colegio de Postgraduados (COLPOS), Montecillos, Edo. de México, México
| |
Collapse
|
14
|
Gill HS, Halder J, Zhang J, Brar NK, Rai TS, Hall C, Bernardo A, Amand PS, Bai G, Olson E, Ali S, Turnipseed B, Sehgal SK. Multi-Trait Multi-Environment Genomic Prediction of Agronomic Traits in Advanced Breeding Lines of Winter Wheat. FRONTIERS IN PLANT SCIENCE 2021; 12:709545. [PMID: 34490011 PMCID: PMC8416538 DOI: 10.3389/fpls.2021.709545] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/12/2021] [Indexed: 06/13/2023]
Abstract
Genomic prediction is a promising approach for accelerating the genetic gain of complex traits in wheat breeding. However, increasing the prediction accuracy (PA) of genomic prediction (GP) models remains a challenge in the successful implementation of this approach. Multivariate models have shown promise when evaluated using diverse panels of unrelated accessions; however, limited information is available on their performance in advanced breeding trials. Here, we used multivariate GP models to predict multiple agronomic traits using 314 advanced and elite breeding lines of winter wheat evaluated in 10 site-year environments. We evaluated a multi-trait (MT) model with two cross-validation schemes representing different breeding scenarios (CV1, prediction of completely unphenotyped lines; and CV2, prediction of partially phenotyped lines for correlated traits). Moreover, extensive data from multi-environment trials (METs) were used to cross-validate a Bayesian multi-trait multi-environment (MTME) model that integrates the analysis of multiple-traits, such as G × E interaction. The MT-CV2 model outperformed all the other models for predicting grain yield with significant improvement in PA over the single-trait (ST-CV1) model. The MTME model performed better for all traits, with average improvement over the ST-CV1 reaching up to 19, 71, 17, 48, and 51% for grain yield, grain protein content, test weight, plant height, and days to heading, respectively. Overall, the empirical analyses elucidate the potential of both the MT-CV2 and MTME models when advanced breeding lines are used as a training population to predict related preliminary breeding lines. Further, we evaluated the practical application of the MTME model in the breeding program to reduce phenotyping cost using a sparse testing design. This showed that complementing METs with GP can substantially enhance resource efficiency. Our results demonstrate that multivariate GS models have a great potential in implementing GS in breeding programs.
Collapse
Affiliation(s)
- Harsimardeep S. Gill
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Jyotirmoy Halder
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Jinfeng Zhang
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Navreet K. Brar
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Teerath S. Rai
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Cody Hall
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Amy Bernardo
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Paul St Amand
- United States Department of Agriculture - Agricultural Research Services, Hard Winter Wheat Genetic Research Unit, Manhattan, KS, United States
| | - Guihua Bai
- United States Department of Agriculture - Agricultural Research Services, Hard Winter Wheat Genetic Research Unit, Manhattan, KS, United States
| | - Eric Olson
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, United States
| | - Shaukat Ali
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Brent Turnipseed
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| | - Sunish K. Sehgal
- Department of Agronomy, Horticulture & Plant Science, South Dakota State University, Brookings, SD, United States
| |
Collapse
|
15
|
Zhang W, Boyle K, Brule-Babel A, Fedak G, Gao P, Djama ZR, Polley B, Cuthbert R, Randhawa H, Graf R, Jiang F, Eudes F, Fobert PR. Evaluation of Genomic Prediction for Fusarium Head Blight Resistance with a Multi-Parental Population. BIOLOGY 2021; 10:biology10080756. [PMID: 34439988 PMCID: PMC8389552 DOI: 10.3390/biology10080756] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/01/2021] [Accepted: 08/02/2021] [Indexed: 12/12/2022]
Abstract
Simple Summary Genomic selection is a promising approach to select superior wheat lines with better resistance to Fusarium head blight. The accuracy of genomic selection is determined by many factors. In this study, we found a training population with large size, genomic selection models incorporating biological information, and multi-environment modelling led to considerably better predictabilities. A training population designed by the coefficient of determination (CDmean) could increase accuracy of prediction. Relatedness between training population (TP) and testing population is the key for accuracies of genomic selection across populations. Abstract Fusarium head blight (FHB) resistance is quantitatively inherited, controlled by multiple minor effect genes, and highly affected by the interaction of genotype and environment. This makes genomic selection (GS) that uses genome-wide molecular marker data to predict the genetic breeding value as a promising approach to select superior lines with better resistance. However, various factors can affect accuracies of GS and better understanding how these factors affect GS accuracies could ensure the success of applying GS to improve FHB resistance in wheat. In this study, we performed a comprehensive evaluation of factors that affect GS accuracies with a multi-parental population designed for FHB resistance. We found larger sample sizes could get better accuracies. Training population designed by CDmean based optimization algorithms significantly increased accuracies than random sampling approach, while mean of predictor error variance (PEVmean) had the poorest performance. Different genomic selection models performed similarly for accuracies. Including prior known large effect quantitative trait loci (QTL) as fixed effect into the GS model considerably improved the predictability. Multi-traits models had almost no effects, while the multi-environment model outperformed the single environment model for prediction across different environments. By comparing within and across family prediction, better accuracies were obtained with the training population more closely related to the testing population. However, achieving good accuracies for GS prediction across populations is still a challenging issue for GS application.
Collapse
Affiliation(s)
- Wentao Zhang
- Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
- Correspondence: (W.Z.); (P.R.F.)
| | - Kerry Boyle
- Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
| | - Anita Brule-Babel
- Department of Plant Science, Agriculture Building, University of Manitoba, Winnipeg, MB R3T 2N2, Canada;
| | - George Fedak
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (G.F.); (Z.R.D.)
| | - Peng Gao
- Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
| | - Zeinab Robleh Djama
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (G.F.); (Z.R.D.)
| | - Brittany Polley
- Aquatic and Crop Resources Development, National Research Council of Canada, Saskatoon, SK S7N 0W9, Canada; (K.B.); (P.G.); (B.P.)
| | - Richard Cuthbert
- Swift Current Research and Development Centre, Agriculture and Agri-Food Canada, Swift Current, SK S9H 3X2, Canada;
| | - Harpinder Randhawa
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
| | - Robert Graf
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
| | - Fengying Jiang
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
| | - Francois Eudes
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, AB T1J 4B1, Canada; (H.R.); (R.G.); (F.J.); (F.E.)
| | - Pierre R. Fobert
- Aquatic and Crop Resources Development, National Research Council of Canada, Ottawa, ON K1A 0R6, Canada
- Correspondence: (W.Z.); (P.R.F.)
| |
Collapse
|
16
|
Crossa J, Fritsche-Neto R, Montesinos-Lopez OA, Costa-Neto G, Dreisigacker S, Montesinos-Lopez A, Bentley AR. The Modern Plant Breeding Triangle: Optimizing the Use of Genomics, Phenomics, and Enviromics Data. FRONTIERS IN PLANT SCIENCE 2021; 12:651480. [PMID: 33936136 PMCID: PMC8085545 DOI: 10.3389/fpls.2021.651480] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Accepted: 02/11/2021] [Indexed: 05/04/2023]
Affiliation(s)
- Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, de Mexico, Mexico
- Colegio de Postgraduados, Montecillo, Edo. de Mexico, Mexico
| | - Roberto Fritsche-Neto
- Department of Genetics, “Luiz de Queiroz” Agriculture College, University of São Paulo, São Paulo, Brazil
| | | | - Germano Costa-Neto
- Department of Genetics, “Luiz de Queiroz” Agriculture College, University of São Paulo, São Paulo, Brazil
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, de Mexico, Mexico
| | - Abelardo Montesinos-Lopez
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Mexico
| | - Alison R. Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, de Mexico, Mexico
- *Correspondence: Alison R. Bentley
| |
Collapse
|
17
|
Ibba MI, Crossa J, Montesinos-López OA, Montesinos-López A, Juliana P, Guzman C, Delorean E, Dreisigacker S, Poland J. Genome-based prediction of multiple wheat quality traits in multiple years. THE PLANT GENOME 2020; 13:e20034. [PMID: 33217204 DOI: 10.1002/tpg2.20034] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 05/26/2020] [Indexed: 05/20/2023]
Abstract
Wheat quality improvement is an important objective in all wheat breeding programs. However, due to the cost, time and quantity of seed required, wheat quality is typically analyzed only in the last stages of the breeding cycle on a limited number of samples. The use of genomic prediction could greatly help to select for wheat quality more efficiently by reducing the cost and time required for this analysis. Here were evaluated the prediction performances of 13 wheat quality traits under two multi-trait models (Bayesian multi-trait multi-environment [BMTME] and multi-trait ridge regression [MTR]) using five data sets of wheat lines evaluated in the field during two consecutive years. Lines in the second year (testing) were predicted using the quality information obtained in the first year (training). For most quality traits were found moderate to high prediction accuracies, suggesting that the use of genomic selection could be feasible. The best predictions were obtained with the BMTME model in all traits and the worst with the MTR model. The best predictions with the BMTME model under the mean arctangent absolute percentage error (MAAPE) were for test weight across the five data sets, whereas the worst predictions were for the alveograph trait ALVPL. In contrast, under Pearson's correlation, the best predictions depended on the data set. The results obtained suggest that the BMTME model should be preferred for multi-trait prediction analyses. This model allows to obtain not only the correlation among traits, but also the correlation among environments, helping to increase the prediction accuracy.
Collapse
Affiliation(s)
- Maria Itria Ibba
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera, Mexico-Veracruz, CP, 52640, Mexico
| | - Jose Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera, Mexico-Veracruz, CP, 52640, Mexico
- Colegio de Postgraduados (COLPOS), Montecillos, Edo. de México, CP, 56230, México
| | | | - Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, 44430, México
| | - Philomin Juliana
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera, Mexico-Veracruz, CP, 52640, Mexico
| | - Carlos Guzman
- Departamento de Genética, Escuela Técnica Superior de Ingeniería Agronómica y de Montes, Campus de Rabanales, Universidad de Córdoba, Córdoba, Spain
| | - Emily Delorean
- Department of Agronomy, Kansas State University, 2004 Throckmorton Plant Science Center, Manhattan, KS, 66506, USA
| | - Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera, Mexico-Veracruz, CP, 52640, Mexico
| | - Jesse Poland
- Department of Agronomy, Kansas State University, 2004 Throckmorton Plant Science Center, Manhattan, KS, 66506, USA
| |
Collapse
|
18
|
Guo J, Khan J, Pradhan S, Shahi D, Khan N, Avci M, Mcbreen J, Harrison S, Brown-Guedira G, Murphy JP, Johnson J, Mergoum M, Esten Mason R, Ibrahim AMH, Sutton R, Griffey C, Babar MA. Multi-Trait Genomic Prediction of Yield-Related Traits in US Soft Wheat under Variable Water Regimes. Genes (Basel) 2020; 11:genes11111270. [PMID: 33126620 PMCID: PMC7716228 DOI: 10.3390/genes11111270] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 10/23/2020] [Accepted: 10/26/2020] [Indexed: 11/16/2022] Open
Abstract
The performance of genomic prediction (GP) on genetically correlated traits can be improved through an interdependence multi-trait model under a multi-environment context. In this study, a panel of 237 soft facultative wheat (Triticum aestivum L.) lines was evaluated to compare single- and multi-trait models for predicting grain yield (GY), harvest index (HI), spike fertility (SF), and thousand grain weight (TGW). The panel was phenotyped in two locations and two years in Florida under drought and moderately drought stress conditions, while the genotyping was performed using 27,957 genotyping-by-sequencing (GBS) single nucleotide polymorphism (SNP) makers. Five predictive models including Multi-environment Genomic Best Linear Unbiased Predictor (MGBLUP), Bayesian Multi-trait Multi-environment (BMTME), Bayesian Multi-output Regressor Stacking (BMORS), Single-trait Multi-environment Deep Learning (SMDL), and Multi-trait Multi-environment Deep Learning (MMDL) were compared. Across environments, the multi-trait statistical model (BMTME) was superior to the multi-trait DL model for prediction accuracy in most scenarios, but the DL models were comparable to the statistical models for response to selection. The multi-trait model also showed 5 to 22% more genetic gain compared to the single-trait model across environment reflected by the response to selection. Overall, these results suggest that multi-trait genomic prediction can be an efficient strategy for economically important yield component related traits in soft wheat.
Collapse
Affiliation(s)
- Jia Guo
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Jahangir Khan
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Sumit Pradhan
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Dipendra Shahi
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Naeem Khan
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Muhsin Avci
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Jordan Mcbreen
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
| | - Stephen Harrison
- School of Plant Environment and Soil Sciences, Louisiana State University, Baton Rouge, LA 70803, USA;
| | | | - Joseph Paul Murphy
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC 27607, USA;
| | - Jerry Johnson
- Department of Crop and Soil Sciences, University of Georgia, Griffin, GA 32223, USA; (J.J.); (M.M.)
| | - Mohamed Mergoum
- Department of Crop and Soil Sciences, University of Georgia, Griffin, GA 32223, USA; (J.J.); (M.M.)
| | - Richanrd Esten Mason
- Department of Crop Soil and Environmental Sciences, University of Arkansas, Fayetteville, AR 72701, USA;
| | - Amir M. H. Ibrahim
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA; (A.M.H.I.); (R.S.)
| | - Russel Sutton
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA; (A.M.H.I.); (R.S.)
| | - Carl Griffey
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA;
| | - Md Ali Babar
- Department of Agronomy, University of Florida, Gainesville, FL 32611, USA; (J.G.); (J.K.); (S.P.); (D.S.); (N.K.); (M.A.); (J.M.)
- Correspondence:
| |
Collapse
|
19
|
Habyarimana E, Lopez-Cruz M, Baloch FS. Genomic Selection for Optimum Index with Dry Biomass Yield, Dry Mass Fraction of Fresh Material, and Plant Height in Biomass Sorghum. Genes (Basel) 2020; 11:genes11010061. [PMID: 31948110 PMCID: PMC7017155 DOI: 10.3390/genes11010061] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 12/19/2019] [Accepted: 12/19/2019] [Indexed: 12/30/2022] Open
Abstract
Sorghum is one of the world’s major crops, expresses traits for resilience to climate change, and can be used for several purposes including food and clean fuels. Multiple-trait genomic prediction and selection models were implemented using genotyping-by-sequencing single nucleotide polymorphism markers and phenotypic data information. We demonstrated for the first time the efficiency genomic selection modelling of index selection including biofuel traits such as aboveground biomass yield, plant height, and dry mass fraction of the fresh material. This work also sheds light, for the first time, on the promising potential of using the information from the populations grown from seed to predict the performance of the populations regrown from the rhizomes—even two winter seasons after the original trial was sown. Genomic selection modelling of the optimum index selection including the three traits of interest (plant height, aboveground dry biomass yield, and dry mass fraction of fresh mass material) was the most promising. Since the plant characteristics evaluated herein are routinely measured in cereal and other plant species of agricultural interest, it can be inferred that the findings can be transferred in other major crops.
Collapse
Affiliation(s)
- Ephrem Habyarimana
- CREA Research Center for Cereals and Industrial Crops, via di Corticella 133, 40128 Bologna, Italy
- Correspondence:
| | - Marco Lopez-Cruz
- Crop, Soil, and Microbial Sciences Department, Michigan State University, 1066 Bogue St, East Lansing, MI 42824, USA
| | - Faheem S. Baloch
- Department of Field Crops, Faculty of Agricultural and Natural Sciences, Abant Izzet Baysal University, 14030 Bolu, Turkey
| |
Collapse
|
20
|
Xavier A, Muir WM, Rainey KM. bWGR: Bayesian Whole-Genome Regression. Bioinformatics 2019; 36:btz794. [PMID: 31647543 DOI: 10.1093/bioinformatics/btz794] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 10/14/2019] [Accepted: 10/15/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Whole-genome regressions methods represent a key framework for genome-wide prediction, cross-validation studies, and association analysis. The bWGR offers a compendium of Bayesian methods with various priors available, allowing users to predict complex traits with different genetic architectures. RESULTS Here we introduce bWGR, an R package that enables users to efficient fit and cross-validate Bayesian and likelihood whole-genome regression methods. It implements a series of methods referred to as the Bayesian alphabet under the traditional Gibbs sampling and optimized Expectation-Maximization. The package also enables fitting efficient multivariate models and complex hierarchical models. The package is user-friendly and computational efficient. AVAILABILITY AND IMPLEMENTATION bWGR is an R package available in the CRAN repository. It can be installed in R by typing: install.packages("bWGR"). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alencar Xavier
- Corteva Agrisciences, 8305 NW 62nd Ave, Johnston IA
- Purdue University, 915 W State St, West Lafayette IN
| | | | - Katy M Rainey
- Purdue University, 915 W State St, West Lafayette IN
| |
Collapse
|
21
|
A Bayesian Genomic Multi-output Regressor Stacking Model for Predicting Multi-trait Multi-environment Plant Breeding Data. G3-GENES GENOMES GENETICS 2019; 9:3381-3393. [PMID: 31427455 PMCID: PMC6778812 DOI: 10.1534/g3.119.400336] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper we propose a Bayesian multi-output regressor stacking (BMORS) model that is a generalization of the multi-trait regressor stacking method. The proposed BMORS model consists of two stages: in the first stage, a univariate genomic best linear unbiased prediction (GBLUP including genotype × environment interaction GE) model is implemented for each of the L traits under study; then the predictions of all traits are included as covariates in the second stage, by implementing a Ridge regression model. The main objectives of this research were to study alternative models to the existing multi-trait multi-environment (BMTME) model with respect to (1) genomic-enabled prediction accuracy, and (2) potential advantages in terms of computing resources and implementation. We compared the predictions of the BMORS model to those of the univariate GBLUP model using 7 maize and wheat datasets. We found that the proposed BMORS produced similar predictions to the univariate GBLUP model and to the BMTME model in terms of prediction accuracy; however, the best predictions were obtained under the BMTME model. In terms of computing resources, we found that the BMORS is at least 9 times faster than the BMTME method. Based on our empirical findings, the proposed BMORS model is an alternative for predicting multi-trait and multi-environment data, which are very common in genomic-enabled prediction in plant and animal breeding programs.
Collapse
|