1
|
Alemu A, Åstrand J, Montesinos-López OA, Isidro Y Sánchez J, Fernández-Gónzalez J, Tadesse W, Vetukuri RR, Carlsson AS, Ceplitis A, Crossa J, Ortiz R, Chawade A. Genomic selection in plant breeding: Key factors shaping two decades of progress. MOLECULAR PLANT 2024; 17:552-578. [PMID: 38475993 DOI: 10.1016/j.molp.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/22/2024] [Accepted: 03/08/2024] [Indexed: 03/14/2024]
Abstract
Genomic selection, the application of genomic prediction (GP) models to select candidate individuals, has significantly advanced in the past two decades, effectively accelerating genetic gains in plant breeding. This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period. We delved into the pivotal roles of training population size and genetic diversity, and their relationship with the breeding population, in determining GP accuracy. Special emphasis was placed on optimizing training population size. We explored its benefits and the associated diminishing returns beyond an optimum size. This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms. The density and distribution of single-nucleotide polymorphisms, level of linkage disequilibrium, genetic complexity, trait heritability, statistical machine-learning methods, and non-additive effects are the other vital factors. Using wheat, maize, and potato as examples, we summarize the effect of these factors on the accuracy of GP for various traits. The search for high accuracy in GP-theoretically reaching one when using the Pearson's correlation as a metric-is an active research area as yet far from optimal for various traits. We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets, effective training population optimization methods and support from other omics approaches (transcriptomics, metabolomics and proteomics) coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy, making genomic selection an effective tool in plant breeding.
Collapse
Affiliation(s)
- Admas Alemu
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Johanna Åstrand
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden; Lantmännen Lantbruk, Svalöv, Sweden
| | | | - Julio Isidro Y Sánchez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Javier Fernández-Gónzalez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Wuletaw Tadesse
- International Center for Agricultural Research in the Dry Areas (ICARDA), Rabat, Morocco
| | - Ramesh R Vetukuri
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Anders S Carlsson
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco, México 52640, Mexico
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| |
Collapse
|
2
|
Maginga TJ, Masabo E, Bakunzibake P, Kim KS, Nsenga J. Using wavelet transform and hybrid CNN - LSTM models on VOC & ultrasound IoT sensor data for non-visual maize disease detection. Heliyon 2024; 10:e26647. [PMID: 38420424 PMCID: PMC10901083 DOI: 10.1016/j.heliyon.2024.e26647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 02/16/2024] [Accepted: 02/16/2024] [Indexed: 03/02/2024] Open
Abstract
Early detection of plant diseases is crucial for safeguarding crop yield, especially in regions vulnerable to food insecurity, such as Sub-Saharan Africa. One of the significant contributors to maize crop yield loss is the Northern Leaf Blight (NLB), which traditionally takes 14-21 days to visually manifest on maize. This study introduces a novel approach for detecting NLB as early as 4-5 days using Internet of Things (IoT) sensors, which can identify the disease before any visual symptoms appear. Utilizing Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) models, nonvisual measurements of Total Volatile Organic Compounds (VOCs) and ultrasound emissions from maize plants were captured and analyzed. A controlled experiment was conducted on four maize varieties, and the data obtained were used to develop and validate a hybrid CNN-LSTM model for VOC classification and an LSTM model for ultrasound anomaly detection. The hybrid CNN-LSTM model, enhanced with wavelet data preprocessing, achieved an F1 score of 0.96 and an Area under the ROC Curve (AUC) of 1.00. In contrast, the LSTM model exhibited an impressive 99.98% accuracy in identifying anomalies in ultrasound emissions. Our findings underscore the potential of IoT sensors in early disease detection, paving the way for innovative disease prevention strategies in agriculture. Future work will focus on optimizing the models for IoT device deployment, incorporating chatbot technology, and more sensor data will be incorporated for improved accuracy and evaluation of the models in a field environment.
Collapse
Affiliation(s)
| | - Emmanuel Masabo
- African Centre of Excellence in Internet of Things (ACEIoT) - University of Rwanda (UR), Rwanda
| | - Pierre Bakunzibake
- African Centre of Excellence in Internet of Things (ACEIoT) - University of Rwanda (UR), Rwanda
| | - Kwang Soo Kim
- Global Research and Development Business Centre (GRC-SNU) -Seoul National University (SNU), South Korea
| | - Jimmy Nsenga
- African Centre of Excellence in Internet of Things (ACEIoT) - University of Rwanda (UR), Rwanda
| |
Collapse
|
3
|
Meher PK, Gupta A, Rustgi S, Mir RR, Kumar A, Kumar J, Balyan HS, Gupta PK. Evaluation of eight Bayesian genomic prediction models for three micronutrient traits in bread wheat (Triticum aestivum L.). THE PLANT GENOME 2023; 16:e20332. [PMID: 37122189 DOI: 10.1002/tpg2.20332] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 02/21/2023] [Accepted: 03/13/2023] [Indexed: 06/19/2023]
Abstract
In wheat, genomic prediction accuracy (GPA) was assessed for three micronutrient traits (grain iron, grain zinc, and β-carotenoid concentrations) using eight Bayesian regression models. For this purpose, data on 246 accessions, each genotyped with 17,937 DArT markers, were utilized. The phenotypic data on traits were available for 2013-2014 from Powerkheda (Madhya Pradesh) and for 2014-2015 from Meerut (Uttar Pradesh), India. The accuracy of the models was measured in terms of reliability, which was computed following a repeated cross-validation approach. The predictions were obtained independently for each of the two environments after adjusting for the local effects and across environments after adjusting for the environmental effects. The Bayes ridge regression (BayesRR) model outperformed the other seven models, whereas BayesLASSO (BayesL) was the least efficient. The GPA increased with an increase in the size of the training set as well as with an increase in marker density. The GPA values differed for the three traits and were higher for the best linear unbiased estimate (BLUE) (obtained after adjusting for the environmental effects) relative to those for the two environments. The GPA also remained unaffected after accounting for the population structure. The results of the present study suggest that only the best model should be used for the estimations of genomic estimated breeding values (GEBVs) before their use for genomic selection to improve the grain micronutrient contents.
Collapse
Affiliation(s)
- Prabina Kumar Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Ajit Gupta
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sachin Rustgi
- Department of Plant and Environmental Sciences, Pee Dee Research and Education Centre, Clemson University, Florence, South Carolina, USA
| | - Reyazul Rouf Mir
- Division of Genetics and Plant Breeding, SKUAST-Kashmir, Kashmir, India
| | - Anuj Kumar
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
- Laboratory of Immunity, Shantou University Medical College, Shantou, People's Republic of China
| | - Jitendra Kumar
- National Agri-Food Biotechnology Institute (NABI), Ajitgarh, India
| | - Harindra Singh Balyan
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut, India
| | - Pushpendra Kumar Gupta
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut, India
| |
Collapse
|
4
|
Joshi B, Singh S, Tiwari GJ, Kumar H, Boopathi NM, Jaiswal S, Adhikari D, Kumar D, Sawant SV, Iquebal MA, Jena SN. Genome-wide association study of fiber yield-related traits uncovers the novel genomic regions and candidate genes in Indian upland cotton ( Gossypium hirsutum L.). FRONTIERS IN PLANT SCIENCE 2023; 14:1252746. [PMID: 37941674 PMCID: PMC10630025 DOI: 10.3389/fpls.2023.1252746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 09/11/2023] [Indexed: 11/10/2023]
Abstract
Upland cotton (Gossypium hirsutum L.) is a major fiber crop that is cultivated worldwide and has significant economic importance. India harbors the largest area for cotton cultivation, but its fiber yield is still compromised and ranks 22nd in terms of productivity. Genetic improvement of cotton fiber yield traits is one of the major goals of cotton breeding, but the understanding of the genetic architecture underlying cotton fiber yield traits remains limited and unclear. To better decipher the genetic variation associated with fiber yield traits, we conducted a comprehensive genome-wide association mapping study using 117 Indian cotton germplasm for six yield-related traits. To accomplish this, we generated 2,41,086 high-quality single nucleotide polymorphism (SNP) markers using genotyping-by-sequencing (GBS) methods. Population structure, PCA, kinship, and phylogenetic analyses divided the germplasm into two sub-populations, showing weak relatedness among the germplasms. Through association analysis, 205 SNPs and 134 QTLs were identified to be significantly associated with the six fiber yield traits. In total, 39 novel QTLs were identified in the current study, whereas 95 QTLs overlapped with existing public domain data in a comparative analysis. Eight QTLs, qGhBN_SCY_D6-1, qGhBN_SCY_D6-2, qGhBN_SCY_D6-3, qGhSI_LI_A5, qGhLI_SI_A13, qGhLI_SI_D9, qGhBW_SCY_A10, and qGhLP_BN_A8 were identified. Gene annotation of these fiber yield QTLs revealed 2,509 unique genes. These genes were predominantly enriched for different biological processes, such as plant cell wall synthesis, nutrient metabolism, and vegetative growth development in the gene ontology (GO) enrichment study. Furthermore, gene expression analysis using RNAseq data from 12 diverse cotton tissues identified 40 candidate genes (23 stable and 17 novel genes) to be transcriptionally active in different stages of fiber, ovule, and seed development. These findings have revealed a rich tapestry of genetic elements, including SNPs, QTLs, and candidate genes, and may have a high potential for improving fiber yield in future breeding programs for Indian cotton.
Collapse
Affiliation(s)
- Babita Joshi
- Plant Genetic Resources and Improvement, CSIR-National Botanical Research Institute, Lucknow, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Sanjay Singh
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Gopal Ji Tiwari
- Plant Genetic Resources and Improvement, CSIR-National Botanical Research Institute, Lucknow, India
| | - Harish Kumar
- Department of Plant Breeding and Genetics, Punjab Agricultural University, Regional Research Station, Faridkot, Punjab, India
| | - Narayanan Manikanda Boopathi
- Department of Plant Biotechnology, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, Tamil Nadu, India
| | - Sarika Jaiswal
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dibyendu Adhikari
- Plant Ecology and Climate Change Science, CSIR-National Botanical Research Institute, Lucknow, India
| | - Dinesh Kumar
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Samir V. Sawant
- Molecular Biology & Biotechnology, CSIR-National Botanical Research Institute, Lucknow, India
| | - Mir Asif Iquebal
- Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Satya Narayan Jena
- Plant Genetic Resources and Improvement, CSIR-National Botanical Research Institute, Lucknow, India
| |
Collapse
|
5
|
Ballén-Taborda C, Lyerly J, Smith J, Howell K, Brown-Guedira G, Babar MA, Harrison SA, Mason RE, Mergoum M, Murphy JP, Sutton R, Griffey CA, Boyles RE. Utilizing genomics and historical data to optimize gene pools for new breeding programs: A case study in winter wheat. Front Genet 2022; 13:964684. [PMID: 36276956 PMCID: PMC9585219 DOI: 10.3389/fgene.2022.964684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 08/05/2022] [Indexed: 11/13/2022] Open
Abstract
With the rapid generation and preservation of both genomic and phenotypic information for many genotypes within crops and across locations, emerging breeding programs have a valuable opportunity to leverage these resources to 1) establish the most appropriate genetic foundation at program inception and 2) implement robust genomic prediction platforms that can effectively select future breeding lines. Integrating genomics-enabled1 breeding into cultivar development can save costs and allow resources to be reallocated towards advanced (i.e., later) stages of field evaluation, which can facilitate an increased number of testing locations and replicates within locations. In this context, a reestablished winter wheat breeding program was used as a case study to understand best practices to leverage and tailor existing genomic and phenotypic resources to determine optimal genetics for a specific target population of environments. First, historical multi-environment phenotype data, representing 1,285 advanced breeding lines, were compiled from multi-institutional testing as part of the SunGrains cooperative and used to produce GGE biplots and PCA for yield. Locations were clustered based on highly correlated line performance among the target population of environments into 22 subsets. For each of the subsets generated, EMMs and BLUPs were calculated using linear models with the ‘lme4’ R package. Second, for each subset, TPs representative of the new SC breeding lines were determined based on genetic relatedness using the ‘STPGA’ R package. Third, for each TP, phenotypic values and SNP data were incorporated into the ‘rrBLUP’ mixed models for generation of GEBVs of YLD, TW, HD and PH. Using a five-fold cross-validation strategy, an average accuracy of r = 0.42 was obtained for yield between all TPs. The validation performed with 58 SC elite breeding lines resulted in an accuracy of r = 0.62 when the TP included complete historical data. Lastly, QTL-by-environment interaction for 18 major effect genes across three geographic regions was examined. Lines harboring major QTL in the absence of disease could potentially underperform (e.g., Fhb1 R-gene), whereas it is advantageous to express a major QTL under biotic pressure (e.g., stripe rust R-gene). This study highlights the importance of genomics-enabled breeding and multi-institutional partnerships to accelerate cultivar development.
Collapse
Affiliation(s)
- Carolina Ballén-Taborda
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
- Pee Dee Research and Education Center, Clemson University, Florence, SC, United States
| | - Jeanette Lyerly
- Crop and Soil Sciences Department, North Carolina State University, Raleigh, NC, United States
| | - Jared Smith
- U.S. Department of Agriculture-Agricultural Research Service (USDA-ARS), Raleigh, NC, United States
| | - Kimberly Howell
- U.S. Department of Agriculture-Agricultural Research Service (USDA-ARS), Raleigh, NC, United States
| | - Gina Brown-Guedira
- Crop and Soil Sciences Department, North Carolina State University, Raleigh, NC, United States
- U.S. Department of Agriculture-Agricultural Research Service (USDA-ARS), Raleigh, NC, United States
| | - Md. Ali Babar
- Agronomy Department, University of Florida, Gainesville, FL, United States
| | - Stephen A. Harrison
- School of Plant, Environmental and Soil Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Richard E. Mason
- College of Agricultural Sciences, Colorado State University, Fort Collins, CO, United States
| | - Mohamed Mergoum
- Department of Crop and Soil Sciences, University of Georgia, Griffin, GA, United States
| | - J. Paul Murphy
- Crop and Soil Sciences Department, North Carolina State University, Raleigh, NC, United States
| | - Russell Sutton
- Department of Soil and Crop Sciences, Texas A&M University, Commerce, TX, United States
| | - Carl A. Griffey
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Richard E. Boyles
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
- Pee Dee Research and Education Center, Clemson University, Florence, SC, United States
- *Correspondence: Richard E. Boyles,
| |
Collapse
|
6
|
Semagn K, Crossa J, Cuevas J, Iqbal M, Ciechanowska I, Henriquez MA, Randhawa H, Beres BL, Aboukhaddour R, McCallum BD, Brûlé-Babel AL, N'Diaye A, Pozniak C, Spaner D. Comparison of single-trait and multi-trait genomic predictions on agronomic and disease resistance traits in spring wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:2747-2767. [PMID: 35737008 DOI: 10.1007/s00122-022-04147-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/28/2022] [Indexed: 06/15/2023]
Abstract
This study performed comprehensive analyses on the predictive abilities of single-trait and two multi-trait models in three populations. Our results demonstrated the superiority of multi-traits over single-trait models across seven agronomic and four to seven disease resistance traits of different genetic architecture. The predictive ability of multi-trait and single-trait prediction models has not been investigated on diverse traits evaluated under organic and conventional management systems. Here, we compared the predictive abilities of 25% of a testing set that has not been evaluated for a single trait (ST), not evaluated for multi-traits (MT1), and evaluated for some traits but not others (MT2) in three spring wheat populations genotyped either with the wheat 90K single nucleotide polymorphisms array or DArTseq. Analyses were performed on seven agronomic traits evaluated under conventional and organic management systems, four to seven disease resistance traits, and all agronomic and disease resistance traits simultaneously. The average prediction accuracies of the ST, MT1, and MT2 models varied from 0.03 to 0.78 (mean 0.41), from 0.05 to 0.82 (mean 0.47), and from 0.05 to 0.92 (mean 0.67), respectively. The predictive ability of the MT2 model was significantly greater than the ST model in all traits and populations except common bunt with the MT1 model being intermediate between them. The MT2 model increased prediction accuracies over the ST and MT1 models in all traits by 9.0-82.4% (mean 37.3%) and 2.9-82.5% (mean 25.7%), respectively, except common bunt that showed up to 7.7% smaller accuracies in two populations. A joint analysis of all agronomic and disease resistance traits further improved accuracies within the MT1 and MT2 models on average by 21.4% and 17.4%, respectively, as compared to either the agronomic or disease resistance traits, demonstrating the high potential of the multi-traits models in improving prediction accuracies.
Collapse
Affiliation(s)
- Kassa Semagn
- Department of Agricultural, Food, and Nutritional Science, 4-10 Agriculture-Forestry Centre, University of Alberta, Edmonton, AB, T6G 2P5, Canada.
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600, Mexico, DF, Mexico
| | | | - Muhammad Iqbal
- Department of Agricultural, Food, and Nutritional Science, 4-10 Agriculture-Forestry Centre, University of Alberta, Edmonton, AB, T6G 2P5, Canada
| | - Izabela Ciechanowska
- Department of Agricultural, Food, and Nutritional Science, 4-10 Agriculture-Forestry Centre, University of Alberta, Edmonton, AB, T6G 2P5, Canada
| | - Maria Antonia Henriquez
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
| | - Harpinder Randhawa
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, 5403-1st Avenue South, Lethbridge, AB, T1J 4B1, Canada
| | - Brian L Beres
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, 5403-1st Avenue South, Lethbridge, AB, T1J 4B1, Canada
| | - Reem Aboukhaddour
- Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, 5403-1st Avenue South, Lethbridge, AB, T1J 4B1, Canada
| | - Brent D McCallum
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
| | - Anita L Brûlé-Babel
- Department of Plant Science, University of Manitoba, 66 Dafoe Road, Winnipeg, MB, R3T 2N2, Canada
| | - Amidou N'Diaye
- Crop Development Centre and Department of Plant Sciences, University of Saskatchewan, 51 Campus Drive, Saskatoon, SK, S7N 5A8, Canada
| | - Curtis Pozniak
- Crop Development Centre and Department of Plant Sciences, University of Saskatchewan, 51 Campus Drive, Saskatoon, SK, S7N 5A8, Canada
| | - Dean Spaner
- Department of Agricultural, Food, and Nutritional Science, 4-10 Agriculture-Forestry Centre, University of Alberta, Edmonton, AB, T6G 2P5, Canada.
| |
Collapse
|