1
|
Fallah Ziarani M, Tohidfar M, Hesami M. Choosing an appropriate somatic embryogenesis medium of carrot (Daucus carota L.) by data mining technology. BMC Biotechnol 2024; 24:68. [PMID: 39334143 PMCID: PMC11428924 DOI: 10.1186/s12896-024-00898-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 09/13/2024] [Indexed: 09/30/2024] Open
Abstract
INTRODUCTION Developing somatic embryogenesis is one of the main steps in successful in vitro propagation and gene transformation in the carrot. However, somatic embryogenesis is influenced by different intrinsic (genetics, genotype, and explant) and extrinsic (e.g., plant growth regulators (PGRs), medium composition, and gelling agent) factors which cause challenges in developing the somatic embryogenesis protocol. Therefore, optimizing somatic embryogenesis is a tedious, time-consuming, and costly process. Novel data mining approaches through a hybrid of artificial neural networks (ANNs) and optimization algorithms can facilitate modeling and optimizing in vitro culture processes and thereby reduce large experimental treatments and combinations. Carrot is a model plant in genetic engineering works and recombinant drugs, and therefore it is an important plant in research works. Also, in this research, for the first time, embryogenesis in carrot (Daucus carota L.) using Genetic algorithm (GA) and data mining technology has been reviewed and analyzed. MATERIALS AND METHODS In the current study, data mining approach through multilayer perceptron (MLP) and radial basis function (RBF) as two well-known ANNs were employed to model and predict embryogenic callus production in carrot based on eight input variables including carrot cultivars, agar, magnesium sulfate (MgSO4), calcium dichloride (CaCl2), manganese (II) sulfate (MnSO4), 2,4-dichlorophenoxyacetic acid (2,4-D), 6-benzylaminopurine (BAP), and kinetin (KIN). To confirm the reliability and accuracy of the developed model, the result obtained from RBF-GA model were tested in the laboratory. RESULTS The results showed that RBF had better prediction efficiency than MLP. Then, the developed model was linked to a genetic algorithm (GA) to optimize the system. To confirm the reliability and accuracy of the developed model, the result of RBF-GA was experimentally tested in the lab as a validation experiment. The result showed that there was no significant difference between the predicted optimized result and the experimental result. CONCLUTIONS Generally, the results of this study suggest that data mining through RBF-GA can be considered as a robust approach, besides experimental methods, to model and optimize in vitro culture systems. According to the RBF-GA result, the highest somatic embryogenesis rate (62.5%) can be obtained from Nantes improved cultivar cultured on medium containing 195.23 mg/l MgSO4, 330.07 mg/l CaCl2, 18.3 mg/l MnSO4, 0.46 mg/l 2,4- D, 0.03 mg/l BAP, and 0.88 mg/l KIN. These results were also confirmed in the laboratory.
Collapse
Affiliation(s)
- Masoumeh Fallah Ziarani
- Department of Cell & Molecular Biology, Faculty of Life Sciences & Biotechnology, Shahid Beheshti University, Tehran, 19839-69411, Iran
| | - Masoud Tohidfar
- Department of Cell & Molecular Biology, Faculty of Life Sciences & Biotechnology, Shahid Beheshti University, Tehran, 19839-69411, Iran.
| | - Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
2
|
Zarbakhsh S, Shahsavar AR, Soltani M. Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models. PLANT METHODS 2024; 20:82. [PMID: 38822411 PMCID: PMC11143642 DOI: 10.1186/s13007-024-01211-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/17/2024] [Indexed: 06/03/2024]
Abstract
BACKGROUND The process of optimizing in vitro shoot proliferation is a complicated task, as it is influenced by interactions of many factors as well as genotype. This study investigated the role of various concentrations of plant growth regulators (zeatin and gibberellic acid) in the successful in vitro shoot proliferation of three Punica granatum cultivars ('Faroogh', 'Atabaki' and 'Shirineshahvar'). Also, the utility of five Machine Learning (ML) algorithms-Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGB), Ensemble Stacking Regression (ESR) and Elastic Net Multivariate Linear Regression (ENMLR)-as modeling tools were evaluated on in vitro multiplication of pomegranate. A new automatic hyperparameter optimization method named Adaptive Tree Pazen Estimator (ATPE) was developed to tune the hyperparameters. The performance of the models was evaluated and compared using statistical indicators (MAE, RMSE, RRMSE, MAPE, R and R2), while a specific Global Performance Indicator (GPI) was introduced to rank the models based on a single parameter. Moreover, Non‑dominated Sorting Genetic Algorithm‑II (NSGA‑II) was employed to optimize the selected prediction model. RESULTS The results demonstrated that the ESR algorithm exhibited higher predictive accuracy in comparison to other ML algorithms. The ESR model was subsequently introduced for optimization by NSGA‑II. ESR-NSGA‑II revealed that the highest proliferation rate (3.47, 3.84, and 3.22), shoot length (2.74, 3.32, and 1.86 cm), leave number (18.18, 19.76, and 18.77), and explant survival (84.21%, 85.49%, and 56.39%) could be achieved with a medium containing 0.750, 0.654, and 0.705 mg/L zeatin, and 0.50, 0.329, and 0.347 mg/L gibberellic acid in the 'Atabaki', 'Faroogh', and 'Shirineshahvar' cultivars, respectively. CONCLUSIONS This study demonstrates that the 'Shirineshahvar' cultivar exhibited lower shoot proliferation success compared to the other cultivars. The results indicated the good performance of ESR-NSGA-II in modeling and optimizing in vitro propagation. ESR-NSGA-II can be applied as an up-to-date and reliable computational tool for future studies in plant in vitro culture.
Collapse
Affiliation(s)
- Saeedeh Zarbakhsh
- Department of Horticultural Science, College of Agriculture, Faculty of Agriculture, Shiraz University, Shiraz, Iran
| | - Ali Reza Shahsavar
- Department of Horticultural Science, College of Agriculture, Faculty of Agriculture, Shiraz University, Shiraz, Iran.
| | | |
Collapse
|
3
|
Jafari M, Daneshvar MH. Machine learning-mediated Passiflora caerulea callogenesis optimization. PLoS One 2024; 19:e0292359. [PMID: 38266002 PMCID: PMC10807783 DOI: 10.1371/journal.pone.0292359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 09/19/2023] [Indexed: 01/26/2024] Open
Abstract
Callogenesis is one of the most powerful biotechnological approaches for in vitro secondary metabolite production and indirect organogenesis in Passiflora caerulea. Comprehensive knowledge of callogenesis and optimized protocol can be obtained by the application of a combination of machine learning (ML) and optimization algorithms. In the present investigation, the callogenesis responses (i.e., callogenesis rate and callus fresh weight) of P. caerulea were predicted based on different types and concentrations of plant growth regulators (PGRs) (i.e., 2,4-dichlorophenoxyacetic acid (2,4-D), 6-benzylaminopurine (BAP), 1-naphthaleneacetic acid (NAA), and indole-3-Butyric Acid (IBA)) as well as explant types (i.e., leaf, node, and internode) using multilayer perceptron (MLP). Moreover, the developed models were integrated into the genetic algorithm (GA) to optimize the concentration of PGRs and explant types for maximizing callogenesis responses. Furthermore, sensitivity analysis was conducted to assess the importance of each input variable on the callogenesis responses. The results showed that MLP had high predictive accuracy (R2 > 0.81) in both training and testing sets for modeling all studied parameters. Based on the results of the optimization process, the highest callogenesis rate (100%) would be obtained from the leaf explant cultured in the medium supplemented with 0.52 mg/L IBA plus 0.43 mg/L NAA plus 1.4 mg/L 2,4-D plus 0.2 mg/L BAP. The results of the sensitivity analysis showed the explant-dependent impact of the exogenous application of PGRs on callogenesis. Generally, the results showed that a combination of MLP and GA can display a forward-thinking aid to optimize and predict in vitro culture systems and consequentially cope with several challenges faced currently in Passiflora tissue culture.
Collapse
Affiliation(s)
- Marziyeh Jafari
- Department of Horticultural Science, College of Agriculture, Shiraz University, Shiraz, Iran
- Department of Horticultural Sciences, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, Iran
| | - Mohammad Hosein Daneshvar
- Department of Horticultural Sciences, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, Iran
| |
Collapse
|
4
|
Jafari M, Daneshvar MH. Prediction and optimization of indirect shoot regeneration of Passiflora caerulea using machine learning and optimization algorithms. BMC Biotechnol 2023; 23:27. [PMID: 37528396 PMCID: PMC10394921 DOI: 10.1186/s12896-023-00796-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 07/21/2023] [Indexed: 08/03/2023] Open
Abstract
BACKGROUND Optimization of indirect shoot regeneration protocols is one of the key prerequisites for the development of Agrobacterium-mediated genetic transformation and/or genome editing in Passiflora caerulea. Comprehensive knowledge of indirect shoot regeneration and optimized protocol can be obtained by the application of a combination of machine learning (ML) and optimization algorithms. MATERIALS AND METHODS In the present investigation, the indirect shoot regeneration responses (i.e., de novo shoot regeneration rate, the number of de novo shoots, and length of de novo shoots) of P. caerulea were predicted based on different types and concentrations of PGRs (i.e., TDZ, BAP, PUT, KIN, and IBA) as well as callus types (i.e., callus derived from different explants including leaf, node, and internode) using generalized regression neural network (GRNN) and random forest (RF). Moreover, the developed models were integrated into the genetic algorithm (GA) to optimize the concentration of PGRs and callus types for maximizing indirect shoot regeneration responses. Moreover, sensitivity analysis was conducted to assess the importance of each input variable on the studied parameters. RESULTS The results showed that both algorithms (RF and GRNN) had high predictive accuracy (R2 > 0.86) in both training and testing sets for modeling all studied parameters. Based on the results of optimization process, the highest de novo shoot regeneration rate (100%) would be obtained from callus derived from nodal segments cultured in the medium supplemented with 0.77 mg/L BAP plus 2.41 mg/L PUT plus 0.06 mg/L IBA. The results of the sensitivity analysis showed the explant-dependent impact of exogenous application of PGRs on indirect de novo shoot regeneration. CONCLUSIONS A combination of ML (GRNN and RF) and GA can display a forward-thinking aid to optimize and predict in vitro culture systems and consequentially cope with several challenges faced currently in Passiflora tissue culture.
Collapse
Affiliation(s)
- Marziyeh Jafari
- Department of Horticultural Science, College of Agriculture, Shiraz University, Shiraz, 7144113131, Iran.
- Department of Horticultural Sciences, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, 6341773637, Iran.
| | - Mohammad Hosein Daneshvar
- Department of Horticultural Sciences, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, 6341773637, Iran
| |
Collapse
|
5
|
Yoosefzadeh-Najafabadi M, Torabi S, Tulpan D, Rajcan I, Eskandari M. Application of SVR-Mediated GWAS for Identification of Durable Genetic Regions Associated with Soybean Seed Quality Traits. PLANTS (BASEL, SWITZERLAND) 2023; 12:2659. [PMID: 37514272 PMCID: PMC10383196 DOI: 10.3390/plants12142659] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/12/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023]
Abstract
Soybean (Glycine max L.) is an important food-grade strategic crop worldwide because of its high seed protein and oil contents. Due to the negative correlation between seed protein and oil percentage, there is a dire need to detect reliable quantitative trait loci (QTL) underlying these traits in order to be used in marker-assisted selection (MAS) programs. Genome-wide association study (GWAS) is one of the most common genetic approaches that is regularly used for detecting QTL associated with quantitative traits. However, the current approaches are mainly focused on estimating the main effects of QTL, and, therefore, a substantial statistical improvement in GWAS is required to detect associated QTL considering their interactions with other QTL as well. This study aimed to compare the support vector regression (SVR) algorithm as a common machine learning method to fixed and random model circulating probability unification (FarmCPU), a common conventional GWAS method in detecting relevant QTL associated with soybean seed quality traits such as protein, oil, and 100-seed weight using 227 soybean genotypes. The results showed a significant negative correlation between soybean seed protein and oil concentrations, with heritability values of 0.69 and 0.67, respectively. In addition, SVR-mediated GWAS was able to identify more relevant QTL underlying the target traits than the FarmCPU method. Our findings demonstrate the potential use of machine learning algorithms in GWAS to detect durable QTL associated with soybean seed quality traits suitable for genomic-based breeding approaches. This study provides new insights into improving the accuracy and efficiency of GWAS and highlights the significance of using advanced computational methods in crop breeding research.
Collapse
Affiliation(s)
| | - Sepideh Torabi
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Dan Tulpan
- Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Istvan Rajcan
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
6
|
Yoosefzadeh Najafabadi M, Hesami M, Eskandari M. Machine Learning-Assisted Approaches in Modernized Plant Breeding Programs. Genes (Basel) 2023; 14:genes14040777. [PMID: 37107535 PMCID: PMC10137951 DOI: 10.3390/genes14040777] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 03/11/2023] [Accepted: 03/21/2023] [Indexed: 04/29/2023] Open
Abstract
In the face of a growing global population, plant breeding is being used as a sustainable tool for increasing food security. A wide range of high-throughput omics technologies have been developed and used in plant breeding to accelerate crop improvement and develop new varieties with higher yield performance and greater resilience to climate changes, pests, and diseases. With the use of these new advanced technologies, large amounts of data have been generated on the genetic architecture of plants, which can be exploited for manipulating the key characteristics of plants that are important for crop improvement. Therefore, plant breeders have relied on high-performance computing, bioinformatics tools, and artificial intelligence (AI), such as machine-learning (ML) methods, to efficiently analyze this vast amount of complex data. The use of bigdata coupled with ML in plant breeding has the potential to revolutionize the field and increase food security. In this review, some of the challenges of this method along with some of the opportunities it can create will be discussed. In particular, we provide information about the basis of bigdata, AI, ML, and their related sub-groups. In addition, the bases and functions of some learning algorithms that are commonly used in plant breeding, three common data integration strategies for the better integration of different breeding datasets using appropriate learning algorithms, and future prospects for the application of novel algorithms in plant breeding will be discussed. The use of ML algorithms in plant breeding will equip breeders with efficient and effective tools to accelerate the development of new plant varieties and improve the efficiency of the breeding process, which are important for tackling some of the challenges facing agriculture in the era of climate change.
Collapse
Affiliation(s)
| | - Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
7
|
Yoosefzadeh-Najafabadi M, Rajcan I, Eskandari M. Optimizing genomic selection in soybean: An important improvement in agricultural genomics. Heliyon 2022; 8:e11873. [PMID: 36468106 PMCID: PMC9713349 DOI: 10.1016/j.heliyon.2022.e11873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 09/26/2022] [Accepted: 11/17/2022] [Indexed: 11/27/2022] Open
Abstract
Fast-paced yield improvement in strategic crops such as soybean is pivotal for achieving sustainable global food security. Precise genomic selection (GS), as one of the most effective genomic tools for recognizing superior genotypes, can accelerate the efficiency of breeding programs through shortening the breeding cycle, resulting in significant increases in annual yield improvement. In this study, we investigated the possible use of haplotype-based GS to increase the prediction accuracy of soybean yield and its component traits through augmenting the models by using sophisticated machine learning algorithms and optimized genetic information. The results demonstrated up to a 7% increase in the prediction accuracy when using haplotype-based GS over the full single nucleotide polymorphisms-based GS methods. In addition, we discover an auspicious haplotype block on chromosome 19 with significant impacts on yield and its components, which can be used for screening climate-resilient soybean genotypes with improved yield in large breeding populations.
Collapse
Affiliation(s)
| | - Istvan Rajcan
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
8
|
Zarbakhsh S, Shahsavar AR. Artificial neural network-based model to predict the effect of γ-aminobutyric acid on salinity and drought responsive morphological traits in pomegranate. Sci Rep 2022; 12:16662. [PMID: 36198905 PMCID: PMC9534893 DOI: 10.1038/s41598-022-21129-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 09/22/2022] [Indexed: 11/20/2022] Open
Abstract
Recently, γ-Aminobutyric acid (GABA) has been introduced as a treatment with high physiological activity induction to enhance the ability of plants against drought and salinity stress, which led to a decline in plant growth. Since changes in morphological traits to drought and salinity stress are influenced by multiple factors, advanced computational analysis has great potential for computing nonlinear and multivariate data. In this work, the effect of four input variables including GABA concentration, pomegranate cultivars, days of treatment, and drought and salinity stress evaluated to predict and modeling of morphological traits using artificial neural network (ANN) models including multilayer perceptron (MLP) and radial basis function (RBF). Image processing technique was used to measure the LLI, LWI, and LAI parameters. Among the ANNs applied, the MLP algorithm was chosen as the best model based on the highest accuracy. Furthermore, to predict and estimate the optimal values of input variables for achieving the best morphological parameters, the MLP algorithm was linked to a non-dominated sorting genetic algorithm-II (NSGA-II). Based on the results of MLP-NSGA-II, the best values of crown diameter (18.42 cm), plant height (151.82 cm), leaf length index (5.67 cm), leaf width index (1.76 cm), and leaf area index (13.82 cm) could be achieved with applying 10.57 mM GABA on ‘Atabaki’ cultivar under control (non-stress) condition after 20.8 days. The results of modeling and optimization can be helpful to predict the morphological responses to drought and salinity conditions.
Collapse
Affiliation(s)
- Saeedeh Zarbakhsh
- Department of Horticultural Science, College of Agriculture, Shiraz University, Shiraz, Iran
| | - Ali Reza Shahsavar
- Department of Horticultural Science, College of Agriculture, Shiraz University, Shiraz, Iran.
| |
Collapse
|
9
|
Hesami M, Alizadeh M, Jones AMP, Torkamaneh D. Machine learning: its challenges and opportunities in plant system biology. Appl Microbiol Biotechnol 2022; 106:3507-3530. [PMID: 35575915 DOI: 10.1007/s00253-022-11963-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/14/2022] [Accepted: 05/07/2022] [Indexed: 12/25/2022]
Abstract
Sequencing technologies are evolving at a rapid pace, enabling the generation of massive amounts of data in multiple dimensions (e.g., genomics, epigenomics, transcriptomic, metabolomics, proteomics, and single-cell omics) in plants. To provide comprehensive insights into the complexity of plant biological systems, it is important to integrate different omics datasets. Although recent advances in computational analytical pipelines have enabled efficient and high-quality exploration and exploitation of single omics data, the integration of multidimensional, heterogenous, and large datasets (i.e., multi-omics) remains a challenge. In this regard, machine learning (ML) offers promising approaches to integrate large datasets and to recognize fine-grained patterns and relationships. Nevertheless, they require rigorous optimizations to process multi-omics-derived datasets. In this review, we discuss the main concepts of machine learning as well as the key challenges and solutions related to the big data derived from plant system biology. We also provide in-depth insight into the principles of data integration using ML, as well as challenges and opportunities in different contexts including multi-omics, single-cell omics, protein function, and protein-protein interaction. KEY POINTS: • The key challenges and solutions related to the big data derived from plant system biology have been highlighted. • Different methods of data integration have been discussed. • Challenges and opportunities of the application of machine learning in plant system biology have been highlighted and discussed.
Collapse
Affiliation(s)
- Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | - Milad Alizadeh
- Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | | | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, G1V 0A6, Canada. .,Institut de Biologie Intégrative Et Des Systèmes (IBIS), Université Laval, Québec City, QC, G1V 0A6, Canada.
| |
Collapse
|
10
|
Yoosefzadeh-Najafabadi M, Eskandari M, Torabi S, Torkamaneh D, Tulpan D, Rajcan I. Machine-Learning-Based Genome-Wide Association Studies for Uncovering QTL Underlying Soybean Yield and Its Components. Int J Mol Sci 2022; 23:5538. [PMID: 35628351 PMCID: PMC9141736 DOI: 10.3390/ijms23105538] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 05/11/2022] [Accepted: 05/13/2022] [Indexed: 12/14/2022] Open
Abstract
A genome-wide association study (GWAS) is currently one of the most recommended approaches for discovering marker-trait associations (MTAs) for complex traits in plant species. Insufficient statistical power is a limiting factor, especially in narrow genetic basis species, that conventional GWAS methods are suffering from. Using sophisticated mathematical methods such as machine learning (ML) algorithms may address this issue and advance the implication of this valuable genetic method in applied plant-breeding programs. In this study, we evaluated the potential use of two ML algorithms, support-vector machine (SVR) and random forest (RF), in a GWAS and compared them with two conventional methods of mixed linear models (MLM) and fixed and random model circulating probability unification (FarmCPU), for identifying MTAs for soybean-yield components. In this study, important soybean-yield component traits, including the number of reproductive nodes (RNP), non-reproductive nodes (NRNP), total nodes (NP), and total pods (PP) per plant along with yield and maturity, were assessed using a panel of 227 soybean genotypes evaluated at two locations over two years (four environments). Using the SVR-mediated GWAS method, we were able to discover MTAs colocalized with previously reported quantitative trait loci (QTL) with potential causal effects on the target traits, supported by the functional annotation of candidate gene analyses. This study demonstrated the potential benefit of using sophisticated mathematical approaches, such as SVR, in a GWAS to complement conventional GWAS methods for identifying MTAs that can improve the efficiency of genomic-based soybean-breeding programs.
Collapse
Affiliation(s)
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.Y.-N.); (S.T.); (I.R.)
| | - Sepideh Torabi
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.Y.-N.); (S.T.); (I.R.)
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC G1V 0A6, Canada;
| | - Dan Tulpan
- Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada;
| | - Istvan Rajcan
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.Y.-N.); (S.T.); (I.R.)
| |
Collapse
|
11
|
Zhang P, Li D. YOLO-VOLO-LS: A Novel Method for Variety Identification of Early Lettuce Seedlings. FRONTIERS IN PLANT SCIENCE 2022; 13:806878. [PMID: 35283870 PMCID: PMC8909383 DOI: 10.3389/fpls.2022.806878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 02/04/2022] [Indexed: 06/14/2023]
Abstract
Accurate identification of crop varieties is an important aspect of smart agriculture, which is not only essential for the management of later crop differences, but also has a significant effect on unmanned operations in planting scenarios such as facility greenhouses. In this study, five kinds of lettuce under the cultivation conditions of greenhouses were used as the research object, and a classification model of lettuce varieties with multiple growth stages was established. First of all, we used the-state-of-the-art method VOLO-D1 to establish a variety classification model for the 7 growth stages of the entire growth process. The results found that the performance of the lettuce variety classification model in the SP stage needs to be improved, but the classification effect of the model at other stages is close to 100%; Secondly, based on the challenges of the SP stage dataset, we combined the advantages of the target detection mechanism and the target classification mechanism, innovatively proposed a new method of variety identification for the SP stage, called YOLO-VOLO-LS. Finally, we used this method to model and analyze the classification of lettuce varieties in the SP stage. The result shows that the method can achieve excellent results of 95.961, 93.452, 96.059, 96.014, 96.039 in Val-acc, Test-acc, Recall, Precision, F1-score, respectively. Therefore, the method proposed in this study has a certain reference value for the accurate identification of varieties in the early growth stage of crops.
Collapse
Affiliation(s)
- Pan Zhang
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing, China
- China-EU Center for Information and Communication Technologies in Agriculture, China Agriculture University, Beijing, China
- Key Laboratory of Agricultural Information Acquisition Technology, Ministry of Agriculture, China Agriculture University, Beijing, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| | - Daoliang Li
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing, China
- China-EU Center for Information and Communication Technologies in Agriculture, China Agriculture University, Beijing, China
- Key Laboratory of Agricultural Information Acquisition Technology, Ministry of Agriculture, China Agriculture University, Beijing, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| |
Collapse
|
12
|
Ramezanpour MR, Farajpour M. Application of artificial neural networks and genetic algorithm to predict and optimize greenhouse banana fruit yield through nitrogen, potassium and magnesium. PLoS One 2022; 17:e0264040. [PMID: 35157736 PMCID: PMC8843134 DOI: 10.1371/journal.pone.0264040] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 01/31/2022] [Indexed: 12/18/2022] Open
Abstract
The excess of the chemical fertilizers not only causes the environmental pollution but also has many deteriorating effects including global warming and alteration of soil microbial diversity. In conventional researches, chemical fertilizers and their concentrations are selected based on the knowledge of experts involved in the projects, which this kind of models are usually subjective. Therefore, the present study aimed to introduce the optimal concentrations of three macro elements including nitrogen (0, 100, and 200 g), potassium (0, 100, 200, and 300 g), and magnesium (0, 50, and 100 g) on fruit yield (FY), fruit length (FL), and number of rows per spike (NRPS) of greenhouse banana using analysis of variance (ANOVA) followed by post hoc LSD test and two well-known artificial neural networks (ANNs) including multilayer perceptron (MLP) and generalized regression neural network (GRNN). According to the results of ANOVA, the highest mean value of the FY was obtained with 200 g of N, 300 g of K, and 50 g of Mg. Based on the results of the present study, the both ANNs models had high predictive accuracy (R2 = 0.66-0.99) in the both training and testing data for the FY, FL, and NRPS. However, the GRNN model had better performance than MLP model for modeling and predicting the three characters of greenhouse banana. Therefore, genetic algorithm (GA) was subjected to the GRNN model in order to find the optimal amounts of N, K, and Mg for achieving the high amounts of the FY, FL, and NRPS. The GRNN-GA hybrid model confirmed that high yield of the plant could be achieved by reducing chemical fertilizers including nitrogen, potassium, and magnesium by 65, 44, and 62%, respectively, in compared to traditional method.
Collapse
Affiliation(s)
- Mahmoud Reza Ramezanpour
- Soil and Water Research Department, Mazandaran Agricultural and Natural Resources Research and Education Center, AREEO, Sari, Iran
| | - Mostafa Farajpour
- Crop and Horticultural Science Research Department, Mazandaran Agricultural and Natural Resources Research and Education Center, AREEO, Sari, Iran
| |
Collapse
|
13
|
Yoosefzadeh-Najafabadi M, Eskandari M, Belzile F, Torkamaneh D. Genome-Wide Association Study Statistical Models: A Review. Methods Mol Biol 2022; 2481:43-62. [PMID: 35641758 DOI: 10.1007/978-1-0716-2237-7_4] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Statistical models are at the core of the genome-wide association study (GWAS). In this chapter, we provide an overview of single- and multilocus statistical models, Bayesian, and machine learning approaches for association studies in plants. These models are discussed based on their basic methodology, cofactors adjustment accounted for, statistical power and computational efficiency. New statistical models and machine learning algorithms are both showing improved performance in detecting missed signals, rare mutations and prioritizing causal genetic variants; nevertheless, further optimization and validation studies are required to maximize the power of GWAS.
Collapse
Affiliation(s)
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Quebec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Quebec City, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada.
| |
Collapse
|
14
|
Yoosefzadeh-Najafabadi M, Torabi S, Tulpan D, Rajcan I, Eskandari M. Genome-Wide Association Studies of Soybean Yield-Related Hyperspectral Reflectance Bands Using Machine Learning-Mediated Data Integration Methods. FRONTIERS IN PLANT SCIENCE 2021; 12:777028. [PMID: 34880894 PMCID: PMC8647880 DOI: 10.3389/fpls.2021.777028] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 10/18/2021] [Indexed: 05/12/2023]
Abstract
In conjunction with big data analysis methods, plant omics technologies have provided scientists with cost-effective and promising tools for discovering genetic architectures of complex agronomic traits using large breeding populations. In recent years, there has been significant progress in plant phenomics and genomics approaches for generating reliable large datasets. However, selecting an appropriate data integration and analysis method to improve the efficiency of phenome-phenome and phenome-genome association studies is still a bottleneck. This study proposes a hyperspectral wide association study (HypWAS) approach as a phenome-phenome association analysis through a hierarchical data integration strategy to estimate the prediction power of hyperspectral reflectance bands in predicting soybean seed yield. Using HypWAS, five important hyperspectral reflectance bands in visible, red-edge, and near-infrared regions were identified significantly associated with seed yield. The phenome-genome association analysis of each tested hyperspectral reflectance band was performed using two conventional genome-wide association studies (GWAS) methods and a machine learning mediated GWAS based on the support vector regression (SVR) method. Using SVR-mediated GWAS, more relevant QTL with the physiological background of the tested hyperspectral reflectance bands were detected, supported by the functional annotation of candidate gene analyses. The results of this study have indicated the advantages of using hierarchical data integration strategy and advanced mathematical methods coupled with phenome-phenome and phenome-genome association analyses for a better understanding of the biology and genetic backgrounds of hyperspectral reflectance bands affecting soybean yield formation. The identified yield-related hyperspectral reflectance bands using HypWAS can be used as indirect selection criteria for selecting superior genotypes with improved yield genetic gains in large breeding populations.
Collapse
Affiliation(s)
| | - Sepideh Torabi
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| | - Dan Tulpan
- Department of Animal Biosciences, University of Guelph, Guelph, ON, Canada
| | - Istvan Rajcan
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
15
|
Vogel JT, Liu W, Olhoft P, Crafts-Brandner SJ, Pennycooke JC, Christiansen N. Soybean Yield Formation Physiology - A Foundation for Precision Breeding Based Improvement. FRONTIERS IN PLANT SCIENCE 2021; 12:719706. [PMID: 34868106 PMCID: PMC8634342 DOI: 10.3389/fpls.2021.719706] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 10/11/2021] [Indexed: 05/25/2023]
Abstract
The continued improvement of crop yield is a fundamental driver in agriculture and is the goal of both plant breeders and researchers. Plant breeders have been remarkably successful in improving crop yield, as demonstrated by the continued release of varieties with improved yield potential. This has largely been accomplished through performance-based selection, without specific knowledge of the molecular mechanisms underpinning these improvements. Insight into molecular mechanisms has been provided by plant molecular, genetic, and biochemical research through elucidation of the function of genes and pathways that underlie many of the physiological processes that contribute to yield potential. Despite this knowledge, the impact of most genes and pathways on yield components have not been tested in key crops or in a field environment for yield assessment. This gap is difficult to bridge, but field-based physiological knowledge offers a starting point for leveraging molecular targets to successfully apply precision breeding technologies such as genome editing. A better understanding of both the molecular mechanisms underlying crop yield physiology and yield limiting processes under field conditions is essential for elucidating which combinations of favorable alleles are required for yield improvement. Consequently, one goal in plant biology should be to more fully integrate crop physiology, breeding, genetics, and molecular knowledge to identify impactful precision breeding targets for relevant yield traits. The foundation for this is an understanding of yield formation physiology. Here, using soybean as an example, we provide a top-down review of yield physiology, starting with the fact that yield is derived from a population of plants growing together in a community. We review yield and yield-related components to provide a basic overview of yield physiology, synthesizing these concepts to highlight how such knowledge can be leveraged for soybean improvement. Using genome editing as an example, we discuss why multiple disciplines must be brought together to fully realize the promise of precision breeding-based crop improvement.
Collapse
|
16
|
Pepe M, Hesami M, Jones AMP. Machine Learning-Mediated Development and Optimization of Disinfection Protocol and Scarification Method for Improved In Vitro Germination of Cannabis Seeds. PLANTS (BASEL, SWITZERLAND) 2021; 10:plants10112397. [PMID: 34834760 PMCID: PMC8619272 DOI: 10.3390/plants10112397] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 11/01/2021] [Accepted: 11/05/2021] [Indexed: 05/22/2023]
Abstract
In vitro seed germination is a useful tool for developing a variety of biotechnologies, but cannabis has presented some challenges in uniformity and germination time, presumably due to the disinfection procedure. Disinfection and subsequent growth are influenced by many factors, such as media pH, temperature, as well as the types and levels of contaminants and disinfectants, which contribute independently and dynamically to system complexity and nonlinearity. Hence, artificial intelligence models are well suited to model and optimize this dynamic system. The current study was aimed to evaluate the effect of different types and concentrations of disinfectants (sodium hypochlorite, hydrogen peroxide) and immersion times on contamination frequency using the generalized regression neural network (GRNN), a powerful artificial neural network (ANN). The GRNN model had high prediction performance (R2 > 0.91) in both training and testing. Moreover, a genetic algorithm (GA) was subjected to the GRNN to find the optimal type and level of disinfectants and immersion time to determine the best methods for contamination reduction. According to the optimization process, 4.6% sodium hypochlorite along with 0.008% hydrogen peroxide for 16.81 min would result in the best outcomes. The results of a validation experiment demonstrated that this protocol resulted in 0% contamination as predicted, but germination rates were low and sporadic. However, using this sterilization protocol in combination with the scarification of in vitro cannabis seed (seed tip removal) resulted in 0% contamination and 100% seed germination within one week.
Collapse
|
17
|
Pepe M, Hesami M, Small F, Jones AMP. Comparative Analysis of Machine Learning and Evolutionary Optimization Algorithms for Precision Micropropagation of Cannabis sativa: Prediction and Validation of in vitro Shoot Growth and Development Based on the Optimization of Light and Carbohydrate Sources. FRONTIERS IN PLANT SCIENCE 2021; 12:757869. [PMID: 34745189 PMCID: PMC8566924 DOI: 10.3389/fpls.2021.757869] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 09/30/2021] [Indexed: 05/03/2023]
Abstract
Micropropagation techniques offer opportunity to proliferate, maintain, and study dynamic plant responses in highly controlled environments without confounding external influences, forming the basis for many biotechnological applications. With medicinal and recreational interests for Cannabis sativa L. growing, research related to the optimization of in vitro practices is needed to improve current methods while boosting our understanding of the underlying physiological processes. Unfortunately, due to the exorbitantly large array of factors influencing tissue culture, existing approaches to optimize in vitro methods are tedious and time-consuming. Therefore, there is great potential to use new computational methodologies for analyzing data to develop improved protocols more efficiently. Here, we first tested the effects of light qualities using assorted combinations of Red, Blue, Far Red, and White spanning 0-100 μmol/m2/s in combination with sucrose concentrations ranging from 1 to 6% (w/v), totaling 66 treatments, on in vitro shoot growth, root development, number of nodes, shoot emergence, and canopy surface area. Collected data were then assessed using multilayer perceptron (MLP), generalized regression neural network (GRNN), and adaptive neuro-fuzzy inference system (ANFIS) to model and predict in vitro Cannabis growth and development. Based on the results, GRNN had better performance than MLP or ANFIS and was consequently selected to link different optimization algorithms [genetic algorithm (GA), biogeography-based optimization (BBO), interior search algorithm (ISA), and symbiotic organisms search (SOS)] for prediction of optimal light levels (quality/intensity) and sucrose concentration for various applications. Predictions of in vitro conditions to refine growth responses were subsequently tested in a validation experiment and data showed no significant differences between predicted optimized values and observed data. Thus, this study demonstrates the potential of machine learning and optimization algorithms to predict the most favorable light combinations and sucrose levels to elicit specific developmental responses. Based on these, recommendations of light and carbohydrate levels to promote specific developmental outcomes for in vitro Cannabis are suggested. Ultimately, this work showcases the importance of light quality and carbohydrate supply in directing plant development as well as the power of machine learning approaches to investigate complex interactions in plant tissue culture.
Collapse
Affiliation(s)
- Marco Pepe
- Department of Plant Agriculture, Gosling Research Institute for Plant Preservation, University of Guelph, Guelph, ON, Canada
| | - Mohsen Hesami
- Department of Plant Agriculture, Gosling Research Institute for Plant Preservation, University of Guelph, Guelph, ON, Canada
| | - Finlay Small
- Department of Research and Development, Entourage Health Corp., Guelph, ON, Canada
| | - Andrew Maxwell Phineas Jones
- Department of Plant Agriculture, Gosling Research Institute for Plant Preservation, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
18
|
Using Hybrid Artificial Intelligence and Evolutionary Optimization Algorithms for Estimating Soybean Yield and Fresh Biomass Using Hyperspectral Vegetation Indices. REMOTE SENSING 2021. [DOI: 10.3390/rs13132555] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Recent advanced high-throughput field phenotyping combined with sophisticated big data analysis methods have provided plant breeders with unprecedented tools for a better prediction of important agronomic traits, such as yield and fresh biomass (FBIO), at early growth stages. This study aimed to demonstrate the potential use of 35 selected hyperspectral vegetation indices (HVI), collected at the R5 growth stage, for predicting soybean seed yield and FBIO. Two artificial intelligence algorithms, ensemble-bagging (EB) and deep neural network (DNN), were used to predict soybean seed yield and FBIO using HVI. Considering HVI as input variables, the coefficients of determination (R2) of 0.76 and 0.77 for yield and 0.91 and 0.89 for FBIO were obtained using DNN and EB, respectively. In this study, we also used hybrid DNN-SPEA2 to estimate the optimum HVI values in soybeans with maximized yield and FBIO productions. In addition, to identify the most informative HVI in predicting yield and FBIO, the feature recursive elimination wrapper method was used and the top ranking HVI were determined to be associated with red, 670 nm and near-infrared, 800 nm, regions. Overall, this study introduced hybrid DNN-SPEA2 as a robust mathematical tool for optimizing and using informative HVI for estimating soybean seed yield and FBIO at early growth stages, which can be employed by soybean breeders for discriminating superior genotypes in large breeding populations.
Collapse
|