1
|
Silva JCF, Ferreira MA, Carvalho TFM, Silva FF, de A. Silveira S, Brommonschenkel SH, Fontes EPB. RLPredictiOme, a Machine Learning-Derived Method for High-Throughput Prediction of Plant Receptor-like Proteins, Reveals Novel Classes of Transmembrane Receptors. Int J Mol Sci 2022; 23:ijms232012176. [PMID: 36293031 PMCID: PMC9603095 DOI: 10.3390/ijms232012176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 10/08/2022] [Accepted: 10/09/2022] [Indexed: 11/16/2022] Open
Abstract
Cell surface receptors play essential roles in perceiving and processing external and internal signals at the cell surface of plants and animals. The receptor-like protein kinases (RLK) and receptor-like proteins (RLPs), two major classes of proteins with membrane receptor configuration, play a crucial role in plant development and disease defense. Although RLPs and RLKs share a similar single-pass transmembrane configuration, RLPs harbor short divergent C-terminal regions instead of the conserved kinase domain of RLKs. This RLP receptor structural design precludes sequence comparison algorithms from being used for high-throughput predictions of the RLP family in plant genomes, as has been extensively performed for RLK superfamily predictions. Here, we developed the RLPredictiOme, implemented with machine learning models in combination with Bayesian inference, capable of predicting RLP subfamilies in plant genomes. The ML models were simultaneously trained using six types of features, along with three stages to distinguish RLPs from non-RLPs (NRLPs), RLPs from RLKs, and classify new subfamilies of RLPs in plants. The ML models achieved high accuracy, precision, sensitivity, and specificity for predicting RLPs with relatively high probability ranging from 0.79 to 0.99. The prediction of the method was assessed with three datasets, two of which contained leucine-rich repeats (LRR)-RLPs from Arabidopsis and rice, and the last one consisted of the complete set of previously described Arabidopsis RLPs. In these validation tests, more than 90% of known RLPs were correctly predicted via RLPredictiOme. In addition to predicting previously characterized RLPs, RLPredictiOme uncovered new RLP subfamilies in the Arabidopsis genome. These include probable lipid transfer (PLT)-RLP, plastocyanin-like-RLP, ring finger-RLP, glycosyl-hydrolase-RLP, and glycerophosphoryldiester phosphodiesterase (GDPD, GDPDL)-RLP subfamilies, yet to be characterized. Compared to the only Arabidopsis GDPDL-RLK, molecular evolution studies confirmed that the ectodomain of GDPDL-RLPs might have undergone a purifying selection with a predominance of synonymous substitutions. Expression analyses revealed that predicted GDPGL-RLPs display a basal expression level and respond to developmental and biotic signals. The results of these biological assays indicate that these subfamily members have maintained functional domains during evolution and may play relevant roles in development and plant defense. Therefore, RLPredictiOme provides a framework for genome-wide surveys of the RLP superfamily as a foundation to rationalize functional studies of surface receptors and their relationships with different biological processes.
Collapse
Affiliation(s)
- Jose Cleydson F. Silva
- National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Viçosa 36570-900, Brazil
| | - Marco Aurélio Ferreira
- Departament of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil
| | - Thales F. M. Carvalho
- Institute of Engineering, Science and Technology, Universidade Federal dos Vales do Jequitinhonha e Mucuri, Janaúba 39447-814, Brazil
| | - Fabyano F. Silva
- Departament of Animal Science, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil
| | - Sabrina de A. Silveira
- Department of Computer Science, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil
| | | | - Elizabeth P. B. Fontes
- Departament of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil
- Correspondence:
| |
Collapse
|
2
|
Santana TEZ, Silva JCF, Silva LOCD, Alvarenga AB, Menezes GRDO, Torres RA, Duarte MDS, Silva FFE. Genome-enabled classification of stayability in Nellore cattle under a machine learning framework. Livest Sci 2022. [DOI: 10.1016/j.livsci.2022.104935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
3
|
Martins LGC, Raimundo GAS, Ribeiro NGA, Silva JCF, Euclydes NC, Loriato VAP, Duarte CEM, Fontes EPB. A Begomovirus Nuclear Shuttle Protein-Interacting Immune Hub: Hijacking Host Transport Activities and Suppressing Incompatible Functions. Front Plant Sci 2020; 11:398. [PMID: 32322262 PMCID: PMC7156597 DOI: 10.3389/fpls.2020.00398] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/19/2020] [Indexed: 05/21/2023]
Abstract
Begomoviruses (Geminiviridae family) represent a severe constraint to agriculture worldwide. As ssDNA viruses that replicate in the nuclei of infected cells, the nascent viral DNA has to move to the cytoplasm and then to the adjacent cell to cause disease. The begomovirus nuclear shuttle protein (NSP) assists the intracellular transport of viral DNA from the nucleus to the cytoplasm and cooperates with the movement protein (MP) for the cell-to-cell translocation of viral DNA to uninfected cells. As a facilitator of intra- and intercellular transport of viral DNA, NSP is predicted to associate with host proteins from the nuclear export machinery, the intracytoplasmic active transport system, and the cell-to-cell transport complex. Furthermore, NSP functions as a virulence factor that suppresses antiviral immunity against begomoviruses. In this review, we focus on the protein-protein network that converges on NSP with a high degree of centrality and forms an immune hub against begomoviruses. We also describe the compatible host functions hijacked by NSP to promote the nucleocytoplasmic and intracytoplasmic movement of viral DNA. Finally, we discuss the NSP virulence function as a suppressor of the recently described NSP-interacting kinase 1 (NIK1)-mediated antiviral immunity. Understanding the NSP-host protein-protein interaction (PPI) network will probably pave the way for strategies to generate more durable resistance against begomoviruses.
Collapse
|
4
|
Silva JCF, Teixeira RM, Silva FF, Brommonschenkel SH, Fontes EPB. Machine learning approaches and their current application in plant molecular biology: A systematic review. Plant Sci 2019; 284:37-47. [PMID: 31084877 DOI: 10.1016/j.plantsci.2019.03.020] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 02/28/2019] [Accepted: 03/26/2019] [Indexed: 05/19/2023]
Abstract
Machine learning (ML) is a field of artificial intelligence that has rapidly emerged in molecular biology, thus allowing the exploitation of Big Data concepts in plant genomics. In this context, the main challenges are given in terms of how to analyze massive datasets and extract new knowledge in all levels of cellular systems research. In summary, ML techniques allow complex interactions to be inferred in several biological systems. Despite its potential, ML has been underused due to complex computational algorithms and definition terms. Therefore, a systematic review to disentangle ML approaches is relevant for plant scientists and has been considered in this study. We presented the main steps for ML development (from data selection to evaluation of classification/prediction models) with a respective discussion approaching functional genomics mainly in terms of pathogen effector genes in plant immunity. Additionally, we also considered how to access public source databases under an ML framework towards advancing plant molecular biology and introduced novel powerful tools, such as deep learning.
Collapse
Affiliation(s)
- Jose Cleydson F Silva
- National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Universidade Federal de Viçosa, Av. PH Rolfs s/n, Centro, Viçosa, MG, 36570-000, Brazil; Department of Biochemistry and Molecular Biology/Bioagro, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Ruan M Teixeira
- National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Universidade Federal de Viçosa, Av. PH Rolfs s/n, Centro, Viçosa, MG, 36570-000, Brazil; Department of Biochemistry and Molecular Biology/Bioagro, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Fabyano F Silva
- Department of Animal Science, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Sergio H Brommonschenkel
- National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Universidade Federal de Viçosa, Av. PH Rolfs s/n, Centro, Viçosa, MG, 36570-000, Brazil; Plant Pathology Department /Bioagro, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Elizabeth P B Fontes
- National Institute of Science and Technology in Plant-Pest Interactions, Bioagro, Universidade Federal de Viçosa, Av. PH Rolfs s/n, Centro, Viçosa, MG, 36570-000, Brazil; Department of Biochemistry and Molecular Biology/Bioagro, Universidade Federal de Viçosa, Viçosa, MG, Brazil.
| |
Collapse
|
5
|
Silva JCF, Carvalho TFM, Basso MF, Deguchi M, Pereira WA, Sobrinho RR, Vidigal PMP, Brustolini OJB, Silva FF, Dal-Bianco M, Fontes RLF, Santos AA, Zerbini FM, Cerqueira FR, Fontes EPB. Geminivirus data warehouse: a database enriched with machine learning approaches. BMC Bioinformatics 2017; 18:240. [PMID: 28476106 PMCID: PMC5420152 DOI: 10.1186/s12859-017-1646-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 04/25/2017] [Indexed: 03/28/2023] Open
Abstract
BACKGROUND The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. RESULTS Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus.org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. CONCLUSIONS The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus.org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.
Collapse
Affiliation(s)
- Jose Cleydson F Silva
- Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.,National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | - Marcos F Basso
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Michihito Deguchi
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Welison A Pereira
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Roberto R Sobrinho
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Pedro M P Vidigal
- Núcleo de Biomoléculas, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Otávio J B Brustolini
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Fabyano F Silva
- Departamento de Zootecnia, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Maximiller Dal-Bianco
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | - Anésia A Santos
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.,Departamento de Biologia Geral, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Francisco Murilo Zerbini
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil.,Departamento de Fitopatologia, Universidade Federal de Viçosa, Viçosa, MG, Brazil
| | - Fabio R Cerqueira
- Departamento de Informática, Universidade Federal de Viçosa, Viçosa, Brazil.,Departamento de Engenharia de Produção, Universidade Federal Fluminense, Petrópolis, Rio de Janeiro, Brazil
| | - Elizabeth P B Fontes
- National Institute of Science and Technology in Plant-Pest Interactions/BIOAGRO, Universidade Federal de Viçosa, Viçosa, Brazil. .,Departamento de Bioquímica e Biologia Molecular, Universidade Federal de Viçosa, Viçosa, Brazil.
| |
Collapse
|