1
|
Taveira IC, Carraro CB, Nogueira KMV, Pereira LMS, Bueno JGR, Fiamenghi MB, dos Santos LV, Silva RN. Structural and biochemical insights of xylose MFS and SWEET transporters in microbial cell factories: challenges to lignocellulosic hydrolysates fermentation. Front Microbiol 2024; 15:1452240. [PMID: 39397797 PMCID: PMC11466781 DOI: 10.3389/fmicb.2024.1452240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 09/16/2024] [Indexed: 10/15/2024] Open
Abstract
The production of bioethanol from lignocellulosic biomass requires the efficient conversion of glucose and xylose to ethanol, a process that depends on the ability of microorganisms to internalize these sugars. Although glucose transporters exist in several species, xylose transporters are less common. Several types of transporters have been identified in diverse microorganisms, including members of the Major Facilitator Superfamily (MFS) and Sugars Will Eventually be Exported Transporter (SWEET) families. Considering that Saccharomyces cerevisiae lacks an effective xylose transport system, engineered yeast strains capable of efficiently consuming this sugar are critical for obtaining high ethanol yields. This article reviews the structure-function relationship of sugar transporters from the MFS and SWEET families. It provides information on several tools and approaches used to identify and characterize them to optimize xylose consumption and, consequently, second-generation ethanol production.
Collapse
Affiliation(s)
- Iasmin Cartaxo Taveira
- Molecular Biotechnology Laboratory, Department of Biochemistry and Immunology, Ribeirao Preto Medical School (FMRP), University of São Paulo, São Paulo, Brazil
| | - Cláudia Batista Carraro
- Molecular Biotechnology Laboratory, Department of Biochemistry and Immunology, Ribeirao Preto Medical School (FMRP), University of São Paulo, São Paulo, Brazil
| | - Karoline Maria Vieira Nogueira
- Molecular Biotechnology Laboratory, Department of Biochemistry and Immunology, Ribeirao Preto Medical School (FMRP), University of São Paulo, São Paulo, Brazil
| | - Lucas Matheus Soares Pereira
- Molecular Biotechnology Laboratory, Department of Biochemistry and Immunology, Ribeirao Preto Medical School (FMRP), University of São Paulo, São Paulo, Brazil
| | - João Gabriel Ribeiro Bueno
- Genetics and Molecular Biology Graduate Program, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| | - Mateus Bernabe Fiamenghi
- Genetics and Molecular Biology Graduate Program, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| | - Leandro Vieira dos Santos
- Genetics and Molecular Biology Graduate Program, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
- Manchester Institute of Biotechnology, University of Manchester, Manchester, United Kingdom
| | - Roberto N. Silva
- Molecular Biotechnology Laboratory, Department of Biochemistry and Immunology, Ribeirao Preto Medical School (FMRP), University of São Paulo, São Paulo, Brazil
| |
Collapse
|
2
|
Ghazikhani H, Butler G. Exploiting protein language models for the precise classification of ion channels and ion transporters. Proteins 2024; 92:998-1055. [PMID: 38656743 DOI: 10.1002/prot.26694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 03/26/2024] [Accepted: 04/08/2024] [Indexed: 04/26/2024]
Abstract
This study introduces TooT-PLM-ionCT, a comprehensive framework that consolidates three distinct systems, each meticulously tailored for one of the following tasks: distinguishing ion channels (ICs) from membrane proteins (MPs), segregating ion transporters (ITs) from MPs, and differentiating ICs from ITs. Drawing upon the strengths of six Protein Language Models (PLMs)-ProtBERT, ProtBERT-BFD, ESM-1b, ESM-2 (650M parameters), and ESM-2 (15B parameters), TooT-PLM-ionCT employs a combination of traditional classifiers and deep learning models for nuanced protein classification. Originally validated on an existing dataset by previous researchers, our systems demonstrated superior performance in identifying ITs from MPs and distinguishing ICs from ITs, with the IC-MP discrimination achieving state-of-the-art results. In light of recommendations for additional validation, we introduced a new dataset, significantly enhancing the robustness and generalization of our models across bioinformatics challenges. This new evaluation underscored the effectiveness of TooT-PLM-ionCT in adapting to novel data while maintaining high classification accuracy. Furthermore, this study explores critical factors affecting classification accuracy, such as dataset balancing, the impact of using frozen versus fine-tuned PLM representations, and the variance between half and full precision in floating-point computations. To facilitate broader application and accessibility, a web server (https://tootsuite.encs.concordia.ca/service/TooT-PLM-ionCT) has been developed, allowing users to evaluate unknown protein sequences through our specialized systems for IC-MP, IT-MP, and IC-IT classification tasks.
Collapse
Affiliation(s)
- Hamed Ghazikhani
- Department of Computer Science and Software Engineering, Concordia University, Montréal, Québec, Canada
| | - Gregory Butler
- Centre for Structural and Functional Genomics, Concordia University, Montréal, Québec, Canada
| |
Collapse
|
3
|
Anteghini M, Santos VAMD, Saccenti E. PortPred: Exploiting deep learning embeddings of amino acid sequences for the identification of transporter proteins and their substrates. J Cell Biochem 2023; 124:1803-1824. [PMID: 37877557 DOI: 10.1002/jcb.30490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/29/2023] [Accepted: 10/03/2023] [Indexed: 10/26/2023]
Abstract
The physiology of every living cell is regulated at some level by transporter proteins which constitute a relevant portion of membrane-bound proteins and are involved in the movement of ions, small and macromolecules across bio-membranes. The importance of transporter proteins is unquestionable. The prediction and study of previously unknown transporters can lead to the discovery of new biological pathways, drugs and treatments. Here we present PortPred, a tool to accurately identify transporter proteins and their substrate starting from the protein amino acid sequence. PortPred successfully combines pre-trained deep learning-based protein embeddings and machine learning classification approaches and outperforms other state-of-the-art methods. In addition, we present a comparison of the most promising protein sequence embeddings (Unirep, SeqVec, ProteinBERT, ESM-1b) and their performances for this specific task.
Collapse
Affiliation(s)
- Marco Anteghini
- LifeGlimmer GmbH, Berlin, Germany
- Department of Systems and Synthetic Biology, Wageningen University & Research, Wageningen WE, The Netherlands
- Department of Visual and Data-Centric Computing, Zuse Institute Berlin, Berlin, Germany
| | - Vitor Ap Martins Dos Santos
- LifeGlimmer GmbH, Berlin, Germany
- Department of Bioprocess Engineering, Wageningen University & Research, Wageningen WE, The Netherlands
| | - Edoardo Saccenti
- Department of Systems and Synthetic Biology, Wageningen University & Research, Wageningen WE, The Netherlands
| |
Collapse
|
4
|
Ghazikhani H, Butler G. Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks. J Integr Bioinform 2023; 0:jib-2022-0055. [PMID: 37497772 PMCID: PMC10389051 DOI: 10.1515/jib-2022-0055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 06/21/2023] [Indexed: 07/28/2023] Open
Abstract
Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and functions remain largely elusive. Recently, natural language processing (NLP) techniques have shown promise in the analysis of protein sequences. Bidirectional Encoder Representations from Transformers (BERT) is an NLP technique adapted for proteins to learn contextual embeddings of individual amino acids within a protein sequence. Our previous strategy, TooT-BERT-T, differentiated transporters from non-transporters by employing a logistic regression classifier with fine-tuned representations from ProtBERT-BFD. In this study, we expand upon this approach by utilizing representations from ProtBERT, ProtBERT-BFD, and MembraneBERT in combination with classical classifiers. Additionally, we introduce TooT-BERT-CNN-T, a novel method that fine-tunes ProtBERT-BFD and discriminates transporters using a Convolutional Neural Network (CNN). Our experimental results reveal that CNN surpasses traditional classifiers in discriminating transporters from non-transporters, achieving an MCC of 0.89 and an accuracy of 95.1 % on the independent test set. This represents an improvement of 0.03 and 1.11 percentage points compared to TooT-BERT-T, respectively.
Collapse
Affiliation(s)
- Hamed Ghazikhani
- Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
| | - Gregory Butler
- Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
| |
Collapse
|
5
|
Wang Q, Xu T, Xu K, Lu Z, Ying J. Prediction of transport proteins from sequence information with the deep learning approach. Comput Biol Med 2023; 160:106974. [PMID: 37167658 DOI: 10.1016/j.compbiomed.2023.106974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 04/17/2023] [Accepted: 04/22/2023] [Indexed: 05/13/2023]
Abstract
Transport proteins (TPs) are vital to the growth and life of all living things, especially in fields of microbial pathogenesis and drug resistance of tumor cells. Accurately identifying potential TPs remains an important challenge for the advancement of functional genomics. This study aimed to develop a tool for predicting TPs using the deep learning approach. Here, we proposed DeepTP, a convolutional neural network model that uses parallel subnetworks to extract features from protein sequences and uses fully connected layers for TP classification. To train and evaluate the performance of the developed model, datasets were collected from the UniProtKB/Swiss-Prot database. The test results revealed that the proposed model could successfully identify TPs with the AUCROC, accuracy, F-value, and Matthews correlation coefficient of 0.9719, 0.9513, 0.8982, and 0.8679, respectively. By further comparison, DeepTP achieved better performance than other commonly used methods. Analysis of the gradients of prediction score concerning input suggested that DeepTP makes predictions by recognizing the functional domains of TPs. We anticipate that DeepTP will serve as a useful tool for predicting TPs in large-scale genome projects, which will facilitate the discovery of novel TPs.
Collapse
Affiliation(s)
- Qian Wang
- Department of Clinical Laboratory, Wenzhou People's Hospital, The Third Affiliated Hospital of Shanghai University, The Third Clinical Institute Affiliated to Wenzhou Medical University, Wenzhou, China
| | - Teng Xu
- Institute of Translational Medicine, Baotou Central Hospital, Baotou, China
| | - Kai Xu
- Department of Clinical Laboratory, Wenzhou People's Hospital, The Third Affiliated Hospital of Shanghai University, The Third Clinical Institute Affiliated to Wenzhou Medical University, Wenzhou, China
| | - Zhongqiu Lu
- Wenzhou Key Laboratory of Emergency, Critical Care, and Disaster Medicine, Department of Emergency, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| | - Jianchao Ying
- Central Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China; Wenzhou Key Laboratory of Emergency, Critical Care, and Disaster Medicine, Department of Emergency, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
| |
Collapse
|