1
|
Fiamenghi MB, Bueno JGR, Camargo AP, Borelli G, Carazzolle MF, Pereira GAG, dos Santos LV, José J. Machine learning and comparative genomics approaches for the discovery of xylose transporters in yeast. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2022; 15:57. [PMID: 35596177 PMCID: PMC9123741 DOI: 10.1186/s13068-022-02153-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 05/05/2022] [Indexed: 11/15/2022]
Abstract
Background The need to mitigate and substitute the use of fossil fuels as the main energy matrix has led to the study and development of biofuels as an alternative. Second-generation (2G) ethanol arises as one biofuel with great potential, due to not only maintaining food security, but also as a product from economically interesting crops such as energy-cane. One of the main challenges of 2G ethanol is the inefficient uptake of pentose sugars by industrial yeast Saccharomyces cerevisiae, the main organism used for ethanol production. Understanding the main drivers for xylose assimilation and identify novel and efficient transporters is a key step to make the 2G process economically viable. Results By implementing a strategy of searching for present motifs that may be responsible for xylose transport and past adaptations of sugar transporters in xylose fermenting species, we obtained a classifying model which was successfully used to select four different candidate transporters for evaluation in the S. cerevisiae hxt-null strain, EBY.VW4000, harbouring the xylose consumption pathway. Yeast cells expressing the transporters SpX, SpH and SpG showed a superior uptake performance in xylose compared to traditional literature control Gxf1. Conclusions Modelling xylose transport with the small data available for yeast and bacteria proved a challenge that was overcome through different statistical strategies. Through this strategy, we present four novel xylose transporters which expands the repertoire of candidates targeting yeast genetic engineering for industrial fermentation. The repeated use of the model for characterizing new transporters will be useful both into finding the best candidates for industrial utilization and to increase the model’s predictive capabilities. Graphical Abstract ![]()
Supplementary Information The online version contains supplementary material available at 10.1186/s13068-022-02153-7.
Collapse
|
2
|
Effects of Polyethylene Microplastics and Phenanthrene on Soil Properties, Enzyme Activities and Bacterial Communities. Processes (Basel) 2022. [DOI: 10.3390/pr10102128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Microplastics (MPs) or polycyclic aromatic hydrocarbons (PAHs) pollution has received increasing concern due to their ubiquitous distribution and potential risks in soils. However, nothing is known about the influences of PAHs-MPs combined pollution on soil ecosystems. To address the knowledge gap, a 1-year soil microcosm experiment was conducted to systematically investigate the single and combined effect of polyethylene (PE) /phenanthrene (PHE) on soil chemical properties, enzymatic activities and bacterial communities (i.e., diversity, composition and function). Results showed that PE and PHE-PE significantly decreased soil pH. The available phosphorus (AP) and neutral phosphatase activity were not considerably changed by PHE, PE and PHE-PE. Significant enhancement of dehydrogenase activity in a PHE-PE amended system might be due to the degradation of PHE by indigenous bacteria (i.e., Sphingomonas, Sphingobium), and PE could enhance this stimulative effect. PHE and PHE-PE led to a slight increase in soil organic matter (SOM) and fluorescein diacetate hydrolase (FDAse) activity but a decrease in available nitrogen (AN) and urease activity. PE significantly enhanced the functions of nitrogen cycle and metabolism, reducing SOM/AN contents but increasing urease/FDAse activities. There were insignificant impacts on overall community diversity and composition in treated samples, although some bacterial genera were significantly stimulated or attenuated with treatments. In conclusion, the addition of PHE and PE influenced the soil chemical properties, enzymatic activities and bacterial community diversity/composition to some extent. The significantly positive effect of PE on the nitrogen cycle and on metabolic function might lead to the conspicuous alterations in SOM/AN contents and urease/FDAse activities. This study may provide new basic information for understanding the ecological risk of PAHs-MPs combined pollution in soils.
Collapse
|
3
|
Alkhadrawi AM, Wang Y, Li C. In-silico screening of potential target transporters for glycyrrhetinic acid (GA) via deep learning prediction of drug-target interactions. Biochem Eng J 2022. [DOI: 10.1016/j.bej.2022.108375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
4
|
Ali Shah SM, Taju SW, Ho QT, Nguyen TTD, Ou YY. GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models. Comput Biol Med 2021; 131:104259. [PMID: 33581474 DOI: 10.1016/j.compbiomed.2021.104259] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 02/04/2021] [Accepted: 02/04/2021] [Indexed: 12/14/2022]
Abstract
Recently, language representation models have drawn a lot of attention in the field of natural language processing (NLP) due to their remarkable results. Among them, BERT (Bidirectional Encoder Representations from Transformers) has proven to be a simple, yet powerful language model that has achieved novel state-of-the-art performance. BERT adopted the concept of contextualized word embeddings to capture the semantics and context in which words appear. We utilized pre-trained BERT models to extract features from protein sequences for discriminating three families of glucose transporters: the major facilitator superfamily of glucose transporters (GLUTs), the sodium-glucose linked transporters (SGLTs), and the sugars will eventually be exported transporters (SWEETs). We treated protein sequences as sentences and transformed them into fixed-length meaningful vectors where a 768- or 1024-dimensional vector represents each amino acid. We observed that BERT-Base and BERT-Large models improved the performance by more than 4% in terms of average sensitivity and Matthews correlation coefficient (MCC), indicating the efficiency of this approach. We also developed a bidirectional transformer-based protein model (TransportersBERT) for comparison with existing pre-trained BERT models.
Collapse
Affiliation(s)
- Syed Muazzam Ali Shah
- Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan
| | - Semmy Wellem Taju
- Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan
| | - Quang-Thai Ho
- Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan
| | | | - Yu-Yen Ou
- Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan.
| |
Collapse
|
5
|
Alballa M, Aplop F, Butler G. TranCEP: Predicting the substrate class of transmembrane transport proteins using compositional, evolutionary, and positional information. PLoS One 2020; 15:e0227683. [PMID: 31935244 PMCID: PMC6959595 DOI: 10.1371/journal.pone.0227683] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 12/26/2019] [Indexed: 11/24/2022] Open
Abstract
Transporters mediate the movement of compounds across the membranes that separate the cell from its environment and across the inner membranes surrounding cellular compartments. It is estimated that one third of a proteome consists of membrane proteins, and many of these are transport proteins. Given the increase in the number of genomes being sequenced, there is a need for computational tools that predict the substrates that are transported by the transmembrane transport proteins. In this paper, we present TranCEP, a predictor of the type of substrate transported by a transmembrane transport protein. TranCEP combines the traditional use of the amino acid composition of the protein, with evolutionary information captured in a multiple sequence alignment (MSA), and restriction to important positions of the alignment that play a role in determining the specificity of the protein. Our experimental results show that TranCEP significantly outperforms the state-of-the-art predictors. The results quantify the contribution made by each type of information used.
Collapse
Affiliation(s)
- Munira Alballa
- Department of Computer Science and Software Engineering, Concordia University, Montréal, Québec, Canada
- College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Faizah Aplop
- School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu, Malaysia
| | - Gregory Butler
- Department of Computer Science and Software Engineering, Concordia University, Montréal, Québec, Canada
- Centre for Structural and Functional Genomics, Concordia University, Montréal, Québec, Canada
- * E-mail:
| |
Collapse
|
6
|
Assessing Herb–Drug Interactions of Herbal Products With Therapeutic Agents for Metabolic Diseases: Analytical and Regulatory Perspectives. STUDIES IN NATURAL PRODUCTS CHEMISTRY 2018. [DOI: 10.1016/b978-0-444-64179-3.00009-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
7
|
Dai X, Li J, Liu T, Zhao PX. HRGRN: A Graph Search-Empowered Integrative Database of Arabidopsis Signaling Transduction, Metabolism and Gene Regulation Networks. PLANT & CELL PHYSIOLOGY 2016; 57:e12. [PMID: 26657893 PMCID: PMC4722177 DOI: 10.1093/pcp/pcv200] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 12/07/2015] [Indexed: 05/10/2023]
Abstract
The biological networks controlling plant signal transduction, metabolism and gene regulation are composed of not only tens of thousands of genes, compounds, proteins and RNAs but also the complicated interactions and co-ordination among them. These networks play critical roles in many fundamental mechanisms, such as plant growth, development and environmental response. Although much is known about these complex interactions, the knowledge and data are currently scattered throughout the published literature, publicly available high-throughput data sets and third-party databases. Many 'unknown' yet important interactions among genes need to be mined and established through extensive computational analysis. However, exploring these complex biological interactions at the network level from existing heterogeneous resources remains challenging and time-consuming for biologists. Here, we introduce HRGRN, a graph search-empowered integrative database of Arabidopsis signal transduction, metabolism and gene regulatory networks. HRGRN utilizes Neo4j, which is a highly scalable graph database management system, to host large-scale biological interactions among genes, proteins, compounds and small RNAs that were either validated experimentally or predicted computationally. The associated biological pathway information was also specially marked for the interactions that are involved in the pathway to facilitate the investigation of cross-talk between pathways. Furthermore, HRGRN integrates a series of graph path search algorithms to discover novel relationships among genes, compounds, RNAs and even pathways from heterogeneous biological interaction data that could be missed by traditional SQL database search methods. Users can also build subnetworks based on known interactions. The outcomes are visualized with rich text, figures and interactive network graphs on web pages. The HRGRN database is freely available at http://plantgrn.noble.org/hrgrn/.
Collapse
Affiliation(s)
- Xinbin Dai
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
| | - Jun Li
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
| | - Tingsong Liu
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
| | - Patrick Xuechun Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
| |
Collapse
|
8
|
Gromiha MM, Anoosha P, Velmurugan D, Fukui K. Mutational studies to understand the structure–function relationship in multidrug efflux transporters: Applications for distinguishing mutants with high specificity. Int J Biol Macromol 2015; 75:218-24. [DOI: 10.1016/j.ijbiomac.2015.01.028] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Revised: 01/14/2015] [Accepted: 01/16/2015] [Indexed: 12/21/2022]
|
9
|
Zuo YC, Su WX, Zhang SH, Wang SS, Wu CY, Yang L, Li GP. Discrimination of membrane transporter protein types using K-nearest neighbor method derived from the similarity distance of total diversity measure. MOLECULAR BIOSYSTEMS 2015; 11:950-7. [PMID: 25607774 DOI: 10.1039/c4mb00681j] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Membrane transporters play crucial roles in the fundamental cellular processes of living organisms. Computational techniques are very necessary to annotate the transporter functions. In this study, a multi-class K nearest neighbor classifier based on the increment of diversity (KNN-ID) was developed to discriminate the membrane transporter types when the increment of diversity (ID) was introduced as one of the novel similarity distances. Comparisons with multiple recently published methods showed that the proposed KNN-ID method outperformed the other methods, obtaining more than 20% improvement for overall accuracy. The overall prediction accuracy reached was 83.1%, when the K was selected as 2. The prediction sensitivity achieved 76.7%, 89.1%, 80.1% for channels/pores, electrochemical potential-driven transporters, primary active transporters, respectively. Discrimination and comparison between any two different classes of transporters further demonstrated that the proposed method is a potential classifier and will play a complementary role for facilitating the functional assignment of transporters.
Collapse
Affiliation(s)
- Yong-Chun Zuo
- The Key Laboratory of Mammalian Reproductive Biology and Biotechnology of the Ministry of Education, College of Life Sciences, Inner Mongolia University, Hohhot, 010021, China.
| | | | | | | | | | | | | |
Collapse
|
10
|
Hu Y, Guo Y, Shi Y, Li M, Pu X. A consensus subunit-specific model for annotation of substrate specificity for ABC transporters. RSC Adv 2015. [DOI: 10.1039/c5ra05304h] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A consensus classification model was built by considering three subunit proteins individually to predict the substrate specificity of ABC transporters.
Collapse
Affiliation(s)
- Yayun Hu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yanzhi Guo
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yinan Shi
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Menglong Li
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Xuemei Pu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| |
Collapse
|
11
|
Mishra NK, Chang J, Zhao PX. Prediction of membrane transport proteins and their substrate specificities using primary sequence information. PLoS One 2014; 9:e100278. [PMID: 24968309 PMCID: PMC4072671 DOI: 10.1371/journal.pone.0100278] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 05/23/2014] [Indexed: 11/18/2022] Open
Abstract
Background Membrane transport proteins (transporters) move hydrophilic substrates across hydrophobic membranes and play vital roles in most cellular functions. Transporters represent a diverse group of proteins that differ in topology, energy coupling mechanism, and substrate specificity as well as sequence similarity. Among the functional annotations of transporters, information about their transporting substrates is especially important. The experimental identification and characterization of transporters is currently costly and time-consuming. The development of robust bioinformatics-based methods for the prediction of membrane transport proteins and their substrate specificities is therefore an important and urgent task. Results Support vector machine (SVM)-based computational models, which comprehensively utilize integrative protein sequence features such as amino acid composition, dipeptide composition, physico-chemical composition, biochemical composition, and position-specific scoring matrices (PSSM), were developed to predict the substrate specificity of seven transporter classes: amino acid, anion, cation, electron, protein/mRNA, sugar, and other transporters. An additional model to differentiate transporters from non-transporters was also developed. Among the developed models, the biochemical composition and PSSM hybrid model outperformed other models and achieved an overall average prediction accuracy of 76.69% with a Mathews correlation coefficient (MCC) of 0.49 and a receiver operating characteristic area under the curve (AUC) of 0.833 on our main dataset. This model also achieved an overall average prediction accuracy of 78.88% and MCC of 0.41 on an independent dataset. Conclusions Our analyses suggest that evolutionary information (i.e., the PSSM) and the AAIndex are key features for the substrate specificity prediction of transport proteins. In comparison, similarity-based methods such as BLAST, PSI-BLAST, and hidden Markov models do not provide accurate predictions for the substrate specificity of membrane transport proteins. TrSSP: The Transporter Substrate Specificity Prediction Server, a web server that implements the SVM models developed in this paper, is freely available at http://bioinfo.noble.org/TrSSP.
Collapse
Affiliation(s)
- Nitish K. Mishra
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Junil Chang
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
| | - Patrick X. Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation, Ardmore, Oklahoma, United States of America
- * E-mail:
| |
Collapse
|
12
|
Viereck M, Gaulton A, Digles D, Ecker GF. Transporter taxonomy - a comparison of different transport protein classification schemes. DRUG DISCOVERY TODAY. TECHNOLOGIES 2014; 12:e37-e46. [PMID: 25027374 DOI: 10.1016/j.ddtec.2014.03.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Currently, there are more than 800 well characterized human membrane transport proteins (including channels and transporters) and there are estimates that about 10% (approx. 2000) of all human genes are related to transport. Membrane transport proteins are of interest as potential drug targets, for drug delivery, and as a cause of side effects and drug–drug interactions. In light of the development of Open PHACTS, which provides an open pharmacological space, we analyzed selected membrane transport protein classification schemes (Transporter Classification Database, ChEMBL, IUPHAR/BPS Guide to Pharmacology, and Gene Ontology) for their ability to serve as a basis for pharmacology driven protein classification. A comparison of these membrane transport protein classification schemes by using a set of clinically relevant transporters as use-case reveals the strengths and weaknesses of the different taxonomy approaches.
Collapse
Affiliation(s)
- Michael Viereck
- University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria
| | - Anna Gaulton
- European Molecular Biology Laboratory ©?? European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Digles
- University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria
| | - Gerhard F Ecker
- University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria
| |
Collapse
|
13
|
Probabilistic local reconstruction for k-NN regression and its application to virtual metrology in semiconductor manufacturing. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2013.10.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
14
|
Barghash A, Helms V. Transferring functional annotations of membrane transporters on the basis of sequence similarity and sequence motifs. BMC Bioinformatics 2013; 14:343. [PMID: 24283849 PMCID: PMC4219331 DOI: 10.1186/1471-2105-14-343] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 11/19/2013] [Indexed: 11/30/2022] Open
Abstract
Background Membrane transporters catalyze the transport of small solute molecules across biological barriers such as lipid bilayer membranes. Experimental identification of the transported substrates is very tedious. Once a particular transport mechanism has been identified in one organism, it is thus highly desirable to transfer this information to related transporter sequences in different organisms based on bioinformatics evidence. Results We present a thorough benchmark at which level of sequence identity membrane transporters from Escherichia coli, Saccharomyces cerevisiae, and Arabidopsis thaliana belong to the same families of the Transporter Classification (TC) system, and at what level these membrane transporters mediate the transport of the same substrate. We found that two membrane transporter sequences from different organisms that are aligned with normalized BLAST expectation value better than E-value 1e-8 are highly likely to belong to the same TC family (F-measure around 90%). Enriched sequence motifs identified by MEME at thresholds below 1e-12 support accurate classification into TC families for about two thirds of the sequences (F-measure 80% and higher). For the comparison of transported substrates, we focused on the four largest substrate classes of amino acids, sugars, metal ions, and phosphate. At similar identity thresholds, the nature of the transported substrates was more divergent (F-measure 40 - 75% at the same thresholds) than the TC family membership. Conclusions We suggest an acceptable threshold of 1e-8 for BLAST and HMMER where at least three quarters of the sequences are classified according to the TC system with a reasonably high accuracy. Researchers who wish to apply these thresholds in their studies should multiply these thresholds by the size of the database they search against. Our findings should be useful to those who wish to transfer transporter functional annotations across species.
Collapse
Affiliation(s)
- Ahmad Barghash
- Center for Bioinformatics, Saarland University, Postfach 15 11 50, 66041 Saarbrücken, Germany.
| | | |
Collapse
|
15
|
Ou YY, Chen SA, Chang YM, Velmurugan D, Fukui K, Michael Gromiha M. Identification of efflux proteins using efficient radial basis function networks with position-specific scoring matrices and biochemical properties. Proteins 2013; 81:1634-43. [DOI: 10.1002/prot.24322] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2013] [Revised: 04/11/2013] [Accepted: 04/19/2013] [Indexed: 11/11/2022]
Affiliation(s)
- Yu-Yen Ou
- Department of Computer Science and Engineering; Yuan Ze University; Chung-Li Taiwan
| | - Shu-An Chen
- Department of Computer Science and Engineering; Yuan Ze University; Chung-Li Taiwan
| | - Yun-Min Chang
- Department of Computer Science and Engineering; Yuan Ze University; Chung-Li Taiwan
| | - Devadasan Velmurugan
- Department of Crystallography and Biophysics; University of Madras; Chennai 600025 Tamilnadu India
| | - Kazuhiko Fukui
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST); 2-43 Aomi Koto-ku Tokyo 135-0064 Japan
| | - M. Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology (IIT) Madras; Chennai 600036 Tamilnadu India
| |
Collapse
|
16
|
Gromiha MM, Ou YY. Bioinformatics approaches for functional annotation of membrane proteins. Brief Bioinform 2013; 15:155-68. [DOI: 10.1093/bib/bbt015] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
17
|
Wu J, Liu B, Cheng F, Ramchiary N, Choi SR, Lim YP, Wang XW. Sequencing of chloroplast genome using whole cellular DNA and solexa sequencing technology. FRONTIERS IN PLANT SCIENCE 2012; 3:243. [PMID: 23162558 PMCID: PMC3492724 DOI: 10.3389/fpls.2012.00243] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Accepted: 10/12/2012] [Indexed: 05/25/2023]
Abstract
Sequencing of the chloroplast (cp) genome using traditional sequencing methods has been difficult because of its size (>120 kb) and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the cp genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassicarapa accessions with one lane per accession. In total, 246, 362, and 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16, and FT, respectively. Micro-reads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7-99.8 or 95.5-99.7% of the B. rapa cp genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of cp genome.
Collapse
Affiliation(s)
- Jian Wu
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural SciencesBeijing, China
| | - Bo Liu
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural SciencesBeijing, China
| | - Feng Cheng
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural SciencesBeijing, China
| | - Nirala Ramchiary
- Department of Horticulture, Plant Genome Research Institute, Chungnam National UniversityDaejeon, South Korea
| | - Su Ryun Choi
- Department of Horticulture, Plant Genome Research Institute, Chungnam National UniversityDaejeon, South Korea
| | - Yong Pyo Lim
- Department of Horticulture, Plant Genome Research Institute, Chungnam National UniversityDaejeon, South Korea
| | - Xiao-Wu Wang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural SciencesBeijing, China
| |
Collapse
|
18
|
Schaadt NS, Helms V. Functional classification of membrane transporters and channels based on filtered TM/non-TM amino acid composition. Biopolymers 2012; 97:558-67. [PMID: 22492257 DOI: 10.1002/bip.22043] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Membrane transporters catalyze the transport of small solute molecules across biological barriers such as lipid bilayer membranes. As the experimental annotation of which proteins transport which substrates is incomplete it is highly desirable to develop computational methods that can assist in the classification and substrate annotation of putative membrane transport proteins. Here, we determined the similarity of membrane transporter sequences annotated in the Transport Classification Database (Saier et al., Nucleic Acids Res 2006, 34, D181-D186) and Arabidopsis thaliana membrane transporters annotated in the database Aramemnon (Schwacke et al., Plant Physiol 2003, 131, 16-26). The similarity measure was based on the amino acid composition either considering the full sequences or separately in the transmembrane (TM) and external parts of the sequences. We considered four different substrate sets and three different subfamilies and tried to classify the given proteins into these classes. Family or substrate prediction based on the simple amino acid frequency had an average accuracy of 76%. The differentiation between TM and non-TM regions led to an improved accuracy of 80% on average.
Collapse
Affiliation(s)
- N S Schaadt
- Department of Natural Sciences and Technology III, Center for Bioinformatics, Saarland University, Im Stadtwald, 66123 Saarbrucken, Germany
| | | |
Collapse
|
19
|
Novel family of carbohydrate-binding modules revealed by the genome sequence of Spirochaeta thermophila DSM 6192. Appl Environ Microbiol 2011; 77:5483-9. [PMID: 21685171 DOI: 10.1128/aem.00523-11] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Spirochaeta thermophila is a thermophilic, free-living, and cellulolytic anaerobe. The genome sequence data for this organism have revealed a high density of genes encoding enzymes from more than 30 glycoside hydrolase (GH) families and a noncellulosomal enzyme system for (hemi)cellulose degradation. Functional screening of a fosmid library whose inserts were mapped on the S. thermophila genome sequence allowed the functional annotation of numerous GH open reading frames (ORFs). Seven different GH ORFs from the S. thermophila DSM 6192 genome, all putative β-glycanase ORFs according to sequence similarity analysis, contained a highly conserved novel GH-associated module of unknown function at their C terminus. Four of these GH enzymes were experimentally verified as xylanase, β-glucanase, β-glucanase/carboxymethylcellulase (CMCase), and CMCase. Binding experiments performed with the recombinantly expressed and purified GH-associated module showed that it represents a new carbohydrate-binding module (CBM) that binds to microcrystalline cellulose and is highly specific for this substrate. In the course of this work, the new CBM type was only detected in Spirochaeta, but recently we found sequences with detectable similarity to the module in the draft genomes of Cytophaga fermentans and Mahella australiensis, both of which are phylogenetically very distant from S. thermophila and noncellulolytic, yet inhabit similar environments. This suggests a possibly broad distribution of the module in nature.
Collapse
|
20
|
Chen SA, Ou YY, Lee TY, Gromiha MM. Prediction of transporter targets using efficient RBF networks with PSSM profiles and biochemical properties. ACTA ACUST UNITED AC 2011; 27:2062-7. [PMID: 21653515 DOI: 10.1093/bioinformatics/btr340] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
SUMMARY Transporters are proteins that are involved in the movement of ions or molecules across biological membranes. Currently, our knowledge about the functions of transporters is limited due to the paucity of their 3D structures. Hence, computational techniques are necessary to annotate the functions of transporters. In this work, we focused on an important functional aspect of transporters, namely annotation of targets for transport proteins. We have systematically analyzed four major classes of transporters with different transporter targets: (i) electron, (ii) protein/mRNA, (iii) ion and (iv) others, using amino acid properties. We have developed a radial basis function network-based method for predicting transport targets with amino acid properties and position specific scoring matrix profiles. Our method showed a 10-fold cross-validation accuracy of 90.1, 80.1, 70.3 and 82.3% for electron transporters, protein/mRNA transporters, ion transporters and others, respectively, in a dataset of 543 transporters. We have also evaluated the performance of the method with an independent dataset of 108 proteins and we obtained similar accuracy. We suggest that our method could be an effective tool for functional annotation of transport proteins. AVAILABILITY http://rbf.bioinfo.tw/~sachen/ttrbf.html
Collapse
Affiliation(s)
- Shu-An Chen
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
| | | | | | | |
Collapse
|
21
|
YTPdb: A wiki database of yeast membrane transporters. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2010; 1798:1908-12. [DOI: 10.1016/j.bbamem.2010.06.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2010] [Revised: 05/17/2010] [Accepted: 06/07/2010] [Indexed: 02/04/2023]
|
22
|
Benedito VA, Li H, Dai X, Wandrey M, He J, Kaundal R, Torres-Jerez I, Gomez SK, Harrison MJ, Tang Y, Zhao PX, Udvardi MK. Genomic inventory and transcriptional analysis of Medicago truncatula transporters. PLANT PHYSIOLOGY 2010; 152:1716-30. [PMID: 20023147 PMCID: PMC2832251 DOI: 10.1104/pp.109.148684] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2009] [Accepted: 12/15/2009] [Indexed: 05/20/2023]
Abstract
Transporters move hydrophilic substrates across hydrophobic biological membranes and play key roles in plant nutrition, metabolism, and signaling and, consequently, in plant growth, development, and responses to the environment. To initiate and support systematic characterization of transporters in the model legume Medicago truncatula, we identified 3,830 transporters and classified 2,673 of these into 113 families and 146 subfamilies. Analysis of gene expression data for 2,611 of these transporters identified 129 that are expressed in an organ-specific manner, including 50 that are nodule specific and 36 specific to mycorrhizal roots. Further analysis uncovered 196 transporters that are induced at least 5-fold during nodule development and 44 in roots during arbuscular mycorrhizal symbiosis. Among the nodule- and mycorrhiza-induced transporter genes are many candidates for known transport activities in these beneficial symbioses. The data presented here are a unique resource for the selection and functional characterization of legume transporters.
Collapse
|
23
|
Ou YY, Chen SA, Gromiha MM. Classification of transporters using efficient radial basis function networks with position-specific scoring matrices and biochemical properties. Proteins 2010; 78:1789-97. [DOI: 10.1002/prot.22694] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
24
|
Li H, Benedito VA, Udvardi MK, Zhao PX. TransportTP: a two-phase classification approach for membrane transporter prediction and characterization. BMC Bioinformatics 2009; 10:418. [PMID: 20003433 PMCID: PMC3087344 DOI: 10.1186/1471-2105-10-418] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2009] [Accepted: 12/14/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides. RESULTS In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation. CONCLUSIONS TransportTP is the most effective tool for eukaryotic transporter characterization up to date.
Collapse
Affiliation(s)
- Haiquan Li
- Plant Biology Division, The Samuel Roberts Noble Foundation, Inc, Ardmore, OK 73401, USA.
| | | | | | | |
Collapse
|
25
|
Using auto covariance method for functional discrimination of membrane proteins based on evolution information. Amino Acids 2009; 38:1497-503. [DOI: 10.1007/s00726-009-0362-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2009] [Accepted: 09/24/2009] [Indexed: 11/29/2022]
|
26
|
Gromiha MM, Yabuki Y, Suresh MX, Thangakani AM, Suwa M, Fukui K. TMFunction: database for functional residues in membrane proteins. Nucleic Acids Res 2008; 37:D201-4. [PMID: 18842639 PMCID: PMC2686444 DOI: 10.1093/nar/gkn672] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
We have developed the database TMFunction, which is a collection of more than 2900 experimentally observed functional residues in membrane proteins. Each entry includes the numerical values for the parameters IC50 (measure of the effectiveness of a compound in inhibiting biological function), Vmax (maximal velocity of transport), relative activity of mutants with respect to wild-type protein, binding affinity, dissociation constant, etc., which are important for understanding the sequence–structure–function relationship of membrane proteins. In addition, we have provided information about name and source of the protein, Uniprot and Protein Data Bank codes, mutational and literature information. Furthermore, TMFunction is linked to related databases and other resources. We have set up a web interface with different search and display options so that users have the ability to get the data in several ways. TMFunction is freely available at http://tmbeta-genome.cbrc.jp/TMFunction/.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | | | | | | | | | |
Collapse
|