1
|
Konovalova A. Components Subcellular Localization: Cell Surface Exposure. Methods Mol Biol 2024; 2715:99-110. [PMID: 37930524 DOI: 10.1007/978-1-0716-3445-5_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2023]
Abstract
Surface-exposed proteins of Gram-negative bacteria are represented by integral outer membrane β-barrel proteins and lipoproteins. There are no computational methods to predict surface-exposed lipoproteins, and therefore lipoprotein topology must be experimentally tested. This chapter describes several distinct but complementary methods for detection of surface-exposed proteins: cell surface protein labeling, accessibility to extracellular protease or antibodies, and SpyTag/SpyCatcher system.
Collapse
Affiliation(s)
- Anna Konovalova
- Department of Microbiology and Molecular Genetics, McGovern Medical School, The University of Texas Health Science Center at Houston (UTHealth), Houston, TX, USA.
| |
Collapse
|
2
|
Zuo R, Xie M, Gao F, Liu J, Tang M, Cheng X, Liu Y, Bai Z, Liu S. Genome-wide identification and functional exploration of the legume lectin genes in Brassica napus and their roles in Sclerotinia disease resistance. FRONTIERS IN PLANT SCIENCE 2022; 13:963263. [PMID: 35968144 PMCID: PMC9374194 DOI: 10.3389/fpls.2022.963263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 06/28/2022] [Indexed: 06/15/2023]
Abstract
As one of the largest classes of lectins, legume lectins have a variety of desirable features such as antibacterial and insecticidal activities as well as anti-abiotic stress ability. The Sclerotinia disease (SD) caused by the soil-borne fungus Sclerotinia sclerotiorum is a devastating disease affecting most oil crops such as Brassica napus. Here, we identified 130 legume lectin (LegLu) genes in B. napus, which could be phylogenetically classified into seven clusters. The BnLegLu gene family has been significantly expanded since the whole-genome duplication (WGD) or segmental duplication. Gene structure and conserved motif analysis suggested that the BnLegLu genes were well conserved in each cluster. Moreover, relative to those genes only containing the legume lectin domain in cluster VI-VII, the genes in cluster I-V harbored a transmembrane domain and a kinase domain linked to the legume lectin domain in the C terminus. The expression of most BnLegLu genes was relatively low in various tissues. Thirty-five BnLegLu genes were responsive to abiotic stress, and 40 BnLegLu genes were strongly induced by S. sclerotiorum, with a most significant up-regulation of 715-fold, indicating their functional roles in SD resistance. Four BnLegLu genes were located in the candidate regions of genome-wide association analysis (GWAS) results which resulted from a worldwide rapeseed population consisting of 324 accessions associated with SD. Among them, the positive role of BnLegLus-16 in SD resistance was validated by transient expression in tobacco leaves. This study provides important information on BnLegLu genes, particularly about their roles in SD resistance, which may help targeted functional research and genetic improvement in the breeding of B. napus.
Collapse
Affiliation(s)
- Rong Zuo
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | - Meili Xie
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | - Feng Gao
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | - Jie Liu
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | | | - Xiaohui Cheng
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | - Yueying Liu
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | - Zetao Bai
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| | - Shengyi Liu
- The Key Laboratory of Biology and Genetic Improvement of Oil Crops, The Ministry of Agriculture and Rural Affairs of PRC, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China
| |
Collapse
|
3
|
Evaluating the Performance of PPE44, HSPX, ESAT-6 and CFP-10 Factors in Tuberculosis Subunit Vaccines. Curr Microbiol 2022; 79:260. [PMID: 35852636 PMCID: PMC9295111 DOI: 10.1007/s00284-022-02949-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 06/23/2022] [Indexed: 11/26/2022]
Abstract
Mycobacterium tuberculosis (M. tuberculosis) is an intracellular pathogen causing long-term infection in humans that mainly attacks macrophages and can escape from the immune system with the various mechanisms. The only FDA-approved vaccine against M. tuberculosis (MTB) is Mycobacterium bovis bacillus Calmette-Guérin (BCG). The protection of this vaccine typically lasts 10–15 years. Due to the increasing number of people becoming ill with MTB each year worldwide, the need to develop a new effective treatment against the disease has been increased. During the past two decades, the research budget for TB vaccine has quadrupled to over half a billion dollars. Most of these research projects were based on amplifying and stimulating the response of T-cells and developing the subunit vaccines. Additionally, these studies have demonstrated that secretory and immunogenic proteins of MTB play a key role in the pathogenesis of the bacteria. Therefore, these proteins were used to develop the new subunit vaccines. In this review, based on the use of these proteins in the successful new subunit vaccines, the PPE44, HSPX, CFP-10 and ESAT-6 antigens were selected and the role of these antigens in designing and developing new subunit vaccines against TB and for the prevention of TB were investigated.
Collapse
|
4
|
Alballa M, Butler G. Integrative approach for detecting membrane proteins. BMC Bioinformatics 2020; 21:575. [PMID: 33349234 PMCID: PMC7751106 DOI: 10.1186/s12859-020-03891-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 11/18/2020] [Indexed: 11/16/2022] Open
Abstract
Background Membrane proteins are key gates that control various vital cellular functions. Membrane proteins are often detected using transmembrane topology prediction tools. While transmembrane topology prediction tools can detect integral membrane proteins, they do not address surface-bound proteins. In this study, we focused on finding the best techniques for distinguishing all types of membrane proteins. Results This research first demonstrates the shortcomings of merely using transmembrane topology prediction tools to detect all types of membrane proteins. Then, the performance of various feature extraction techniques in combination with different machine learning algorithms was explored. The experimental results obtained by cross-validation and independent testing suggest that applying an integrative approach that combines the results of transmembrane topology prediction and position-specific scoring matrix (Pse-PSSM) optimized evidence-theoretic k nearest neighbor (OET-KNN) predictors yields the best performance. Conclusion The integrative approach outperforms the state-of-the-art methods in terms of accuracy and MCC, where the accuracy reached a 92.51% in independent testing, compared to the 89.53% and 79.42% accuracies achieved by the state-of-the-art methods.
Collapse
Affiliation(s)
- Munira Alballa
- Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada. .,College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia.
| | - Gregory Butler
- Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada.,Centre for Structural and Functional Genomics, Concordia University, Montreal, QC, 24105, Canada
| |
Collapse
|
5
|
Abstract
Surface-exposed proteins of Gram-negative bacteria are represented by integral outer membrane beta-barrel proteins and lipoproteins. No computational methods exist for predicting surface-exposed lipoproteins, and therefore lipoprotein topology must be experimentally tested. This chapter describes three distinct but complementary methods for the detection of surface-exposed proteins: cell surface protein labeling, accessibility to extracellular protease and antibodies.
Collapse
|
6
|
Venko K, Roy Choudhury A, Novič M. Computational Approaches for Revealing the Structure of Membrane Transporters: Case Study on Bilitranslocase. Comput Struct Biotechnol J 2017; 15:232-242. [PMID: 28228927 PMCID: PMC5312651 DOI: 10.1016/j.csbj.2017.01.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Revised: 01/19/2017] [Accepted: 01/20/2017] [Indexed: 11/23/2022] Open
Abstract
The structural and functional details of transmembrane proteins are vastly underexplored, mostly due to experimental difficulties regarding their solubility and stability. Currently, the majority of transmembrane protein structures are still unknown and this present a huge experimental and computational challenge. Nowadays, thanks to X-ray crystallography or NMR spectroscopy over 3000 structures of membrane proteins have been solved, among them only a few hundred unique ones. Due to the vast biological and pharmaceutical interest in the elucidation of the structure and the functional mechanisms of transmembrane proteins, several computational methods have been developed to overcome the experimental gap. If combined with experimental data the computational information enables rapid, low cost and successful predictions of the molecular structure of unsolved proteins. The reliability of the predictions depends on the availability and accuracy of experimental data associated with structural information. In this review, the following methods are proposed for in silico structure elucidation: sequence-dependent predictions of transmembrane regions, predictions of transmembrane helix–helix interactions, helix arrangements in membrane models, and testing their stability with molecular dynamics simulations. We also demonstrate the usage of the computational methods listed above by proposing a model for the molecular structure of the transmembrane protein bilitranslocase. Bilitranslocase is bilirubin membrane transporter, which shares similar tissue distribution and functional properties with some of the members of the Organic Anion Transporter family and is the only member classified in the Bilirubin Transporter Family. Regarding its unique properties, bilitranslocase is a potentially interesting drug target.
Collapse
Affiliation(s)
- Katja Venko
- Department of Cheminformatics, National Institute of Chemistry, Ljubljana, Slovenia
| | - A Roy Choudhury
- Department of Cheminformatics, National Institute of Chemistry, Ljubljana, Slovenia
| | - Marjana Novič
- Department of Cheminformatics, National Institute of Chemistry, Ljubljana, Slovenia
| |
Collapse
|
7
|
Yin X, Xu YY, Shen HB. Enhancing the Prediction of Transmembrane β-Barrel Segments with Chain Learning and Feature Sparse Representation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:1016-1026. [PMID: 26887010 DOI: 10.1109/tcbb.2016.2528000] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Transmembrane β-barrels (TMBs) are one important class of membrane proteins that play crucial functions in the cell. Membrane proteins are difficult wet-lab targets of structural biology, which call for accurate computational prediction approaches. Here, we developed a novel method named MemBrain-TMB to predict the spanning segments of transmembrane β-barrel from amino acid sequence. MemBrain-TMB is a statistical machine learning-based model, which is constructed using a new chain learning algorithm with input features encoded by the image sparse representation approach. We considered the relative status information between neighboring residues for enhancing the performance, and the matrix of features was translated into feature image by sparse coding algorithm for noise and dimension reduction. To deal with the diverse loop length problem, we applied a dynamic threshold method, which is particularly useful for enhancing the recognition of short loops and tight turns. Our experiments demonstrate that the new protocol designed in MemBrain-TMB effectively helps improve prediction performance.
Collapse
|
8
|
Zhang L, Wang H, Yan L, Su L, Xu D. OMPcontact: An Outer Membrane Protein Inter-Barrel Residue Contact Prediction Method. J Comput Biol 2016; 24:217-228. [PMID: 27513917 DOI: 10.1089/cmb.2015.0236] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In the two transmembrane protein types, outer membrane proteins (OMPs) perform diverse important biochemical functions, including substrate transport and passive nutrient uptake and intake. Hence their 3D structures are expected to reveal these functions. Because experimental structures are scarce, predicted 3D structures are more adapted to OMP research instead, and the inter-barrel residue contact is becoming one of the most remarkable features, improving prediction accuracy by describing the structural information of OMPs. To predict OMP structures accurately, we explored an OMP inter-barrel residue contact prediction method: OMPcontact. Multiple OMP-specific features were integrated in the method, including residue evolutionary covariation, topology-based transmembrane segment relative residue position, OMP lipid layer accessibility, and residue evolution conservation. These features describe the properties of a residue pair in different respects: sequential, structural, evolutionary, and biochemical. Within a 3-residues slide window, a Support Vector Machine (SVM) could accurately determinate the inter-barrel contact residue pair using above features. A 5-fold cross-valuation process was applied in testing the OMPcontact performance against a non-redundant OMP set with 75 samples inside. The tests compared four evolutionary covariation methods and screen analyzed the adaptive ones for inter-barrel contact prediction. The results showed our method not only efficiently realized the prediction, but also scored the possibility for residue pairs reliably. This is expected to improve OMP tertiary structure prediction. Therefore, OMPcontact will be helpful in compiling a structural census of outer membrane protein.
Collapse
Affiliation(s)
- Li Zhang
- 1 School of Computer Science and Technology, Jilin University , Changchun, China .,4 School of Computer Science and Engineering, Changchun University of Technology , Changchun, China
| | - Han Wang
- 2 School of Computer Science and Information Technology, Northeast Normal University , Changchun, China
| | - Lun Yan
- 1 School of Computer Science and Technology, Jilin University , Changchun, China
| | - Lingtao Su
- 1 School of Computer Science and Technology, Jilin University , Changchun, China
| | - Dong Xu
- 3 Department of Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri , Columbia, Missouri, U.S.A
| |
Collapse
|
9
|
All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences. Proc Natl Acad Sci U S A 2015; 112:5413-8. [PMID: 25858953 DOI: 10.1073/pnas.1419956112] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand-strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.
Collapse
|
10
|
Leman JK, Ulmschneider MB, Gray JJ. Computational modeling of membrane proteins. Proteins 2015; 83:1-24. [PMID: 25355688 PMCID: PMC4270820 DOI: 10.1002/prot.24703] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Revised: 10/01/2014] [Accepted: 10/18/2014] [Indexed: 02/06/2023]
Abstract
The determination of membrane protein (MP) structures has always trailed that of soluble proteins due to difficulties in their overexpression, reconstitution into membrane mimetics, and subsequent structure determination. The percentage of MP structures in the protein databank (PDB) has been at a constant 1-2% for the last decade. In contrast, over half of all drugs target MPs, only highlighting how little we understand about drug-specific effects in the human body. To reduce this gap, researchers have attempted to predict structural features of MPs even before the first structure was experimentally elucidated. In this review, we present current computational methods to predict MP structure, starting with secondary structure prediction, prediction of trans-membrane spans, and topology. Even though these methods generate reliable predictions, challenges such as predicting kinks or precise beginnings and ends of secondary structure elements are still waiting to be addressed. We describe recent developments in the prediction of 3D structures of both α-helical MPs as well as β-barrels using comparative modeling techniques, de novo methods, and molecular dynamics (MD) simulations. The increase of MP structures has (1) facilitated comparative modeling due to availability of more and better templates, and (2) improved the statistics for knowledge-based scoring functions. Moreover, de novo methods have benefited from the use of correlated mutations as restraints. Finally, we outline current advances that will likely shape the field in the forthcoming decade.
Collapse
Affiliation(s)
- Julia Koehler Leman
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Martin B. Ulmschneider
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
11
|
Leman JK, Mueller R, Karakas M, Woetzel N, Meiler J. Simultaneous prediction of protein secondary structure and transmembrane spans. Proteins 2013; 81:1127-40. [PMID: 23349002 DOI: 10.1002/prot.24258] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2013] [Revised: 01/03/2013] [Accepted: 01/09/2013] [Indexed: 11/06/2022]
Abstract
Prediction of transmembrane spans and secondary structure from the protein sequence is generally the first step in the structural characterization of (membrane) proteins. Preference of a stretch of amino acids in a protein to form secondary structure and being placed in the membrane are correlated. Nevertheless, current methods predict either secondary structure or individual transmembrane states. We introduce a method that simultaneously predicts the secondary structure and transmembrane spans from the protein sequence. This approach not only eliminates the necessity to create a consensus prediction from possibly contradicting outputs of several predictors but bears the potential to predict conformational switches, i.e., sequence regions that have a high probability to change for example from a coil conformation in solution to an α-helical transmembrane state. An artificial neural network was trained on databases of 177 membrane proteins and 6048 soluble proteins. The output is a 3 × 3 dimensional probability matrix for each residue in the sequence that combines three secondary structure types (helix, strand, coil) and three environment types (membrane core, interface, solution). The prediction accuracies are 70.3% for nine possible states, 73.2% for three-state secondary structure prediction, and 94.8% for three-state transmembrane span prediction. These accuracies are comparable to state-of-the-art predictors of secondary structure (e.g., Psipred) or transmembrane placement (e.g., OCTOPUS). The method is available as web server and for download at www.meilerlab.org.
Collapse
Affiliation(s)
- Julia Koehler Leman
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee; Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, USA
| | | | | | | | | |
Collapse
|
12
|
Hayat S, Elofsson A. Ranking models of transmembrane β-barrel proteins using Z-coordinate predictions. Bioinformatics 2013; 28:i90-6. [PMID: 22689784 PMCID: PMC3371865 DOI: 10.1093/bioinformatics/bts233] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Motivation: Transmembrane β-barrels exist in the outer membrane of gram-negative bacteria as well as in chloroplast and mitochondria. They are often involved in transport processes and are promising antimicrobial drug targets. Structures of only a few β-barrel protein families are known. Therefore, a method that could automatically generate such models would be valuable. The symmetrical arrangement of the barrels suggests that an approach based on idealized geometries may be successful. Results: Here, we present tobmodel; a method for generating 3D models of β-barrel transmembrane proteins. First, alternative topologies are obtained from the BOCTOPUS topology predictor. Thereafter, several 3D models are constructed by using different angles of the β-sheets. Finally, the best model is selected based on agreement with a novel predictor, ZPRED3, which predicts the distance from the center of the membrane for each residue, i.e. the Z-coordinate. The Z-coordinate prediction has an average error of 1.61 Å. Tobmodel predicts the correct topology for 75% of the proteins in the dataset which is a slight improvement over BOCTOPUS alone. More importantly, however, tobmodel provides a Cα template with an average RMSD of 7.24 Å from the native structure. Availability: Tobmodel is freely available as a web server at: http://tobmodel.cbr.su.se/. The datasets used for training and evaluations are also available from this site. Contact:arne@bioinfo.se
Collapse
Affiliation(s)
- Sikander Hayat
- Center for Biomembrane Research, Department of Biochemistry and Biophysics, Stockholm Bioinformatics Center, Science for Life Laboratory, Swedish E-science Research Center, Stockholm University, SE-10691 Stockholm, Sweden
| | | |
Collapse
|
13
|
Motomura K, Fujita T, Tsutsumi M, Kikuzato S, Nakamura M, Otaki JM. Word decoding of protein amino Acid sequences with availability analysis: a linguistic approach. PLoS One 2012; 7:e50039. [PMID: 23185527 PMCID: PMC3503725 DOI: 10.1371/journal.pone.0050039] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 10/15/2012] [Indexed: 11/19/2022] Open
Abstract
The amino acid sequences of proteins determine their three-dimensional structures and functions. However, how sequence information is related to structures and functions is still enigmatic. In this study, we show that at least a part of the sequence information can be extracted by treating amino acid sequences of proteins as a collection of English words, based on a working hypothesis that amino acid sequences of proteins are composed of short constituent amino acid sequences (SCSs) or "words". We first confirmed that the English language highly likely follows Zipf's law, a special case of power law. We found that the rank-frequency plot of SCSs in proteins exhibits a similar distribution when low-rank tails are excluded. In comparison with natural English and "compressed" English without spaces between words, amino acid sequences of proteins show larger linear ranges and smaller exponents with heavier low-rank tails, demonstrating that the SCS distribution in proteins is largely scale-free. A distribution pattern of SCSs in proteins is similar among species, but species-specific features are also present. Based on the availability scores of SCSs, we found that sequence motifs are enriched in high-availability sites (i.e., "key words") and vice versa. In fact, the highest availability peak within a given protein sequence often directly corresponds to a sequence motif. The amino acid composition of high-availability sites within motifs is different from that of entire motifs and all protein sequences, suggesting the possible functional importance of specific SCSs and their compositional amino acids within motifs. We anticipate that our availability-based word decoding approach is complementary to sequence alignment approaches in predicting functionally important sites of unknown proteins from their amino acid sequences.
Collapse
Affiliation(s)
- Kenta Motomura
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Nishihara, Okinawa, Japan
- Department of Information Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| | - Tomohiro Fujita
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| | - Motosuke Tsutsumi
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| | - Satsuki Kikuzato
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| | - Morikazu Nakamura
- Department of Information Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| | - Joji M. Otaki
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| |
Collapse
|
14
|
Procópio L, Macrae A, van Elsas JD, Seldin L. The putative α/β-hydrolases of Dietzia cinnamea P4 strain as potential enzymes for biocatalytic applications. Antonie van Leeuwenhoek 2012; 103:635-46. [PMID: 23142860 DOI: 10.1007/s10482-012-9847-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 10/31/2012] [Indexed: 10/27/2022]
Abstract
The draft genome of the soil actinomycete Dietzia cinnamea P4 reveals a versatile group of α/β-hydrolase fold enzymes. Phylogenetic and comparative sequence analyses were used to classify the α/β-hydrolases of strain P4 into six different groups: (i) lipases, (ii) esterases, (iii) epoxide hydrolases, (iv) haloacid dehalogenases, (v) C-C breaking enzymes and (vi) serine peptidases. The high number of lipases/esterases (41) and epoxide hydrolase enzymes (14) present in the relatively small (3.6 Mb) P4 genome is unusual; it is likely to be linked to the survival of strain P4 in its natural environment. Strain P4 is thus equipped with a large number of genes which would appear to confer survivability in harsh hot tropical soil. As such, this highly resilient soil bacterial strain provides an interesting genome for enzyme mining for applications in the field of biotransformations of polymeric compounds.
Collapse
Affiliation(s)
- Luciano Procópio
- Laboratório de Genética Microbiana, Departamento de Microbiologia Geral, Instituto de Microbiologia Prof. Paulo de Góes, Centro de Ciências da Saúde, Universidade Federal do Rio de Janeiro, Bloco I, Ilha do Fundão, Rio de Janeiro, RJ, CEP 21941.590, Brazil
| | | | | | | |
Collapse
|
15
|
Nugent T, Jones DT. Membrane protein structural bioinformatics. J Struct Biol 2012; 179:327-37. [DOI: 10.1016/j.jsb.2011.10.008] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2011] [Accepted: 10/25/2011] [Indexed: 10/15/2022]
|
16
|
Abstract
MOTIVATION We previously reported the development of a highly accurate statistical algorithm for identifying β-barrel outer membrane proteins or transmembrane β-barrels (TMBBs), from genomic sequence data of Gram-negative bacteria (Freeman,T.C. and Wimley,W.C. (2010) Bioinformatics, 26, 1965-1974). We have now applied this identification algorithm to all available Gram-negative bacterial genomes (over 600 chromosomes) and have constructed a publicly available, searchable, up-to-date, database of all proteins in these genomes. RESULTS For each protein in the database, there is information on (i) β-barrel membrane protein probability for identification of β-barrels, (ii) β-strand and β-hairpin propensity for structure and topology prediction, (iii) signal sequence score because most TMBBs are secreted through the inner membrane translocon and, thus, have a signal sequence, and (iv) transmembrane α-helix predictions, for reducing false positive predictions. This information is sufficient for the accurate identification of most β-barrel membrane proteins in these genomes. In the database there are nearly 50 000 predicted TMBBs (out of 1.9 million total putative proteins). Of those, more than 15 000 are 'hypothetical' or 'putative' proteins, not previously identified as TMBBs. This wealth of genomic information is not available anywhere else. AVAILABILITY The TMBB genomic database is available at http://beta-barrel.tulane.edu/. CONTACT wwimley@tulane.edu.
Collapse
Affiliation(s)
- Thomas C Freeman
- Department of Biochemistry, Tulane University, New Orleans, LA 70112, USA
| | | |
Collapse
|
17
|
Hayat S, Elofsson A. BOCTOPUS: improved topology prediction of transmembrane β barrel proteins. Bioinformatics 2012; 28:516-22. [DOI: 10.1093/bioinformatics/btr710] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
18
|
Computational studies of membrane proteins: models and predictions for biological understanding. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2011; 1818:927-41. [PMID: 22051023 DOI: 10.1016/j.bbamem.2011.09.026] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Revised: 09/22/2011] [Accepted: 09/26/2011] [Indexed: 01/26/2023]
Abstract
We discuss recent progresses in computational studies of membrane proteins based on physical models with parameters derived from bioinformatics analysis. We describe computational identification of membrane proteins and prediction of their topology from sequence, discovery of sequence and spatial motifs, and implications of these discoveries. The detection of evolutionary signal for understanding the substitution pattern of residues in the TM segments and for sequence alignment is also discussed. We further discuss empirical potential functions for energetics of inserting residues in the TM domain, for interactions between TM helices or strands, and their applications in predicting lipid-facing surfaces of the TM domain. Recent progresses in structure predictions of membrane proteins are also reviewed, with further discussions on calculation of ensemble properties such as melting temperature based on simplified state space model. Additional topics include prediction of oligomerization state of membrane proteins, identification of the interfaces for protein-protein interactions, and design of membrane proteins. This article is part of a Special Issue entitled: Protein Folding in Membranes.
Collapse
|