1
|
Dubey A, Baxter M, Hendargo KJ, Medrano-Soto A, Saier MH. The Pentameric Ligand-Gated Ion Channel Family: A New Member of the Voltage Gated Ion Channel Superfamily? Int J Mol Sci 2024; 25:5005. [PMID: 38732224 PMCID: PMC11084639 DOI: 10.3390/ijms25095005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/28/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024] Open
Abstract
In this report we present seven lines of bioinformatic evidence supporting the conclusion that the Pentameric Ligand-gated Ion Channel (pLIC) Family is a member of the Voltage-gated Ion Channel (VIC) Superfamily. In our approach, we used the Transporter Classification Database (TCDB) as a reference and applied a series of bioinformatic methods to search for similarities between the pLIC family and members of the VIC superfamily. These include: (1) sequence similarity, (2) compatibility of topology and hydropathy profiles, (3) shared domains, (4) conserved motifs, (5) similarity of Hidden Markov Model profiles between families, (6) common 3D structural folds, and (7) clustering analysis of all families. Furthermore, sequence and structural comparisons as well as the identification of a 3-TMS repeat unit in the VIC superfamily suggests that the sixth transmembrane segment evolved into a re-entrant loop. This evidence suggests that the voltage-sensor domain and the channel domain have a common origin. The classification of the pLIC family within the VIC superfamily sheds light onto the topological origins of this family and its evolution, which will facilitate experimental verification and further research into this superfamily by the scientific community.
Collapse
Affiliation(s)
| | | | | | - Arturo Medrano-Soto
- Department of Molecular Biology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92093-0116, USA; (A.D.); (M.B.); (K.J.H.)
| | - Milton H. Saier
- Department of Molecular Biology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92093-0116, USA; (A.D.); (M.B.); (K.J.H.)
| |
Collapse
|
2
|
Hendargo KJ, Patel AO, Chukwudozie OS, Moreno-Hagelsieb G, Christen JA, Medrano-Soto A, Saier MH. Sequence Similarity among Structural Repeats in the Piezo Family of Mechanosensitive Ion Channels. Microb Physiol 2023; 33:49-62. [PMID: 37321192 PMCID: PMC11283329 DOI: 10.1159/000531468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 06/05/2023] [Indexed: 06/17/2023]
Abstract
Members of the Piezo family of mechanically activated cation channels are involved in multiple physiological processes in higher eukaryotes, including vascular development, cell differentiation, touch perception, hearing, and more, but they are also common in single-celled eukaryotic microorganisms. Mutations in these proteins in humans are associated with a variety of diseases, such as colorectal adenomatous polyposis, dehydrated hereditary stomatocytosis, and hereditary xerocytosis. Available 3D structures for Piezo proteins show nine regions of four transmembrane segments each that have the same fold. Despite the remarkable similarity among the nine characteristic structural repeats in the family, no significant sequence similarity among them has been reported. Using bioinformatics approaches and the Transporter Classification Database (TCDB) as reference, we reliably identified sequence similarity among repeats based on four lines of evidence: (1) hidden Markov model-profile similarities across repeats at the family level, (2) pairwise sequence similarities between different repeats across Piezo homologs, (3) Piezo-specific conserved sequence signatures that consistently identify the same regions across repeats, and (4) conserved residues that maintain the same orientation and location in 3D space.
Collapse
Affiliation(s)
- Kevin J. Hendargo
- Department of Molecular Biology, School of Biological Sciences, University of California, San Diego, CA, USA
| | - Ashay O. Patel
- Department of Molecular Biology, School of Biological Sciences, University of California, San Diego, CA, USA
| | - Onyeka S. Chukwudozie
- Department of Molecular Biology, School of Biological Sciences, University of California, San Diego, CA, USA
| | | | - J. Andrés Christen
- Departamento de Probabilidad y Estadística, Centro de Investigación en Matemáticas, CIMAT, Guanajuato, Mexico
| | - Arturo Medrano-Soto
- Department of Molecular Biology, School of Biological Sciences, University of California, San Diego, CA, USA
| | - Milton H. Saier
- Department of Molecular Biology, School of Biological Sciences, University of California, San Diego, CA, USA
| |
Collapse
|
3
|
Tyler D, Hendargo KJ, Medrano-Soto A, Saier MH. Discovery and Characterization of the Phospholemman/SIMP/Viroporin Superfamily. Microb Physiol 2022; 32:83-94. [PMID: 35152214 PMCID: PMC9355910 DOI: 10.1159/000521947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 01/11/2022] [Indexed: 11/19/2022]
Abstract
Using bioinformatic approaches, we present evidence of distant relatedness among the Ephemerovirus Viroporin family, the Rhabdoviridae Putative Viroporin U5 family, the Phospholemman family, and the Small Integral Membrane Protein family. Our approach is based on the transitivity property of homology complemented with five validation criteria: (1) significant sequence similarity and alignment coverage, (2) compatibility of topology of transmembrane segments, (3) overlap of hydropathy profiles, (4) conservation of protein domains, and (5) conservation of sequence motifs. Our results indicate that Pfam protein domains PF02038 and PF15831 can be found in or projected onto members of all four families. In addition, we identified a 26-residue motif conserved across the superfamily. This motif is characterized by hydrophobic residues that help anchor the protein to the membrane and charged residues that constitute phosphorylation sites. In addition, all members of the four families with annotated function are either responsible for or affect the transport of ions into and/or out of the cell. Taken together, these results justify the creation of the novel Phospholemman/SIMP/Viroporin superfamily. Given that transport proteins can be found not just in cells, but also in viruses, the ability to relate viroporin protein families with their eukaryotic and bacterial counterparts is an important development in this superfamily.
Collapse
Affiliation(s)
| | | | - Arturo Medrano-Soto
- Corresponding Authors: Milton H. Saier, Jr. & Arturo
Medrano-Soto, Department of Molecular Biology, Division of Biological Sciences.,
University of California, San Diego., 9500 Gilman Drive #0116, La Jolla,
California. 92093-0116, Tel: 858-534-4084, &
| | - Milton H. Saier
- Corresponding Authors: Milton H. Saier, Jr. & Arturo
Medrano-Soto, Department of Molecular Biology, Division of Biological Sciences.,
University of California, San Diego., 9500 Gilman Drive #0116, La Jolla,
California. 92093-0116, Tel: 858-534-4084, &
| |
Collapse
|
4
|
Khan MKA, Akhtar S. Novel drug design and bioinformatics: an introduction. PHYSICAL SCIENCES REVIEWS 2021. [DOI: 10.1515/psr-2018-0158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
In the current era of high-throughput technology, where enormous amounts of biological data are generated day by day via various sequencing projects, thereby the staggering volume of biological targets deciphered. The discovery of new chemical entities and bioisosteres of relatively low molecular weight has been gaining high momentum in the pharmacopoeia, and traditional combinatorial design wherein chemical structure is used as an initial template for enhancing efficacy pharmacokinetic selectivity properties. Once the compound is identified, it undergoes ADMET filtration to ensure whether it has toxic and mutagenic properties or not. If the compound has no toxicity and mutagenicity is either considered a potential lead molecule. Understanding the mechanism of lead molecules with various biological targets is imperative to advance related functions for drug discovery and development. Notwithstanding, a tedious and costly process, taking around 10–15 years and costing around $4 billion, cascaded approached of Bioinformatics and Computational biology viz., structure-based drug design (SBDD) and cognate ligand-based drug design (LBDD) respectively rely on the availability of 3D structure of target biomacromolecules and vice versa has made this process easy and approachable. SBDD encompasses homology modelling, ligand docking, fragment-based drug design and molecular dynamics, while LBDD deals with pharmacophore mapping, QSAR, and similarity search. All the computational methods discussed herein, whether for target identification or novel ligand discovery, continuously evolve and facilitate cost-effective and reliable outcomes in an era of overwhelming data.
Collapse
Affiliation(s)
- Mohammad Kalim Ahmad Khan
- Department of Bioengineering, Faculty of Engineering , Integral University , Lucknow , Uttar Pradesh , 226026 , India
| | - Salman Akhtar
- Department of Bioengineering, Faculty of Engineering , Integral University , Lucknow , Uttar Pradesh , 226026 , India
| |
Collapse
|
5
|
Eisenhaber B, Sinha S, Jadalanki CK, Shitov VA, Tan QW, Sirota FL, Eisenhaber F. Conserved sequence motifs in human TMTC1, TMTC2, TMTC3, and TMTC4, new O-mannosyltransferases from the GT-C/PMT clan, are rationalized as ligand binding sites. Biol Direct 2021; 16:4. [PMID: 33436046 PMCID: PMC7801869 DOI: 10.1186/s13062-021-00291-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/04/2021] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND The human proteins TMTC1, TMTC2, TMTC3 and TMTC4 have been experimentally shown to be components of a new O-mannosylation pathway. Their own mannosyl-transferase activity has been suspected but their actual enzymatic potential has not been demonstrated yet. So far, sequence analysis of TMTCs has been compromised by evolutionary sequence divergence within their membrane-embedded N-terminal region, sequence inaccuracies in the protein databases and the difficulty to interpret the large functional variety of known homologous proteins (mostly sugar transferases and some with known 3D structure). RESULTS Evolutionary conserved molecular function among TMTCs is only possible with conserved membrane topology within their membrane-embedded N-terminal regions leading to the placement of homologous long intermittent loops at the same membrane side. Using this criterion, we demonstrate that all TMTCs have 11 transmembrane regions. The sequence segment homologous to Pfam model DUF1736 is actually just a loop between TM7 and TM8 that is located in the ER lumen and that contains a small hydrophobic, but not membrane-embedded helix. Not only do the membrane-embedded N-terminal regions of TMTCs share a common fold and 3D structural similarity with subgroups of GT-C sugar transferases. The conservation of residues critical for catalysis, for binding of a divalent metal ion and of the phosphate group of a lipid-linked sugar moiety throughout enzymatically and structurally well-studied GT-Cs and sequences of TMTCs indicates that TMTCs are actually sugar-transferring enzymes. We present credible 3D structural models of all four TMTCs (derived from their closest known homologues 5ezm/5f15) and find observed conserved sequence motifs rationalized as binding sites for a metal ion and for a dolichyl-phosphate-mannose moiety. CONCLUSIONS With the results from both careful sequence analysis and structural modelling, we can conclusively say that the TMTCs are enzymatically active sugar transferases belonging to the GT-C/PMT superfamily. The DUF1736 segment, the loop between TM7 and TM8, is critical for catalysis and lipid-linked sugar moiety binding. Together with the available indirect experimental data, we conclude that the TMTCs are not only part of an O-mannosylation pathway in the endoplasmic reticulum of upper eukaryotes but, actually, they are the sought mannosyl-transferases.
Collapse
Affiliation(s)
- Birgit Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore.
- Genome Institute of Singapore (BII), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
| | - Swati Sinha
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
| | - Chaitanya K Jadalanki
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
| | - Vladimir A Shitov
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
- Siberian State Medical University, Moskovskiy Trakt, 2, Tomsk, Tomsk Oblast, 634050, Russia
| | - Qiao Wen Tan
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
- School of Biological Science (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore, 637551, Republic of Singapore
| | - Fernanda L Sirota
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
| | - Frank Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore.
- Genome Institute of Singapore (BII), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- School of Biological Science (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore, 637551, Republic of Singapore.
| |
Collapse
|
6
|
Medrano-Soto A, Ghazi F, Hendargo KJ, Moreno-Hagelsieb G, Myers S, Saier MH. Expansion of the Transporter-Opsin-G protein-coupled receptor superfamily with five new protein families. PLoS One 2020; 15:e0231085. [PMID: 32320418 PMCID: PMC7176098 DOI: 10.1371/journal.pone.0231085] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 03/17/2020] [Indexed: 02/06/2023] Open
Abstract
Here we provide bioinformatic evidence that the Organo-Arsenical Exporter (ArsP), Endoplasmic Reticulum Retention Receptor (KDELR), Mitochondrial Pyruvate Carrier (MPC), L-Alanine Exporter (AlaE), and the Lipid-linked Sugar Translocase (LST) protein families are members of the Transporter-Opsin-G Protein-coupled Receptor (TOG) Superfamily. These families share domains homologous to well-established TOG superfamily members, and their topologies of transmembranal segments (TMSs) are compatible with the basic 4-TMS repeat unit characteristic of this Superfamily. These repeat units tend to occur twice in proteins as a result of intragenic duplication events, often with subsequent gain/loss of TMSs in many superfamily members. Transporters within the ArsP family allow microbial pathogens to expel toxic arsenic compounds from the cell. Members of the KDELR family are involved in the selective retrieval of proteins that reside in the endoplasmic reticulum. Proteins of the MPC family are involved in the transport of pyruvate into mitochondria, providing the organelle with a major oxidative fuel. Members of family AlaE excrete L-alanine from the cell. Members of the LST family are involved in the translocation of lipid-linked glucose across the membrane. These five families substantially expand the range of substrates of transport carriers in the superfamily, although KDEL receptors have no known transport function. Clustering of protein sequences reveals the relationships among families, and the resulting tree correlates well with the degrees of sequence similarity documented between families. The analyses and programs developed to detect distant relatedness, provide insights into the structural, functional, and evolutionary relationships that exist between families of the TOG superfamily, and should be of value to many other investigators.
Collapse
Affiliation(s)
- Arturo Medrano-Soto
- Department of Molecular Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
| | - Faezeh Ghazi
- Department of Molecular Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
| | - Kevin J. Hendargo
- Department of Molecular Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
| | | | - Scott Myers
- Department of Molecular Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
| | - Milton H. Saier
- Department of Molecular Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
7
|
Wang SC, Davejan P, Hendargo KJ, Javadi-Razaz I, Chou A, Yee DC, Ghazi F, Lam KJK, Conn AM, Madrigal A, Medrano-Soto A, Saier MH. Expansion of the Major Facilitator Superfamily (MFS) to include novel transporters as well as transmembrane-acting enzymes. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2020; 1862:183277. [PMID: 32205149 DOI: 10.1016/j.bbamem.2020.183277] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 03/14/2020] [Accepted: 03/17/2020] [Indexed: 12/14/2022]
Abstract
The Major Facilitator Superfamily (MFS) is currently the largest characterized superfamily of transmembrane secondary transport proteins. Its diverse members are found in essentially all organisms in the biosphere and function by uniport, symport, and/or antiport mechanisms. In 1993 we first named and described the MFS which then consisted of 5 previously known families that had not been known to be related, and by 2012 we had identified a total of 74 families, classified phylogenetically within the MFS, all of which included only transport proteins. This superfamily has since expanded to 89 families, all included under TC# 2.A.1, and a few transporter families outside of TC# 2.A.1 were identified as members of the MFS. In this study, we assign nine previously unclassified protein families in the Transporter Classification Database (TCDB; http://www.tcdb.org) to the MFS based on multiple criteria and bioinformatic methodologies. In addition, we find integral membrane domains distantly related to partial or full-length MFS permeases in Lysyl tRNA Synthases (TC# 9.B.111), Lysylphosphatidyl Glycerol Synthases (TC# 4.H.1), and cytochrome b561 transmembrane electron carriers (TC# 5.B.2). Sequence alignments, overlap of hydropathy plots, compatibility of repeat units, similarity of complexity profiles of transmembrane segments, shared protein domains and 3D structural similarities between transport proteins were analyzed to assist in inferring homology. The MFS now includes 105 families.
Collapse
Affiliation(s)
- Steven C Wang
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Pauldeen Davejan
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Kevin J Hendargo
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Ida Javadi-Razaz
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Amy Chou
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Daniel C Yee
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Faezeh Ghazi
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Katie Jing Kay Lam
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Adam M Conn
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Assael Madrigal
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Arturo Medrano-Soto
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America
| | - Milton H Saier
- Department of Molecular Biology, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, United States of America.
| |
Collapse
|
8
|
Subaraja M, Kulandaisamy A, Shanmugam NRS, Vanisree AJ. Homology modeling identified for purported drug targets to the neuroprotective effects of levodopa and asiaticoside-D in degenerated cerebral ganglions of Lumbricus terrestris. Indian J Pharmacol 2019; 51:31-39. [PMID: 31031465 PMCID: PMC6444839 DOI: 10.4103/ijp.ijp_600_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
CONTEXT: Homology modeling plays role in determining the therapeutic targets dreadful for condition such as neurodegenerative diseases (NDD), which pose challenge in achieving the effective managements. The structures of the serotonin transporter (SERT), aquaporin (AQP), and tropomyosin receptor kinase (TrkA) which are implicated in NDD pathology are still unknown for Lumbricus terrestris, but the three-dimensional (3D) structure of the human counterpart for modeling. AIM: This study aims to generate and evaluate the 3D structure of TrkA, SERT, and AQP proteins and their interaction with the ligands, namely Asiaticoside-D (AD) and levodopa (L-DOPA) the anti-NDD agents. SUBJECTS AND METHODS: Homology modeling for SERT, AQP, and TrkA proteins of Lumbricus terrestris using SWISS-MODEL Server and the modeled structure was validated using Rampage Server. Wet-lab analysis of their correspondent m-RNA levels was also done to validate the in silico data. RESULTS: It was found that TrkA had moderately high homology (67%) to human while SERT and AQP could exhibit 58% and 42%, respectively. The reliability of the model was assessed by Ramachandran plot analysis. Interactions of AD with the SERT, AQP-4, and TrkA showed the binding energies as −9.93, 8.88, and −7.58 of Kcal/mol, respectively, while for L-DOPA did show −3.93, −5.13, and −6.0 Kcal/mol, respectively. The levels of SERT, TrkA, and AQP-4 were significantly reduced (P < 0.001) on ROT induced when compared to those of control worms. On ROT + AD supplementation group (III), m-RNA levels were significantly increased (P < 0.05) when compared to those of ROT induced worms (group II). CONCLUSION: Our pioneering docking data propose the possible of target which is proved useful for therapeutic investigations against the unconquered better of NDD.
Collapse
Affiliation(s)
- Mamangam Subaraja
- Department of Biochemistry, Guindy Campus, University of Madras, Chennai, Tamil Nadu, India
| | - A Kulandaisamy
- Department of Biotechnology, Indian Institute of Technology, Chennai, Tamil Nadu, India
| | - N R Siva Shanmugam
- Department of Biotechnology, Indian Institute of Technology, Chennai, Tamil Nadu, India
| | | |
Collapse
|
9
|
Eisenhaber B, Sinha S, Wong WC, Eisenhaber F. Function of a membrane-embedded domain evolutionarily multiplied in the GPI lipid anchor pathway proteins PIG-B, PIG-M, PIG-U, PIG-W, PIG-V, and PIG-Z. Cell Cycle 2018; 17:874-880. [PMID: 29764287 PMCID: PMC6056205 DOI: 10.1080/15384101.2018.1456294] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Distant homology relationships among proteins with many transmembrane regions (TMs) are difficult to detect as they are clouded by the TMs’ hydrophobic compositional bias and mutational divergence in connecting loops. In the case of several GPI lipid anchor biosynthesis pathway components, the hidden evolutionary signal can be revealed with dissectHMMER, a sequence similarity search tool focusing on fold-critical, high complexity sequence segments. We find that a sequence module with 10 TMs in PIG-W, described as acyl transferase, is homologous to PIG-U, a transamidase subunit without characterized molecular function, and to mannosyltransferases PIG-B, PIG-M, PIG-V and PIG-Z. We conclude that this new, membrane-embedded domain named BindGPILA functions as the unit for recognizing, binding and stabilizing the GPI lipid anchor in a modification-competent form as this appears the only functional aspect shared among all proteins. Thus, PIG-U's likely molecular function is shuttling/presenting the anchor in a productive conformation to the transamidase complex.
Collapse
Affiliation(s)
- Birgit Eisenhaber
- a Bioinformatics Institute, Agency for Science , Technology and Research (A*STAR) , 30 Biopolis Street, #07-01 Matrix, Singapore 138671 , Republic of Singapore
| | - Swati Sinha
- a Bioinformatics Institute, Agency for Science , Technology and Research (A*STAR) , 30 Biopolis Street, #07-01 Matrix, Singapore 138671 , Republic of Singapore
| | - Wing-Cheong Wong
- a Bioinformatics Institute, Agency for Science , Technology and Research (A*STAR) , 30 Biopolis Street, #07-01 Matrix, Singapore 138671 , Republic of Singapore
| | - Frank Eisenhaber
- a Bioinformatics Institute, Agency for Science , Technology and Research (A*STAR) , 30 Biopolis Street, #07-01 Matrix, Singapore 138671 , Republic of Singapore.,b School of Computer Engineering , Nanyang Technological University (NTU) , 50 Nanyang Drive, Singapore 637553 , Republic of Singapore
| |
Collapse
|
10
|
Baker JA, Wong WC, Eisenhaber B, Warwicker J, Eisenhaber F. Charged residues next to transmembrane regions revisited: "Positive-inside rule" is complemented by the "negative inside depletion/outside enrichment rule". BMC Biol 2017; 15:66. [PMID: 28738801 PMCID: PMC5525207 DOI: 10.1186/s12915-017-0404-4] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 07/07/2017] [Indexed: 11/25/2022] Open
Abstract
Background Transmembrane helices (TMHs) frequently occur amongst protein architectures as means for proteins to attach to or embed into biological membranes. Physical constraints such as the membrane’s hydrophobicity and electrostatic potential apply uniform requirements to TMHs and their flanking regions; consequently, they are mirrored in their sequence patterns (in addition to TMHs being a span of generally hydrophobic residues) on top of variations enforced by the specific protein’s biological functions. Results With statistics derived from a large body of protein sequences, we demonstrate that, in addition to the positive charge preference at the cytoplasmic inside (positive-inside rule), negatively charged residues preferentially occur or are even enriched at the non-cytoplasmic flank or, at least, they are suppressed at the cytoplasmic flank (negative-not-inside/negative-outside (NNI/NO) rule). As negative residues are generally rare within or near TMHs, the statistical significance is sensitive with regard to details of TMH alignment and residue frequency normalisation and also to dataset size; therefore, this trend was obscured in previous work. We observe variations amongst taxa as well as for organelles along the secretory pathway. The effect is most pronounced for TMHs from single-pass transmembrane (bitopic) proteins compared to those with multiple TMHs (polytopic proteins) and especially for the class of simple TMHs that evolved for the sole role as membrane anchors. Conclusions The charged-residue flank bias is only one of the TMH sequence features with a role in the anchorage mechanisms, others apparently being the leucine intra-helix propensity skew towards the cytoplasmic side, tryptophan flanking as well as the cysteine and tyrosine inside preference. These observations will stimulate new prediction methods for TMHs and protein topology from a sequence as well as new engineering designs for artificial membrane proteins. Electronic supplementary material The online version of this article (doi:10.1186/s12915-017-0404-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- James Alexander Baker
- Bioinformatics Institute, Agency for Science Technology and Research (A*STAR), 30 Biopolis Street #07-01, Matrix, Singapore, 138671, Singapore.,School of Chemistry, Manchester Institute of Biotechnology, 131 Princess Street, Manchester, M1 7DN, UK
| | - Wing-Cheong Wong
- Bioinformatics Institute, Agency for Science Technology and Research (A*STAR), 30 Biopolis Street #07-01, Matrix, Singapore, 138671, Singapore
| | - Birgit Eisenhaber
- Bioinformatics Institute, Agency for Science Technology and Research (A*STAR), 30 Biopolis Street #07-01, Matrix, Singapore, 138671, Singapore
| | - Jim Warwicker
- School of Chemistry, Manchester Institute of Biotechnology, 131 Princess Street, Manchester, M1 7DN, UK.
| | - Frank Eisenhaber
- Bioinformatics Institute, Agency for Science Technology and Research (A*STAR), 30 Biopolis Street #07-01, Matrix, Singapore, 138671, Singapore. .,School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore, 637553, Singapore.
| |
Collapse
|
11
|
Saidijam M, Azizpour S, Patching SG. Comprehensive analysis of the numbers, lengths and amino acid compositions of transmembrane helices in prokaryotic, eukaryotic and viral integral membrane proteins of high-resolution structure. J Biomol Struct Dyn 2017; 36:443-464. [PMID: 28150531 DOI: 10.1080/07391102.2017.1285725] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
We report a comprehensive analysis of the numbers, lengths and amino acid compositions of transmembrane helices in 235 high-resolution structures of integral membrane proteins. The properties of 1551 transmembrane helices in the structures were compared with those obtained by analysis of the same amino acid sequences using topology prediction tools. Explanations for the 81 (5.2%) missing or additional transmembrane helices in the prediction results were identified. Main reasons for missing transmembrane helices were mis-identification of N-terminal signal peptides, breaks in α-helix conformation or charged residues in the middle of transmembrane helices and transmembrane helices with unusual amino acid composition. The main reason for additional transmembrane helices was mis-identification of amphipathic helices, extramembrane helices or hairpin re-entrant loops. Transmembrane helix length had an overall median of 24 residues and an average of 24.9 ± 7.0 residues and the most common length was 23 residues. The overall content of residues in transmembrane helices as a percentage of the full proteins had a median of 56.8% and an average of 55.7 ± 16.0%. Amino acid composition was analysed for the full proteins, transmembrane helices and extramembrane regions. Individual proteins or types of proteins with transmembrane helices containing extremes in contents of individual amino acids or combinations of amino acids with similar physicochemical properties were identified and linked to structure and/or function. In addition to overall median and average values, all results were analysed for proteins originating from different types of organism (prokaryotic, eukaryotic, viral) and for subgroups of receptors, channels, transporters and others.
Collapse
Affiliation(s)
- Massoud Saidijam
- a Department of Molecular Medicine and Genetics, Research Centre for Molecular Medicine, School of Medicine , Hamadan University of Medical Sciences , Hamadan , Iran
| | - Sonia Azizpour
- a Department of Molecular Medicine and Genetics, Research Centre for Molecular Medicine, School of Medicine , Hamadan University of Medical Sciences , Hamadan , Iran
| | - Simon G Patching
- b School of BioMedical Sciences and the Astbury Centre for Structural Molecular Biology , University of Leeds , Leeds , UK
| |
Collapse
|
12
|
Yap CK, Eisenhaber B, Eisenhaber F, Wong WC. xHMMER3x2: Utilizing HMMER3's speed and HMMER2's sensitivity and specificity in the glocal alignment mode for improved large-scale protein domain annotation. Biol Direct 2016; 11:63. [PMID: 27894340 PMCID: PMC5126834 DOI: 10.1186/s13062-016-0163-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Accepted: 10/24/2016] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND While the local-mode HMMER3 is notable for its massive speed improvement, the slower glocal-mode HMMER2 is more exact for domain annotation by enforcing full domain-to-sequence alignments. Since a unit of domain necessarily implies a unit of function, local-mode HMMER3 alone remains insufficient for precise function annotation tasks. In addition, the incomparable E-values for the same domain model by different HMMER builds create difficulty when checking for domain annotation consistency on a large-scale basis. RESULTS In this work, both the speed of HMMER3 and glocal-mode alignment of HMMER2 are combined within the xHMMER3x2 framework for tackling the large-scale domain annotation task. Briefly, HMMER3 is utilized for initial domain detection so that HMMER2 can subsequently perform the glocal-mode, sequence-to-full-domain alignments for the detected HMMER3 hits. An E-value calibration procedure is required to ensure that the search space by HMMER2 is sufficiently replicated by HMMER3. We find that the latter is straightforwardly possible for ~80% of the models in the Pfam domain library (release 29). However in the case of the remaining ~20% of HMMER3 domain models, the respective HMMER2 counterparts are more sensitive. Thus, HMMER3 searches alone are insufficient to ensure sensitivity and a HMMER2-based search needs to be initiated. When tested on the set of UniProt human sequences, xHMMER3x2 can be configured to be between 7× and 201× faster than HMMER2, but with descending domain detection sensitivity from 99.8 to 95.7% with respect to HMMER2 alone; HMMER3's sensitivity was 95.7%. At extremes, xHMMER3x2 is either the slow glocal-mode HMMER2 or the fast HMMER3 with glocal-mode. Finally, the E-values to false-positive rates (FPR) mapping by xHMMER3x2 allows E-values of different model builds to be compared, so that any annotation discrepancies in a large-scale annotation exercise can be flagged for further examination by dissectHMMER. CONCLUSION The xHMMER3x2 workflow allows large-scale domain annotation speed to be drastically improved over HMMER2 without compromising for domain-detection with regard to sensitivity and sequence-to-domain alignment incompleteness. The xHMMER3x2 code and its webserver (for Pfam release 27, 28 and 29) are freely available at http://xhmmer3x2.bii.a-star.edu.sg/ . REVIEWERS Reviewed by Thomas Dandekar, L. Aravind, Oliviero Carugo and Shamil Sunyaev. For the full reviews, please go to the Reviewers' comments section.
Collapse
Affiliation(s)
- Choon-Kong Yap
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore
| | - Birgit Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore
| | - Frank Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore. .,School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore, 637553, Singapore.
| | - Wing-Cheong Wong
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore.
| |
Collapse
|
13
|
Pannuzzo G, Graziano ACE, Pannuzzo M, Masman MF, Avola R, Cardile V. Zoledronate derivatives as potential inhibitors of uridine diphosphate-galactose ceramide galactosyltransferase 8: A combined molecular docking and dynamic study. J Neurosci Res 2016; 94:1318-1326. [DOI: 10.1002/jnr.23761] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2023]
Affiliation(s)
- Giovanna Pannuzzo
- Department of Biomedical and Biotechnological Sciences, Section of Physiology; University of Catania; Catania Italy
| | | | - Martina Pannuzzo
- Department of Computational Biology; Universität Erlangen-Nürnberg; Erlangen Germany
| | - Marcelo Fabricio Masman
- Department of Biocatalysis and Biotransformation, Groningen Biomolecular Sciences and Biotechnology Institute; University of Groningen; Groningen The Netherlands
| | - Rosanna Avola
- Department of Biomedical and Biotechnological Sciences, Section of Physiology; University of Catania; Catania Italy
| | - Venera Cardile
- Department of Biomedical and Biotechnological Sciences, Section of Physiology; University of Catania; Catania Italy
| |
Collapse
|
14
|
The Recipe for Protein Sequence-Based Function Prediction and Its Implementation in the ANNOTATOR Software Environment. Methods Mol Biol 2016; 1415:477-506. [PMID: 27115649 DOI: 10.1007/978-1-4939-3572-7_25] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
|
15
|
The C-terminal region of the non-structural protein 2B from Hepatitis A Virus demonstrates lipid-specific viroporin-like activity. Sci Rep 2015; 5:15884. [PMID: 26515753 PMCID: PMC4626808 DOI: 10.1038/srep15884] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2015] [Accepted: 10/05/2015] [Indexed: 12/20/2022] Open
Abstract
Viroporins are virally encoded, membrane-active proteins, which enhance viral replication and assist in egress of viruses from host cells. The 2B proteins in the picornaviridae family are known to have viroporin-like properties, and play critical roles during virus replication. The 2B protein of Hepatitis A Virus (2B), an unusual picornavirus, is somewhat dissimilar from its analogues in several respects. HAV 2B is approximately 2.5 times the length of other 2B proteins, and does not disrupt calcium homeostasis or glycoprotein trafficking. Additionally, its membrane penetrating properties are not yet clearly established. Here we show that the membrane interacting activity of HAV 2B is localized in its C-terminal region, which contains an alpha-helical hairpin motif. We show that this region is capable of forming small pores in membranes and demonstrates lipid specific activity, which partially rationalizes the intracellular localization of full-length 2B. Using a combination of biochemical assays and molecular dynamics simulation studies, we also show that HAV 2B demonstrates a marked propensity to dimerize in a crowded environment, and probably interacts with membranes in a multimeric form, a hallmark of other picornavirus viroporins. In sum, our study clearly establishes HAV 2B as a bona fide viroporin in the picornaviridae family.
Collapse
|
16
|
Wong WC, Yap CK, Eisenhaber B, Eisenhaber F. dissectHMMER: a HMMER-based score dissection framework that statistically evaluates fold-critical sequence segments for domain fold similarity. Biol Direct 2015; 10:39. [PMID: 26228544 PMCID: PMC4521371 DOI: 10.1186/s13062-015-0068-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 07/20/2015] [Indexed: 11/10/2022] Open
Abstract
Background Annotation transfer for function and structure within the sequence homology concept essentially requires protein sequence similarity for the secondary structural blocks forming the fold of a protein. A simplistic similarity approach in the case of non-globular segments (coiled coils, low complexity regions, transmembrane regions, long loops, etc.) is not justified and a pertinent source for mistaken homologies. The latter is either due to positional sequence conservation as a result of a very simple, physically induced pattern or integral sequence properties that are critical for function. Furthermore, against the backdrop that the number of well-studied proteins continues to grow at a slow rate, it necessitates for a search methodology to dive deeper into the sequence similarity space to connect the unknown sequences to the well-studied ones, albeit more distant, for biological function postulations. Results Based on our previous work of dissecting the hidden markov model (HMMER) based similarity score into fold-critical and the non-globular contributions to improve homology inference, we propose a framework-dissectHMMER, that identifies more fold-related domain hits from standard HMMER searches. Subsequent statistical stratification of the fold-related hits into cohorts of functionally-related domains allows for the function postulation of the query sequence. Briefly, the technical problems as to how to recognize non-globular parts in the domain model, resolve contradictory HMMER2/HMMER3 results and evaluate fold-related domain hits for homology, are addressed in this work. The framework is benchmarked against a set of SCOP-to-Pfam domain models. Despite being a sequence-to-profile method, dissectHMMER performs favorably against a profile-to-profile based method-HHsuite/HHsearch. Examples of function annotation using dissectHMMER, including the function discovery of an uncharacterized membrane protein Q9K8K1_BACHD (WP_010899149.1) as a lactose/H+ symporter, are presented. Finally, dissectHMMER webserver is made publicly available at http://dissecthmmer.bii.a-star.edu.sg. Conclusions The proposed framework-dissectHMMER, is faithful to the original inception of the sequence homology concept while improving upon the existing HMMER search tool through the rescue of statistically evaluated false-negative yet fold-related domain hits to the query sequence. Overall, this translates into an opportunity for any novel protein sequence to be functionally characterized. Reviewers This article was reviewed by Masanori Arita, Shamil Sunyaev and L. Aravind. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0068-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wing-Cheong Wong
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore.
| | - Choon-Kong Yap
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore.
| | - Birgit Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore.
| | - Frank Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore, 138671, Singapore. .,Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore, 117597, Singapore. .,School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore, 637553, Singapore.
| |
Collapse
|
17
|
Mudgal R, Sandhya S, Chandra N, Srinivasan N. De-DUFing the DUFs: Deciphering distant evolutionary relationships of Domains of Unknown Function using sensitive homology detection methods. Biol Direct 2015; 10:38. [PMID: 26228684 PMCID: PMC4520260 DOI: 10.1186/s13062-015-0069-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 07/20/2015] [Indexed: 12/23/2022] Open
Abstract
Background In the post-genomic era where sequences are being determined at a rapid rate, we are highly reliant on computational methods for their tentative biochemical characterization. The Pfam database currently contains 3,786 families corresponding to “Domains of Unknown Function” (DUF) or “Uncharacterized Protein Family” (UPF), of which 3,087 families have no reported three-dimensional structure, constituting almost one-fourth of the known protein families in search for both structure and function. Results We applied a ‘computational structural genomics’ approach using five state-of-the-art remote similarity detection methods to detect the relationship between uncharacterized DUFs and domain families of known structures. The association with a structural domain family could serve as a start point in elucidating the function of a DUF. Amongst these five methods, searches in SCOP-NrichD database have been applied for the first time. Predictions were classified into high, medium and low- confidence based on the consensus of results from various approaches and also annotated with enzyme and Gene ontology terms. 614 uncharacterized DUFs could be associated with a known structural domain, of which high confidence predictions, involving at least four methods, were made for 54 families. These structure-function relationships for the 614 DUF families can be accessed on-line at http://proline.biochem.iisc.ernet.in/RHD_DUFS/. For potential enzymes in this set, we assessed their compatibility with the associated fold and performed detailed structural and functional annotation by examining alignments and extent of conservation of functional residues. Detailed discussion is provided for interesting assignments for DUF3050, DUF1636, DUF1572, DUF2092 and DUF659. Conclusions This study provides insights into the structure and potential function for nearly 20 % of the DUFs. Use of different computational approaches enables us to reliably recognize distant relationships, especially when they converge to a common assignment because the methods are often complementary. We observe that while pointers to the structural domain can offer the right clues to the function of a protein, recognition of its precise functional role is still ‘non-trivial’ with many DUF domains conserving only some of the critical residues. It is not clear whether these are functional vestiges or instances involving alternate substrates and interacting partners. Reviewers This article was reviewed by Drs Eugene Koonin, Frank Eisenhaber and Srikrishna Subramanian. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0069-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Richa Mudgal
- IISc Mathematics Initiative, Indian Institute of Science, Bangalore, 560 012, India.
| | - Sankaran Sandhya
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560 012, India.
| | - Nagasuma Chandra
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560 012, India.
| | | |
Collapse
|
18
|
|
19
|
Walther TH, Ulrich AS. Transmembrane helix assembly and the role of salt bridges. Curr Opin Struct Biol 2014; 27:63-8. [DOI: 10.1016/j.sbi.2014.05.003] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Revised: 05/08/2014] [Accepted: 05/09/2014] [Indexed: 10/25/2022]
|
20
|
Wong WC, Maurer-Stroh S, Eisenhaber B, Eisenhaber F. On the necessity of dissecting sequence similarity scores into segment-specific contributions for inferring protein homology, function prediction and annotation. BMC Bioinformatics 2014; 15:166. [PMID: 24890864 PMCID: PMC4061105 DOI: 10.1186/1471-2105-15-166] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Accepted: 05/27/2014] [Indexed: 02/01/2023] Open
Abstract
Background Protein sequence similarities to any types of non-globular segments (coiled coils, low complexity regions, transmembrane regions, long loops, etc. where either positional sequence conservation is the result of a very simple, physically induced pattern or rather integral sequence properties are critical) are pertinent sources for mistaken homologies. Regretfully, these considerations regularly escape attention in large-scale annotation studies since, often, there is no substitute to manual handling of these cases. Quantitative criteria are required to suppress events of function annotation transfer as a result of false homology assignments. Results The sequence homology concept is based on the similarity comparison between the structural elements, the basic building blocks for conferring the overall fold of a protein. We propose to dissect the total similarity score into fold-critical and other, remaining contributions and suggest that, for a valid homology statement, the fold-relevant score contribution should at least be significant on its own. As part of the article, we provide the DissectHMMER software program for dissecting HMMER2/3 scores into segment-specific contributions. We show that DissectHMMER reproduces HMMER2/3 scores with sufficient accuracy and that it is useful in automated decisions about homology for instructive sequence examples. To generalize the dissection concept for cases without 3D structural information, we find that a dissection based on alignment quality is an appropriate surrogate. The approach was applied to a large-scale study of SMART and PFAM domains in the space of seed sequences and in the space of UniProt/SwissProt. Conclusions Sequence similarity core dissection with regard to fold-critical and other contributions systematically suppresses false hits and, additionally, recovers previously obscured homology relationships such as the one between aquaporins and formate/nitrite transporters that, so far, was only supported by structure comparison.
Collapse
Affiliation(s)
- Wing-Cheong Wong
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore.
| | | | | | | |
Collapse
|
21
|
Eisenhaber B, Eisenhaber S, Kwang TY, Grüber G, Eisenhaber F. Transamidase subunit GAA1/GPAA1 is a M28 family metallo-peptide-synthetase that catalyzes the peptide bond formation between the substrate protein's omega-site and the GPI lipid anchor's phosphoethanolamine. Cell Cycle 2014; 13:1912-7. [PMID: 24743167 PMCID: PMC4111754 DOI: 10.4161/cc.28761] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The transamidase subunit GAA1/GPAA1 is predicted to be the enzyme that catalyzes the attachment of the glycosylphosphatidyl (GPI) lipid anchor to the carbonyl intermediate of the substrate protein at the ω-site. Its ~300-amino acid residue lumenal domain is a M28 family metallo-peptide-synthetase with an α/β hydrolase fold, including a central 8-strand β-sheet and a single metal (most likely zinc) ion coordinated by 3 conserved polar residues. Phosphoethanolamine is used as an adaptor to make the non-peptide GPI lipid anchor look chemically similar to the N terminus of a peptide.
Collapse
Affiliation(s)
- Birgit Eisenhaber
- Bioinformatics Institute (BII); A*STAR; Singapore, Republic of Singapore
| | - Stephan Eisenhaber
- Department of Physical Chemistry; University of Vienna; Wien/Vienna, Republic of Austria
| | - Toh Yew Kwang
- Bioinformatics Institute (BII); A*STAR; Singapore, Republic of Singapore
| | - Gerhard Grüber
- Bioinformatics Institute (BII); A*STAR; Singapore, Republic of Singapore; Nanyang Technological University; School of Biological Sciences; Singapore, Republic of Singapore
| | - Frank Eisenhaber
- Bioinformatics Institute (BII); A*STAR; Singapore, Republic of Singapore; Department of Biological Sciences (DBS); National University of Singapore (NUS); Singapore, Republic of Singapore; School of Computer Engineering (SCE); Nanyang Technological University (NTU); Singapore, Republic of Singapore
| |
Collapse
|
22
|
Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently "orphan" viral proteins. J Virol 2013; 88:10-20. [PMID: 24155369 PMCID: PMC3911697 DOI: 10.1128/jvi.02595-13] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.
Collapse
|
23
|
Vyas VK, Ukawala RD, Ghate M, Chintha C. Homology modeling a fast tool for drug discovery: current perspectives. Indian J Pharm Sci 2012. [PMID: 23204616 PMCID: PMC3507339 DOI: 10.4103/0250-474x.102537] [Citation(s) in RCA: 139] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery.
Collapse
Affiliation(s)
- V K Vyas
- Department of Pharmaceutical Chemistry, Institute of Pharmacy, Nirma University, Ahmedabad-382 481, India
| | | | | | | |
Collapse
|
24
|
Rekapalli B, Wuichet K, Peterson GD, Zhulin IB. Dynamics of domain coverage of the protein sequence universe. BMC Genomics 2012; 13:634. [PMID: 23157439 PMCID: PMC3557196 DOI: 10.1186/1471-2164-13-634] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2012] [Accepted: 11/11/2012] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its "dark matter". RESULTS Here we suggest that true size of "dark matter" is much larger than stated by current definitions. We propose an approach to reducing the size of "dark matter" by identifying and subtracting regions in protein sequences that are not likely to contain any domain. CONCLUSIONS Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of "dark matter"; however, its absolute size increases substantially with the growth of sequence data.
Collapse
Affiliation(s)
- Bhanu Rekapalli
- Joint Institute for Computational Sciences, Oak Ridge National Laboratory - University of Tennessee, Oak Ridge, TN 37831, USA
| | | | | | | |
Collapse
|
25
|
Bañó-Polo M, Baeza-Delgado C, Orzáez M, Marti-Renom MA, Abad C, Mingarro I. Polar/Ionizable residues in transmembrane segments: effects on helix-helix packing. PLoS One 2012; 7:e44263. [PMID: 22984481 PMCID: PMC3440369 DOI: 10.1371/journal.pone.0044263] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Accepted: 07/31/2012] [Indexed: 01/14/2023] Open
Abstract
The vast majority of membrane proteins are anchored to biological membranes through hydrophobic α-helices. Sequence analysis of high-resolution membrane protein structures show that ionizable amino acid residues are present in transmembrane (TM) helices, often with a functional and/or structural role. Here, using as scaffold the hydrophobic TM domain of the model membrane protein glycophorin A (GpA), we address the consequences of replacing specific residues by ionizable amino acids on TM helix insertion and packing, both in detergent micelles and in biological membranes. Our findings demonstrate that ionizable residues are stably inserted in hydrophobic environments, and tolerated in the dimerization process when oriented toward the lipid face, emphasizing the complexity of protein-lipid interactions in biological membranes.
Collapse
Affiliation(s)
- Manuel Bañó-Polo
- Departament de Bioquímica i Biologia Molecular, Universitat de València, Burjassot, Spain
| | - Carlos Baeza-Delgado
- Departament de Bioquímica i Biologia Molecular, Universitat de València, Burjassot, Spain
| | - Mar Orzáez
- Centro de Investigación Príncipe Felipe, Valencia, Spain
| | - Marc A. Marti-Renom
- Genome Biology Group, Structural Genomics Team, Centre Nacional d'Anàlisi Genòmic, Barcelona, Spain
- Structural Genomics Group, Center for Genomic Regulation, Barcelona, Spain
| | - Concepción Abad
- Departament de Bioquímica i Biologia Molecular, Universitat de València, Burjassot, Spain
| | - Ismael Mingarro
- Departament de Bioquímica i Biologia Molecular, Universitat de València, Burjassot, Spain
| |
Collapse
|
26
|
EISENHABER FRANK. A DECADE AFTER THE FIRST FULL HUMAN GENOME SEQUENCING: WHEN WILL WE UNDERSTAND OUR OWN GENOME? J Bioinform Comput Biol 2012; 10:1271001. [DOI: 10.1142/s0219720012710011] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The contrast between the pomp of celebrating the first full human genome sequencing in 2000 and the cautious tone of recollections a decade thereafter could hardly be greater. The promises with regard to medical cures and biotechnology applications have been realized not even nearly to the expectations. Understanding the human genomes means knowing the genes' and proteins' functions and their interconnectedness via biomolecular mechanisms. This articles estimates how long will it take to achieve this goal if we extrapolate from the previous decade (indeed, a century!) and the possible disruptive trends in science, technology and society that may accelerate the pace of progress dramatically.
Collapse
Affiliation(s)
- FRANK EISENHABER
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore
- Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore 117597, Singapore
- School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore 637553, Singapore
| |
Collapse
|
27
|
Abstract
Transmembrane helical segments (TMs) can be classified into two groups of so-called ‘simple’ and ‘complex’ TMs. Whereas the first group represents mere hydrophobic anchors with an overrepresentation of aliphatic hydrophobic residues that are likely attributed to convergent evolution in many cases, the complex ones embody ancestral information and tend to have structural and functional roles beyond just membrane immersion. Hence, the sequence homology concept is not applicable on simple TMs. In practice, these simple TMs can attract statistically significant but evolutionarily unrelated hits during similarity searches (whether through BLAST- or HMM-based approaches). This is especially problematic for membrane proteins that contain both globular segments and TMs. As such, we have developed the transmembrane helix: simple or complex (TMSOC) webserver for the identification of simple and complex TMs. By masking simple TM segments in seed sequences prior to sequence similarity searches, the false-discovery rate decreases without sacrificing sensitivity. Therefore, TMSOC is a novel and necessary sequence analytic tool for both the experimentalists and the computational biology community working on membrane proteins. It is freely accessible at http://tmsoc.bii.a-star.edu.sg or available for download.
Collapse
Affiliation(s)
- Wing-Cheong Wong
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore 117597 and School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore 637553
- *To whom correspondence should be addressed. Tel: +65 64788305; Fax: +65 64789047;
| | - Sebastian Maurer-Stroh
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore 117597 and School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore 637553
| | - Georg Schneider
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore 117597 and School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore 637553
| | - Frank Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore 117597 and School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, Singapore 637553
- *To whom correspondence should be addressed. Tel: +65 64788305; Fax: +65 64789047;
| |
Collapse
|