1
|
Romero MF, Krall JB, Nichols PJ, Vantreeck J, Henen MA, Dejardin E, Schulz F, Vicens Q, Vögeli B, Diallo MA. Novel Z-DNA binding domains in giant viruses. J Biol Chem 2024:107504. [PMID: 38944123 DOI: 10.1016/j.jbc.2024.107504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 06/15/2024] [Accepted: 06/18/2024] [Indexed: 07/01/2024] Open
Abstract
Z-nucleic acid structures play vital roles in cellular processes and have implications in innate immunity due to their recognition by Zα domains containing proteins (Z-DNA/Z-RNA binding proteins, ZBPs). Although Zα domains have been identified in six proteins, including viral E3L, ORF112, and I73R, as well as, cellular ADAR1, ZBP1, and PKZ, their prevalence across living organisms remains largely unexplored. In this study, we introduce a computational approach to predict Zα domains, leading to the revelation of previously unidentified Zα domain-containing proteins in eukaryotic organisms, including non-metazoan species. Our findings encompass the discovery of new ZBPs in previously unexplored giant viruses, members of the Nucleocytoviricota phylum. Through experimental validation, we confirm the Zα functionality of select proteins, establishing their capability to induce the B-to-Z conversion. Additionally, we identify Zα-like domains within bacterial proteins. While these domains share certain features with Zα domains, they lack the ability to bind to Z-nucleic acids or facilitate the B-to-Z DNA conversion. Our findings significantly expand the ZBP family across a wide spectrum of organisms and raise intriguing questions about the evolutionary origins of Zα-containing proteins. Moreover, our study offers fresh perspectives on the functional significance of Zα domains in virus sensing and innate immunity and opens avenues for exploring hitherto undiscovered functions of ZBPs.
Collapse
Affiliation(s)
- Miguel F Romero
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Jeffrey B Krall
- Department of Biochemistry and Molecular Genetics University of Colorado at Denver
| | - Parker J Nichols
- Department of Biochemistry and Molecular Genetics University of Colorado at Denver
| | - Jillian Vantreeck
- Department of Biochemistry and Molecular Genetics University of Colorado at Denver
| | - Morkos A Henen
- Department of Biochemistry and Molecular Genetics University of Colorado at Denver
| | - Emmanuel Dejardin
- GIGA I3 - Molecular Immunology and Signal Transduction, University of Liège, Liège, B-4000, Belgium
| | - Frederik Schulz
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Quentin Vicens
- Department of Biology and Biochemistry, Center for Nuclear Receptors and Cell Signaling, University of Houston, Houston, Texas 77204
| | - Beat Vögeli
- Department of Biochemistry and Molecular Genetics University of Colorado at Denver
| | - Mamadou Amadou Diallo
- GIGA I3 - Molecular Immunology and Signal Transduction, University of Liège, Liège, B-4000, Belgium
| |
Collapse
|
2
|
Liu W, Wang Z, You R, Xie C, Wei H, Xiong Y, Yang J, Zhu S. PLMSearch: Protein language model powers accurate and fast sequence search for remote homology. Nat Commun 2024; 15:2775. [PMID: 38555371 PMCID: PMC10981738 DOI: 10.1038/s41467-024-46808-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 03/08/2024] [Indexed: 04/02/2024] Open
Abstract
Homologous protein search is one of the most commonly used methods for protein annotation and analysis. Compared to structure search, detecting distant evolutionary relationships from sequences alone remains challenging. Here we propose PLMSearch (Protein Language Model), a homologous protein search method with only sequences as input. PLMSearch uses deep representations from a pre-trained protein language model and trains the similarity prediction model with a large number of real structure similarity. This enables PLMSearch to capture the remote homology information concealed behind the sequences. Extensive experimental results show that PLMSearch can search millions of query-target protein pairs in seconds like MMseqs2 while increasing the sensitivity by more than threefold, and is comparable to state-of-the-art structure search methods. In particular, unlike traditional sequence search methods, PLMSearch can recall most remote homology pairs with dissimilar sequences but similar structures. PLMSearch is freely available at https://dmiip.sjtu.edu.cn/PLMSearch .
Collapse
Affiliation(s)
- Wei Liu
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China
| | - Ziye Wang
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China
| | - Ronghui You
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China
| | - Chenghan Xie
- School of Mathematical Sciences, Fudan University, 200433, Shanghai, China
| | - Hong Wei
- School of Mathematical Sciences, Nankai University, 300071, Tianjin, China
| | - Yi Xiong
- Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Jianyi Yang
- Ministry of Education Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Science, Shandong University, 266237, Qingdao, China.
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China.
- Shanghai Qi Zhi Institute, Shanghai, China.
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China.
- Shanghai Key Lab of Intelligent Information Processing and Shanghai Institute of Artificial Intelligence Algorithm, Fudan University, Shanghai, China.
- Zhangjiang Fudan International Innovation Center, Shanghai, China.
| |
Collapse
|
3
|
Guo Z, Wang Y, Ou G. Utilizing the scale-invariant feature transform algorithm to align distance matrices facilitates systematic protein structure comparison. Bioinformatics 2024; 40:btae064. [PMID: 38318777 PMCID: PMC10924749 DOI: 10.1093/bioinformatics/btae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/08/2024] [Accepted: 02/02/2024] [Indexed: 02/07/2024] Open
Abstract
MOTIVATION Protein structure comparison is pivotal for deriving homological relationships, elucidating protein functions, and understanding evolutionary developments. The burgeoning field of in-silico protein structure prediction now yields billions of models with near-experimental accuracy, necessitating sophisticated tools for discerning structural similarities among proteins, particularly when sequence similarity is limited. RESULTS In this article, we have developed the align distance matrix with scale (ADAMS) pipeline, which synergizes the distance matrix alignment method with the scale-invariant feature transform algorithm, streamlining protein structure comparison on a proteomic scale. Utilizing a computer vision-centric strategy for contrasting disparate distance matrices, ADAMS adeptly alleviates challenges associated with proteins characterized by a high degree of structural flexibility. Our findings indicate that ADAMS achieves a level of performance and accuracy on par with Foldseek, while maintaining similar speed. Crucially, ADAMS overcomes certain limitations of Foldseek in handling structurally flexible proteins, establishing it as an efficacious tool for in-depth protein structure analysis with heightened accuracy. AVAILABILITY ADAMS can be download and used as a python package from Python Package Index (PyPI): adams · PyPI. Source code and other materials are available from young55775/ADAMS-developing (github.com). An online server is available: Bseek Search Server (cryonet.ai).
Collapse
Affiliation(s)
- Zhengyang Guo
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing 100084, China
| | - Yang Wang
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing 100084, China
| | - Guangshuo Ou
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing 100084, China
| |
Collapse
|
4
|
van Kempen M, Kim SS, Tumescheit C, Mirdita M, Lee J, Gilchrist CLM, Söding J, Steinegger M. Fast and accurate protein structure search with Foldseek. Nat Biotechnol 2024; 42:243-246. [PMID: 37156916 PMCID: PMC10869269 DOI: 10.1038/s41587-023-01773-0] [Citation(s) in RCA: 240] [Impact Index Per Article: 240.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 03/30/2023] [Indexed: 05/10/2023]
Abstract
As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing tertiary amino acid interactions within proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of Dali, TM-align and CE, respectively.
Collapse
Affiliation(s)
- Michel van Kempen
- Quantitative and Computational Biology Group, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Stephanie S Kim
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | | | - Milot Mirdita
- Quantitative and Computational Biology Group, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Jeongjae Lee
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | | | - Johannes Söding
- Quantitative and Computational Biology Group, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany.
- Campus Institute Data Science (CIDAS), Göttingen, Germany.
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea.
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea.
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea.
| |
Collapse
|
5
|
Petrovskiy DV, Nikolsky KS, Rudnev VR, Kulikova LI, Butkova TV, Malsagova KA, Kopylov AT, Kaysheva AL. SAFoldNet: A Novel Tool for Discovering and Aligning Three-Dimensional Protein Structures Based on a Neural Network. Int J Mol Sci 2023; 24:14439. [PMID: 37833886 PMCID: PMC10572457 DOI: 10.3390/ijms241914439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 09/15/2023] [Accepted: 09/19/2023] [Indexed: 10/15/2023] Open
Abstract
The development and improvement of methods for comparing and searching for three-dimensional protein structures remain urgent tasks in modern structural biology. To solve this problem, we developed a new tool, SAFoldNet, which allows for searching, aligning, superimposing, and determining the exact coordinates of fragments of protein structures. The proposed search and alignment tool was built using neural networking. Specifically, we implemented the integrative synergy of neural network predictions and the well-known BLAST algorithm for searching and aligning sequences. The proposed method involves multistage processing, comprising a stage for converting the geometry of protein structures into sequences of a structural alphabet using a neural network, a search stage for forming a set of candidate structures, and a refinement stage for calculating the structural alignment and overlap and evaluating the similarity with the starting structure of the search. The effectiveness and practical applicability of the proposed tool were compared with those of several widely used services for searching and aligning protein structures. The results of the comparisons confirmed that the proposed method is effective and competitive relative to the available modern services. Furthermore, using the proposed approach, a service with a user-friendly web interface was developed, which allows for searching, aligning, and superimposing protein structures; determining the location of protein fragments; mapping onto a protein molecule chain; and providing structural similarity metrices (expected value and root mean square deviation).
Collapse
Affiliation(s)
| | | | | | | | | | - Kristina A. Malsagova
- Institute of Biomedical Chemistry, 119121 Moscow, Russia; (D.V.P.); (K.S.N.); (V.R.R.); (L.I.K.); (T.V.B.); (A.T.K.); (A.L.K.)
| | | | | |
Collapse
|
6
|
Bordin N, Dallago C, Heinzinger M, Kim S, Littmann M, Rauer C, Steinegger M, Rost B, Orengo C. Novel machine learning approaches revolutionize protein knowledge. Trends Biochem Sci 2023; 48:345-359. [PMID: 36504138 PMCID: PMC10570143 DOI: 10.1016/j.tibs.2022.11.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 10/24/2022] [Accepted: 11/17/2022] [Indexed: 12/10/2022]
Abstract
Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models of proteins and annotating their functions on a large scale is no longer limited by time and resources. The most recent method to be top ranked by the Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), is capable of building structural models with an accuracy comparable to that of experimental structures. Annotations of 3D models are keeping pace with the deposition of the structures due to advancements in protein language models (pLMs) and structural aligners that help validate these transferred annotations. In this review we describe how recent developments in ML for protein science are making large-scale structural bioinformatics available to the general scientific community.
Collapse
Affiliation(s)
- Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK
| | - Christian Dallago
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; VantAI, 151 W 42nd Street, New York, NY 10036, USA
| | - Michael Heinzinger
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748 Garching, Germany
| | - Stephanie Kim
- School of Biological Sciences, Seoul National University, Seoul, South Korea; Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Maria Littmann
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany
| | - Clemens Rauer
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea; Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Burkhard Rost
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching/Munich, Germany; TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK.
| |
Collapse
|
7
|
Structural basis for matriglycan synthesis by the LARGE1 dual glycosyltransferase. PLoS One 2022; 17:e0278713. [PMID: 36512577 PMCID: PMC9746966 DOI: 10.1371/journal.pone.0278713] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 11/21/2022] [Indexed: 12/15/2022] Open
Abstract
LARGE1 is a bifunctional glycosyltransferase responsible for generating a long linear polysaccharide termed matriglycan that links the cytoskeleton and the extracellular matrix and is required for proper muscle function. This matriglycan polymer is made with an alternating pattern of xylose and glucuronic acid monomers. Mutations in the LARGE1 gene have been shown to cause life-threatening dystroglycanopathies through the inhibition of matriglycan synthesis. Despite its major role in muscle maintenance, the structure of the LARGE1 enzyme and how it assembles in the Golgi are unknown. Here we present the structure of LARGE1, obtained by a combination of X-ray crystallography and single-particle cryo-EM. We found that LARGE1 homo-dimerizes in a configuration that is dictated by its coiled-coil stem domain. The structure shows that this enzyme has two canonical GT-A folds within each of its catalytic domains. In the context of its dimeric structure, the two types of catalytic domains are brought into close proximity from opposing monomers to allow efficient shuttling of the substrates between the two domains. Together, with putative retention of matriglycan by electrostatic interactions, this dimeric organization offers a possible mechanism for the ability of LARGE1 to synthesize long matriglycan chains. The structural information further reveals the mechanisms in which disease-causing mutations disrupt the activity of LARGE1. Collectively, these data shed light on how matriglycan is synthesized alongside the functional significance of glycosyltransferase oligomerization.
Collapse
|
8
|
Kim JS, Born A, Till JKA, Liu L, Kant S, Henen MA, Vögeli B, Vázquez-Torres A. Promiscuity of response regulators for thioredoxin steers bacterial virulence. Nat Commun 2022; 13:6210. [PMID: 36266276 PMCID: PMC9584953 DOI: 10.1038/s41467-022-33983-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 10/11/2022] [Indexed: 12/24/2022] Open
Abstract
The exquisite specificity between a sensor kinase and its cognate response regulator ensures faithful partner selectivity within two-component pairs concurrently firing in a single bacterium, minimizing crosstalk with other members of this conserved family of paralogous proteins. We show that conserved hydrophobic and charged residues on the surface of thioredoxin serve as a docking station for structurally diverse response regulators. Using the OmpR protein, we identify residues in the flexible linker and the C-terminal β-hairpin that enable associations of this archetypical response regulator with thioredoxin, but are dispensable for interactions of this transcription factor to its cognate sensor kinase EnvZ, DNA or RNA polymerase. Here we show that the promiscuous interactions of response regulators with thioredoxin foster the flow of information through otherwise highly dedicated two-component signaling systems, thereby enabling both the transcription of Salmonella pathogenicity island-2 genes as well as growth of this intracellular bacterium in macrophages and mice.
Collapse
Affiliation(s)
- Ju-Sim Kim
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Alexandra Born
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Biochemistry & Molecular Genetics, Aurora, Colorado USA
| | - James Karl A. Till
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Lin Liu
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Sashi Kant
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado USA
| | - Morkos A. Henen
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Biochemistry & Molecular Genetics, Aurora, Colorado USA ,grid.10251.370000000103426662Faculty of Pharmacy, Mansoura University, Mansoura, 35516 Egypt
| | - Beat Vögeli
- grid.430503.10000 0001 0703 675XUniversity of Colorado School of Medicine, Department of Biochemistry & Molecular Genetics, Aurora, Colorado USA
| | - Andrés Vázquez-Torres
- University of Colorado School of Medicine, Department of Immunology & Microbiology, Aurora, Colorado, USA. .,Veterans Affairs Eastern Colorado Health Care System, Denver, Colorado, USA.
| |
Collapse
|
9
|
Chen TR, Juan SH, Huang YW, Lin YC, Lo WC. A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction. PLoS One 2021; 16:e0255076. [PMID: 34320027 PMCID: PMC8318245 DOI: 10.1371/journal.pone.0255076] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 07/11/2021] [Indexed: 11/18/2022] Open
Abstract
Protein secondary structure prediction (SSP) has a variety of applications; however, there has been relatively limited improvement in accuracy for years. With a vision of moving forward all related fields, we aimed to make a fundamental advance in SSP. There have been many admirable efforts made to improve the machine learning algorithm for SSP. This work thus took a step back by manipulating the input features. A secondary structure element-based position-specific scoring matrix (SSE-PSSM) is proposed, based on which a new set of machine learning features can be established. The feasibility of this new PSSM was evaluated by rigid independent tests with training and testing datasets sharing <25% sequence identities. In all experiments, the proposed PSSM outperformed the traditional amino acid PSSM. This new PSSM can be easily combined with the amino acid PSSM, and the improvement in accuracy was remarkable. Preliminary tests made by combining the SSE-PSSM and well-known SSP methods showed 2.0% and 5.2% average improvements in three- and eight-state SSP accuracies, respectively. If this PSSM can be integrated into state-of-the-art SSP methods, the overall accuracy of SSP may break the current restriction and eventually bring benefit to all research and applications where secondary structure prediction plays a vital role during development. To facilitate the application and integration of the SSE-PSSM with modern SSP methods, we have established a web server and standalone programs for generating SSE-PSSM available at http://10.life.nctu.edu.tw/SSE-PSSM.
Collapse
Affiliation(s)
- Teng-Ruei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Sheng-Hung Juan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Wei Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yen-Cheng Lin
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- The Center for Bioinformatics Research, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
10
|
Lange A, Patel PH, Heames B, Damry AM, Saenger T, Jackson CJ, Findlay GD, Bornberg-Bauer E. Structural and functional characterization of a putative de novo gene in Drosophila. Nat Commun 2021; 12:1667. [PMID: 33712569 PMCID: PMC7954818 DOI: 10.1038/s41467-021-21667-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/03/2021] [Indexed: 11/26/2022] Open
Abstract
Comparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard's orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard's structure appears to have been maintained with only minor changes over millions of years.
Collapse
Affiliation(s)
- Andreas Lange
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Prajal H Patel
- Department of Biology, College of the Holy Cross, Worcester, MA, USA
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Adam M Damry
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | - Thorsten Saenger
- Department of Pediatric Kidney, Liver and Metabolic Diseases, Hannover Medical School, Hannover, Germany
| | - Colin J Jackson
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | | | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
11
|
Thongkawphueak T, Winter AJ, Williams C, Maple HJ, Soontaranon S, Kaewhan C, Campopiano DJ, Crump MP, Wattana-Amorn P. Solution Structure and Conformational Dynamics of a Doublet Acyl Carrier Protein from Prodigiosin Biosynthesis. Biochemistry 2021; 60:219-230. [PMID: 33416314 DOI: 10.1021/acs.biochem.0c00830] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The acyl carrier protein (ACP) is an indispensable component of both fatty acid and polyketide synthases and is primarily responsible for delivering acyl intermediates to enzymatic partners. At present, increasing numbers of multidomain ACPs have been discovered with roles in molecular recognition of trans-acting enzymatic partners as well as increasing metabolic flux. Further structural information is required to provide insight into their function, yet to date, the only high-resolution structure of this class to be determined is that of the doublet ACP (two continuous ACP domains) from mupirocin synthase. Here we report the solution nuclear magnetic resonance (NMR) structure of the doublet ACP domains from PigH (PigH ACP1-ACP2), which is an enzyme that catalyzes the formation of the bipyrrolic intermediate of prodigiosin, a potent anticancer compound with a variety of biological activities. The PigH ACP1-ACP2 structure shows each ACP domain consists of three conserved helices connected by a linker that is partially restricted by interactions with the ACP1 domain. Analysis of the holo (4'-phosphopantetheine, 4'-PP) form of PigH ACP1-ACP2 by NMR revealed conformational exchange found predominantly in the ACP2 domain reflecting the inherent plasticity of this ACP. Furthermore, ensemble models obtained from SAXS data reveal two distinct conformers, bent and extended, of both apo (unmodified) and holo PigH ACP1-ACP2 mediated by the central linker. The bent conformer appears to be a result of linker-ACP1 interactions detected by NMR and might be important for intradomain communication during the biosynthesis. These results provide new insights into the behavior of the interdomain linker of multiple ACP domains that may modulate protein-protein interactions. This is likely to become an increasingly important consideration for metabolic engineering in prodigiosin and other related biosynthetic pathways.
Collapse
Affiliation(s)
- Thitapa Thongkawphueak
- Department of Chemistry, Special Research Unit for Advanced Magnetic Resonance and Center of Excellence for Innovation in Chemistry, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| | - Ashley J Winter
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, U.K
| | - Christopher Williams
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, U.K.,BrisSynBio, Centre for Synthetic Biology Research, Life Sciences Building, Tyndall Avenue, University of Bristol, Bristol BS8 1TQ, U.K
| | - Hannah J Maple
- School of Social and Community Medicine, University of Bristol, Oakfield House, Bristol BS8 2BN, U.K
| | - Siriwat Soontaranon
- Synchrotron Light Research Institute (Public Organization), Nakhon Ratchasima 30000, Thailand
| | - Chonthicha Kaewhan
- Synchrotron Light Research Institute (Public Organization), Nakhon Ratchasima 30000, Thailand
| | - Dominic J Campopiano
- School of Chemistry, University of Edinburgh, Joseph Black Building, David Brewster Road, Edinburgh EH9 3FJ, U.K
| | - Matthew P Crump
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, U.K.,BrisSynBio, Centre for Synthetic Biology Research, Life Sciences Building, Tyndall Avenue, University of Bristol, Bristol BS8 1TQ, U.K
| | - Pakorn Wattana-Amorn
- Department of Chemistry, Special Research Unit for Advanced Magnetic Resonance and Center of Excellence for Innovation in Chemistry, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| |
Collapse
|
12
|
Reddy PK, Pullepu D, Dhabalia D, Udaya Prakash SM, Kabir MA. CSU57 encodes a novel repressor of sorbose utilization in opportunistic human fungal pathogen Candida albicans. Yeast 2020; 38:222-238. [PMID: 33179314 DOI: 10.1002/yea.3537] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 11/04/2020] [Accepted: 11/05/2020] [Indexed: 11/11/2022] Open
Abstract
Human fungal pathogen Candida albicans cannot utilize L-sorbose as a sole carbon source. However, chromosome 5 monosomic strains can grow on sorbose as repressors present on this chromosome get diminished allowing the expression of sorbose utilization gene (SOU1) located on chromosome 4. Functional identification of these repressors has been a difficult task as they are scattered on a large portion of the right arm of chromosome 5. Herein, we have applied the telomere-mediated chromosomal truncation approach to identify a novel repressor for sorbose utilization in this pathogen. Multiple systematic chromosomal truncations were performed on the right arm of Chr5 in the background of csu51∆/CSU51 to minimize the functional region to 6-kb chromosomal stretch. Further, truncation that removes the part of Orf19.3942 strongly suggested its role in sorbose utilization. However, compelling evidence comes from the observation that truncation at 1,044.288-kb position of Chr5 in the strain csu51∆/CSU51 orf19.3942∆/Orf.19.3942 produced Sou+ phenotype; otherwise, the strain remains Sou- . This confirms beyond doubt the role of Orf.19.3942 in the regulation of sorbose utilization and designated as CSU57. Comparison of SOU1 gene expression of Sou+ strains with wild type suggested its role at transcriptional level. Strain carrying double disruption of CSU57 remains Sou- . Co-overexpression of SOU1 and CSU57 together does not make the recipient strain Sou- ; however, multiple tandem copies of CSU57 produced diminished growth compared with control suggesting that it is a weak repressor. Taken together, we report that CSU57 encodes a novel repressor of L-sorbose utilization in this pathogen. TAKE AWAY: CSU57 encodes a repressor for L-sorbose utilization in Candida albicans. Csu57p acts in combination with Csu51p and other regulators. Csu57p exerts its repressing effect at transcriptional level of SOU1 gene. Utilization of sorbose positively correlates to the expression of SOU1 gene. Multiple copies of CSU57 can partially suppress Sou+ phenotype.
Collapse
Affiliation(s)
- Praveen Kumar Reddy
- Molecular Genetics Laboratory, School of Biotechnology, National Institute of Technology Calicut, Calicut, India
| | - Dileep Pullepu
- Molecular Genetics Laboratory, School of Biotechnology, National Institute of Technology Calicut, Calicut, India
| | - Darshan Dhabalia
- Molecular Genetics Laboratory, School of Biotechnology, National Institute of Technology Calicut, Calicut, India
| | | | - Mohammad Anaul Kabir
- Molecular Genetics Laboratory, School of Biotechnology, National Institute of Technology Calicut, Calicut, India
| |
Collapse
|
13
|
Matsuyama K, Kondo T, Igarashi K, Sakamoto T, Ishimaru M. Substrate-recognition mechanism of tomato β-galactosidase 4 using X-ray crystallography and docking simulation. PLANTA 2020; 252:72. [PMID: 33011862 DOI: 10.1007/s00425-020-03481-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 09/22/2020] [Indexed: 06/11/2023]
Abstract
TBG4 recognize multiple linkage types substrates due to having a spatially wide subsite + 1. This feature allows the degradation of AGI, AGII, and AGP leading to the fruit ripening. β-galactosidase (EC 3. 2. 1. 23) catalyzes the hydrolysis of β-galactan and release of D-galactose. Tomato has at least 17 β-galactosidases (TBGs), of which, TBG 4 is responsible for fruit ripening. TBG4 hydrolyzes not only β-1,4-bound galactans, but also β-1,3- and β-1,6-galactans. In this study, we compared each enzyme-substrate complex using X-ray crystallography, ensemble refinement, and docking simulation to understand the broad substrate-specificity of TBG4. In subsite - 1, most interactions were conserved across each linkage type of galactobioses; however, some differences were seen in subsite + 1, owing to the huge volume of catalytic pocket. In addition to this, docking simulation indicated TBG4 to possibly have more positive subsites to recognize and hydrolyze longer galactans. Taken together, our results indicated that during tomato fruit ripening, TBG4 plays an important role by degrading arabinogalactan I (AGI), arabinogalactan II (AGII), and the carbohydrate moiety of arabinogalactan protein (AGP).
Collapse
Affiliation(s)
- Kaori Matsuyama
- Department of Biomaterial Sciences, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Bunkyo-ku, Tokyo, 113-8657, Japan
- Faculty of Biology-Oriented Science and Technology, Kindai University, 930 Nishimitani, Kinokawa, Wakayama, 649-6493, Japan
| | - Tatsuya Kondo
- Division of Applied Life Sciences, Graduate School of Life and Environmental Sciences, Osaka Prefecture University, 1-1 Gakuencho, Naka-ku, Sakai, Osaka, 599-8531, Japan
| | - Kiyohiko Igarashi
- Department of Biomaterial Sciences, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Bunkyo-ku, Tokyo, 113-8657, Japan
| | - Tatsuji Sakamoto
- Division of Applied Life Sciences, Graduate School of Life and Environmental Sciences, Osaka Prefecture University, 1-1 Gakuencho, Naka-ku, Sakai, Osaka, 599-8531, Japan
| | - Megumi Ishimaru
- Faculty of Biology-Oriented Science and Technology, Kindai University, 930 Nishimitani, Kinokawa, Wakayama, 649-6493, Japan.
| |
Collapse
|
14
|
Nakayama M, Miyagawa H, Kuranami Y, Tsunooka-Ota M, Yamaguchi Y, Kojima-Aikawa K. Annexin A4 inhibits sulfatide-induced activation of coagulation factor XII. J Thromb Haemost 2020; 18:1357-1369. [PMID: 32145147 DOI: 10.1111/jth.14789] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 02/24/2020] [Accepted: 03/02/2020] [Indexed: 12/28/2022]
Abstract
BACKGROUND Factor XII (FXII) is a plasma serine protease that initiates the intrinsic pathway of blood coagulation upon contact with anionic substances, such as the sulfated glycolipid sulfatide. Annexins (ANXs) have been implicated in the regulation of the blood coagulation reaction by binding to anionic surfaces composed of phospholipids and sulfated glycoconjugates, but their physiological importance is only partially understood. OBJECTIVE To test the hypothesis that ANXs are involved in suppressing the intrinsic pathway initiated by sulfatide, we examined the effect of eight recombinant ANX proteins on the intrinsic coagulation reaction and their sulfatide binding activities. METHODS Recombinant ANXs were prepared in Escherichia coli expression systems and their anticoagulant effects on the intrinsic pathway initiated by sulfatide were examined using plasma clotting assay and chromogenic assay. ANXA4 active sites were identified by alanine scanning and fold deletion in the core domain. RESULTS AND CONCLUSIONS We found that ANXA3, ANXA4, and ANXA5 strongly inhibited sulfatide-induced plasma coagulation. Wild-type and mutated ANXA4 were used to clarify the molecular mechanism involved in inhibition. ANXA4 inhibited sulfatide-induced auto-activation of FXII to FXIIa and the conversion of its natural substrate FXI to FXIa but showed no effect on the protease activity of FXIIa or FXIa. Alanine scanning showed that substitution of the Ca2+ -binding amino acid residue in the fourth fold of the core domain of ANXA4 reduced anticoagulant activity, and deletion of the entire fourth fold of the core domain resulted in complete loss of anticoagulant activity.
Collapse
Affiliation(s)
- Moeka Nakayama
- Division of Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, Tokyo, Japan
- Program for Leading Graduate Schools, Ochanomizu University, Tokyo, Japan
| | - Hitomi Miyagawa
- Division of Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, Tokyo, Japan
| | - Yumiko Kuranami
- Division of Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, Tokyo, Japan
| | - Miyuki Tsunooka-Ota
- Division of Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, Tokyo, Japan
| | - Yoshiki Yamaguchi
- Synthetic Cellular Chemistry Laboratory, RIKEN, Saitama, Japan
- Laboratory of Pharmaceutical Physical Chemistry, Tohoku Medical and Pharmaceutical University, Miyagi, Japan
| | - Kyoko Kojima-Aikawa
- Natural Science Division, Faculty of Core Research, Ochanomizu University, Tokyo, Japan
- Institute for Human Life Innovation, Ochanomizu University, Tokyo, Japan
| |
Collapse
|
15
|
Tenorio CA, Longo LM, Parker JB, Lee J, Blaber M. Ab initio folding of a trefoil-fold motif reveals structural similarity with a β-propeller blade motif. Protein Sci 2020; 29:1172-1185. [PMID: 32142181 DOI: 10.1002/pro.3850] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 03/01/2020] [Accepted: 03/03/2020] [Indexed: 01/05/2023]
Abstract
Many protein architectures exhibit evidence of internal rotational symmetry postulated to be the result of gene duplication/fusion events involving a primordial polypeptide motif. A common feature of such structures is a domain-swapped arrangement at the interface of the N- and C-termini motifs and postulated to provide cooperative interactions that promote folding and stability. De novo designed symmetric protein architectures have demonstrated an ability to accommodate circular permutation of the N- and C-termini in the overall architecture; however, the folding requirement of the primordial motif is poorly understood, and tolerance to circular permutation is essentially unknown. The β-trefoil protein fold is a threefold-symmetric architecture where the repeating ~42-mer "trefoil-fold" motif assembles via a domain-swapped arrangement. The trefoil-fold structure in isolation exposes considerable hydrophobic area that is otherwise buried in the intact β-trefoil trimeric assembly. The trefoil-fold sequence is not predicted to adopt the trefoil-fold architecture in ab initio folding studies; rather, the predicted fold is closely related to a compact "blade" motif from the β-propeller architecture. Expression of a trefoil-fold sequence and circular permutants shows that only the wild-type N-terminal motif definition yields an intact β-trefoil trimeric assembly, while permutants yield monomers. The results elucidate the folding requirements of the primordial trefoil-fold motif, and also suggest that this motif may sample a compact conformation that limits hydrophobic residue exposure, contains key trefoil-fold structural features, but is more structurally homologous to a β-propeller blade motif.
Collapse
Affiliation(s)
- Connie A Tenorio
- Department of Biomedical Sciences, Florida State University, Tallahassee, Florida, USA
| | - Liam M Longo
- Department of Biomedical Sciences, Florida State University, Tallahassee, Florida, USA
| | - Joseph B Parker
- Department of Biomedical Sciences, Florida State University, Tallahassee, Florida, USA
| | - Jihun Lee
- Department of Biomedical Sciences, Florida State University, Tallahassee, Florida, USA
| | | |
Collapse
|
16
|
Gao S, Liu H, de Crécy-Lagard V, Zhu W, Richards NGJ, Naismith JH. PMP-diketopiperazine adducts form at the active site of a PLP dependent enzyme involved in formycin biosynthesis. Chem Commun (Camb) 2019; 55:14502-14505. [PMID: 31730149 PMCID: PMC6927412 DOI: 10.1039/c9cc06975e] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Accepted: 10/16/2019] [Indexed: 01/04/2023]
Abstract
ForI is a PLP-dependent enzyme from the biosynthetic pathway of the C-nucleoside antibiotic formycin. Cycloserine is thought to inhibit PLP-dependent enzymes by irreversibly forming a PMP-isoxazole. We now report that ForI forms novel PMP-diketopiperazine derivatives following incubation with both d and l cycloserine. This unexpected result suggests chemical diversity in the chemistry of cycloserine inhibition.
Collapse
Affiliation(s)
- Sisi Gao
- Research Complex at Harwell
,
Didcot
, OX11 0FA
, UK
- BSRC
, University of St Andrews
,
St Andrews
, KY16 9ST
, UK
| | - Huanting Liu
- BSRC
, University of St Andrews
,
St Andrews
, KY16 9ST
, UK
| | | | - Wen Zhu
- Department of Chemistry and California
, Institute for Quantitative Biosciences
, University of California
,
Berkeley
, CA 94720
, USA
| | - Nigel G. J. Richards
- School of Chemistry
, Cardiff University
, Park Place
,
Cardiff
, CF10 3AT
, UK
- Foundation for Applied Molecular Evolution
,
Alachua
, FL 32415
, USA
| | - James H. Naismith
- Division of Structural Biology
, University of Oxford
,
Oxford
, OX3 7BN
, UK
.
- The Rosalind Franklin Institute
,
Didcot
, OX11 0FA
, UK
- State Key Laboratory of Biotherapy
, University of Sichuan
,
China
| |
Collapse
|
17
|
Faure G, Joseph AP, Craveur P, Narwani TJ, Srinivasan N, Gelly JC, Rebehmed J, de Brevern AG. iPBAvizu: a PyMOL plugin for an efficient 3D protein structure superimposition approach. SOURCE CODE FOR BIOLOGY AND MEDICINE 2019; 14:5. [PMID: 31700529 PMCID: PMC6825713 DOI: 10.1186/s13029-019-0075-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 10/14/2019] [Indexed: 11/10/2022]
Abstract
Background Protein 3D structure is the support of its function. Comparison of 3D protein structures provides insight on their evolution and their functional specificities and can be done efficiently via protein structure superimposition analysis. Multiple approaches have been developed to perform such task and are often based on structural superimposition deduced from sequence alignment, which does not take into account structural features. Our methodology is based on the use of a Structural Alphabet (SA), i.e. a library of 3D local protein prototypes able to approximate protein backbone. The interest of a SA is to translate into 1D sequences into the 3D structures. Results We used Protein blocks (PB), a widely used SA consisting of 16 prototypes, each representing a conformation of the pentapeptide skeleton defined in terms of dihedral angles. Proteins are described using PB from which we have previously developed a sequence alignment procedure based on dynamic programming with a dedicated PB Substitution Matrix. We improved the procedure with a specific two-step search: (i) very similar regions are selected using very high weights and aligned, and (ii) the alignment is completed (if possible) with less stringent parameters. Our approach, iPBA, has shown to perform better than other available tools in benchmark tests. To facilitate the usage of iPBA, we designed and implemented iPBAvizu, a plugin for PyMOL that allows users to run iPBA in an easy way and analyse protein superimpositions. Conclusions iPBAvizu is an implementation of iPBA within the well-known and widely used PyMOL software. iPBAvizu enables to generate iPBA alignments, create and interactively explore structural superimposition, and assess the quality of the protein alignments.
Collapse
Affiliation(s)
- Guilhem Faure
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France
| | - Agnel Praveen Joseph
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France.,INSERM UMR_S 1134, DSIMB, Université de Paris, Institut National de la Transfusion Sanguine (INTS), 6, rue Alexandre Cabanel, F-75739, Paris cedex 15, France.,Laboratoire d'Excellence GR-Ex, F-75739 Paris, France.,4Birkbeck College, University of London, London, UK
| | - Pierrick Craveur
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France.,INSERM UMR_S 1134, DSIMB, Université de Paris, Institut National de la Transfusion Sanguine (INTS), 6, rue Alexandre Cabanel, F-75739, Paris cedex 15, France.,Laboratoire d'Excellence GR-Ex, F-75739 Paris, France.,5Molecular Graphics Laboratory, Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Tarun J Narwani
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France.,INSERM UMR_S 1134, DSIMB, Université de Paris, Institut National de la Transfusion Sanguine (INTS), 6, rue Alexandre Cabanel, F-75739, Paris cedex 15, France.,Laboratoire d'Excellence GR-Ex, F-75739 Paris, France
| | | | - Jean-Christophe Gelly
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France.,INSERM UMR_S 1134, DSIMB, Université de Paris, Institut National de la Transfusion Sanguine (INTS), 6, rue Alexandre Cabanel, F-75739, Paris cedex 15, France.,Laboratoire d'Excellence GR-Ex, F-75739 Paris, France
| | - Joseph Rebehmed
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France.,INSERM UMR_S 1134, DSIMB, Université de Paris, Institut National de la Transfusion Sanguine (INTS), 6, rue Alexandre Cabanel, F-75739, Paris cedex 15, France.,Laboratoire d'Excellence GR-Ex, F-75739 Paris, France.,7Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon
| | - Alexandre G de Brevern
- INSERM, U 1134, DSIMB, Univ Paris, Univ de la Réunion, Univ des Antilles, F-75739 Paris, France.,INSERM UMR_S 1134, DSIMB, Université de Paris, Institut National de la Transfusion Sanguine (INTS), 6, rue Alexandre Cabanel, F-75739, Paris cedex 15, France.,Laboratoire d'Excellence GR-Ex, F-75739 Paris, France
| |
Collapse
|
18
|
Acetyl-CoA-mediated activation of Mycobacterium tuberculosis isocitrate lyase 2. Nat Commun 2019; 10:4639. [PMID: 31604954 PMCID: PMC6788997 DOI: 10.1038/s41467-019-12614-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 09/18/2019] [Indexed: 11/25/2022] Open
Abstract
Isocitrate lyase is important for lipid utilisation by Mycobacterium tuberculosis but its ICL2 isoform is poorly understood. Here we report that binding of the lipid metabolites acetyl-CoA or propionyl-CoA to ICL2 induces a striking structural rearrangement, substantially increasing isocitrate lyase and methylisocitrate lyase activities. Thus, ICL2 plays a pivotal role regulating carbon flux between the tricarboxylic acid (TCA) cycle, glyoxylate shunt and methylcitrate cycle at high lipid concentrations, a mechanism essential for bacterial growth and virulence. Isocitrate lyase (ICL) isoforms 1 and 2 are enzymes in the glyoxylate and methylcitrate cycles that enable Mycobacterium tuberculosis (Mtb) to use lipids as a carbon source. Here the authors present the ligand-free Mtb ICL2 and acetyl-CoA bound ICL2 crystal structures, which reveal a structural reorganisation upon acetyl-CoA binding that leads to an activation of its isocitrate lyase and methylcitrate lyase activities.
Collapse
|
19
|
Dong R, Pan S, Peng Z, Zhang Y, Yang J. mTM-align: a server for fast protein structure database search and multiple protein structure alignment. Nucleic Acids Res 2019; 46:W380-W386. [PMID: 29788129 PMCID: PMC6030909 DOI: 10.1093/nar/gky430] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 05/07/2018] [Indexed: 11/14/2022] Open
Abstract
With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.
Collapse
Affiliation(s)
- Runze Dong
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Shuo Pan
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| |
Collapse
|
20
|
Hayward D, Kouznetsova VL, Pierson HE, Hasan NM, Guzman ER, Tsigelny IF, Lutsenko S. ANKRD9 is a metabolically-controlled regulator of IMPDH2 abundance and macro-assembly. J Biol Chem 2019; 294:14454-14466. [PMID: 31337707 DOI: 10.1074/jbc.ra119.008231] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/10/2019] [Indexed: 12/17/2022] Open
Abstract
Members of a large family of Ankyrin Repeat Domain (ANKRD) proteins regulate numerous cellular processes by binding to specific protein targets and modulating their activity, stability, and other properties. The same ANKRD protein may interact with different targets and regulate distinct cellular pathways. The mechanisms responsible for switches in the ANKRDs' behavior are often unknown. We show that cells' metabolic state can markedly alter interactions of an ANKRD protein with its target and the functional outcomes of this interaction. ANKRD9 facilitates degradation of inosine monophosphate dehydrogenase 2 (IMPDH2), the rate-limiting enzyme in GTP biosynthesis. Under basal conditions ANKRD9 is largely segregated from the cytosolic IMPDH2 in vesicle-like structures. Upon nutrient limitation, ANKRD9 loses its vesicular pattern and assembles with IMPDH2 into rodlike filaments, in which IMPDH2 is stable. Inhibition of IMPDH2 activity with ribavirin favors ANKRD9 binding to IMPDH2 rods. The formation of ANKRD9/IMPDH2 rods is reversed by guanosine, which restores ANKRD9 associations with the vesicle-like structures. The conserved Cys109Cys110 motif in ANKRD9 is required for the vesicle-to-rods transition as well as binding and regulation of IMPDH2. Oppositely to overexpression, ANKRD9 knockdown increases IMPDH2 levels and prevents formation of IMPDH2 rods upon nutrient limitation. Taken together, the results suggest that a guanosine-dependent metabolic switch determines the mode of ANKRD9 action toward IMPDH2.
Collapse
Affiliation(s)
- Dawn Hayward
- Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Valentina L Kouznetsova
- The Moores Cancer Center, University of California San Diego, La Jolla, California 92093.,San Diego Supercomputer Center University of California San Diego, La Jolla, California 92093
| | - Hannah E Pierson
- Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Nesrin M Hasan
- Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Estefany R Guzman
- Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Igor F Tsigelny
- Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205.,San Diego Supercomputer Center University of California San Diego, La Jolla, California 92093.,Department of Neurosciences, University of California San Diego, La Jolla, California 92093
| | - Svetlana Lutsenko
- Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| |
Collapse
|
21
|
Chitrala KN, Yang X, Nagarkatti P, Nagarkatti M. Comparative analysis of interactions between aryl hydrocarbon receptor ligand binding domain with its ligands: a computational study. BMC STRUCTURAL BIOLOGY 2018; 18:15. [PMID: 30522477 PMCID: PMC6282305 DOI: 10.1186/s12900-018-0095-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Accepted: 11/07/2018] [Indexed: 12/22/2022]
Abstract
BACKGROUND Aryl hydrocarbon receptor (AhR) ligands may act as potential carcinogens or anti-tumor agents. Understanding how some of the residues in AhR ligand binding domain (AhRLBD) modulate their interactions with ligands would be useful in assessing their divergent roles including toxic and beneficial effects. To this end, we have analysed the nature of AhRLBD interactions with 2,3,7,8-tetrachlorodibenzo-ρ-dioxin (TCDD), 6-formylindolo[3,2-b]carbazole (FICZ), indole-3-carbinol (I3C) and its degradation product, 3,3'-diindolylmethane (DIM), Resveratrol (RES) and its analogue, Piceatannol (PTL) using molecular modeling approach followed by molecular dynamic simulations. RESULTS Results showed that each of the AhR ligands, TCDD, FICZ, I3C, DIM, RES and PTL affect the local and global conformations of AhRLBD. CONCLUSION The data presented in this study provide a structural understanding of AhR with its ligands and set the basis for its functions in several pathways and their related diseases.
Collapse
Affiliation(s)
- Kumaraswamy Naidu Chitrala
- Department of Pathology, Microbiology and Immunology, University of South Carolina, School of Medicine, Columbia, SC 29208 USA
| | - Xiaoming Yang
- Department of Pathology, Microbiology and Immunology, University of South Carolina, School of Medicine, Columbia, SC 29208 USA
| | - Prakash Nagarkatti
- Department of Pathology, Microbiology and Immunology, University of South Carolina, School of Medicine, Columbia, SC 29208 USA
| | - Mitzi Nagarkatti
- Department of Pathology, Microbiology and Immunology, University of South Carolina, School of Medicine, Columbia, SC 29208 USA
| |
Collapse
|
22
|
Sofos N, Winkler MBL, Brodersen DE. RRM domain of human RBM7: purification, crystallization and structure determination. Acta Crystallogr F Struct Biol Commun 2016; 72:397-402. [PMID: 27139832 PMCID: PMC4854568 DOI: 10.1107/s2053230x16006129] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Accepted: 04/12/2016] [Indexed: 01/04/2023] Open
Abstract
RNA decay is an important process that is essential for controlling the abundance, quality and maturation of transcripts. In eukaryotes, RNA decay in the 3'-5' direction is carried out by the exosome, an RNA-degradation machine that is conserved from yeast to humans. A range of cofactors stimulate the enzymatic activity of the exosome and serve as adapters for the many RNA substrates. In human cells, the exosome associates with the heterotrimeric nuclear exosome targeting (NEXT) complex consisting of the DExH-box helicase hMTR4, the zinc-finger protein hZCCHC8 and the RRM-type protein hRBM7. Here, the 2.5 Å resolution crystal structure of the RRM domain of human RBM7 is reported. Molecular replacement using a previously determined solution structure of RBM7 was unsuccessful. Instead, RBM8 and CBP20 RRM-domain crystal structures were used to successfully determine the RBM7 structure by molecular replacement. The structure reveals a ring-shaped pentameric assembly, which is most likely a consequence of crystal packing.
Collapse
Affiliation(s)
- Nicholas Sofos
- Centre for mRNP Biogenesis and Metabolism, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10c, DK-8000 Aarhus C, Denmark
| | - Mikael B. L. Winkler
- Centre for mRNP Biogenesis and Metabolism, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10c, DK-8000 Aarhus C, Denmark
| | - Ditlev E. Brodersen
- Centre for mRNP Biogenesis and Metabolism, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10c, DK-8000 Aarhus C, Denmark
| |
Collapse
|
23
|
Ruiz-Gómez G, Hawkins JC, Philipp J, Künze G, Wodtke R, Löser R, Fahmy K, Pisabarro MT. Rational Structure-Based Rescaffolding Approach to De Novo Design of Interleukin 10 (IL-10) Receptor-1 Mimetics. PLoS One 2016; 11:e0154046. [PMID: 27123592 PMCID: PMC4849758 DOI: 10.1371/journal.pone.0154046] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Accepted: 04/07/2016] [Indexed: 12/25/2022] Open
Abstract
Tackling protein interfaces with small molecules capable of modulating protein-protein interactions remains a challenge in structure-based ligand design. Particularly arduous are cases in which the epitopes involved in molecular recognition have a non-structured and discontinuous nature. Here, the basic strategy of translating continuous binding epitopes into mimetic scaffolds cannot be applied, and other innovative approaches are therefore required. We present a structure-based rational approach involving the use of a regular expression syntax inspired in the well established PROSITE to define minimal descriptors of geometric and functional constraints signifying relevant functionalities for recognition in protein interfaces of non-continuous and unstructured nature. These descriptors feed a search engine that explores the currently available three-dimensional chemical space of the Protein Data Bank (PDB) in order to identify in a straightforward manner regular architectures containing the desired functionalities, which could be used as templates to guide the rational design of small natural-like scaffolds mimicking the targeted recognition site. The application of this rescaffolding strategy to the discovery of natural scaffolds incorporating a selection of functionalities of interleukin-10 receptor-1 (IL-10R1), which are relevant for its interaction with interleukin-10 (IL-10) has resulted in the de novo design of a new class of potent IL-10 peptidomimetic ligands.
Collapse
Affiliation(s)
- Gloria Ruiz-Gómez
- Structural Bioinformatics, BIOTEC TU Dresden, Tatzberg, Dresden, Germany
- * E-mail: (GRG); (MTB)
| | - John C. Hawkins
- Structural Bioinformatics, BIOTEC TU Dresden, Tatzberg, Dresden, Germany
| | - Jenny Philipp
- Helmholtz-Zentrum Dresden Rossendorf, Institute of Resource Ecology, Dresden, Germany
| | - Georg Künze
- Institute of Medical Physics and Biophysics, University of Leipzig, Leipzig, Germany
| | - Robert Wodtke
- Helmholtz-Zentrum Dresden Rossendorf, Institute of Radiopharmaceutical Cancer Research, Dresden, Germany
| | - Reik Löser
- Helmholtz-Zentrum Dresden Rossendorf, Institute of Radiopharmaceutical Cancer Research, Dresden, Germany
| | - Karim Fahmy
- Helmholtz-Zentrum Dresden Rossendorf, Institute of Resource Ecology, Dresden, Germany
| | - M. Teresa Pisabarro
- Structural Bioinformatics, BIOTEC TU Dresden, Tatzberg, Dresden, Germany
- * E-mail: (GRG); (MTB)
| |
Collapse
|
24
|
Fox NK, Brenner SE, Chandonia JM. The value of protein structure classification information-Surveying the scientific literature. Proteins 2015; 83:2025-38. [PMID: 26313554 PMCID: PMC4609302 DOI: 10.1002/prot.24915] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Revised: 08/06/2015] [Accepted: 08/18/2015] [Indexed: 11/08/2022]
Abstract
The Structural Classification of Proteins (SCOP) and Class, Architecture, Topology, Homology (CATH) databases have been valuable resources for protein structure classification for over 20 years. Development of SCOP (version 1) concluded in June 2009 with SCOP 1.75. The SCOPe (SCOP-extended) database offers continued development of the classic SCOP hierarchy, adding over 33,000 structures. We have attempted to assess the impact of these two decade old resources and guide future development. To this end, we surveyed recent articles to learn how structure classification data are used. Of 571 articles published in 2012-2013 that cite SCOP, 439 actually use data from the resource. We found that the type of use was fairly evenly distributed among four top categories: A) study protein structure or evolution (27% of articles), B) train and/or benchmark algorithms (28% of articles), C) augment non-SCOP datasets with SCOP classification (21% of articles), and D) examine the classification of one protein/a small set of proteins (22% of articles). Most articles described computational research, although 11% described purely experimental research, and a further 9% included both. We examined how CATH and SCOP were used in 158 articles that cited both databases: while some studies used only one dataset, the majority used data from both resources. Protein structure classification remains highly relevant for a diverse range of problems and settings.
Collapse
Affiliation(s)
- Naomi K Fox
- Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, California, 94720
| | - Steven E Brenner
- Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, California, 94720.,Department of Plant and Microbial Biology, University of California, Berkeley, California, 94720
| | - John-Marc Chandonia
- Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, California, 94720
| |
Collapse
|
25
|
A Multi-Objective Approach for Protein Structure Prediction Based on an Energy Model and Backbone Angle Preferences. Int J Mol Sci 2015; 16:15136-49. [PMID: 26151847 PMCID: PMC4519891 DOI: 10.3390/ijms160715136] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2015] [Revised: 06/25/2015] [Accepted: 06/25/2015] [Indexed: 11/17/2022] Open
Abstract
Protein structure prediction (PSP) is concerned with the prediction of protein tertiary structure from primary structure and is a challenging calculation problem. After decades of research effort, numerous solutions have been proposed for optimisation methods based on energy models. However, further investigation and improvement is still needed to increase the accuracy and similarity of structures. This study presents a novel backbone angle preference factor, which is one of the factors inducing protein folding. The proposed multiobjective optimisation approach simultaneously considers energy models and backbone angle preferences to solve the ab initio PSP. To prove the effectiveness of the multiobjective optimisation approach based on the energy models and backbone angle preferences, 75 amino acid sequences with lengths ranging from 22 to 88 amino acids were selected from the CB513 data set to be the benchmarks. The data sets were highly dissimilar, therefore indicating that they are meaningful. The experimental results showed that the root-mean-square deviation (RMSD) of the multiobjective optimization approach based on energy model and backbone angle preferences was superior to those of typical energy models, indicating that the proposed approach can facilitate the ab initio PSP.
Collapse
|
26
|
Craveur P, Joseph AP, Esque J, Narwani TJ, Noël F, Shinada N, Goguet M, Leonard S, Poulain P, Bertrand O, Faure G, Rebehmed J, Ghozlane A, Swapna LS, Bhaskara RM, Barnoud J, Téletchéa S, Jallu V, Cerny J, Schneider B, Etchebest C, Srinivasan N, Gelly JC, de Brevern AG. Protein flexibility in the light of structural alphabets. Front Mol Biosci 2015; 2:20. [PMID: 26075209 PMCID: PMC4445325 DOI: 10.3389/fmolb.2015.00020] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Accepted: 04/30/2015] [Indexed: 01/01/2023] Open
Abstract
Protein structures are valuable tools to understand protein function. Nonetheless, proteins are often considered as rigid macromolecules while their structures exhibit specific flexibility, which is essential to complete their functions. Analyses of protein structures and dynamics are often performed with a simplified three-state description, i.e., the classical secondary structures. More precise and complete description of protein backbone conformation can be obtained using libraries of small protein fragments that are able to approximate every part of protein structures. These libraries, called structural alphabets (SAs), have been widely used in structure analysis field, from definition of ligand binding sites to superimposition of protein structures. SAs are also well suited to analyze the dynamics of protein structures. Here, we review innovative approaches that investigate protein flexibility based on SAs description. Coupled to various sources of experimental data (e.g., B-factor) and computational methodology (e.g., Molecular Dynamic simulation), SAs turn out to be powerful tools to analyze protein dynamics, e.g., to examine allosteric mechanisms in large set of structures in complexes, to identify order/disorder transition. SAs were also shown to be quite efficient to predict protein flexibility from amino-acid sequence. Finally, in this review, we exemplify the interest of SAs for studying flexibility with different cases of proteins implicated in pathologies and diseases.
Collapse
Affiliation(s)
- Pierrick Craveur
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Agnel P Joseph
- Rutherford Appleton Laboratory, Science and Technology Facilities Council Didcot, UK
| | - Jeremy Esque
- Institut National de la Santé et de la Recherche Médicale U964,7 UMR Centre National de la Recherche Scientifique 7104, IGBMC, Université de Strasbourg Illkirch, France
| | - Tarun J Narwani
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Floriane Noël
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Nicolas Shinada
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Matthieu Goguet
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Sylvain Leonard
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Pierre Poulain
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Ets Poulain Pointe-Noire, Congo
| | - Olivier Bertrand
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Guilhem Faure
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health Bethesda, MD, USA
| | - Joseph Rebehmed
- Centre National de la Recherche Scientifique UMR7590, Sorbonne Universités, Université Pierre et Marie Curie - MNHN - IRD - IUC Paris, France
| | | | - Lakshmipuram S Swapna
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India ; Hospital for Sick Children, and Departments of Biochemistry and Molecular Genetics, University of Toronto Toronto, ON, Canada
| | - Ramachandra M Bhaskara
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India ; Department of Theoretical Biophysics, Max Planck Institute of Biophysics Frankfurt, Germany
| | - Jonathan Barnoud
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Laboratoire de Physique, École Normale Supérieure de Lyon, Université de Lyon, Centre National de la Recherche Scientifique UMR 5672 Lyon, France
| | - Stéphane Téletchéa
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Faculté des Sciences et Techniques, Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines, Centre National de la Recherche Scientifique UMR 6286, Université Nantes Nantes, France
| | - Vincent Jallu
- Platelet Unit, Institut National de la Transfusion Sanguine Paris, France
| | - Jiri Cerny
- Institute of Biotechnology, The Czech Academy of Sciences Prague, Czech Republic
| | - Bohdan Schneider
- Institute of Biotechnology, The Czech Academy of Sciences Prague, Czech Republic
| | - Catherine Etchebest
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | | | - Jean-Christophe Gelly
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| | - Alexandre G de Brevern
- Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
| |
Collapse
|
27
|
Zhou MB, Zhong H, Hu JL, Tang DQ. Ppmar1andPpmar2: the first two complete and intact full-lengthmariner-like elements isolated inPhyllostachys edulis. ACTA ACUST UNITED AC 2015. [DOI: 10.1080/12538078.2014.999117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
28
|
Structure based annotation of Helicobacter pylori strain 26695 proteome. PLoS One 2014; 9:e115020. [PMID: 25549250 PMCID: PMC4280198 DOI: 10.1371/journal.pone.0115020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2014] [Accepted: 11/17/2014] [Indexed: 11/23/2022] Open
Abstract
The availability of complete genome sequences of H. pylori 26695 has provided a wealth of information enabling us to carry out in silico studies to identify new molecular targets for pharmaceutical treatment. In order to construe the structural and functional information of complete proteome, use of computational methods are more relevant since these methods are reliable and provide a solution to the time consuming and expensive experimental methods. Out of 1590 predicted protein coding genes in H. pylori, experimentally determined structures are available for only 145 proteins in the PDB. In the absence of experimental structures, computational studies on the three dimensional (3D) structural organization would help in deciphering the protein fold, structure and active site. Functional annotation of each protein was carried out based on structural fold and binding site based ligand association. Most of these proteins are uncharacterized in this proteome and through our annotation pipeline we were able to annotate most of them. We could assign structural folds to 464 uncharacterized proteins from an initial list of 557 sequences. Of the 1195 known structural folds present in the SCOP database, 411 (34% of all known folds) are observed in the whole H. pylori 26695 proteome, with greater inclination for domains belonging to α/β class (36.63%). Top folds include P-loop containing nucleoside triphosphate hydrolases (22.6%), TIM barrel (16.7%), transmembrane helix hairpin (16.05%), alpha-alpha superhelix (11.1%) and S-adenosyl-L-methionine-dependent methyltransferases (10.7%).
Collapse
|
29
|
Zhou J, Grigoryan G. Rapid search for tertiary fragments reveals protein sequence-structure relationships. Protein Sci 2014; 24:508-24. [PMID: 25420575 DOI: 10.1002/pro.2610] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 11/21/2014] [Indexed: 12/31/2022]
Abstract
Finding backbone substructures from the Protein Data Bank that match an arbitrary query structural motif, composed of multiple disjoint segments, is a problem of growing relevance in structure prediction and protein design. Although numerous protein structure search approaches have been proposed, methods that address this specific task without additional restrictions and on practical time scales are generally lacking. Here, we propose a solution, dubbed MASTER, that is both rapid, enabling searches over the Protein Data Bank in a matter of seconds, and provably correct, finding all matches below a user-specified root-mean-square deviation cutoff. We show that despite the potentially exponential time complexity of the problem, running times in practice are modest even for queries with many segments. The ability to explore naturally plausible structural and sequence variations around a given motif has the potential to synthesize its design principles in an automated manner; so we go on to illustrate the utility of MASTER to protein structural biology. We demonstrate its capacity to rapidly establish structure-sequence relationships, uncover the native designability landscapes of tertiary structural motifs, identify structural signatures of binding, and automatically rewire protein topologies. Given the broad utility of protein tertiary fragment searches, we hope that providing MASTER in an open-source format will enable novel advances in understanding, predicting, and designing protein structure.
Collapse
Affiliation(s)
- Jianfu Zhou
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire, 03755
| | | |
Collapse
|
30
|
Chiu YY, Tseng JH, Liu KH, Lin CT, Hsu KC, Yang JM. Homopharma: a new concept for exploring the molecular binding mechanisms and drug repurposing. BMC Genomics 2014; 15 Suppl 9:S8. [PMID: 25521038 PMCID: PMC4290623 DOI: 10.1186/1471-2164-15-s9-s8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Background Drugs that simultaneously target multiple proteins often improve efficacy, particularly in the treatment of complex diseases such as cancers and central nervous system disorders. Many approaches have been proposed to identify the potential targets of a drug. Recently, we have introduced Space-Related Pharmamotif (SRPmotif) method to recognize the proteins that share similar binding environments. In addition, compounds with similar topology may bind to similar proteins and have similar protein-compound interactions. However, few studies have focused on exploring the relationships between binding environments and protein-compound interactions, which is important for understanding molecular binding mechanisms and helpful to be used in discovering drug repurposing. Results In this study, we propose a new concept of "Homopharma", combining similar binding environments and protein-compound interaction profiles, to explore the molecular binding mechanisms and drug repurposing. A Homopharma consists of a set of proteins which have the conserved binding environment and a set of compounds that share similar structures and functional groups. These proteins and compounds present conserved interactions and similar physicochemical properties. Therefore, these compounds are often able to inhibit the proteins in a Homopharma. Our experimental results show that the proteins and compounds in a Homopharma often have similar protein-compound interactions, comprising conserved specific residues and functional sites. Based on the Homopharma concept, we selected four flavonoid derivatives and 32 human protein kinases for enzymatic profiling. Among these 128 bioassays, the IC50 of 56 and 25 flavonoid-kinase inhibitions are less than 10 μM and 1 μM, respectively. Furthermore, these experimental results suggest that these flavonoids can be used as anticancer compounds, such as oral and colorectal cancer drugs. Conclusions The experimental results show that the Homopharma is useful for identifying key binding environments of proteins and compounds and discovering new inhibitory effects. We believe that the Homopharma concept can have the potential for understanding molecular binding mechanisms and providing new clues for drug development.
Collapse
|
31
|
Kleinboelting S, van den Heuvel J, Steegborn C. Structural analysis of human soluble adenylyl cyclase and crystal structures of its nucleotide complexes-implications for cyclase catalysis and evolution. FEBS J 2014; 281:4151-64. [PMID: 25040695 DOI: 10.1111/febs.12913] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Revised: 06/29/2014] [Accepted: 07/04/2014] [Indexed: 01/18/2023]
Abstract
UNLABELLED The ubiquitous second messenger cAMP regulates a wide array of functions, from bacterial transcription to mammalian memory. It is synthesized by six evolutionarily distinct adenylyl cyclase (AC) families. In mammals, there are two AC types: nine transmembrane ACs (tmACs) and one soluble AC (sAC). Both AC types belong to the widespread cyclase class III, which has members in numerous organisms from archaeons to mammals. Class III also contains all known guanylyl cyclases (GCs), which synthesize the cAMP-related messenger cGMP in many eukaryotes and possibly some prokaryotes. Among mammalian ACs, sAC is uniquely regulated by bicarbonate, and has been proposed to be more closely related to a bacterial AC subfamily than to mammalian ACs, on the basis of sequence comparisons. Here, we used crystal structures of human sAC catalytic domains to analyze its relationships with other class III ACs and GCs, and to study its substrate selection mechanisms. Structural comparisons revealed a similarity within an sAC-like subfamily but no family-specific structure elements, and an unexpected sAC similarity to eukaryotic GCs and a potential bacterial GC. We further solved novel crystal structures of sAC catalytic domains in complex with a substrate analog, unprocessed ATP substrate, and product after soaking with ATP or GTP. The structures show a novel ATP-binding conformation, and suggest mechanisms for substrate association and recognition. Our results could explain the limited substrate specificity of sAC, suggest how specificity is increased in other cyclases, and indicate evolutionary relationships among class III enzymes, with sAC being close to a putative 'ancestor' cyclase. DATABASE Coordinates and structure factors for the novel sAC-cat structures described have been deposited with the Worldwide PDB (www.pdb.org): ApCpp soak (entry 4usu), ATP + Ca(2+) soak (entry 4usv), GTP + Mg(2+) soak (entry 4ust), ATP soak (entry 4usw).
Collapse
|
32
|
3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces. Methods Mol Biol 2014; 1137:105-17. [PMID: 24573477 DOI: 10.1007/978-1-4939-0366-5_8] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The increasing number of uncharacterized protein structures necessitates the development of computational approaches for function annotation using the protein tertiary structures. Protein structure database search is the basis of any structure-based functional elucidation of proteins. 3D-SURFER is a web platform for real-time protein surface comparison of a given protein structure against the entire PDB using 3D Zernike descriptors. It can smoothly navigate the protein structure space in real-time from one query structure to another. A major new feature of Release 2.0 is the ability to compare the protein surface of a single chain, a single domain, or a single complex against databases of protein chains, domains, complexes, or a combination of all three in the latest PDB. Additionally, two types of protein structures can now be compared: all-atom-surface and backbone-atom-surface. The server can also accept a batch job for a large number of database searches. Pockets in protein surfaces can be identified by VisGrid and LIGSITE (csc) . The server is available at http://kiharalab.org/3d-surfer/.
Collapse
|
33
|
Using hidden Markov models to predict DNA-binding proteins with sequence and structure information. Soft comput 2013. [DOI: 10.1007/s00500-013-1210-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
34
|
Wylie T, Zhu B. Protein chain pair simplification under the discrete Fréchet distance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1372-1383. [PMID: 24407296 DOI: 10.1109/tcbb.2013.17] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
For protein structure alignment and comparison, a lot of work has been done using RMSD as the distance measure, which has drawbacks under certain circumstances. Thus, the discrete Fréchet distance was recently applied to the problem of protein (backbone) structure alignment and comparison with promising results. For this problem, visualization is also important because protein chain backbones can have as many as 500-600 $(\alpha)$-carbon atoms, which constitute the vertices in the comparison. Even with an excellent alignment, the similarity of two polygonal chains can be difficult to visualize unless the chains are nearly identical. Thus, the chain pair simplification problem (CPS-3F) was proposed in 2008 to simultaneously simplify both chains with respect to each other under the discrete Fréchet distance. The complexity of CPS-3F is unknown, so heuristic methods have been developed. Here, we define a variation of CPS-3F, called the constrained CPS-3F problem ($({\rm CPS\hbox{-}3F}^+)$), and prove that it is polynomially solvable by presenting a dynamic programming solution, which we then prove is a factor-2 approximation for CPS-3F. We then compare the $({\rm CPS\hbox{-}3F}^+)$ solutions with previous empirical results, and further demonstrate some of the benefits of the simplified comparisons. Chain pair simplification based on the Hausdorff distance (CPS-2H) is known to be NP-complete, and here we prove that the constrained version ($(\rm CPS\hbox{-}2H^+)$) is also NP-complete. Finally, we discuss future work and implications along with a software library implementation, named the Fréchet-based Protein Alignment & Comparison Toolkit (FPACT).
Collapse
|
35
|
A computational prediction of structure and function of novel homologue of Arabidopsis thaliana Vps51/Vps67 subunit in Corchorus olitorius. Interdiscip Sci 2013; 4:256-67. [PMID: 23354814 DOI: 10.1007/s12539-012-0139-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Revised: 06/05/2012] [Accepted: 07/29/2012] [Indexed: 10/27/2022]
Abstract
Vps mediated vesicular transport is important for transferring macromolecules trapped inside a vesicle. Although highly abundant, Vps shows tremendous sequence variation among diverse array of species. However, this difference in sequence, which seems to also translate into substantial functional variation, is hardly characterized in Corchorus spp. Here, our computational study investigates structural and functional features of one of the Vps subunit namely Vps51/Vps67 in C. olitorius. Broad scale structural characterization revealed novel information about the overall Vps structure and binding sites. Moreover, functional analyses indicate interaction partners which were unexplored to date. Since membrane trafficking is essentially associated with nutrient uptake and chemical de-toxification, characterization of the Vps subunit can well provide us with better insight into important agronomic traits such as stress response, immune response and phytoremediation capacity.
Collapse
|
36
|
Chiu YY, Lin CY, Lin CT, Hsu KC, Chang LZ, Yang JM. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets. BMC Genomics 2012; 13 Suppl 7:S21. [PMID: 23281852 PMCID: PMC3521469 DOI: 10.1186/1471-2164-13-s7-s21] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Collapse
Affiliation(s)
- Yi-Yuan Chiu
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050, Taiwan
| | | | | | | | | | | |
Collapse
|
37
|
Ritchie DW, Ghoorah AW, Mavridis L, Venkatraman V. Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity. Bioinformatics 2012; 28:3274-81. [DOI: 10.1093/bioinformatics/bts618] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
38
|
Bonnel N, Marteau PF. LNA: fast protein structural comparison using a Laplacian characterization of tertiary structure. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1451-1458. [PMID: 22547433 DOI: 10.1109/tcbb.2012.64] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Abstract—In the last two decades, a lot of protein 3D shapes have been discovered, characterized, and made available thanks to the Protein Data Bank (PDB), that is nevertheless growing very quickly. New scalable methods are thus urgently required to search through the PDB efficiently. This paper presents an approach entitled LNA (Laplacian Norm Alignment) that performs a structural comparison of two proteins with dynamic programming algorithms. This is achieved by characterizing each residue in the protein with scalar features. The feature values are calculated using a Laplacian operator applied on the graph corresponding to the adjacency matrix of the residues. The weighted Laplacian operator we use estimates, at various scales, local deformations of the topology where each residue is located. On some benchmarks, which are widely shared by the community, we obtain qualitatively similar results compared to other competing approaches, but with an algorithm one or two order of magnitudes faster. 180,000 protein comparisons can be done within 1 second with a single recent Graphical Processing Unit (GPU), which makes our algorithm very scalable and suitable for real-time database querying across the web.
Collapse
Affiliation(s)
- Nicolas Bonnel
- IRISA, Université de Bretagne Sud, Campus de Tohannic, Vannes 56000, France.
| | | |
Collapse
|
39
|
Kubrycht J, Sigler K, Souček P. Virtual interactomics of proteins from biochemical standpoint. Mol Biol Int 2012; 2012:976385. [PMID: 22928109 PMCID: PMC3423939 DOI: 10.1155/2012/976385] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Revised: 05/18/2012] [Accepted: 05/18/2012] [Indexed: 12/24/2022] Open
Abstract
Virtual interactomics represents a rapidly developing scientific area on the boundary line of bioinformatics and interactomics. Protein-related virtual interactomics then comprises instrumental tools for prediction, simulation, and networking of the majority of interactions important for structural and individual reproduction, differentiation, recognition, signaling, regulation, and metabolic pathways of cells and organisms. Here, we describe the main areas of virtual protein interactomics, that is, structurally based comparative analysis and prediction of functionally important interacting sites, mimotope-assisted and combined epitope prediction, molecular (protein) docking studies, and investigation of protein interaction networks. Detailed information about some interesting methodological approaches and online accessible programs or databases is displayed in our tables. Considerable part of the text deals with the searches for common conserved or functionally convergent protein regions and subgraphs of conserved interaction networks, new outstanding trends and clinically interesting results. In agreement with the presented data and relationships, virtual interactomic tools improve our scientific knowledge, help us to formulate working hypotheses, and they frequently also mediate variously important in silico simulations.
Collapse
Affiliation(s)
- Jaroslav Kubrycht
- Department of Physiology, Second Medical School, Charles University, 150 00 Prague, Czech Republic
| | - Karel Sigler
- Laboratory of Cell Biology, Institute of Microbiology, Academy of Sciences of the Czech Republic, 142 20 Prague, Czech Republic
| | - Pavel Souček
- Toxicogenomics Unit, National Institute of Public Health, 100 42 Prague, Czech Republic
| |
Collapse
|
40
|
Mirceva G, Cingovska I, Dimov Z, Davcev D. Efficient approaches for retrieving protein tertiary structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1166-1179. [PMID: 22025763 DOI: 10.1109/tcbb.2011.138] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The 3D conformation of a protein in the space is the main factor which determines its function in living organisms. Due to the huge amount of newly discovered proteins, there is a need for fast and accurate computational methods for retrieving protein structures. Their purpose is to speed up the process of understanding the structure-to-function relationship which is crucial in the development of new drugs. There are many algorithms addressing the problem of protein structure retrieval. In this paper, we present several novel approaches for retrieving protein tertiary structures. We present our voxel-based descriptor. Then we present our protein ray-based descriptors which are applied on the interpolated protein backbone. We introduce five novel wavelet descriptors which perform wavelet transforms on the protein distance matrix. We also propose an efficient algorithm for distance matrix alignment named Matrix Alignment by Sequence Alignment within Sliding Window (MASASW), which has shown as much faster than DALI, CE, and MatAlign. We compared our approaches between themselves and with several existing algorithms, and they generally prove to be fast and accurate. MASASW achieves the highest accuracy. The ray and wavelet-based descriptors as well as MASASW are more accurate than CE.
Collapse
Affiliation(s)
- Georgina Mirceva
- Department of Computer Science and Computer Engineering, Faculty of Electrical Engineering and Information Technologies, Ss. Cyril and Methodius University in Skopje, PO Box 574, 1000 Skopje, Macedonia.
| | | | | | | |
Collapse
|
41
|
Lo WC, Wang LF, Liu YY, Dai T, Hwang JK, Lyu PC. CPred: a web server for predicting viable circular permutations in proteins. Nucleic Acids Res 2012; 40:W232-7. [PMID: 22693212 PMCID: PMC3394280 DOI: 10.1093/nar/gks529] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Circular permutation (CP) is a protein structural rearrangement phenomenon, through which nature allows structural homologs to have different locations of termini and thus varied activities, stabilities and functional properties. It can be applied in many fields of protein research and bioengineering. The limitation of applying CP lies in its technical complexity, high cost and uncertainty of the viability of the resulting protein variants. Not every position in a protein can be used to create a viable circular permutant, but there is still a lack of practical computational tools for evaluating the positional feasibility of CP before costly experiments are carried out. We have previously designed a comprehensive method for predicting viable CP cleavage sites in proteins. In this work, we implement that method into an efficient and user-friendly web server named CPred (CP site predictor), which is supposed to be helpful to promote fundamental researches and biotechnological applications of CP. The CPred is accessible at http://sarst.life.nthu.edu.tw/CPred.
Collapse
Affiliation(s)
- Wei-Cheng Lo
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu 30013, Taiwan
| | | | | | | | | | | |
Collapse
|
42
|
Abstract
A computational pipeline PocketAnnotate for functional annotation of proteins at the level of binding sites has been proposed in this study. The pipeline integrates three in-house algorithms for site-based function annotation: PocketDepth, for prediction of binding sites in protein structures; PocketMatch, for rapid comparison of binding sites and PocketAlign, to obtain detailed alignment between pair of binding sites. A novel scheme has been developed to rapidly generate a database of non-redundant binding sites. For a given input protein structure, putative ligand-binding sites are identified, matched in real time against the database and the query substructure aligned with the promising hits, to obtain a set of possible ligands that the given protein could bind to. The input can be either whole protein structures or merely the substructures corresponding to possible binding sites. Structure-based function annotation at the level of binding sites thus achieved could prove very useful for cases where no obvious functional inference can be obtained based purely on sequence or fold-level analyses. An attempt has also been made to analyse proteins of no known function from Protein Data Bank. PocketAnnotate would be a valuable tool for the scientific community and contribute towards structure-based functional inference. The web server can be freely accessed at http://proline.biochem.iisc.ernet.in/pocketannotate/.
Collapse
Affiliation(s)
- Praveen Anand
- Department of Biochemistry, Indian Institute of Science, Bangalore 560012, Karnataka, India
| | | | | |
Collapse
|
43
|
Shealy P, Valafar H. Multiple structure alignment with msTALI. BMC Bioinformatics 2012; 13:105. [PMID: 22607234 PMCID: PMC3473313 DOI: 10.1186/1471-2105-13-105] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2011] [Accepted: 04/18/2012] [Indexed: 11/10/2022] Open
Abstract
Background Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification. A method that is capable of solving a variety of problems using structure comparison is still absent. Here we introduce a program msTALI for aligning multiple protein structures. Our algorithm uses several informative features to guide its alignments: torsion angles, backbone Cα atom positions, secondary structure, residue type, surface accessibility, and properties of nearby atoms. The algorithm allows the user to weight the types of information used to generate the alignment, which expands its utility to a wide variety of problems. Results msTALI exhibits competitive results on 824 families from the Homstrad and SABmark databases when compared to Matt and Mustang. We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications. Finally, we present an example applying msTALI to the problem of detecting hinges in a protein undergoing rigid-body motion. Conclusions msTALI is an effective algorithm for multiple structure alignment. In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications. The C++ source code for msTALI is available for Linux on the web at
http://ifestos.cse.sc.edu/mstali.
Collapse
Affiliation(s)
- Paul Shealy
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA
| | | |
Collapse
|
44
|
Pang B, Zhao N, Becchi M, Korkin D, Shyu CR. Accelerating large-scale protein structure alignments with graphics processing units. BMC Res Notes 2012; 5:116. [PMID: 22357132 PMCID: PMC3309952 DOI: 10.1186/1756-0500-5-116] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Accepted: 02/22/2012] [Indexed: 11/24/2022] Open
Abstract
Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU.
Collapse
Affiliation(s)
- Bin Pang
- Informatics Institute, University of Missouri, Columbia, MO, USA
| | | | | | | | | |
Collapse
|
45
|
Deciphering the preference and predicting the viability of circular permutations in proteins. PLoS One 2012; 7:e31791. [PMID: 22359629 PMCID: PMC3281007 DOI: 10.1371/journal.pone.0031791] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Accepted: 01/19/2012] [Indexed: 01/21/2023] Open
Abstract
Circular permutation (CP) refers to situations in which the termini of a protein are relocated to other positions in the structure. CP occurs naturally and has been artificially created to study protein function, stability and folding. Recently CP is increasingly applied to engineer enzyme structure and function, and to create bifunctional fusion proteins unachievable by tandem fusion. CP is a complicated and expensive technique. An intrinsic difficulty in its application lies in the fact that not every position in a protein is amenable for creating a viable permutant. To examine the preferences of CP and develop CP viability prediction methods, we carried out comprehensive analyses of the sequence, structural, and dynamical properties of known CP sites using a variety of statistics and simulation methods, such as the bootstrap aggregating, permutation test and molecular dynamics simulations. CP particularly favors Gly, Pro, Asp and Asn. Positions preferred by CP lie within coils, loops, turns, and at residues that are exposed to solvent, weakly hydrogen-bonded, environmentally unpacked, or flexible. Disfavored positions include Cys, bulky hydrophobic residues, and residues located within helices or near the protein's core. These results fostered the development of an effective viable CP site prediction system, which combined four machine learning methods, e.g., artificial neural networks, the support vector machine, a random forest, and a hierarchical feature integration procedure developed in this work. As assessed by using the hydrofolate reductase dataset as the independent evaluation dataset, this prediction system achieved an AUC of 0.9. Large-scale predictions have been performed for nine thousand representative protein structures; several new potential applications of CP were thus identified. Many unreported preferences of CP are revealed in this study. The developed system is the best CP viability prediction method currently available. This work will facilitate the application of CP in research and biotechnology.
Collapse
|
46
|
Abstract
Celiac sprue is an inflammatory disease of the small intestine caused by dietary gluten and treated by adherence to a life-long gluten-free diet. The recent identification of immunodominant gluten peptides, the discovery of their cogent properties, and the elucidation of the mechanisms by which they engender immunopathology in genetically susceptible individuals have advanced our understanding of the molecular pathogenesis of this complex disease, enabling the rational design of new therapeutic strategies. The most clinically advanced of these is oral enzyme therapy, in which enzymes capable of proteolyzing gluten (i.e., glutenases) are delivered to the alimentary tract of a celiac sprue patient to detoxify ingested gluten in situ. In this chapter, we discuss the key challenges for discovery and preclinical development of oral enzyme therapies for celiac sprue. Methods for lead identification, assay development, gram-scale production and formulation, and lead optimization for next-generation proteases are described and critically assessed.
Collapse
Affiliation(s)
- Michael T Bethune
- Division of Biology, California Institute of Technology, Pasadena, California, USA
| | | |
Collapse
|
47
|
Yang Z, Yu Y, Yao L, Li G, Wang L, Hu Y, Wei H, Wang L, Hammami R, Razavi R, Zhong Y, Liang X. DetoxiProt: an integrated database for detoxification proteins. BMC Genomics 2011; 12 Suppl 3:S2. [PMID: 22369658 PMCID: PMC3333179 DOI: 10.1186/1471-2164-12-s3-s2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background Detoxification proteins are a class of proteins for degradation and/or elimination of endogenous and exogenous toxins or medicines, as well as reactive oxygen species (ROS) produced by these materials. Most of these proteins are generated as a response to the stimulation of toxins or medicines. They are essential for the clearance of harmful substances and for maintenance of physiological balance in organisms. Thus, it is important to collect and integrate information on detoxification proteins. Results To store, retrieve and analyze the information related to their features and functions, we developed the DetoxiProt, a comprehensive database for annotation of these proteins. This database provides detailed introductions about different classes of the detoxification proteins. Extensive annotations of these proteins, including sequences, structures, features, inducers, inhibitors, substrates, chromosomal location, functional domains as well as physiological-biochemical properties were generated. Furthermore, pre-computed BLAST results, multiple sequence alignments and evolutionary trees for detoxification proteins are also provided for evolutionary study of conserved function and pathways. The current version of DetoxiProt contains 5956 protein entries distributed in 628 organisms. An easy to use web interface was designed, so that annotations about each detoxification protein can be retrieved by browsing with a specific method or by searching with different criteria. Conclusions DetoxiProt provides an effective and efficient way of accessing the detoxification protein sequences and other high-quality information. This database would be a valuable source for toxicologists, pharmacologists and medicinal chemists. DetoxiProt database is freely available at http://lifecenter.sgst.cn/detoxiprot/.
Collapse
Affiliation(s)
- Zhen Yang
- School of Life Science, Fudan University, HanDan Road 220#, Shanghai, 200433, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Penner RC, Knudsen M, Wiuf C, Andersen JE. An Algebro-topological description of protein domain structure. PLoS One 2011; 6:e19670. [PMID: 21629687 PMCID: PMC3101207 DOI: 10.1371/journal.pone.0019670] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2011] [Accepted: 04/03/2011] [Indexed: 11/25/2022] Open
Abstract
The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH.
Collapse
Affiliation(s)
- Robert Clark Penner
- Center for the Topology and Quantization of Moduli Spaces, Department of Mathematical Sciences, Aarhus University, Aarhus, Denmark
- Departments of Mathematics and Physics/Astronomy, University of Southern California, Los Angeles, California, United States of America
| | - Michael Knudsen
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Carsten Wiuf
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Centre for Membrane Pumps in Cells and Disease, Aarhus University, Aarhus, Denmark
- * E-mail:
| | - Jørgen Ellegaard Andersen
- Center for the Topology and Quantization of Moduli Spaces, Department of Mathematical Sciences, Aarhus University, Aarhus, Denmark
| |
Collapse
|
49
|
Shutov AD, Prak K, Fukuda T, Rudakov SV, Rudakova AS, Tandang-Silvas MR, Fujiwara K, Mikami B, Utsumi S, Maruyama N. Soybean basic 7S globulin: subunit heterogeneity and molecular evolution. Biosci Biotechnol Biochem 2010; 74:1631-4. [PMID: 20699573 DOI: 10.1271/bbb.100234] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Basic 7S globulin, a cysteine-rich protein from soybean seeds, consists of subunits containing 27 kD and 16 kD chains linked by disulfide bonding. Three differently sized subunits of the basic 7S globulin were detected and partially separated by SP Sepharose chromatography. The basic 7S globulin was characterized as a member of a superfamily of structurally related but functionally distinct proteins descended from a specific group of plant aspartic proteinases.
Collapse
Affiliation(s)
- Andrei D Shutov
- Laboratory of Plant Biochemistry, State University of Moldova, Mateevicii str, Chişinău, Moldova
| | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Shyu CR, Pang B, Chi PH, Zhao N, Korkin D, Xu D. ProteinDBS v2.0: a web server for global and local protein structure search. Nucleic Acids Res 2010; 38:W53-8. [PMID: 20538653 PMCID: PMC2896110 DOI: 10.1093/nar/gkq522] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
ProteinDBS v2.0 is a web server designed for efficient and accurate comparisons and searches of structurally similar proteins from a large-scale database. It provides two comparison methods, global-to-global and local-to-local, to facilitate the searches of protein structures or substructures. ProteinDBS v2.0 applies advanced feature extraction algorithms and scalable indexing techniques to achieve a high-running speed while preserving reasonably high precision of structural comparison. The experimental results show that our system is able to return results of global comparisons in seconds from a complete Protein Data Bank (PDB) database of 152,959 protein chains and that it takes much less time to complete local comparisons from a non-redundant database of 3276 proteins than other accurate comparison methods. ProteinDBS v2.0 supports query by PDB protein ID and by new structures uploaded by users. To our knowledge, this is the only search engine that can simultaneously support global and local comparisons. ProteinDBS v2.0 is a useful tool to investigate functional or evolutional relationships among proteins. Moreover, the common substructures identified by local comparison can be potentially used to assist the human curation process in discovering new domains or folds from the ever-growing protein structure databases. The system is hosted at http://ProteinDBS.rnet.missouri.edu.
Collapse
Affiliation(s)
- Chi-Ren Shyu
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA.
| | | | | | | | | | | |
Collapse
|