1
|
Sidhanta SPD, Sowdhamini R, Srinivasan N. Comparative analysis of permanent and transient domain-domain interactions in multi-domain proteins. Proteins 2023. [PMID: 37828826 DOI: 10.1002/prot.26581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/09/2023] [Accepted: 08/11/2023] [Indexed: 10/14/2023]
Abstract
Protein domains are structural, functional, and evolutionary units. These domains bring out the diversity of functionality by means of interactions with other co-existing domains and provide stability. Hence, it is important to study intra-protein inter-domain interactions from the perspective of types of interactions. Domains within a chain could interact over short timeframes or permanently, rather like protein-protein interactions (PPIs). However, no systematic study has been carried out between two classes, namely permanent and transient domain-domain interactions. In this work, we studied 263 two-domain proteins, belonging to either of these classes and their interfaces on the basis of several factors, such as interface area and details of interactions (number, strength, and types of interactions). We also characterized them based on residue conservation at the interface, correlation of residue motions across domains, its involvement in repeat formation, and their involvement in particular molecular processes. Finally, we could analyze the interactions arising from domains in two-domain monomeric proteins, and we observed significant differences between these two classes of domain interactions and a few similarities. This study will help to obtain a better understanding of structure-function and folding principles of multi-domain proteins.
Collapse
Affiliation(s)
| | - Ramanathan Sowdhamini
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- Computational Approaches to Protein Science, National Centre for Biological Sciences, Bangalore, India
- Computational Biology, Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
| | | |
Collapse
|
2
|
Cretin G, Périn C, Zimmermann N, Galochkina T, Gelly JC. ICARUS: flexible protein structural alignment based on Protein Units. Bioinformatics 2023; 39:btad459. [PMID: 37498544 PMCID: PMC10400377 DOI: 10.1093/bioinformatics/btad459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 07/04/2023] [Accepted: 07/26/2023] [Indexed: 07/28/2023] Open
Abstract
MOTIVATION Alignment of protein structures is a major problem in structural biology. The first approach commonly used is to consider proteins as rigid bodies. However, alignment of protein structures can be very complex due to conformational variability, or complex evolutionary relationships between proteins such as insertions, circular permutations or repetitions. In such cases, introducing flexibility becomes useful for two reasons: (i) it can help compare two protein chains which adopted two different conformational states, such as due to proteins/ligands interaction or post-translational modifications, and (ii) it aids in the identification of conserved regions in proteins that may have distant evolutionary relationships. RESULTS We propose ICARUS, a new approach for flexible structural alignment based on identification of Protein Units, evolutionarily preserved structural descriptors of intermediate size, between secondary structures and domains. ICARUS significantly outperforms reference methods on a dataset of very difficult structural alignments. AVAILABILITY AND IMPLEMENTATION Code is freely available online at https://github.com/DSIMB/ICARUS.
Collapse
Affiliation(s)
- Gabriel Cretin
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Charlotte Périn
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
- TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Nicolas Zimmermann
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Jean-Christophe Gelly
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| |
Collapse
|
3
|
Zhu K, Su H, Peng Z, Yang J. A unified approach to protein domain parsing with inter-residue distance matrix. Bioinformatics 2023; 39:7025502. [PMID: 36734597 PMCID: PMC9919455 DOI: 10.1093/bioinformatics/btad070] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 01/02/2023] [Accepted: 02/01/2023] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION It is fundamental to cut multi-domain proteins into individual domains, for precise domain-based structural and functional studies. In the past, sequence-based and structure-based domain parsing was carried out independently with different methodologies. The recent progress in deep learning-based protein structure prediction provides the opportunity to unify sequence-based and structure-based domain parsing. RESULTS Based on the inter-residue distance matrix, which can be either derived from the input structure or predicted by trRosettaX, we can decode the domain boundaries under a unified framework. We name the proposed method UniDoc. The principle of UniDoc is based on the well-accepted physical concept of maximizing intra-domain interaction while minimizing inter-domain interaction. Comprehensive tests on five benchmark datasets indicate that UniDoc outperforms other state-of-the-art methods in terms of both accuracy and speed, for both sequence-based and structure-based domain parsing. The major contribution of UniDoc is providing a unified framework for structure-based and sequence-based domain parsing. We hope that UniDoc would be a convenient tool for protein domain analysis. AVAILABILITY AND IMPLEMENTATION https://yanglab.nankai.edu.cn/UniDoc/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kun Zhu
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Hong Su
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Zhenling Peng
- Ministry of Education Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- Ministry of Education Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
4
|
Swain SP, Gupta S, Das N, Franca TCC, Goncalves ADS, Ramalho TC, Subrahmanya S, Narsaria U, Deb D, Mishra N. Flavanones: A potential natural inhibitor of the ATP binding site of PknG of Mycobacterium tuberculosis. J Biomol Struct Dyn 2022; 40:11885-11899. [PMID: 34409917 DOI: 10.1080/07391102.2021.1965913] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Over the years, Mycobacterium tuberculosis has been one of the major causes of death worldwide. As several clinical isolates of the bacteria have developed drug resistance against the target sites of the current therapeutic agents, the development of a novel drug is the pressing priority. According to recent studies on Mycobacterium tuberculosis, ATP binding sites of Mycobacterium tuberculosis serine/threonine protein kinases (MTPKs) have been identified as the new promising drug target. Among the several other protein kinases (PKs), Protein kinase G (PknG) was selected for the study because of its crucial role in modulating bacterium's metabolism to survive in host macrophages. In this work, we have focused on the H37Rv strain of Mycobacterium tuberculosis. A list of 477 flavanones obtained from the PubChem database was docked one by one against the crystallized and refined structure of PknG by in-silico techniques. Initially, potential inhibitors were narrowed down by preliminary docking. Flavanones were then selected using binding energies ranging from -7.9 kcal.mol-1 to -10.8 kcal.mol-1. This was followed by drug-likeness prediction, redocking analysis, and molecular dynamics simulations. Here, we have used experimentally confirmed drug AX20017 as a reference to determine candidate compounds that can act as potential inhibitors for PknG. PubChem165506, PubChem242065, PubChem688859, PubChem101367767, PubChem3534982, and PubChem42607933 were identified as possible target site inhibitors for PknG with a desirable negative binding energy of -8.1, -8.3, -8.4, -8.8, -8.6 and -7.9 kcal.mol-1 respectively. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | - Subhi Gupta
- Independent Researcher, Karnataka, Bangalore, India
| | - Nidhi Das
- Independent Researcher, Karnataka, Bangalore, India
| | - Tanos Celmar Costa Franca
- Laboratory of Molecular Modeling Applied to Chemical and Biological Defense (LMCBD), Military Institute of Engineering, Rio de Janeiro, RJ, Brazil.,Faculty of Science, Department of Chemistry, University of Hradec Kralove, Hradec Kralove, Czech Republic
| | - Arlan da Silva Goncalves
- Department of Chemistry, Federal Institute of Espirito Santo - Unit Vila Velha, Vila Velha, ES, Brazil.,PPGQUI (Graduate Program in Chemistry), Federal University of Espirito Santo, Vitoria, ES, Brazil
| | - Teodorico Castro Ramalho
- Faculty of Science, Department of Chemistry, University of Hradec Kralove, Hradec Kralove, Czech Republic.,Laboratory of Computational Chemistry, Department of Chemisry, UFLA, Lavras, MG, Brazil
| | - Shreya Subrahmanya
- Department of Botany, St. Joseph's College (autonomous), Bangalore, Karnataka, India
| | | | | | - Neelam Mishra
- Department of Botany, St. Joseph's College (autonomous), Bangalore, Karnataka, India
| |
Collapse
|
5
|
Cretin G, Galochkina T, Vander Meersche Y, de Brevern AG, Postic G, Gelly JC. SWORD2: hierarchical analysis of protein 3D structures. Nucleic Acids Res 2022; 50:W732-W738. [PMID: 35580056 PMCID: PMC9252838 DOI: 10.1093/nar/gkac370] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 04/19/2022] [Accepted: 04/29/2022] [Indexed: 11/27/2022] Open
Abstract
Understanding the functions and origins of proteins requires splitting these macromolecules into fragments that could be independent in terms of folding, activity, or evolution. For that purpose, structural domains are the typical level of analysis, but shorter segments, such as subdomains and supersecondary structures, are insightful as well. Here, we propose SWORD2, a web server for exploring how an input protein structure may be decomposed into ‘Protein Units’ that can be hierarchically assembled to delimit structural domains. For each partitioning solution, the relevance of the identified substructures is estimated through different measures. This multilevel analysis is achieved by integrating our previous work on domain delineation, ‘protein peeling’ and model quality assessment. We hope that SWORD2 will be useful to biologists searching for key regions in their proteins of interest and to bioinformaticians building datasets of protein structures. The web server is freely available online: https://www.dsimb.inserm.fr/SWORD2.
Collapse
Affiliation(s)
- Gabriel Cretin
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France.,Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France.,Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Yann Vander Meersche
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France.,Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Alexandre G de Brevern
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France.,Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Guillaume Postic
- Université Paris-Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Jean-Christophe Gelly
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75015 Paris, France.,Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| |
Collapse
|
6
|
Analysis of Integrin α IIb Subunit Dynamics Reveals Long-Range Effects of Missense Mutations on Calf Domains. Int J Mol Sci 2022; 23:ijms23020858. [PMID: 35055046 PMCID: PMC8776176 DOI: 10.3390/ijms23020858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 12/23/2021] [Accepted: 12/30/2021] [Indexed: 11/17/2022] Open
Abstract
Integrin αIIbβ3, a glycoprotein complex expressed at the platelet surface, is involved in platelet aggregation and contributes to primary haemostasis. Several integrin αIIbβ3 polymorphisms prevent the aggregation that causes haemorrhagic syndromes, such as Glanzmann thrombasthenia (GT). Access to 3D structure allows understanding the structural effects of polymorphisms related to GT. In a previous analysis using Molecular Dynamics (MD) simulations of αIIbCalf-1 domain structure, it was observed that GT associated with single amino acid variation affects distant loops, but not the mutated position. In this study, experiments are extended to Calf-1, Thigh, and Calf-2 domains. Two loops in Calf-2 are unstructured and therefore are modelled expertly using biophysical restraints. Surprisingly, MD revealed the presence of rigid zones in these loops. Detailed analysis with structural alphabet, the Proteins Blocks (PBs), allowed observing local changes in highly flexible regions. The variant P741R located at C-terminal of Calf-1 revealed that the Calf-2 presence did not affect the results obtained with isolated Calf-1 domain. Simulations for Calf-1 + Calf-2, and Thigh + Calf-1 variant systems are designed to comprehend the impact of five single amino acid variations in these domains. Distant conformational changes are observed, thus highlighting the potential role of allostery in the structural basis of GT.
Collapse
|
7
|
Nekrasov AN, Kozmin YP, Kozyrev SV, Ziganshin RH, de Brevern AG, Anashkina AA. Hierarchical Structure of Protein Sequence. Int J Mol Sci 2021; 22:8339. [PMID: 34361104 PMCID: PMC8348890 DOI: 10.3390/ijms22158339] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 07/22/2021] [Accepted: 07/27/2021] [Indexed: 11/28/2022] Open
Abstract
Most non-communicable diseases are associated with dysfunction of proteins or protein complexes. The relationship between sequence and structure has been analyzed for a long time, and the analysis of the sequences organization in domains and motifs remains an actual research area. Here, we propose a mathematical method for revealing the hierarchical organization of protein sequences. The method is based on the pentapeptide as a unit of protein sequences. Employing the frequency of occurrence of pentapeptides in sequences of natural proteins and a special mathematical approach, this method revealed a hierarchical structure in the protein sequence. The method was applied to 24,647 non-homologous protein sequences with sizes ranging from 50 to 400 residues from the NRDB90 database. Statistical analysis of the branching points of the graphs revealed 11 characteristic values of y (the width of the inscribed function), showing the relationship of these multiple fragments of the sequences. Several examples illustrate how fragments of the protein spatial structure correspond to the elements of the hierarchical structure of the protein sequence. This methodology provides a promising basis for a mathematically-based classification of the elements of the spatial organization of proteins. Elements of the hierarchical structure of different levels of the hierarchy can be used to solve biotechnological and medical problems.
Collapse
Affiliation(s)
- Alexei N. Nekrasov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, The Russian Academy of Sciences, Miklukho-Maklaya St. 16/10, 117997 Moscow, Russia; (A.N.N.); (Y.P.K.); (R.H.Z.)
| | - Yuri P. Kozmin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, The Russian Academy of Sciences, Miklukho-Maklaya St. 16/10, 117997 Moscow, Russia; (A.N.N.); (Y.P.K.); (R.H.Z.)
| | - Sergey V. Kozyrev
- Steklov Mathematical Institute and of Russian Academy of Sciences, 8 Gubkina St., 119991 Moscow, Russia;
| | - Rustam H. Ziganshin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, The Russian Academy of Sciences, Miklukho-Maklaya St. 16/10, 117997 Moscow, Russia; (A.N.N.); (Y.P.K.); (R.H.Z.)
| | - Alexandre G. de Brevern
- INSERM UMR S-1134, DSIMB, Univ. Paris, INTS, Lab. of Excellence GR-Ex 6, rue Alexandre Cabanel, CEDEX 15, 75739 Paris, France;
| | - Anastasia A. Anashkina
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov St. 32, 119991 Moscow, Russia
| |
Collapse
|
8
|
Abstract
The Covid-19 a pandemic infectious disease and affected life across the world resulting in over 188.65 million confirmed cases across 223 countries, territories and areas with 4.06 million deaths. It is caused by a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and spike (S) protein of SARS-CoV-2, which plays a key role in the receptor recognition and cell membrane fusion process, is composed of two subunits, S1 and S2. The S1 subunit contains a receptor-binding domain (RBD) that recognizes and binds to the host receptor angiotensin-converting enzyme 2 (ACE2), while the S2 subunit mediates viral cell membrane fusion. Hence, it is a key target for developing neutralizing antibodies. Here, we have performed phylogenetic analysis and structural modeling of the SARS-CoV-2 spike glycoprotein, which is found highly conserved. The overall percent protein sequence identity from the SARS-CoV-2 spike protein sequences from the NCBI database was 99.68%. The functional domains of the S protein reveal that the S1 subunit was highly conserved (99.70%) than the S2 subunit (99.66%). Further, the 319–541 residues (RBD) of amino acids within the S1 domain were 100% similar among the spike protein. The 3D modeling of SARS-CoV-2 spike glycoprotein indicated that S protein has four domains with five protein units and the S1 subunit from 1 to 289 amino acid of domain 1 is highly conserved without any change in the ligand interaction site. This analysis clearly suggests that the S1 subunit (RBD 319–541) can be used as a target region for stable and safe vaccine development.
Collapse
|
9
|
Fogha J, Bayry J, Diharce J, de Brevern AG. Structural and evolutionary exploration of the IL-3 family and its alpha subunit receptors. Amino Acids 2021; 53:1211-1227. [PMID: 34196789 DOI: 10.1007/s00726-021-03026-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 06/21/2021] [Indexed: 12/14/2022]
Abstract
Interleukin-3 (IL-3) is a cytokine belonging to the family of common β (βc) and is involved in various biological systems. Its activity is mediated by the interaction with its receptor (IL-3R), a heterodimer composed of two distinct subunits: IL-3Rα and βc. IL-3 and its receptor, especially IL-3Rα, play a crucial role in pathologies like inflammatory diseases and therefore are interesting therapeutic targets. Here, we have performed an analysis of these proteins and their interaction based on structural and evolutionary information. We highlighted that IL-3 and IL-3Rα structural architectures are conserved across evolution and shared with other proteins belonging to the same βc family interleukin-5 (IL-5) and granulocyte-macrophage colony-stimulating factor (GM-CSF). The IL-3Rα/IL-3 interaction is mediated by a large interface in which most residues are surprisingly not conserved during evolution and across family members. In spite of this high variability, we suggested small regions constituted by few residues conserved during the evolution in both proteins that could be important for the binding affinity.
Collapse
Affiliation(s)
- Jade Fogha
- UMR_S 1134, DSIMB, Université de Paris, Inserm, Biologie Intégrée du Globule Rouge, 75739, Paris, France
- Institut National de La Transfusion Sanguine (INTS), 75739, Paris, France
- Laboratoire D'Excellence GR-Ex, 75739, Paris, France
| | - Jagadeesh Bayry
- Centre de Recherche Des Cordeliers, Institut National de La Santé Et de La Recherche Médicale, Sorbonne Université, Université de Paris, 75006, Paris, France
- Indian Institute of Technology Palakkad, Kozhippara, Palakkad, 678 557, India
| | - Julien Diharce
- UMR_S 1134, DSIMB, Université de Paris, Inserm, Biologie Intégrée du Globule Rouge, 75739, Paris, France.
- Institut National de La Transfusion Sanguine (INTS), 75739, Paris, France.
- Laboratoire D'Excellence GR-Ex, 75739, Paris, France.
| | - Alexandre G de Brevern
- UMR_S 1134, DSIMB, Université de Paris, Inserm, Biologie Intégrée du Globule Rouge, 75739, Paris, France.
- Institut National de La Transfusion Sanguine (INTS), 75739, Paris, France.
- Laboratoire D'Excellence GR-Ex, 75739, Paris, France.
- UMR_S 1134, DSIMB, Université de La Réunion, Inserm, Biologie Intégrée du Globule Rouge, La Réunion, 97744, Saint-Denis, France.
| |
Collapse
|
10
|
Arora V, Pal S, Kulshreshtha S, Verma IC. A Further Case of Larsen's Syndrome: Clinical and Genotypic Challenges in Diagnosis. J Pediatr Genet 2020; 11:298-303. [DOI: 10.1055/s-0040-1718540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 09/06/2020] [Indexed: 10/23/2022]
Abstract
AbstractLarsen's syndrome is characterized by dislocation of multiple large joints, digital anomalies, craniofacial dysmorphism, and short stature. In this paper, we describe a case of a 5-month-old boy with a triad of cardinal features in association with other signs. The diagnosis was confirmed by exome sequencing, which led to the identification of a novel missense variant NM_001457.4:c.4928C > G (p.Ala1643Gly) in the FLNB gene. We describe the role of protein modelling for the establishment of pathogenicity of this variant. We also outline the challenges in genetic diagnosis due to variable expressivity of the variant and discuss the clinicogenetic profile of previously reported patients with Larsen's syndrome in India.
Collapse
Affiliation(s)
- Veronica Arora
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Swasti Pal
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Samarth Kulshreshtha
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| | - Ishwar C. Verma
- Institute of Medical Genetics and Genomics, Sir Ganga Ram Hospital, New Delhi, India
| |
Collapse
|
11
|
Lai JI, Verma D, Bailey-Kellogg C, Ackerman ME. Towards conformational fidelity of a quaternary HIV-1 epitope: computational design and directed evolution of a minimal V1V2 antigen. Protein Eng Des Sel 2018; 31:121-133. [PMID: 29897567 PMCID: PMC6030936 DOI: 10.1093/protein/gzy010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 04/16/2018] [Accepted: 04/24/2018] [Indexed: 12/11/2022] Open
Abstract
Structure-based approaches to antigen design utilize insights from antibody (Ab):antigen interactions and a refined understanding of protective Ab responses to engineer novel antigens presenting epitopes with conformations relevant to eliciting or discovering protective humoral responses. For human immunodeficiency virus-1 (HIV-1), one model of protection is provided by broadly neutralizing Abs (bnAbs) against epitopes present in the closed prefusion trimeric conformation of HIV-1 envelope glycoprotein, such as the variable loops 1-2 (V1V2) apex. Here, computational design and directed evolution yielded a novel V1V2 sequence variant with potential utility for inclusion in an immunogen for eliciting bnAbs, or as an epitope probe for their detection. The computational design goal was to engineer a minimal single-chain antigen with three copies of the V1V2 loops to support maintenance of closed prefusion V1V2 trimeric conformation and presentation of bnAb epitopes. Via directed evolution of this computationally designed single-chain antigen, we isolated a V1V2 sequence variant that in monomeric form exhibited preferential recognition by quaternary-preferring and conformation-dependent mAbs. Structural context and transferability of this phenotype to V1V2 sequences from all strains of HIV-1 tested suggest a conformation-stabilizing effect. This example demonstrates the potential utility of computational design and directed evolution-based protein engineering strategies to develop minimal, conformation-stabilized epitope-specific antigens.
Collapse
Affiliation(s)
- Jennifer I Lai
- Thayer School of Engineering, Dartmouth College, 14 Engineering Dr, Hanover NH, USA
| | - Deeptak Verma
- Department of Computer Science, Dartmouth College, 9 Maynard St, Hanover NH, USA
| | - Chris Bailey-Kellogg
- Department of Computer Science, Dartmouth College, 9 Maynard St, Hanover NH, USA
| | - Margaret E Ackerman
- Thayer School of Engineering, Dartmouth College, 14 Engineering Dr, Hanover NH, USA
- Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth College, 1 Medical Center Dr, Lebanon NH, USA
| |
Collapse
|
12
|
Postic G, Ghouzam Y, Chebrek R, Gelly JC. An ambiguity principle for assigning protein structural domains. SCIENCE ADVANCES 2017; 3:e1600552. [PMID: 28097215 PMCID: PMC5235333 DOI: 10.1126/sciadv.1600552] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 11/28/2016] [Indexed: 05/20/2023]
Abstract
Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object-in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our "multipartitioning" approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules.
Collapse
Affiliation(s)
- Guillaume Postic
- INSERM U1134, Paris, France
- Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France
- Institut National de la Transfusion Sanguine, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
- Corresponding author. (G.P.); (J.-C.G.)
| | - Yassine Ghouzam
- INSERM U1134, Paris, France
- Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France
- Institut National de la Transfusion Sanguine, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
| | - Romain Chebrek
- INSERM U1134, Paris, France
- Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France
- Institut National de la Transfusion Sanguine, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
| | - Jean-Christophe Gelly
- INSERM U1134, Paris, France
- Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France
- Institut National de la Transfusion Sanguine, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
- Corresponding author. (G.P.); (J.-C.G.)
| |
Collapse
|
13
|
Multiple nucleophilic elbows leading to multiple active sites in a single module esterase from Sorangium cellulosum. J Struct Biol 2015; 190:314-27. [DOI: 10.1016/j.jsb.2015.04.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Revised: 03/25/2015] [Accepted: 04/10/2015] [Indexed: 11/17/2022]
|
14
|
Skorupka K, Han SK, Nam HJ, Kim S, Faham S. Protein design by fusion: implications for protein structure prediction and evolution. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 2013; 69:2451-60. [PMID: 24311586 DOI: 10.1107/s0907444913022701] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2013] [Accepted: 08/12/2013] [Indexed: 01/21/2023]
Abstract
Domain fusion is a useful tool in protein design. Here, the structure of a fusion of the heterodimeric flagella-assembly proteins FliS and FliC is reported. Although the ability of the fusion protein to maintain the structure of the heterodimer may be apparent, threading-based structural predictions do not properly fuse the heterodimer. Additional examples of naturally occurring heterodimers that are homologous to full-length proteins were identified. These examples highlight that the designed protein was engineered by the same tools as used in the natural evolution of proteins and that heterodimeric structures contain a wealth of information, currently unused, that can improve structural predictions.
Collapse
Affiliation(s)
- Katarzyna Skorupka
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, Charlottesville, VA 22093, USA
| | | | | | | | | |
Collapse
|
15
|
Hleap JS, Susko E, Blouin C. Defining structural and evolutionary modules in proteins: a community detection approach to explore sub-domain architecture. BMC STRUCTURAL BIOLOGY 2013; 13:20. [PMID: 24131821 PMCID: PMC4016585 DOI: 10.1186/1472-6807-13-20] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 10/11/2013] [Indexed: 12/23/2022]
Abstract
Background Assessing protein modularity is important to understand protein evolution. Still the question of the existence of a sub-domain modular architecture remains. We propose a graph-theory approach with significance and power testing to identify modules in protein structures. In the first step, clusters are determined by optimizing the partition that maximizes the modularity score. Second, each cluster is tested for significance. Significant clusters are referred to as modules. Evolutionary modules are identified by analyzing homologous structures. Dynamic modules are inferred from sets of snapshots of molecular simulations. We present here a methodology to identify sub-domain architecture robustly, biologically meaningful, and statistically supported. Results The robustness of this new method is tested using simulated data with known modularity. Modules are correctly identified even when there is a low correlation between landmarks within a module. We also analyzed the evolutionary modularity of a data set of α-amylase catalytic domain homologs, and the dynamic modularity of the Niemann-Pick C1 (NPC1) protein N-terminal domain. The α-amylase contains an (α/β)8 barrel (TIM barrel) with the polysaccharides cleavage site and a calcium-binding domain. In this data set we identified four robust evolutionary modules, one of which forms the minimal functional TIM barrel topology. The NPC1 protein is involved in the intracellular lipid metabolism coordinating sterol trafficking. NPC1 N-terminus is the first luminal domain which binds to cholesterol and its oxygenated derivatives. Our inferred dynamic modules in the protein NPC1 are also shown to match functional components of the protein related to the NPC1 disease. Conclusions A domain compartmentalization can be found and described in correlation space. To our knowledge, there is no other method attempting to identify sub-domain architecture from the correlation among residues. Most attempts made focus on sequence motifs of protein-protein interactions, binding sites, or sequence conservancy. We were able to describe functional/structural sub-domain architecture related to key residues for starch cleavage, calcium, and chloride binding sites in the α-amylase, and sterol opening-defining modules and disease-related residues in the NPC1. We also described the evolutionary sub-domain architecture of the α-amylase catalytic domain, identifying the already reported minimum functional TIM barrel.
Collapse
Affiliation(s)
- Jose Sergio Hleap
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, B3H 4R2, Canada.
| | | | | |
Collapse
|
16
|
Esque J, Léonard S, de Brevern AG, Oguey C. VLDP web server: a powerful geometric tool for analysing protein structures in their environment. Nucleic Acids Res 2013; 41:W373-8. [PMID: 23761450 PMCID: PMC3692094 DOI: 10.1093/nar/gkt509] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Protein structures are an ensemble of atoms determined experimentally mostly by X-ray crystallography or Nuclear Magnetic Resonance. Studying 3D protein structures is a key point for better understanding protein function at a molecular level. We propose a set of accurate tools, for analysing protein structures, based on the reliable method of Voronoi–Laguerre tessellations. The Voronoi Laguerre Delaunay Protein web server (VLDPws) computes the Laguerre tessellation on a whole given system first embedded in solvent. Through this fine description, VLDPws gives the following data: (i) Amino acid volumes evaluated with high precision, as confirmed by good correlations with experimental data. (ii) A novel definition of inter-residue contacts within the given protein. (iii) A measure of the residue exposure to solvent that significantly improves the standard notion of accessibility in some cases. At present, no equivalent web server is available. VLDPws provides output in two complementary forms: direct visualization of the Laguerre tessellation, mostly its polygonal molecular surfaces; files of volumes; and areas, contacts and similar data for each residue and each atom. These files are available for download for further analysis. VLDPws can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/vldp.
Collapse
Affiliation(s)
- Jérémy Esque
- LPTM, CNRS UMR 8089, Université Cergy-Pontoise, F-95302 Cergy-Pontoise, France
| | | | | | | |
Collapse
|
17
|
Rebehmed J, Alphand V, de Berardinis V, de Brevern AG. Evolution study of the Baeyer-Villiger monooxygenases enzyme family: functional importance of the highly conserved residues. Biochimie 2013; 95:1394-402. [PMID: 23523772 DOI: 10.1016/j.biochi.2013.03.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 03/08/2013] [Indexed: 11/19/2022]
Abstract
Baeyer-Villiger monooxygenases (BVMOs) catalyze the transformation of linear and cyclic ketones into their corresponding esters and lactones by introducing an oxygen atom into a C-C bond. This bioreaction has numerous advantages compared to its chemical version; it does not induce the use of potentially harmful reagents (i.e., green chemistry) and displays significant better enantio- and regio-selectivity. New potential BVMOs were searched using sequence homology for type I BVMO proteins. 116 new sequences were identified as new putative BVMOs respecting the defined selection criteria. Multiple sequence alignments were carried out on the selected sequences to study the conservation of structurally and/or functionally important amino acids during evolution. Type I BVMO signature motif was found to be conserved in 94.8% of the sequences. We noticed also the highly conserved - but previously unnoticed - Threonine 167 (93.1%), located in the signature motif; this position could be added in the pattern used to characterize specific Type I enzymes. Amino acids at the vicinity of the FAD and NADPH cofactors were found also to be highly conserved and the details of the interactions were emphasized. Interestingly, residues at the enzyme binding site were found less conserved in terms of sequence evolution, leading sometimes to some important amino acid changes. These behaviors could explain the enzyme selectivity and specificity for different ligands.
Collapse
|
18
|
Gelly JC, Lin HY, de Brevern AG, Chuang TJ, Chen FC. Selective constraint on human pre-mRNA splicing by protein structural properties. Genome Biol Evol 2012; 4:966-75. [PMID: 22936073 PMCID: PMC3468958 DOI: 10.1093/gbe/evs071] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Alternative splicing (AS) is a major mechanism of increasing proteome diversity in complex organisms. Different AS transcript isoforms may be translated into peptide sequences of significantly different lengths and amino acid compositions. One important question, then, is how AS is constrained by protein structural requirements while peptide sequences may be significantly changed in AS events. Here, we address this issue by examining whether the intactness of three-dimensional protein structural units (compact units in protein structures, namely protein units [PUs]) tends to be preserved in AS events in human. We show that PUs tend to occur in constitutively spliced exons and to overlap constitutive exon boundaries. Furthermore, when PUs are located at the boundaries between two alternatively spliced exons (ASEs), these neighboring ASEs tend to co-occur in different transcript isoforms. In addition, such PU-spanned ASE pairs tend to have a higher frequency of being included in transcript isoforms. ASE regions that overlap with PUs also have lower nonsynonymous-to-synonymous substitution rate ratios than those that do not overlap with PUs, indicating stronger negative selection pressure in PU-overlapped ASE regions. Of note, we show that PUs have protein domain- and structural orderness-independent effects on messenger RNA (mRNA) splicing. Overall, our results suggest that fine-scale protein structural requirements have significant influences on the splicing patterns of human mRNAs.
Collapse
Affiliation(s)
- Jean-Christophe Gelly
- INSERM, UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques, Paris, France
| | | | | | | | | |
Collapse
|
19
|
Rorick M. Quantifying protein modularity and evolvability: a comparison of different techniques. Biosystems 2012; 110:22-33. [PMID: 22796584 DOI: 10.1016/j.biosystems.2012.06.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Revised: 06/20/2012] [Accepted: 06/27/2012] [Indexed: 10/28/2022]
Abstract
Modularity increases evolvability by reducing constraints on adaptation and by allowing preexisting parts to function in new contexts for novel uses. Protein evolution provides an excellent context to study the causes and consequences of biological modularity. In order to address such questions, however, an index for protein modularity is necessary. This paper proposes a simple index for protein modularity-"module density"-which is the number of evolutionarily independent modules that compose a protein divided by the number of amino acids in the protein. The decomposition of proteins into constituent modules can be accomplished by either of two classes of methods. The first class of methods relies on "suppositional" criteria to assign amino acids to modules, whereas the second class of methods relies on "coevolutionary" criteria for this task. One simple and practical method from the first class consists of approximating the number of modules in a protein as the number of regular secondary structure elements (i.e., helices and sheets). Methods based on coevolutionary criteria require more elaborate data, but they have the advantage of being able to specify modules without prior assumptions about why they exist. Given the increasing availability of datasets sampling protein mutational spectra (e.g., from comparative genomics, experimental evolution, and computational prediction), methods based on coevolutionary criteria will likely become more promising in the near future. The ability to meaningfully quantify protein modularity via simple indices has the potential to aid future efforts to understand protein evolutionary rate determinants, improve molecular evolution models and engineer novel proteins.
Collapse
Affiliation(s)
- Mary Rorick
- University of Michigan, Department of Ecology and Evolutionary Biology, Ann Arbor, MI 48109-1048, United States.
| |
Collapse
|
20
|
Esque J, Oguey C, de Brevern AG. Comparative Analysis of Threshold and Tessellation Methods for Determining Protein Contacts. J Chem Inf Model 2011; 51:493-507. [DOI: 10.1021/ci100195t] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jeremy Esque
- LPTM, CNRS UMR 8089, Université de Cergy Pontoise, 2 av. Adolphe Chauvin, 95302 Cergy-Pontoise, France
- INSERM UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), Université Paris Diderot, Paris 7, INTS, 6, rue Alexandre Cabanel, 75739 Paris Cedex 15, France
| | - Christophe Oguey
- LPTM, CNRS UMR 8089, Université de Cergy Pontoise, 2 av. Adolphe Chauvin, 95302 Cergy-Pontoise, France
| | - Alexandre G. de Brevern
- INSERM UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), Université Paris Diderot, Paris 7, INTS, 6, rue Alexandre Cabanel, 75739 Paris Cedex 15, France
| |
Collapse
|
21
|
Gelly JC, de Brevern AG. Protein Peeling 3D: new tools for analyzing protein structures. Bioinformatics 2010; 27:132-3. [DOI: 10.1093/bioinformatics/btq610] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
22
|
Chowriappa P, Dua S, Kanno J, Thompson HW. Protein structure classification based on conserved hydrophobic residues. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2009; 6:639-651. [PMID: 19875862 DOI: 10.1109/tcbb.2008.77] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Protein folding is frequently guided by local residue interactions that form clusters in the protein core. The interactions between residue clusters serve as potential nucleation sites in the folding process. Evidence postulates that the residue interactions are governed by the hydrophobic propensities that the residues possess. An array of hydrophobicity scales has been developed to determine the hydrophobic propensities of residues under different environmental conditions. In this work, we propose a graph-theory-based data mining framework to extract and isolate protein structural features that sustain invariance in evolutionary-related proteins, through the integrated analysis of five well-known hydrophobicity scales over the 3D structure of proteins. We hypothesize that proteins of the same homology contain conserved hydrophobic residues and exhibit analogous residue interaction patterns in the folded state. The results obtained demonstrate that discriminatory residue interaction patterns shared among proteins of the same family can be employed for both the structural and the functional annotation of proteins. We obtained on the average 90 percent accuracy in protein classification with a significantly small feature vector compared to previous results in the area. This work presents an elaborate study, as well as validation evidence, to illustrate the efficacy of the method and the correctness of results reported.
Collapse
Affiliation(s)
- Pradeep Chowriappa
- Data Mining Research Laboratory and the Department of Computer Science, College of Engineering and Science, Louisiana Tech University, PO Box 10348, Nethken Hall, Ruston, LA 71272, USA.
| | | | | | | |
Collapse
|
23
|
Faure G, Bornot A, de Brevern AG. Analysis of protein contacts into Protein Units. Biochimie 2009; 91:876-87. [PMID: 19383526 DOI: 10.1016/j.biochi.2009.04.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2008] [Accepted: 04/13/2009] [Indexed: 11/18/2022]
Abstract
Three-dimensional structures of proteins are the support of their biological functions. Their folds are maintained by inter-residue interactions which are one of the main focuses to understand the mechanisms of protein folding and stability. Furthermore, protein structures can be composed of single or multiple functional domains that can fold and function independently. Hence, dividing a protein into domains is useful for obtaining an accurate structure and function determination. In previous studies, we enlightened protein contact properties according to different definitions and developed a novel methodology named Protein Peeling. Within protein structures, Protein Peeling characterizes small successive compact units along the sequence called protein units (PUs). The cutting done by Protein Peeling maximizes the number of contacts within the PUs and minimizes the number of contacts between them. This method is so a relevant tool in the context of the protein folding research and particularly regarding the hierarchical model proposed by George Rose. Here, we accurately analyze the PUs at different levels of cutting, using a non-redundant protein databank. Distribution of PU sizes, number of PUs or their accessibility are screened to determine their common and different features. Moreover, we highlight the preferential amino acid interactions inside and between PUs. Our results show that PUs are clearly an intermediate level between secondary structures and protein structural domains.
Collapse
Affiliation(s)
- Guilhem Faure
- INSERM UMR-S 726, Equipe de Bioinformatique Génomique et Moléculaire (EBGM), DSIMB, Université Paris Diderot - Paris 7, case 7113, 2 place Jussieu, 75251 Paris, France
| | | | | |
Collapse
|
24
|
Dong Q, Wang X, Lin L. Prediction of protein local structures and folding fragments based on building-block library. Proteins 2008; 72:353-66. [PMID: 18214964 DOI: 10.1002/prot.21931] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In recent years, protein structure prediction using local structure information has made great progress. In this study, a novel and effective method is developed to predict the local structure and the folding fragments of proteins. First, the proteins with known structures are split into fragments. Second, these fragments, represented by dihedrals, are clustered to produce the building blocks (BBs). Third, an efficient machine learning method is used to predict the local structures of proteins from sequence profiles. Finally, a bi-gram model, trained by an iterated algorithm, is introduced to simulate the interactions of these BBs. For test proteins, the building-block lattice is constructed, which contains all the folding fragments of the proteins. The local structures and the optimal fragments are then obtained by the dynamic programming algorithm. The experiment is performed on a subset of the PDB database with sequence identity less than 25%. The results show that the performance of the method is better than the method that uses only sequence information. When multiple paths are returned, the average classification accuracy of local structures is 72.27% and the average prediction accuracy of local structures is 67.72%, which is a significant improvement in comparison with previous studies. The method can predict not only the local structures but also the folding fragments of proteins. This work is helpful for the ab initio protein structure prediction and especially, the understanding of the folding process of proteins.
Collapse
Affiliation(s)
- Qiwen Dong
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| | | | | |
Collapse
|
25
|
Faure G, Bornot A, de Brevern AG. Protein contacts, inter-residue interactions and side-chain modelling. Biochimie 2008; 90:626-39. [DOI: 10.1016/j.biochi.2007.11.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2007] [Accepted: 11/22/2007] [Indexed: 10/22/2022]
|
26
|
Abstract
Domains are considered to be the building blocks of protein structures. A protein can contain a single domain or multiple domains, each one typically associated with a specific function. The combination of domains determines the function of the protein, its subcellular localization and the interactions it is involved in. Determining the domain structure of a protein is important for multiple reasons, including protein function analysis and structure prediction. This chapter reviews the different approaches for domain prediction and discusses lessons learned from the application of these methods.
Collapse
Affiliation(s)
- Helgi Ingolfsson
- Department of Physiology and Biophysics, Weill Medical College of Cornell University, Ithaca, NY, USA
| | | |
Collapse
|
27
|
ProCKSI: a decision support system for Protein (structure) Comparison, Knowledge, Similarity and Information. BMC Bioinformatics 2007; 8:416. [PMID: 17963510 PMCID: PMC2222653 DOI: 10.1186/1471-2105-8-416] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2007] [Accepted: 10/26/2007] [Indexed: 11/19/2022] Open
Abstract
Background We introduce the decision support system for Protein (Structure) Comparison, Knowledge, Similarity and Information (ProCKSI). ProCKSI integrates various protein similarity measures through an easy to use interface that allows the comparison of multiple proteins simultaneously. It employs the Universal Similarity Metric (USM), the Maximum Contact Map Overlap (MaxCMO) of protein structures and other external methods such as the DaliLite and the TM-align methods, the Combinatorial Extension (CE) of the optimal path, and the FAST Align and Search Tool (FAST). Additionally, ProCKSI allows the user to upload a user-defined similarity matrix supplementing the methods mentioned, and computes a similarity consensus in order to provide a rich, integrated, multicriteria view of large datasets of protein structures. Results We present ProCKSI's architecture and workflow describing its intuitive user interface, and show its potential on three distinct test-cases. In the first case, ProCKSI is used to evaluate the results of a previous CASP competition, assessing the similarity of proposed models for given targets where the structures could have a large deviation from one another. To perform this type of comparison reliably, we introduce a new consensus method. The second study deals with the verification of a classification scheme for protein kinases, originally derived by sequence comparison by Hanks and Hunter, but here we use a consensus similarity measure based on structures. In the third experiment using the Rost and Sander dataset (RS126), we investigate how a combination of different sets of similarity measures influences the quality and performance of ProCKSI's new consensus measure. ProCKSI performs well with all three datasets, showing its potential for complex, simultaneous multi-method assessment of structural similarity in large protein datasets. Furthermore, combining different similarity measures is usually more robust than relying on one single, unique measure. Conclusion Based on a diverse set of similarity measures, ProCKSI computes a consensus similarity profile for the entire protein set. All results can be clustered, visualised, analysed and easily compared with each other through a simple and intuitive interface. ProCKSI is publicly available at for academic and non-commercial use.
Collapse
|
28
|
De Brevern AG, Etchebest C, Benros C, Hazout S. "Pinning strategy": a novel approach for predicting the backbone structure in terms of protein blocks from sequence. J Biosci 2007; 32:51-70. [PMID: 17426380 DOI: 10.1007/s12038-007-0006-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The description of protein 3D structures can be performed through a library of 3D fragments, named a structural alphabet. Our structural alphabet is composed of 16 small protein fragments of 5 C alpha in length, called protein blocks (PBs). It allows an efficient approximation of the 3D protein structures and a correct prediction of the local structure. The 72 most frequent series of 5 consecutive PBs, called structural words (SWs)are able to cover more than 90% of the 3D structures. PBs are highly conditioned by the presence of a limited number of transitions between them. In this study, we propose a new method called "pinning strategy" that used this specific feature to predict long protein fragments. Its goal is to define highly probable successions of PBs. It starts from the most probable SW and is then extended with overlapping SWs. Starting from an initial prediction rate of 34.4%, the use of the SWs instead of the PBs allows a gain of 4.5%. The pinning strategy simply applied to the SWs increases the prediction accuracy to 39.9%. In a second step, the sequence-structure relationship is optimized, the prediction accuracy reaches 43.6%.
Collapse
Affiliation(s)
- A G De Brevern
- 1 INSERM, U726, Equipe de Bioinformatique Genomique et Moleculaire (EBGM), Universite Paris 7,case 7113, 2, place Jussieu, 75251 Paris Cedex 05, France.
| | | | | | | |
Collapse
|
29
|
Taylor WR. Evolutionary transitions in protein fold space. Curr Opin Struct Biol 2007; 17:354-61. [PMID: 17580115 DOI: 10.1016/j.sbi.2007.06.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2007] [Revised: 04/11/2007] [Accepted: 06/06/2007] [Indexed: 10/23/2022]
Abstract
With the number of known protein folds potentially approaching completion, the problems associated with their systematic classification are evaluated. It is argued that it will be difficult, if not impossible, to find a general metric based on pairwise comparison that will provide a satisfactory classification. It is suggested that some progress may be made through comparison against a library of idealised 'template' folds, but a proper solution can only be attained if this includes a model of the underlying evolutionary processes. These processes are considered with examples of some unexpected relationships among folds, including circular permutations. The problem is finally set in the wider context of the genetic environment, introducing complications relating to introns, gene fixation and population size.
Collapse
Affiliation(s)
- William R Taylor
- Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK.
| |
Collapse
|
30
|
Gelly JC, Etchebest C, Hazout S, de Brevern A. Protein Peeling 2: a web server to convert protein structures into series of protein units. Nucleic Acids Res 2006; 34:W75-8. [PMID: 16845113 PMCID: PMC1538916 DOI: 10.1093/nar/gkl292] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Protein Peeling 2 (PP2) is a web server for the automatic identification of protein units (PUs) given the 3D coordinates of a protein. PUs are an intermediate level of protein structure description between protein domains and secondary structures. It is a new tool to better understand and analyze the organization of protein structures. PP2 uses only the matrices of protein contact probabilities and cuts the protein structures optimally using Matthews' coefficient correlation. An index assesses the compactness quality of each PU. Results are given both textually and graphically using JMol and PyMol softwares. The server can be accessed from .
Collapse
Affiliation(s)
| | - C. Etchebest
- INSERM, U726, Equipe de Bioinformatique Génomique et Moléculaire (EBGM)Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris Cedex 05, France
| | - S. Hazout
- INSERM, U726, Equipe de Bioinformatique Génomique et Moléculaire (EBGM)Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris Cedex 05, France
| | - A.G. de Brevern
- INSERM, U726, Equipe de Bioinformatique Génomique et Moléculaire (EBGM)Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris Cedex 05, France
- To whom correspondence should be addressed. Tel: +33 1 44 27 77 31; Fax: +33 1 43 26 38 30;
| |
Collapse
|