1
|
Su Z, Dhusia K, Wu Y. Encoding the space of protein-protein binding interfaces by artificial intelligence. Comput Biol Chem 2024; 110:108080. [PMID: 38643609 DOI: 10.1016/j.compbiolchem.2024.108080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/03/2024] [Accepted: 04/17/2024] [Indexed: 04/23/2024]
Abstract
The physical interactions between proteins are largely determined by the structural properties at their binding interfaces. It was found that the binding interfaces in distinctive protein complexes are highly similar. The structural properties underlying different binding interfaces could be further captured by artificial intelligence. In order to test this hypothesis, we broke protein-protein binding interfaces into pairs of interacting fragments. We employed a generative model to encode these interface fragment pairs in a low-dimensional latent space. After training, new conformations of interface fragment pairs were generated. We found that, by only using a small number of interface fragment pairs that were generated by artificial intelligence, we were able to guide the assembly of protein complexes into their native conformations. These results demonstrate that the conformational space of fragment pairs at protein-protein binding interfaces is highly degenerate. Features in this degenerate space can be well characterized by artificial intelligence. In summary, our machine learning method will be potentially useful to search for and predict the conformations of unknown protein-protein interactions.
Collapse
Affiliation(s)
- Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN 37212, USA
| | - Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
2
|
Dapkūnas J, Olechnovič K, Venclovas Č. Modeling of protein complexes in CASP14 with emphasis on the interaction interface prediction. Proteins 2021; 89:1834-1843. [PMID: 34176161 PMCID: PMC9292421 DOI: 10.1002/prot.26167] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 06/21/2021] [Accepted: 06/23/2021] [Indexed: 01/08/2023]
Abstract
The goal of CASP experiments is to monitor the progress in the protein structure prediction field. During the 14th CASP edition we aimed to test our capabilities of predicting structures of protein complexes. Our protocol for modeling protein assemblies included both template‐based modeling and free docking. Structural templates were identified using sensitive sequence‐based searches. If sequence‐based searches failed, we performed structure‐based template searches using selected CASP server models. In the absence of reliable templates we applied free docking starting from monomers generated by CASP servers. We evaluated and ranked models of protein complexes using an improved version of our protein structure quality assessment method, VoroMQA, taking into account both interaction interface and global structure scores. If reliable templates could be identified, generally accurate models of protein assemblies were generated with the exception of an antibody‐antigen interaction. The success of free docking mainly depended on the accuracy of initial subunit models and on the scoring of docking solutions. To put our overall results in perspective, we analyzed our performance in the context of other CASP groups. Although the subunits in our assembly models often were not of the top quality, these models had, overall, the best‐predicted intersubunit interfaces according to several accuracy measures. We attribute our relative success primarily to the emphasis on the interaction interface when modeling and scoring.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
3
|
Gong W, Guerler A, Zhang C, Warner E, Li C, Zhang Y. Integrating Multimeric Threading With High-throughput Experiments for Structural Interactome of Escherichia coli. J Mol Biol 2021; 433:166944. [PMID: 33741411 DOI: 10.1016/j.jmb.2021.166944] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 03/06/2021] [Accepted: 03/09/2021] [Indexed: 10/21/2022]
Abstract
Genome-wide protein-protein interaction (PPI) determination remains a significant unsolved problem in structural biology. The difficulty is twofold since high-throughput experiments (HTEs) have often a relatively high false-positive rate in assigning PPIs, and PPI quaternary structures are more difficult to solve than tertiary structures using traditional structural biology techniques. We proposed a uniform pipeline, Threpp, to address both problems. Starting from a pair of monomer sequences, Threpp first threads both sequences through a complex structure library, where the alignment score is combined with HTE data using a naïve Bayesian classifier model to predict the likelihood of two chains to interact with each other. Next, quaternary complex structures of the identified PPIs are constructed by reassembling monomeric alignments with dimeric threading frameworks through interface-specific structural alignments. The pipeline was applied to the Escherichia coli genome and created 35,125 confident PPIs which is 4.5-fold higher than HTE alone. Graphic analyses of the PPI networks show a scale-free cluster size distribution, consistent with previous studies, which was found critical to the robustness of genome evolution and the centrality of functionally important proteins that are essential to E. coli survival. Furthermore, complex structure models were constructed for all predicted E. coli PPIs based on the quaternary threading alignments, where 6771 of them were found to have a high confidence score that corresponds to the correct fold of the complexes with a TM-score >0.5, and 39 showed a close consistency with the later released experimental structures with an average TM-score = 0.73. These results demonstrated the significant usefulness of threading-based homologous modeling in both genome-wide PPI network detection and complex structural construction.
Collapse
Affiliation(s)
- Weikang Gong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Aysam Guerler
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Elisa Warner
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chunhua Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China.
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
4
|
Hertle R, Nazet J, Semmelmann F, Schlee S, Funke F, Merkl R, Sterner R. Reprogramming the Specificity of a Protein Interface by Computational and Data-Driven Design. Structure 2020; 29:292-304.e3. [PMID: 33296666 DOI: 10.1016/j.str.2020.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 09/21/2020] [Accepted: 11/16/2020] [Indexed: 10/22/2022]
Abstract
The formation of specific protein complexes in a cell is a non-trivial problem given the co-existence of thousands of different polypeptide chains. A particularly difficult case are two glutamine amidotransferase complexes (anthranilate synthase [AS] and aminodeoxychorismate synthase [ADCS]), which are composed of homologous pairs of synthase and glutaminase subunits. We have attempted to identify discriminating interface residues of the glutaminase subunit TrpG from AS, which are responsible for its specific interaction with the synthase subunit TrpEx and prevent binding to the closely related synthase subunit PabB from ADCS. For this purpose, TrpG-specific interface residues were grafted into the glutaminase subunit PabA from ADCS by two different approaches, namely a computational and a data-driven one. Both approaches resulted in PabA variants that bound TrpEx with higher affinity than PabB. Hence, we have accomplished a reprogramming of protein-protein interaction specificity that provides insights into the evolutionary adaptation of protein interfaces.
Collapse
Affiliation(s)
- Regina Hertle
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Julian Nazet
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Florian Semmelmann
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Sandra Schlee
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Franziska Funke
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany.
| | - Reinhard Sterner
- Institute of Biophysics and Physical Biochemistry, Regensburg Center for Biochemistry, University of Regensburg, 93040 Regensburg, Germany.
| |
Collapse
|
5
|
Vangaveti S, Vreven T, Zhang Y, Weng Z. Integrating ab initio and template-based algorithms for protein-protein complex structure prediction. Bioinformatics 2020; 36:751-757. [PMID: 31393558 DOI: 10.1093/bioinformatics/btz623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 07/03/2019] [Accepted: 08/06/2019] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Template-based and template-free methods have both been widely used in predicting the structures of protein-protein complexes. Template-based modeling is effective when a reliable template is available, while template-free methods are required for predicting the binding modes or interfaces that have not been previously observed. Our goal is to combine the two methods to improve computational protein-protein complex structure prediction. RESULTS Here, we present a method to identify and combine high-confidence predictions of a template-based method (SPRING) with a template-free method (ZDOCK). Cross-validated using the protein-protein docking benchmark version 5.0, our method (ZING) achieved a success rate of 68.2%, outperforming SPRING and ZDOCK, with success rates of 52.1% and 35.9% respectively, when the top 10 predictions were considered per test case. In conclusion, a statistics-based method that evaluates and integrates predictions from template-based and template-free methods is more successful than either method independently. AVAILABILITY AND IMPLEMENTATION ZING is available for download as a Github repository (https://github.com/weng-lab/ZING.git). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sweta Vangaveti
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| |
Collapse
|
6
|
Evolutionary diversification of protein-protein interactions by interface add-ons. Proc Natl Acad Sci U S A 2017; 114:E8333-E8342. [PMID: 28923934 DOI: 10.1073/pnas.1707335114] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Cells contain a multitude of protein complexes whose subunits interact with high specificity. However, the number of different protein folds and interface geometries found in nature is limited. This raises the question of how protein-protein interaction specificity is achieved on the structural level and how the formation of nonphysiological complexes is avoided. Here, we describe structural elements called interface add-ons that fulfill this function and elucidate their role for the diversification of protein-protein interactions during evolution. We identified interface add-ons in 10% of a representative set of bacterial, heteromeric protein complexes. The importance of interface add-ons for protein-protein interaction specificity is demonstrated by an exemplary experimental characterization of over 30 cognate and hybrid glutamine amidotransferase complexes in combination with comprehensive genetic profiling and protein design. Moreover, growth experiments showed that the lack of interface add-ons can lead to physiologically harmful cross-talk between essential biosynthetic pathways. In sum, our complementary in silico, in vitro, and in vivo analysis argues that interface add-ons are a practical and widespread evolutionary strategy to prevent the formation of nonphysiological complexes by specializing protein-protein interactions.
Collapse
|
7
|
Xiong P, Zhang C, Zheng W, Zhang Y. BindProfX: Assessing Mutation-Induced Binding Affinity Change by Protein Interface Profiles with Pseudo-Counts. J Mol Biol 2016; 429:426-434. [PMID: 27899282 DOI: 10.1016/j.jmb.2016.11.022] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 11/22/2016] [Accepted: 11/23/2016] [Indexed: 11/27/2022]
Abstract
Understanding how gene-level mutations affect the binding affinity of protein-protein interactions is a key issue of protein engineering. Due to the complexity of the problem, using physical force field to predict the mutation-induced binding free-energy change remains challenging. In this work, we present a renewed approach to calculate the impact of gene mutations on the binding affinity through the structure-based profiling of protein-protein interfaces, where the binding free-energy change (ΔΔG) is counted as the logarithm of relative probability of mutant amino acids over wild-type ones in the interface alignment matrix; three pseudo-counts are introduced to alleviate the limit of the current interface library. Compared with a previous profile score that was based on the log-odds likelihood calculation, the correlation between predicted and experimental ΔΔG of single-site mutations is increased in this approach from 0.33 to 0.68. The structure-based profile score is found complementary to the physical potentials, where a linear combination of the profile score with the FoldX potential could increase the ΔΔG correlation from 0.46 to 0.74. It is also shown that the profile score is robust for counting the coupling effect of multiple individual mutations. For the mutations involving more than two mutation sites where the correlation between FoldX and experimental data vanishes, the profile-based calculation retains a strong correlation with the experimental measurements.
Collapse
Affiliation(s)
- Peng Xiong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
8
|
Folador EL, de Carvalho PVSD, Silva WM, Ferreira RS, Silva A, Gromiha M, Ghosh P, Barh D, Azevedo V, Röttger R. In silico identification of essential proteins in Corynebacterium pseudotuberculosis based on protein-protein interaction networks. BMC SYSTEMS BIOLOGY 2016; 10:103. [PMID: 27814699 PMCID: PMC5097352 DOI: 10.1186/s12918-016-0346-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/18/2016] [Indexed: 12/27/2022]
Abstract
Background Corynebacterium pseudotuberculosis (Cp) is a gram-positive bacterium that is classified into equi and ovis serovars. The serovar ovis is the etiological agent of caseous lymphadenitis, a chronic infection affecting sheep and goats, causing economic losses due to carcass condemnation and decreased production of meat, wool, and milk. Current diagnosis or treatment protocols are not fully effective and, thus, require further research of Cp pathogenesis. Results Here, we mapped known protein-protein interactions (PPI) from various species to nine Cp strains to reconstruct parts of the potential Cp interactome and to identify potentially essential proteins serving as putative drug targets. On average, we predict 16,669 interactions for each of the nine strains (with 15,495 interactions shared among all strains). An in silico sanity check suggests that the potential networks were not formed by spurious interactions but have a strong biological bias. With the inferred Cp networks we identify 181 essential proteins, among which 41 are non-host homologous. Conclusions The list of candidate interactions of the Cp strains lay the basis for developing novel hypotheses and designing according wet-lab studies. The non-host homologous essential proteins are attractive targets for therapeutic and diagnostic proposes. They allow for searching of small molecule inhibitors of binding interactions enabling modern drug discovery. Overall, the predicted Cp PPI networks form a valuable and versatile tool for researchers interested in Corynebacterium pseudotuberculosis. Electronic supplementary material The online version of this article (doi:10.1186/s12918-016-0346-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Edson Luiz Folador
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil.,Institute of Biological Sciences, Federal University of Para, Belém, PA, Brazil.,Biotechnology Center (CBiotec), Federal University of Paraiba (UFPB), João Pessoa, Brazil
| | - Paulo Vinícius Sanches Daltro de Carvalho
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Wanderson Marques Silva
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Rafaela Salgado Ferreira
- Department of Biochemistry and Immunology, Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Artur Silva
- Institute of Biological Sciences, Federal University of Para, Belém, PA, Brazil
| | - Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Tamilnadu, India
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal, India
| | - Vasco Azevedo
- Department of General Biology, Instituto de Ciências Biológicas (ICB), Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
9
|
Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, von Mering C. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res 2016; 45:D362-D368. [PMID: 27924014 PMCID: PMC5210637 DOI: 10.1093/nar/gkw937] [Citation(s) in RCA: 4755] [Impact Index Per Article: 594.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 10/06/2016] [Indexed: 02/06/2023] Open
Abstract
A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein–protein association data for a large number of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein–protein interactions, and importing known pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer of interaction knowledge between organisms based on gene orthology. In the latest version 10.5 of STRING, the biggest changes are concerned with data dissemination: the web frontend has been completely redesigned to reduce dependency on outdated browser technologies, and the database can now also be queried from inside the popular Cytoscape software framework. Further improvements include automated background analysis of user inputs for functional enrichments, and streamlined download options. The STRING resource is available online, at http://string-db.org/.
Collapse
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Helen Cook
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Michael Kuhn
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alberto Santos
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Alexander Roth
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany .,Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany.,Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany.,Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
10
|
Garma LD, Medina M, Juffer AH. Structure-based classification of FAD binding sites: A comparative study of structural alignment tools. Proteins 2016; 84:1728-1747. [PMID: 27580869 DOI: 10.1002/prot.25158] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Revised: 07/29/2016] [Accepted: 08/24/2016] [Indexed: 11/06/2022]
Abstract
A total of six different structural alignment tools (TM-Align, TriangleMatch, CLICK, ProBis, SiteEngine and GA-SI) were assessed for their ability to perform two particular tasks: (i) discriminating FAD (flavin adenine dinucleotide) from non-FAD binding sites, and (ii) performing an all-to-all comparison on a set of 883 FAD binding sites for the purpose of classifying them. For the first task, the consistency of each alignment method was evaluated, showing that every method is able to distinguish FAD and non-FAD binding sites with a high Matthews correlation coefficient. Additionally, GA-SI was found to provide alignments different from those of the other approaches. The results obtained for the second task revealed more significant differences among alignment methods, as reflected in the poor correlation of their results and highlighted clearly by the independent evaluation of the structural superimpositions generated by each method. The classification itself was performed using the combined results of all methods, using the best result found for each comparison of binding sites. A number of different clustering methods (Single-linkage, UPGMA, Complete-linkage, SPICKER and k-Means clustering) were also used. The groups of similar binding sites (proteins) or clusters generated by the best performing method were further analyzed in terms of local sequence identity, local structural similarity and conservation of analogous contacts with the FAD ligands. Each of the clusters was characterized by a unique set of structural features or patterns, demonstrating that the groups generated truly reflect the structural diversity of FAD binding sites. Proteins 2016; 84:1728-1747. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Leonardo D Garma
- Biocenter Oulu, and Faculty of Biochemistry and Molecular Medicine, University of Oulu, FI-90014 University of Oulu, Oulu, Finland
| | - Milagros Medina
- Department of Biochemistry and Molecular and Cellular Biology, Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, Zaragoza, 50009, Spain
| | - André H Juffer
- Biocenter Oulu, and Faculty of Biochemistry and Molecular Medicine, University of Oulu, FI-90014 University of Oulu, Oulu, Finland.
| |
Collapse
|
11
|
Frezza E, Lavery R. Internal Normal Mode Analysis (iNMA) Applied to Protein Conformational Flexibility. J Chem Theory Comput 2015; 11:5503-12. [DOI: 10.1021/acs.jctc.5b00724] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Elisa Frezza
- BMSSI, UMR 5086 CNRS/Univ.
Lyon I, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, Lyon 69367, France
| | - Richard Lavery
- BMSSI, UMR 5086 CNRS/Univ.
Lyon I, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, Lyon 69367, France
| |
Collapse
|
12
|
Goncearenco A, Shaytan AK, Shoemaker BA, Panchenko AR. Structural Perspectives on the Evolutionary Expansion of Unique Protein-Protein Binding Sites. Biophys J 2015. [PMID: 26213149 DOI: 10.1016/j.bpj.2015.06.056] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Structures of protein complexes provide atomistic insights into protein interactions. Human proteins represent a quarter of all structures in the Protein Data Bank; however, available protein complexes cover less than 10% of the human proteome. Although it is theoretically possible to infer interactions in human proteins based on structures of homologous protein complexes, it is still unclear to what extent protein interactions and binding sites are conserved, and whether protein complexes from remotely related species can be used to infer interactions and binding sites. We considered biological units of protein complexes and clustered protein-protein binding sites into similarity groups based on their structure and sequence, which allowed us to identify unique binding sites. We showed that the growth rate of the number of unique binding sites in the Protein Data Bank was much slower than the growth rate of the number of structural complexes. Next, we investigated the evolutionary roots of unique binding sites and identified the major phyletic branches with the largest expansion in the number of novel binding sites. We found that many binding sites could be traced to the universal common ancestor of all cellular organisms, whereas relatively few binding sites emerged at the major evolutionary branching points. We analyzed the physicochemical properties of unique binding sites and found that the most ancient sites were the largest in size, involved many salt bridges, and were the most compact and least planar. In contrast, binding sites that appeared more recently in the evolution of eukaryotes were characterized by a larger fraction of polar and aromatic residues, and were less compact and more planar, possibly due to their more transient nature and roles in signaling processes.
Collapse
Affiliation(s)
- Alexander Goncearenco
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland
| | - Alexey K Shaytan
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland
| | - Benjamin A Shoemaker
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland
| | - Anna R Panchenko
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland.
| |
Collapse
|
13
|
Zhang Z, Schindler CEM, Lange OF, Zacharias M. Application of Enhanced Sampling Monte Carlo Methods for High-Resolution Protein-Protein Docking in Rosetta. PLoS One 2015; 10:e0125941. [PMID: 26053419 PMCID: PMC4459952 DOI: 10.1371/journal.pone.0125941] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 03/26/2015] [Indexed: 11/30/2022] Open
Abstract
The high-resolution refinement of docked protein-protein complexes can provide valuable structural and mechanistic insight into protein complex formation complementing experiment. Monte Carlo (MC) based approaches are frequently applied to sample putative interaction geometries of proteins including also possible conformational changes of the binding partners. In order to explore efficiency improvements of the MC sampling, several enhanced sampling techniques, including temperature or Hamiltonian replica exchange and well-tempered ensemble approaches, have been combined with the MC method and were evaluated on 20 protein complexes using unbound partner structures. The well-tempered ensemble method combined with a 2-dimensional temperature and Hamiltonian replica exchange scheme (WTE-H-REMC) was identified as the most efficient search strategy. Comparison with prolonged MC searches indicates that the WTE-H-REMC approach requires approximately 5 times fewer MC steps to identify near native docking geometries compared to conventional MC searches.
Collapse
Affiliation(s)
- Zhe Zhang
- Physik-Department T38, Technische Universität München, James-Franck-Str. 1, 84748 Garching, Germany
| | | | - Oliver F. Lange
- Biomolecular NMR and Munich Center for Integrated Protein Science, Department Chemie, Technische Universität München, Lichtenbergstr. 4, 85748 Garching, Germany
| | - Martin Zacharias
- Physik-Department T38, Technische Universität München, James-Franck-Str. 1, 84748 Garching, Germany
- * E-mail:
| |
Collapse
|
14
|
|
15
|
Aumentado-Armstrong TT, Istrate B, Murgita RA. Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol Biol 2015; 10:7. [PMID: 25713596 PMCID: PMC4338852 DOI: 10.1186/s13015-015-0033-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Accepted: 01/07/2015] [Indexed: 12/19/2022] Open
Abstract
Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
Collapse
|
16
|
Villoutreix BO, Kuenemann MA, Poyet JL, Bruzzoni-Giovanelli H, Labbé C, Lagorce D, Sperandio O, Miteva MA. Drug-Like Protein-Protein Interaction Modulators: Challenges and Opportunities for Drug Discovery and Chemical Biology. Mol Inform 2014; 33:414-437. [PMID: 25254076 PMCID: PMC4160817 DOI: 10.1002/minf.201400040] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 04/21/2014] [Indexed: 12/13/2022]
Abstract
[Formula: see text] Fundamental processes in living cells are largely controlled by macromolecular interactions and among them, protein-protein interactions (PPIs) have a critical role while their dysregulations can contribute to the pathogenesis of numerous diseases. Although PPIs were considered as attractive pharmaceutical targets already some years ago, they have been thus far largely unexploited for therapeutic interventions with low molecular weight compounds. Several limiting factors, from technological hurdles to conceptual barriers, are known, which, taken together, explain why research in this area has been relatively slow. However, this last decade, the scientific community has challenged the dogma and became more enthusiastic about the modulation of PPIs with small drug-like molecules. In fact, several success stories were reported both, at the preclinical and clinical stages. In this review article, written for the 2014 International Summer School in Chemoinformatics (Strasbourg, France), we discuss in silico tools (essentially post 2012) and databases that can assist the design of low molecular weight PPI modulators (these tools can be found at www.vls3d.com). We first introduce the field of protein-protein interaction research, discuss key challenges and comment recently reported in silico packages, protocols and databases dedicated to PPIs. Then, we illustrate how in silico methods can be used and combined with experimental work to identify PPI modulators.
Collapse
Affiliation(s)
- Bruno O Villoutreix
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- CDithem, Faculté de Pharmacie, 1 rue du Prof Laguesse59000 Lille, France
| | - Melaine A Kuenemann
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| | - Jean-Luc Poyet
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- IUH, Hôpital Saint-LouisParis, France
- CDithem, Faculté de Pharmacie, 1 rue du Prof Laguesse59000 Lille, France
| | - Heriberto Bruzzoni-Giovanelli
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- CIC, Clinical investigation center, Hôpital Saint-LouisParis, France
| | - Céline Labbé
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| | - David Lagorce
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| | - Olivier Sperandio
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- CDithem, Faculté de Pharmacie, 1 rue du Prof Laguesse59000 Lille, France
| | - Maria A Miteva
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| |
Collapse
|
17
|
Cukuroglu E, Gursoy A, Nussinov R, Keskin O. Non-redundant unique interface structures as templates for modeling protein interactions. PLoS One 2014; 9:e86738. [PMID: 24475173 PMCID: PMC3903793 DOI: 10.1371/journal.pone.0086738] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2013] [Accepted: 12/18/2013] [Indexed: 01/16/2023] Open
Abstract
Improvements in experimental techniques increasingly provide structural data relating to protein-protein interactions. Classification of structural details of protein-protein interactions can provide valuable insights for modeling and abstracting design principles. Here, we aim to cluster protein-protein interactions by their interface structures, and to exploit these clusters to obtain and study shared and distinct protein binding sites. We find that there are 22604 unique interface structures in the PDB. These unique interfaces, which provide a rich resource of structural data of protein-protein interactions, can be used for template-based docking. We test the specificity of these non-redundant unique interface structures by finding protein pairs which have multiple binding sites. We suggest that residues with more than 40% relative accessible surface area should be considered as surface residues in template-based docking studies. This comprehensive study of protein interface structures can serve as a resource for the community. The dataset can be accessed at http://prism.ccbb.ku.edu.tr/piface.
Collapse
Affiliation(s)
- Engin Cukuroglu
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| | - Attila Gursoy
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| | - Ruth Nussinov
- National Cancer Institute, Cancer and Inflammation Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| |
Collapse
|
18
|
Cummings RD, Pierce JM. The challenge and promise of glycomics. CHEMISTRY & BIOLOGY 2014; 21:1-15. [PMID: 24439204 PMCID: PMC3955176 DOI: 10.1016/j.chembiol.2013.12.010] [Citation(s) in RCA: 280] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 12/27/2013] [Accepted: 12/30/2013] [Indexed: 01/22/2023]
Abstract
Glycomics is a broad and emerging scientific discipline focused on defining the structures and functional roles of glycans in biological systems. The staggering complexity of the glycome, minimally defined as the repertoire of glycans expressed in a cell or organism, has resulted in many challenges that must be overcome; these are being addressed by new advances in mass spectrometry as well as by the expansion of genetic and cell biology studies. Conversely, identifying the specific glycan recognition determinants of glycan-binding proteins by employing the new technology of glycan microarrays is providing insights into how glycans function in recognition and signaling within an organism and with microbes and pathogens. The promises of a more complete knowledge of glycomes are immense in that glycan modifications of intracellular and extracellular proteins have critical functions in almost all biological pathways.
Collapse
Affiliation(s)
- Richard D Cummings
- Department of Biochemistry, Emory Glycomics Center, Emory University School of Medicine, 1510 Clifton Road NE, Atlanta, GA 30322, USA.
| | - J Michael Pierce
- Complex Carbohydrate Research Center, Department of Biochemistry and Molecular Biology, University of Georgia, 315 Riverbend Road, Athens, GA 30602, USA.
| |
Collapse
|
19
|
Syafrizayanti, Betzen C, Hoheisel JD, Kastelic D. Methods for analyzing and quantifying protein–protein interaction. Expert Rev Proteomics 2014; 11:107-20. [DOI: 10.1586/14789450.2014.875857] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
20
|
Folador EL, Hassan SS, Lemke N, Barh D, Silva A, Ferreira RS, Azevedo V. An improved interolog mapping-based computational prediction of protein–protein interactions with increased network coverage. Integr Biol (Camb) 2014; 6:1080-7. [DOI: 10.1039/c4ib00136b] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Automated and efficient methods that map ortholog interactions from several organisms and public databases (pDB) are needed to identify new interactions in an organism of interest (interolog mapping).
Collapse
Affiliation(s)
- Edson Luiz Folador
- Department of General Biology
- Instituto de Ciências Biológicas (ICB)
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| | - Syed Shah Hassan
- Department of General Biology
- Instituto de Ciências Biológicas (ICB)
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| | - Ney Lemke
- Laboratory of Bioinformatic and Computational Biofisic
- Instituto de Biociência
- Universidade Estadual de São Paulo (UNESP)
- Botucatu, Brazil
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology
- Institute of Integrative Omics and Applied Biotechnology (IIOAB)
- Purba Medinipur, India
| | - Artur Silva
- Instituto de Ciências Biológicas
- Universidade Federal do Para
- Belém, Brazil
| | - Rafaela Salgado Ferreira
- Department of Biochemistry and Immunology
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| | - Vasco Azevedo
- Department of General Biology
- Instituto de Ciências Biológicas (ICB)
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| |
Collapse
|
21
|
Template-based structure modeling of protein-protein interactions. Curr Opin Struct Biol 2013; 24:10-23. [PMID: 24721449 DOI: 10.1016/j.sbi.2013.11.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2013] [Revised: 10/29/2013] [Accepted: 11/21/2013] [Indexed: 01/21/2023]
Abstract
The structure of protein-protein complexes can be constructed by using the known structure of other protein complexes as a template. The complex structure templates are generally detected either by homology-based sequence alignments or, given the structure of monomer components, by structure-based comparisons. Critical improvements have been made in recent years by utilizing interface recognition and by recombining monomer and complex template libraries. Encouraging progress has also been witnessed in genome-wide applications of template-based modeling, with modeling accuracy comparable to high-throughput experimental data. Nevertheless, bottlenecks exist due to the incompleteness of the protein-protein complex structure library and the lack of methods for distant homologous template identification and full-length complex structure refinement.
Collapse
|
22
|
Ahmed MH, Habtemariam M, Safo MK, Scarsdale JN, Spyrakis F, Cozzini P, Mozzarelli A, Kellogg GE. Unintended consequences? Water molecules at biological and crystallographic protein–protein interfaces. Comput Biol Chem 2013; 47:126-41. [DOI: 10.1016/j.compbiolchem.2013.08.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 08/27/2013] [Accepted: 08/27/2013] [Indexed: 01/31/2023]
|
23
|
Gardiner J. Evolutionary basins of attraction and convergence in plants and animals. Commun Integr Biol 2013; 6:e26760. [PMID: 24505506 PMCID: PMC3914912 DOI: 10.4161/cib.26760] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 10/09/2013] [Indexed: 11/19/2022] Open
Abstract
Living organisms evolve, in part, according to the underlying properties of the amino acids and other compounds of which they are composed. Thus there are evolutionary basins of attraction that living organisms will tend to evolve toward. These processes are complex and probably beyond our current capabilities to fully envisage. But progress is being made toward an understanding of such principles by efforts to catalog protein folds and protein–protein interactions. Even plants and animals show convergent evolution, possibly driven by underlying evolutionary basins of attraction. Physical and chemical parameters and the properties of proteins present in the last common ancestor of these 2 taxa, including a putative connexin ancestor, may have played key roles here. Thus evolution is perhaps not as random as is sometimes depicted, but will follow predefined pathways. Here I address convergent evolution in plants and animals beginning at the molecular level and progressing to the organismic one.
Collapse
Affiliation(s)
- John Gardiner
- The School of Biological Sciences; The University of Sydney; Camperdown, NSW Australia
| |
Collapse
|
24
|
Zhang Z, Lange OF. Replica exchange improves sampling in low-resolution docking stage of RosettaDock. PLoS One 2013; 8:e72096. [PMID: 24009670 PMCID: PMC3756964 DOI: 10.1371/journal.pone.0072096] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 07/10/2013] [Indexed: 11/18/2022] Open
Abstract
Many protein-protein docking protocols are based on a shotgun approach, in which thousands of independent random-start trajectories minimize the rigid-body degrees of freedom. Another strategy is enumerative sampling as used in ZDOCK. Here, we introduce an alternative strategy, ReplicaDock, using a small number of long trajectories of temperature replica exchange. We compare replica exchange sampling as low-resolution stage of RosettaDock with RosettaDock's original shotgun sampling as well as with ZDOCK. A benchmark of 30 complexes starting from structures of the unbound binding partners shows improved performance for ReplicaDock and ZDOCK when compared to shotgun sampling at equal or less computational expense. ReplicaDock and ZDOCK consistently reach lower energies and generate significantly more near-native conformations than shotgun sampling. Accordingly, they both improve typical metrics of prediction quality of complex structures after refinement. Additionally, the refined ReplicaDock ensembles reach significantly lower interface energies and many previously hidden features of the docking energy landscape become visible when ReplicaDock is applied.
Collapse
Affiliation(s)
- Zhe Zhang
- Biomolecular NMR and Munich Center for Integrated Protein Science, Department Chemie, Technische Universität München, Garching, Germany
| | - Oliver F. Lange
- Biomolecular NMR and Munich Center for Integrated Protein Science, Department Chemie, Technische Universität München, Garching, Germany
- Institute of Structural Biology, Helmholtz Zentrum München, Neuherberg, Germany
- * E-mail:
| |
Collapse
|
25
|
Vreven T, Hwang H, Pierce BG, Weng Z. Evaluating template-based and template-free protein-protein complex structure prediction. Brief Bioinform 2013; 15:169-76. [PMID: 23818491 DOI: 10.1093/bib/bbt047] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
We compared the performance of template-free (docking) and template-based methods for the prediction of protein-protein complex structures. We found similar performance for a template-based method based on threading (COTH) and another template-based method based on structural alignment (PRISM). The template-based methods showed similar performance to a docking method (ZDOCK) when the latter was allowed one prediction for each complex, but when the same number of predictions was allowed for each method, the docking approach outperformed template-based approaches. We identified strengths and weaknesses in each method. Template-based approaches were better able to handle complexes that involved conformational changes upon binding. Furthermore, the threading-based and docking methods were better than the structural-alignment-based method for enzyme-inhibitor complex prediction. Finally, we show that the near-native (correct) predictions were generally not shared by the various approaches, suggesting that integrating their results could be the superior strategy.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, ASC-5th floor room 1069, 368 Plantation St., Worcester, MA 01605, USA.
| | | | | | | |
Collapse
|
26
|
Guerler A, Govindarajoo B, Zhang Y. Mapping monomeric threading to protein-protein structure prediction. J Chem Inf Model 2013; 53:717-25. [PMID: 23413988 DOI: 10.1021/ci300579r] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD <2.5 Å by SPRING is 134% and 167% higher than these competing methods. SPRING is controlled with ZDOCK on 77 docking benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.
Collapse
Affiliation(s)
- Aysam Guerler
- Department of Computational Medicine and Bioinformatics, University of Michigan , Ann Arbor, Michigan, 48109, United States
| | | | | |
Collapse
|