1
|
Chopra K, Burdak B, Sharma K, Kembhavi A, Mande SC, Chauhan R. CoRNeA: A Pipeline to Decrypt the Inter-Protein Interfaces from Amino Acid Sequence Information. Biomolecules 2020; 10:biom10060938. [PMID: 32580303 PMCID: PMC7356028 DOI: 10.3390/biom10060938] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 05/26/2020] [Accepted: 05/27/2020] [Indexed: 12/27/2022] Open
Abstract
Decrypting the interface residues of the protein complexes provides insight into the functions of the proteins and, hence, the overall cellular machinery. Computational methods have been devised in the past to predict the interface residues using amino acid sequence information, but all these methods have been majorly applied to predict for prokaryotic protein complexes. Since the composition and rate of evolution of the primary sequence is different between prokaryotes and eukaryotes, it is important to develop a method specifically for eukaryotic complexes. Here, we report a new hybrid pipeline for predicting the protein-protein interaction interfaces in a pairwise manner from the amino acid sequence information of the interacting proteins. It is based on the framework of Co-evolution, machine learning (Random Forest), and Network Analysis named CoRNeA trained specifically on eukaryotic protein complexes. We use Co-evolution, physicochemical properties, and contact potential as major group of features to train the Random Forest classifier. We also incorporate the intra-contact information of the individual proteins to eliminate false positives from the predictions keeping in mind that the amino acid sequence of a protein also holds information for its own folding and not only the interface propensities. Our prediction on example datasets shows that CoRNeA not only enhances the prediction of true interface residues but also reduces false positive rates significantly.
Collapse
Affiliation(s)
- Kriti Chopra
- National Centre for Cell Science, Pune 411007, Maharashtra, India; (K.C.); (B.B.)
| | - Bhawna Burdak
- National Centre for Cell Science, Pune 411007, Maharashtra, India; (K.C.); (B.B.)
| | - Kaushal Sharma
- Inter-University Centre for Astronomy and Astrophysics, Pune 411007, Maharashtra, India; (K.S.); (A.K.)
| | - Ajit Kembhavi
- Inter-University Centre for Astronomy and Astrophysics, Pune 411007, Maharashtra, India; (K.S.); (A.K.)
| | - Shekhar C. Mande
- Council of Scientific and Industrial Research (CSIR), New Delhi 110001, India;
| | - Radha Chauhan
- National Centre for Cell Science, Pune 411007, Maharashtra, India; (K.C.); (B.B.)
- Correspondence: ; Tel.: +91-20-25708255
| |
Collapse
|
2
|
Mura C, Draizen EJ, Bourne PE. Structural biology meets data science: does anything change? Curr Opin Struct Biol 2018; 52:95-102. [PMID: 30267935 DOI: 10.1016/j.sbi.2018.09.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 08/31/2018] [Accepted: 09/07/2018] [Indexed: 01/22/2023]
Abstract
Data science has emerged from the proliferation of digital data, coupled with advances in algorithms, software and hardware (e.g., GPU computing). Innovations in structural biology have been driven by similar factors, spurring us to ask: can these two fields impact one another in deep and hitherto unforeseen ways? We posit that the answer is yes. New biological knowledge lies in the relationships between sequence, structure, function and disease, all of which play out on the stage of evolution, and data science enables us to elucidate these relationships at scale. Here, we consider the above question from the five key pillars of data science: acquisition, engineering, analytics, visualization and policy, with an emphasis on machine learning as the premier analytics approach.
Collapse
Affiliation(s)
- Cameron Mura
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA
| | - Eli J Draizen
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA
| | - Philip E Bourne
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA; Data Science Institute, University of Virginia, Charlottesville, VA 22904, USA.
| |
Collapse
|
3
|
Integrating computational methods and experimental data for understanding the recognition mechanism and binding affinity of protein-protein complexes. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:33-38. [PMID: 28069340 DOI: 10.1016/j.pbiomolbio.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 01/04/2017] [Accepted: 01/05/2017] [Indexed: 01/09/2023]
Abstract
Protein-protein interactions perform several functions inside the cell. Understanding the recognition mechanism and binding affinity of protein-protein complexes is a challenging problem in experimental and computational biology. In this review, we focus on two aspects (i) understanding the recognition mechanism and (ii) predicting the binding affinity. The first part deals with computational techniques for identifying the binding site residues and the contribution of important interactions for understanding the recognition mechanism of protein-protein complexes in comparison with experimental observations. The second part is devoted to the methods developed for discriminating high and low affinity complexes, and predicting the binding affinity of protein-protein complexes using three-dimensional structural information and just from the amino acid sequence. The overall view enhances our understanding of the integration of experimental data and computational methods, recognition mechanism of protein-protein complexes and the binding affinity.
Collapse
|
4
|
Li M, Goncearenco A, Panchenko AR. Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols. Methods Mol Biol 2017; 1550:235-260. [PMID: 28188534 PMCID: PMC5388446 DOI: 10.1007/978-1-4939-6747-6_17] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
In this review we describe a protocol to annotate the effects of missense mutations on proteins, their functions, stability, and binding. For this purpose we present a collection of the most comprehensive databases which store different types of sequencing data on missense mutations, we discuss their relationships, possible intersections, and unique features. Next, we suggest an annotation workflow using the state-of-the art methods and highlight their usability, advantages, and limitations for different cases. Finally, we address a particularly difficult problem of deciphering the molecular mechanisms of mutations on proteins and protein complexes to understand the origins and mechanisms of diseases.
Collapse
Affiliation(s)
- Minghui Li
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Alexander Goncearenco
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Anna R Panchenko
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
5
|
Exploring Protein-Protein Interactions as Drug Targets for Anti-cancer Therapy with In Silico Workflows. Methods Mol Biol 2017; 1647:221-236. [PMID: 28809006 DOI: 10.1007/978-1-4939-7201-2_15] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
We describe a computational protocol to aid the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anti-cancer therapy. To achieve this goal, we explore multiple strategies, including finding binding hot spots, incorporating chemical similarity and bioactivity data, and sampling similar binding sites from homologous protein complexes. We demonstrate how to combine existing interdisciplinary resources with examples of semi-automated workflows. Finally, we discuss several major problems, including the occurrence of drug-resistant mutations, drug promiscuity, and the design of dual-effect inhibitors.
Collapse
|
6
|
Physical and molecular bases of protein thermal stability and cold adaptation. Curr Opin Struct Biol 2016; 42:117-128. [PMID: 28040640 DOI: 10.1016/j.sbi.2016.12.007] [Citation(s) in RCA: 114] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 11/15/2016] [Accepted: 12/11/2016] [Indexed: 11/20/2022]
Abstract
The molecular bases of thermal and cold stability and adaptation, which allow proteins to remain folded and functional in the temperature ranges in which their host organisms live and grow, are still only partially elucidated. Indeed, both experimental and computational studies fail to yield a fully precise and global physical picture, essentially because all effects are context-dependent and thus quite intricate to unravel. We present a snapshot of the current state of knowledge of this highly complex and challenging issue, whose resolution would enable large-scale rational protein design.
Collapse
|
7
|
Sudha G, Srinivasan N. Comparative analyses of quaternary arrangements in homo-oligomeric proteins in superfamilies: Functional implications. Proteins 2016; 84:1190-202. [PMID: 27177429 DOI: 10.1002/prot.25065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Revised: 05/03/2016] [Accepted: 05/08/2016] [Indexed: 11/08/2022]
Abstract
A comprehensive analysis of the quaternary features of distantly related homo-oligomeric proteins is the focus of the current study. This study has been performed at the levels of quaternary state, symmetry, and quaternary structure. Quaternary state and quaternary structure refers to the number of subunits and spatial arrangements of subunits, respectively. Using a large dataset of available 3D structures of biologically relevant assemblies, we show that only 53% of the distantly related homo-oligomeric proteins have the same quaternary state. Considering these homologous homo-oligomers with the same quaternary state, conservation of quaternary structures is observed only in 38% of the pairs. In 36% of the pairs of distantly related homo-oligomers with different quaternary states the larger assembly in a pair shows high structural similarity with the entire quaternary structure of the related protein with lower quaternary state and it is referred as "Russian doll effect." The differences in quaternary state and structure have been suggested to contribute to the functional diversity. Detailed investigations show that even though the gross functions of many distantly related homo-oligomers are the same, finer level differences in molecular functions are manifested by differences in quaternary states and structures. Comparison of structures of biological assemblies in distantly and closely related homo-oligomeric proteins throughout the study differentiates the effects of sequence divergence on the quaternary structures and function. Knowledge inferred from this study can provide insights for improved protein structure classification and function prediction of homo-oligomers. Proteins 2016; 84:1190-1202. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Govindarajan Sudha
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560012, India
| | | |
Collapse
|
8
|
Muratcioglu S, Guven-Maiorov E, Keskin Ö, Gursoy A. Advances in template-based protein docking by utilizing interfaces towards completing structural interactome. Curr Opin Struct Biol 2015; 35:87-92. [PMID: 26539658 DOI: 10.1016/j.sbi.2015.10.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2015] [Revised: 10/09/2015] [Accepted: 10/13/2015] [Indexed: 11/27/2022]
Abstract
The increase in the number of structurally determined protein complexes strengthens template-based docking (TBD) methods for modelling protein-protein interactions (PPIs). These methods utilize the known structures of protein complexes as templates to predict the quaternary structure of the target proteins. The templates may be partial or complete structures. Interface based (partial) methods have recently gained interest due in part to the observation that the interface regions are reusable. We describe how available template interfaces can be used to obtain the structural models of protein interactions. Despite the agreement that a majority of the protein complexes can be modelled using the available Protein Data Bank (PDB) structures, a handful of studies argue that we need more template proteins to increase the structural coverage of PPIs. We also discuss the performance of the interface TBD methods at large scale, and the significance of capturing multiple conformations for improving accuracy.
Collapse
Affiliation(s)
- Serena Muratcioglu
- Department of Chemical and Biological Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey
| | - Emine Guven-Maiorov
- Department of Chemical and Biological Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey
| | - Özlem Keskin
- Department of Chemical and Biological Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey.
| |
Collapse
|