1
|
Martin J. AlphaFold2 Predicts Whether Proteins Interact Amidst Confounding Structural Compatibility. J Chem Inf Model 2024; 64:1473-1480. [PMID: 38373070 DOI: 10.1021/acs.jcim.3c01805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Predicting whether two proteins physically interact is one of the holy grails of computational biology, galvanized by rapid advancements in deep learning. AlphaFold2, although not developed with this goal, is promising in this respect. Here, I test the prediction capability of AlphaFold2 on a very challenging data set, where proteins are structurally compatible, even when they do not interact. AlphaFold2 achieves high discrimination between interacting and non-interacting proteins, and the cases of misclassifications can either be rescued by revisiting the input sequences or can suggest false positives and negatives in the data set. AlphaFold2 is thus not impaired by the compatibility between protein structures and has the potential to be applied on a large scale.
Collapse
Affiliation(s)
- Juliette Martin
- Univ Lyon, CNRS, UMR 5086 MMSB, 7 passage du Vercors F-69367, Lyon, France
- Laboratory of Biology and Modeling of the Cell, Ecole Normale Supérieure de Lyon, CNRS UMR 5239, Inserm U1293, University Claude Bernard Lyon 1, 69364, Lyon, France
| |
Collapse
|
2
|
Vyas P, Kumar PBS, Das SL. Sorting of proteins with shape and curvature anisotropy on a lipid bilayer tube. SOFT MATTER 2022; 18:1653-1665. [PMID: 35132986 DOI: 10.1039/d2sm00077f] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Curvature induced sorting of lipid membrane bound proteins has been widely studied through experiments that induce curvature variation in a giant unilamellar lipid-bilayer vesicle with adsorbed proteins by pulling thin cylindrical tethers. In the theoretical space, this has been supplemented with models that capture curvature dependent interaction between membrane and idealized protein particles, through free energy contributions. Many membrane proteins such as the BAR domain proteins are known to have extremely anisotropic shapes and soft interacting potentials, whereas the idealizations of protein particles explored in models have only assumed them as hard disk-like particles with curvature anisotropy. Here, we present a model of sorting of the proteins while including the effects of softness in their interaction potentials, shape anisotropy in the protein structure, and curvature anisotropy in the interactions with the membrane. This is based on a clean separation of free energy contributions from non-ideal fluid behavior of soft anisotropic particles and curvature interactions between proteins and membranes. We probe the behavior of the sorting function under limiting conditions and show that it converges to the previously derived models. In addition to this, we present a comparison of the variation in sorting ratio due to the observed variation in the shape parameter values in known membrane proteins. Finally, using published experimental data for membrane proteins, we perform fitting and derive model parameters. We observe that shape anisotropy adversely affects the sorting of proteins to a high curvature region, whereas curvature anisotropy and softer interaction between proteins favor sorting.
Collapse
Affiliation(s)
- Pranav Vyas
- Department of Bioengineering, Stanford University, Stanford, California 94305, USA.
| | - P B Sunil Kumar
- Department of Physics, Indian Institute of Technology Palakkad, Palakkad 678623, India
| | - Sovan Lal Das
- Physical and Chemical Biology Laboratory and Department of Mechanical Engineering, Indian Institute of Technology Palakkad, Palakkad 678623, India
| |
Collapse
|
3
|
Prévost C, Sacquin-Mora S. Moving pictures: Reassessing docking experiments with a dynamic view of protein interfaces. Proteins 2021; 89:1315-1323. [PMID: 34038009 DOI: 10.1002/prot.26152] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/22/2021] [Accepted: 05/19/2021] [Indexed: 11/06/2022]
Abstract
The modeling of protein assemblies at the atomic level remains a central issue in structural biology, as protein interactions play a key role in numerous cellular processes. This problem is traditionally addressed using docking tools, where the quality of the models is based on their similarity to a single reference experimental structure. However, using a static reference does not take into account the dynamic quality of the protein interface. Here, we used all-atom classical Molecular Dynamics simulations to investigate the stability of the reference interface for three complexes that previously served as targets in the CAPRI competition. For each one of these targets, we also ran MD simulations for ten models that are distributed over the High, Medium and Acceptable accuracy categories. To assess the quality of these models from a dynamic perspective, we set up new criteria which take into account the stability of the reference experimental protein interface. We show that, when the protein interfaces are allowed to evolve along time, the original ranking based on the static CAPRI criteria no longer holds as over 50% of the docking models undergo a category change (which can be either toward a better or a lower accuracy group) when reassessing their quality using dynamic information.
Collapse
Affiliation(s)
- Chantal Prévost
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| | - Sophie Sacquin-Mora
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| |
Collapse
|
4
|
Hossain MS, Hossan MI, Mizan S, Moin AT, Yasmin F, Akash AS, Powshi SN, Hasan AR, Chowdhury AS. Immunoinformatics approach to designing a multi-epitope vaccine against Saint Louis Encephalitis Virus. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2020.100500] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
|
5
|
Guo F, Zou Q, Yang G, Wang D, Tang J, Xu J. Identifying protein-protein interface via a novel multi-scale local sequence and structural representation. BMC Bioinformatics 2019; 20:483. [PMID: 31874604 PMCID: PMC6929278 DOI: 10.1186/s12859-019-3048-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 08/21/2019] [Indexed: 12/23/2022] Open
Abstract
Background Protein-protein interaction plays a key role in a multitude of biological processes, such as signal transduction, de novo drug design, immune responses, and enzymatic activities. Gaining insights of various binding abilities can deepen our understanding of the interaction. It is of great interest to understand how proteins in a complex interact with each other. Many efficient methods have been developed for identifying protein-protein interface. Results In this paper, we obtain the local information on protein-protein interface, through multi-scale local average block and hexagon structure construction. Given a pair of proteins, we use a trained support vector regression (SVR) model to select best configurations. On Benchmark v4.0, our method achieves average Irmsd value of 3.28Å and overall Fnat value of 63%, which improves upon Irmsd of 3.89Å and Fnat of 49% for ZRANK, and Irmsd of 3.99Å and Fnat of 46% for ClusPro. On CAPRI targets, our method achieves average Irmsd value of 3.45Å and overall Fnat value of 46%, which improves upon Irmsd of 4.18Å and Fnat of 40% for ZRANK, and Irmsd of 5.12Å and Fnat of 32% for ClusPro. The success rates by our method, FRODOCK 2.0, InterEvDock and SnapDock on Benchmark v4.0 are 41.5%, 29.0%, 29.4% and 37.0%, respectively. Conclusion Experiments show that our method performs better than some state-of-the-art methods, based on the prediction quality improved in terms of CAPRI evaluation criteria. All these results demonstrate that our method is a valuable technological tool for identifying protein-protein interface.
Collapse
Affiliation(s)
- Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, People's Republic of China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, People's Republic of China
| | - Guang Yang
- School of Economics, Nankai University, Tianjin, People's Republic of China
| | - Dan Wang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Jijun Tang
- College of Intelligence and Computing, Tianjin University, Tianjin, People's Republic of China.,Department of Computer Science and Engineering, University of South Carolina, Columbia, USA
| | - Junhai Xu
- College of Intelligence and Computing, Tianjin University, Tianjin, People's Republic of China
| |
Collapse
|
6
|
Roel-Touris J, Don CG, V Honorato R, Rodrigues JPGLM, Bonvin AMJJ. Less Is More: Coarse-Grained Integrative Modeling of Large Biomolecular Assemblies with HADDOCK. J Chem Theory Comput 2019; 15:6358-6367. [PMID: 31539250 PMCID: PMC6854652 DOI: 10.1021/acs.jctc.9b00310] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Predicting the 3D structure of protein interactions remains a challenge in the field of computational structural biology. This is in part due to difficulties in sampling the complex energy landscape of multiple interacting flexible polypeptide chains. Coarse-graining approaches, which reduce the number of degrees of freedom of the system, help address this limitation by smoothing the energy landscape, allowing an easier identification of the global energy minimum. They also accelerate the calculations, allowing for modeling larger assemblies. Here, we present the implementation of the MARTINI coarse-grained force field for proteins into HADDOCK, our integrative modeling platform. Docking and refinement are performed at the coarse-grained level, and the resulting models are then converted back to atomistic resolution through a distance restraints-guided morphing procedure. Our protocol, tested on the largest complexes of the protein docking benchmark 5, shows an overall ∼7-fold speed increase compared to standard all-atom calculations, while maintaining a similar accuracy and yielding substantially more near-native solutions. To showcase the potential of our method, we performed simultaneous 7 body docking to model the 1:6 KaiC-KaiB complex, integrating mutagenesis and hydrogen/deuterium exchange data from mass spectrometry with symmetry restraints, and validated the resulting models against a recently published cryo-EM structure.
Collapse
Affiliation(s)
- Jorge Roel-Touris
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry , Utrecht University , Utrecht 3584CH , The Netherlands
| | - Charleen G Don
- Department of Pharmaceutical Sciences , University of Basel , 4056 Basel , Switzerland
| | - Rodrigo V Honorato
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry , Utrecht University , Utrecht 3584CH , The Netherlands
| | - João P G L M Rodrigues
- Department of Structural Biology , Stanford University School of Medicine , Stanford , California 94305 , United States
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry , Utrecht University , Utrecht 3584CH , The Netherlands
| |
Collapse
|
7
|
Yu L, Yao S, Gao L, Zha Y. Conserved Disease Modules Extracted From Multilayer Heterogeneous Disease and Gene Networks for Understanding Disease Mechanisms and Predicting Disease Treatments. Front Genet 2019; 9:745. [PMID: 30713550 PMCID: PMC6346701 DOI: 10.3389/fgene.2018.00745] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 12/27/2018] [Indexed: 12/29/2022] Open
Abstract
Disease relationship studies for understanding the pathogenesis of complex diseases, diagnosis, prognosis, and drug development are important. Traditional approaches consider one type of disease data or aggregating multiple types of disease data into a single network, which results in important temporal- or context-related information loss and may distort the actual organization. Therefore, it is necessary to apply multilayer network model to consider multiple types of relationships between diseases and the important interplays between different relationships. Further, modules extracted from multilayer networks are smaller and have more overlap that better capture the actual organization. Here, we constructed a weighted four-layer disease-disease similarity network to characterize the associations at different levels between diseases. Then, a tensor-based computational framework was used to extract Conserved Disease Modules (CDMs) from the four-layer disease network. After filtering, nine significant CDMs were reserved. The statistical significance test proved the significance of the nine CDMs. Comparing with modules got from four single layer networks, CMDs are smaller, better represent the actual relationships, and contain potential disease-disease relationships. KEGG pathways enrichment analysis and literature mining further contributed to confirm that these CDMs are highly reliable. Furthermore, the CDMs can be applied to predict potential drugs for diseases. The molecular docking techniques were used to provide the direct evidence for drugs to treat related disease. Taking Rheumatoid Arthritis (RA) as a case, we found its three potential drugs Carvedilol, Metoprolol, and Ramipril. And many studies have pointed out that Carvedilol and Ramipril have an effect on RA. Overall, the CMDs extracted from multilayer networks provide us with an impressive understanding disease mechanisms from the perspective of multi-layer network and also provide an effective way to predict potential drugs for diseases based on its neighbors in a same CDM.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Shunyu Yao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Yunhong Zha
- Department of Neurology, Institute of Neural Regeneration and Repair, Three Gorges University College of Medicine, The First Hospital of Yichang, Yichang, China
| |
Collapse
|
8
|
Lagarde N, Carbone A, Sacquin-Mora S. Hidden partners: Using cross-docking calculations to predict binding sites for proteins with multiple interactions. Proteins 2018; 86:723-737. [DOI: 10.1002/prot.25506] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 03/23/2018] [Accepted: 04/07/2018] [Indexed: 02/06/2023]
Affiliation(s)
- Nathalie Lagarde
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie; Paris 75005 France
| | - Alessandra Carbone
- Laboratoire de Biologie Computationnelle et Quantitative, CNRS UMR7238, UPMC Univ-Paris 6, Sorbonne Université, 4 place Jussieu; Paris 75005 France
- Institut Universitaire de France; Paris 75005 France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie; Paris 75005 France
| |
Collapse
|
9
|
Liu G, Chai B, Yang K, Yu J, Zhou X. Overlapping functional modules detection in PPI network with pair-wise constrained non-negative matrix tri-factorisation. IET Syst Biol 2018. [PMID: 29533217 PMCID: PMC8687432 DOI: 10.1049/iet-syb.2017.0084] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
A large amount of available protein–protein interaction (PPI) data has been generated by high‐throughput experimental techniques. Uncovering functional modules from PPI networks will help us better understand the underlying mechanisms of cellular functions. Numerous computational algorithms have been designed to identify functional modules automatically in the past decades. However, most community detection methods (non‐overlapping or overlapping types) are unsupervised models, which cannot incorporate the well‐known protein complexes as a priori. The authors propose a novel semi‐supervised model named pairwise constrains nonnegative matrix tri‐factorisation (PCNMTF), which takes full advantage of the well‐known protein complexes to find overlapping functional modules based on protein module indicator matrix and module correlation matrix simultaneously from PPI networks. PCNMTF determinately models and learns the mixed module memberships of each protein by considering the correlation among modules simultaneously based on the non‐negative matrix tri‐factorisation. The experiment results on both synthetic and real‐world biological networks demonstrate that PCNMTF gains more precise functional modules than that of state‐of‐the‐art methods.
Collapse
Affiliation(s)
- Guangming Liu
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China
| | - Bianfang Chai
- Department of Information Engineering, Hebei GEO University, Shijiazhuang, People's Republic of China
| | - Kuo Yang
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China
| | - Jian Yu
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China
| | - Xuezhong Zhou
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China.
| |
Collapse
|
10
|
Qiu Z, Zhou B, Yuan J. Protein–protein interaction site predictions with minimum covariance determinant and Mahalanobis distance. J Theor Biol 2017; 433:57-63. [DOI: 10.1016/j.jtbi.2017.08.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 08/26/2017] [Accepted: 08/30/2017] [Indexed: 10/18/2022]
|
11
|
Yu L, Wang B, Ma X, Gao L. The extraction of drug-disease correlations based on module distance in incomplete human interactome. BMC SYSTEMS BIOLOGY 2016; 10:111. [PMID: 28155709 PMCID: PMC5260043 DOI: 10.1186/s12918-016-0364-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
BACKGROUND Extracting drug-disease correlations is crucial in unveiling disease mechanisms, as well as discovering new indications of available drugs, or drug repositioning. Both the interactome and the knowledge of disease-associated and drug-associated genes remain incomplete. RESULTS We present a new method to predict the associations between drugs and diseases. Our method is based on a module distance, which is originally proposed to calculate distances between modules in incomplete human interactome. We first map all the disease genes and drug genes to a combined protein interaction network. Then based on the module distance, we calculate the distances between drug gene sets and disease gene sets, and take the distances as the relationships of drug-disease pairs. We also filter possible false positive drug-disease correlations by p-value. Finally, we validate the top-100 drug-disease associations related to six drugs in the predicted results. CONCLUSION The overlapping between our predicted correlations with those reported in Comparative Toxicogenomics Database (CTD) and literatures, and their enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways demonstrate our approach can not only effectively identify new drug indications, but also provide new insight into drug-disease discovery.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, People's Republic of China.
| | - Bingbo Wang
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, People's Republic of China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, People's Republic of China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, People's Republic of China
| |
Collapse
|
12
|
Du T, Liao L, Wu CH. Enhancing interacting residue prediction with integrated contact matrix prediction in protein-protein interaction. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2016; 2016:17. [PMID: 27818677 PMCID: PMC5075339 DOI: 10.1186/s13637-016-0051-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Accepted: 09/25/2016] [Indexed: 11/10/2022]
Abstract
Identifying the residues in a protein that are involved in protein-protein interaction and identifying the contact matrix for a pair of interacting proteins are two computational tasks at different levels of an in-depth analysis of protein-protein interaction. Various methods for solving these two problems have been reported in the literature. However, the interacting residue prediction and contact matrix prediction were handled by and large independently in those existing methods, though intuitively good prediction of interacting residues will help with predicting the contact matrix. In this work, we developed a novel protein interacting residue prediction system, contact matrix-interaction profile hidden Markov model (CM-ipHMM), with the integration of contact matrix prediction and the ipHMM interaction residue prediction. We propose to leverage what is learned from the contact matrix prediction and utilize the predicted contact matrix as "feedback" to enhance the interaction residue prediction. The CM-ipHMM model showed significant improvement over the previous method that uses the ipHMM for predicting interaction residues only. It indicates that the downstream contact matrix prediction could help the interaction site prediction.
Collapse
Affiliation(s)
- Tianchuan Du
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| | - Li Liao
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| | - Cathy H Wu
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| |
Collapse
|
13
|
Chakavorty A, Li L, Alexov E. Electrostatic component of binding energy: Interpreting predictions from poisson-boltzmann equation and modeling protocols. J Comput Chem 2016; 37:2495-507. [PMID: 27546093 PMCID: PMC5030180 DOI: 10.1002/jcc.24475] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 08/03/2016] [Accepted: 08/06/2016] [Indexed: 01/11/2023]
Abstract
Macromolecular interactions are essential for understanding numerous biological processes and are typically characterized by the binding free energy. Important component of the binding free energy is the electrostatics, which is frequently modeled via the solutions of the Poisson-Boltzmann Equations (PBE). However, numerous works have shown that the electrostatic component (ΔΔGelec ) of binding free energy is very sensitive to the parameters used and modeling protocol. This prompted some researchers to question the robustness of PBE in predicting ΔΔGelec . We argue that the sensitivity of the absolute ΔΔGelec calculated with PBE using different input parameters and definitions does not indicate PBE deficiency, rather this is what should be expected. We show how the apparent sensitivity should be interpreted in terms of the underlying changes in several numerous and physical parameters. We demonstrate that PBE approach is robust within each considered force field (CHARMM-27, AMBER-94, and OPLS-AA) once the corresponding structures are energy minimized. This observation holds despite of using two different molecular surface definitions, pointing again that PBE delivers consistent results within particular force field. The fact that PBE delivered ΔΔGelec values may differ if calculated with different modeling protocols is not a deficiency of PBE, but natural results of the differences of the force field parameters and potential functions for energy minimization. In addition, while the absolute ΔΔGelec values calculated with different force field differ, their ordering remains practically the same allowing for consistent ranking despite of the force field used. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Arghya Chakavorty
- Computational Biophysics and Bioinformatics, Department of Physics and Astronomy, Clemson University, Clemson, South Carolina, 29634
| | - Lin Li
- Computational Biophysics and Bioinformatics, Department of Physics and Astronomy, Clemson University, Clemson, South Carolina, 29634
| | - Emil Alexov
- Computational Biophysics and Bioinformatics, Department of Physics and Astronomy, Clemson University, Clemson, South Carolina, 29634.
| |
Collapse
|
14
|
Tonddast-Navaei S, Skolnick J. Are protein-protein interfaces special regions on a protein's surface? J Chem Phys 2016; 143:243149. [PMID: 26723634 DOI: 10.1063/1.4937428] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Protein-protein interactions (PPIs) are involved in many cellular processes. Experimentally obtained protein quaternary structures provide the location of protein-protein interfaces, the surface region of a given protein that interacts with another. These regions are termed half-interfaces (HIs). Canonical HIs cover roughly one third of a protein's surface and were found to have more hydrophobic residues than the non-interface surface region. In addition, the classical view of protein HIs was that there are a few (if not one) HIs per protein that are structurally and chemically unique. However, on average, a given protein interacts with at least a dozen others. This raises the question of whether they use the same or other HIs. By copying HIs from monomers with the same folds in solved quaternary structures, we introduce the concept of geometric HIs (HIs whose geometry has a significant match to other known interfaces) and show that on average they cover three quarters of a protein's surface. We then demonstrate that in some cases, these geometric HI could result in real physical interactions (which may or may not be biologically relevant). The composition of the new HIs is on average more charged compared to most known ones, suggesting that the current protein interface database is biased towards more hydrophobic, possibly more obligate, complexes. Finally, our results provide evidence for interface fuzziness and PPI promiscuity. Thus, the classical view of unique, well defined HIs needs to be revisited as HIs are another example of coarse-graining that is used by nature.
Collapse
Affiliation(s)
- Sam Tonddast-Navaei
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street N.W., Atlanta, Georgia 30318, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street N.W., Atlanta, Georgia 30318, USA
| |
Collapse
|
15
|
Guo F, Ding Y, Li SC, Shen C, Wang L. Protein–protein interface prediction based on hexagon structure similarity. Comput Biol Chem 2016; 63:83-88. [DOI: 10.1016/j.compbiolchem.2016.02.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Accepted: 02/01/2016] [Indexed: 01/17/2023]
|
16
|
Im W, Liang J, Olson A, Zhou HX, Vajda S, Vakser IA. Challenges in structural approaches to cell modeling. J Mol Biol 2016; 428:2943-64. [PMID: 27255863 PMCID: PMC4976022 DOI: 10.1016/j.jmb.2016.05.024] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2016] [Revised: 05/19/2016] [Accepted: 05/24/2016] [Indexed: 11/17/2022]
Abstract
Computational modeling is essential for structural characterization of biomolecular mechanisms across the broad spectrum of scales. Adequate understanding of biomolecular mechanisms inherently involves our ability to model them. Structural modeling of individual biomolecules and their interactions has been rapidly progressing. However, in terms of the broader picture, the focus is shifting toward larger systems, up to the level of a cell. Such modeling involves a more dynamic and realistic representation of the interactomes in vivo, in a crowded cellular environment, as well as membranes and membrane proteins, and other cellular components. Structural modeling of a cell complements computational approaches to cellular mechanisms based on differential equations, graph models, and other techniques to model biological networks, imaging data, etc. Structural modeling along with other computational and experimental approaches will provide a fundamental understanding of life at the molecular level and lead to important applications to biology and medicine. A cross section of diverse approaches presented in this review illustrates the developing shift from the structural modeling of individual molecules to that of cell biology. Studies in several related areas are covered: biological networks; automated construction of three-dimensional cell models using experimental data; modeling of protein complexes; prediction of non-specific and transient protein interactions; thermodynamic and kinetic effects of crowding; cellular membrane modeling; and modeling of chromosomes. The review presents an expert opinion on the current state-of-the-art in these various aspects of structural modeling in cellular biology, and the prospects of future developments in this emerging field.
Collapse
Affiliation(s)
- Wonpil Im
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, United States.
| | - Arthur Olson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States.
| | - Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306, United States.
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, United States.
| | - Ilya A Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| |
Collapse
|
17
|
Vamparys L, Laurent B, Carbone A, Sacquin-Mora S. Great interactions: How binding incorrect partners can teach us about protein recognition and function. Proteins 2016; 84:1408-21. [PMID: 27287388 PMCID: PMC5516155 DOI: 10.1002/prot.25086] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Revised: 06/01/2016] [Accepted: 06/02/2016] [Indexed: 12/29/2022]
Abstract
Protein–protein interactions play a key part in most biological processes and understanding their mechanism is a fundamental problem leading to numerous practical applications. The prediction of protein binding sites in particular is of paramount importance since proteins now represent a major class of therapeutic targets. Amongst others methods, docking simulations between two proteins known to interact can be a useful tool for the prediction of likely binding patches on a protein surface. From the analysis of the protein interfaces generated by a massive cross‐docking experiment using the 168 proteins of the Docking Benchmark 2.0, where all possible protein pairs, and not only experimental ones, have been docked together, we show that it is also possible to predict a protein's binding residues without having any prior knowledge regarding its potential interaction partners. Evaluating the performance of cross‐docking predictions using the area under the specificity‐sensitivity ROC curve (AUC) leads to an AUC value of 0.77 for the complete benchmark (compared to the 0.5 AUC value obtained for random predictions). Furthermore, a new clustering analysis performed on the binding patches that are scattered on the protein surface show that their distribution and growth will depend on the protein's functional group. Finally, in several cases, the binding‐site predictions resulting from the cross‐docking simulations will lead to the identification of an alternate interface, which corresponds to the interaction with a biomolecular partner that is not included in the original benchmark. Proteins 2016; 84:1408–1421. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Lydie Vamparys
- Laboratoire De Biochimie Théorique, CNRS UPR 9080, Institut De Biologie Physico-Chimique, 13 Rue Pierre Et Marie Curie, Paris, 75005, France
| | - Benoist Laurent
- Laboratoire De Biochimie Théorique, CNRS UPR 9080, Institut De Biologie Physico-Chimique, 13 Rue Pierre Et Marie Curie, Paris, 75005, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC Univ-Paris 6, CNRS UMR7238, Laboratoire De Biologie Computationnelle Et Quantitative, 15 Rue De L'Ecole De Médecine, Paris, 75006, France.,Institut Universitaire De France, Paris, 75005, France
| | - Sophie Sacquin-Mora
- Laboratoire De Biochimie Théorique, CNRS UPR 9080, Institut De Biologie Physico-Chimique, 13 Rue Pierre Et Marie Curie, Paris, 75005, France.
| |
Collapse
|
18
|
Rigid-Docking Approaches to Explore Protein-Protein Interaction Space. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2016; 160:33-55. [PMID: 27830312 DOI: 10.1007/10_2016_41] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Protein-protein interactions play core roles in living cells, especially in the regulatory systems. As information on proteins has rapidly accumulated on publicly available databases, much effort has been made to obtain a better picture of protein-protein interaction networks using protein tertiary structure data. Predicting relevant interacting partners from their tertiary structure is a challenging task and computer science methods have the potential to assist with this. Protein-protein rigid docking has been utilized by several projects, docking-based approaches having the advantages that they can suggest binding poses of predicted binding partners which would help in understanding the interaction mechanisms and that comparing docking results of both non-binders and binders can lead to understanding the specificity of protein-protein interactions from structural viewpoints. In this review we focus on explaining current computational prediction methods to predict pairwise direct protein-protein interactions that form protein complexes.
Collapse
|
19
|
Soner S, Ozbek P, Garzon JI, Ben-Tal N, Haliloglu T. DynaFace: Discrimination between Obligatory and Non-obligatory Protein-Protein Interactions Based on the Complex's Dynamics. PLoS Comput Biol 2015; 11:e1004461. [PMID: 26506003 PMCID: PMC4623975 DOI: 10.1371/journal.pcbi.1004461] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/08/2015] [Indexed: 12/31/2022] Open
Abstract
Protein-protein interfaces have been evolutionarily-designed to enable transduction between the interacting proteins. Thus, we hypothesize that analysis of the dynamics of the complex can reveal details about the nature of the interaction, and in particular whether it is obligatory, i.e., persists throughout the entire lifetime of the proteins, or not. Indeed, normal mode analysis, using the Gaussian network model, shows that for the most part obligatory and non-obligatory complexes differ in their decomposition into dynamic domains, i.e., the mobile elements of the protein complex. The dynamic domains of obligatory complexes often mix segments from the interacting chains, and the hinges between them do not overlap with the interface between the chains. In contrast, in non-obligatory complexes the interface often hinges between dynamic domains, held together through few anchor residues on one side of the interface that interact with their counterpart grooves in the other end. In automatic analysis, 117 of 139 obligatory (84.2%) and 203 of 246 non-obligatory (82.5%) complexes are correctly classified by our method: DynaFace. We further use DynaFace to predict obligatory and non-obligatory interactions among a set of 300 putative protein complexes. DynaFace is available at: http://safir.prc.boun.edu.tr/dynaface.
Collapse
Affiliation(s)
- Seren Soner
- Department of Computer Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
| | - Pemra Ozbek
- Department of Bioengineering, Marmara University, Istanbul, Turkey
| | - Jose Ignacio Garzon
- Departments of Biochemistry and Molecular Biophysics and Systems Biology and Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Turkan Haliloglu
- Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
- * E-mail:
| |
Collapse
|
20
|
Hou Q, Dutilh BE, Huynen MA, Heringa J, Feenstra KA. Sequence specificity between interacting and non-interacting homologs identifies interface residues--a homodimer and monomer use case. BMC Bioinformatics 2015; 16:325. [PMID: 26449222 PMCID: PMC4599308 DOI: 10.1186/s12859-015-0758-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 09/30/2015] [Indexed: 11/17/2022] Open
Abstract
Background Protein families participating in protein-protein interactions may contain sub-families that have different binding characteristics, ranging from right binding to showing no interaction at all. Composition differences at the sequence level in these sub-families are often decisive to their differential functional interaction. Methods to predict interface sites from protein sequences typically exploit conservation as a signal. Here, instead, we provide proof of concept that the sequence specificity between interacting versus non-interacting groups can be exploited to recognise interaction sites. Results We collected homodimeric and monomeric proteins and formed homologous groups, each having an interacting (homodimer) subgroup and a non-interacting (monomer) subgroup. We then compiled multiple sequence alignments of the proteins in the homologous groups and identified compositional differences between the homodimeric and monomeric subgroups for each of the alignment positions. Our results show that this specificity signal distinguishes interface and other surface residues with 40.9 % recall and up to 25.1 % precision. Conclusions To our best knowledge, this is the first large scale study that exploits sequence specificity between interacting and non-interacting homologs to predict interaction sites from sequence information only. The performance obtained indicates that this signal contains valuable information to identify protein-protein interaction sites. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0758-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qingzhen Hou
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| | - Bas E Dutilh
- Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH, Utrecht, The Netherlands. .,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 28, 6525 GA, Nijmegen, The Netherlands. .,Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.
| | - Martijn A Huynen
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 28, 6525 GA, Nijmegen, The Netherlands.
| | - Jaap Heringa
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| | - K Anton Feenstra
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| |
Collapse
|
21
|
Guo F, Li SC, Wei Z, Zhu D, Shen C, Wang L. Structural neighboring property for identifying protein-protein binding sites. BMC SYSTEMS BIOLOGY 2015; 9 Suppl 5:S3. [PMID: 26356630 PMCID: PMC4565107 DOI: 10.1186/1752-0509-9-s5-s3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Background The protein-protein interaction plays a key role in the control of many biological functions, such as drug design and functional analysis. Determination of binding sites is widely applied in molecular biology research. Therefore, many efficient methods have been developed for identifying binding sites. In this paper, we calculate structural neighboring property through Voronoi diagram. Using 6,438 complexes, we study local biases of structural neighboring property on interface. Results We propose a novel statistical method to extract interacting residues, and interacting patches can be clustered as predicted interface residues. In addition, structural neighboring property can be adopted to construct a new energy function, for evaluating docking solutions. It includes new statistical property as well as existing energy items. Comparing to existing methods, our approach improves overall Fnat value by at least 3%. On Benchmark v4.0, our method has average Irmsd value of 3.31Å and overall Fnat value of 63%, which improves upon Irmsd of 3.89 Å and Fnat of 49% for ZRANK, and Irmsd of 3.99Å and Fnat of 46% for ClusPro. On the CAPRI targets, our method has average Irmsd value of 3.46 Å and overall Fnat value of 45%, which improves upon Irmsd of 4.18 Å and Fnat of 40% for ZRANK, and Irmsd of 5.12 Å and Fnat of 32% for ClusPro. Conclusions Experiments show that our method achieves better results than some state-of-the-art methods for identifying protein-protein binding sites, with the prediction quality improved in terms of CAPRI evaluation criteria.
Collapse
|
22
|
Identification of Protein–Protein Interactions by Detecting Correlated Mutation at the Interface. J Chem Inf Model 2015; 55:2042-9. [DOI: 10.1021/acs.jcim.5b00320] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
23
|
Vakser IA. Protein-protein docking: from interaction to interactome. Biophys J 2015; 107:1785-1793. [PMID: 25418159 DOI: 10.1016/j.bpj.2014.08.033] [Citation(s) in RCA: 184] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Revised: 08/17/2014] [Accepted: 08/27/2014] [Indexed: 12/29/2022] Open
Abstract
The protein-protein docking problem is one of the focal points of activity in computational biophysics and structural biology. The three-dimensional structure of a protein-protein complex, generally, is more difficult to determine experimentally than the structure of an individual protein. Adequate computational techniques to model protein interactions are important because of the growing number of known protein structures, particularly in the context of structural genomics. Docking offers tools for fundamental studies of protein interactions and provides a structural basis for drug design. Protein-protein docking is the prediction of the structure of the complex, given the structures of the individual proteins. In the heart of the docking methodology is the notion of steric and physicochemical complementarity at the protein-protein interface. Originally, mostly high-resolution, experimentally determined (primarily by x-ray crystallography) protein structures were considered for docking. However, more recently, the focus has been shifting toward lower-resolution modeled structures. Docking approaches have to deal with the conformational changes between unbound and bound structures, as well as the inaccuracies of the interacting modeled structures, often in a high-throughput mode needed for modeling of large networks of protein interactions. The growing number of docking developers is engaged in the community-wide assessments of predictive methodologies. The development of more powerful and adequate docking approaches is facilitated by rapidly expanding information and data resources, growing computational capabilities, and a deeper understanding of the fundamental principles of protein interactions.
Collapse
Affiliation(s)
- Ilya A Vakser
- Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| |
Collapse
|
24
|
Krull F, Korff G, Elghobashi-Meinhardt N, Knapp EW. ProPairs: A Data Set for Protein–Protein Docking. J Chem Inf Model 2015; 55:1495-507. [DOI: 10.1021/acs.jcim.5b00082] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Florian Krull
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Gerrit Korff
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Nadia Elghobashi-Meinhardt
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Ernst-Walter Knapp
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| |
Collapse
|
25
|
Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M, Loscalzo J, Barabási AL. Disease networks. Uncovering disease-disease relationships through the incomplete interactome. Science 2015; 347:1257601. [PMID: 25700523 PMCID: PMC4435741 DOI: 10.1126/science.1257601] [Citation(s) in RCA: 875] [Impact Index Per Article: 97.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
According to the disease module hypothesis, the cellular components associated with a disease segregate in the same neighborhood of the human interactome, the map of biologically relevant molecular interactions. Yet, given the incompleteness of the interactome and the limited knowledge of disease-associated genes, it is not obvious if the available data have sufficient coverage to map out modules associated with each disease. Here we derive mathematical conditions for the identifiability of disease modules and show that the network-based location of each disease module determines its pathobiological relationship to other diseases. For example, diseases with overlapping network modules show significant coexpression patterns, symptom similarity, and comorbidity, whereas diseases residing in separated network neighborhoods are phenotypically distinct. These tools represent an interactome-based platform to predict molecular commonalities between phenotypically related diseases, even if they do not share primary disease genes.
Collapse
Affiliation(s)
- Jörg Menche
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA. Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary
| | - Amitabh Sharma
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Maksim Kitsak
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Susan Dina Ghiassian
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA. Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | - Albert-László Barabási
- Center for Complex Networks Research and Department of Physics, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, MA 02115, USA. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA. Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary. Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA.
| |
Collapse
|
26
|
Aumentado-Armstrong TT, Istrate B, Murgita RA. Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol Biol 2015; 10:7. [PMID: 25713596 PMCID: PMC4338852 DOI: 10.1186/s13015-015-0033-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Accepted: 01/07/2015] [Indexed: 12/19/2022] Open
Abstract
Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
Collapse
|
27
|
Matsuzaki Y, Ohue M, Uchikoga N, Akiyama Y. Protein-protein interaction network prediction by using rigid-body docking tools: application to bacterial chemotaxis. Protein Pept Lett 2015; 21:790-8. [PMID: 23855669 PMCID: PMC4440392 DOI: 10.2174/09298665113209990066] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2012] [Revised: 02/27/2013] [Accepted: 03/03/2013] [Indexed: 11/22/2022]
Abstract
Core elements of cell regulation are made up of protein-protein interaction (PPI) networks. However, many
parts of the cell regulatory systems include unknown PPIs. To approach this problem, we have developed a computational
method of high-throughput PPI network prediction based on all-to-all rigid-body docking of protein tertiary structures.
The prediction system accepts a set of data comprising protein tertiary structures as input and generates a list of possible
interacting pairs from all the combinations as output. A crucial advantage of this docking based method is in providing
predictions of protein pairs that increases our understanding of biological pathways by analyzing the structures of candidate
complex structures, which gives insight into novel interaction mechanisms. Although such exhaustive docking calculation
requires massive computational resources, recent advancements in the computational sciences have made such
large-scale calculations feasible. different rigid-body docking tools with different scoring models. We found that the predicted interactions were different
between the results from the two tools. When the positive predictions from both of the docking tools were combined, all
the core signaling interactions were correctly predicted with the exception of interactions activated by protein phosphorylation.
Large-scale PPI prediction using tertiary structures is an effective approach that has a wide range of potential applications.
This method is especially useful for identifying novel PPIs of new pathways that control cellular behavior.
Collapse
Affiliation(s)
| | | | | | - Yutaka Akiyama
- Graduate School of Information Science and Engineering, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan.
| |
Collapse
|
28
|
Abstract
Regulated interactions between proteins govern signaling pathways within and between cells. Structural studies on protein complexes formed reversibly and/or transiently illustrate the remarkable diversity of interactions, both in terms of interfacial size and nature. In recent years, "domain-peptide" interactions have gained much greater recognition and may be viewed as both pre-translational and posttranslational-dependent functional switches. Our understanding of the multistep regulation of auto-inhibited multidomain proteins has also grown. Their activity may be understood as the "combinatorial" output of multiple input signals, including phosphorylation, location, and mechanical force. The prospects for bridging the gap between the new "systems biology" data and the traditional "reductionist" data are also discussed.
Collapse
Affiliation(s)
- Robert C Liddington
- Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA, 92037, USA,
| |
Collapse
|
29
|
Advances in Human Biology: Combining Genetics and Molecular Biophysics to Pave the Way for Personalized Diagnostics and Medicine. ACTA ACUST UNITED AC 2014. [DOI: 10.1155/2014/471836] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Advances in several biology-oriented initiatives such as genome sequencing and structural genomics, along with the progress made through traditional biological and biochemical research, have opened up a unique opportunity to better understand the molecular effects of human diseases. Human DNA can vary significantly from person to person and determines an individual’s physical characteristics and their susceptibility to diseases. Armed with an individual’s DNA sequence, researchers and physicians can check for defects known to be associated with certain diseases by utilizing various databases. However, for unclassified DNA mutations or in order to reveal molecular mechanism behind the effects, the mutations have to be mapped onto the corresponding networks and macromolecular structures and then analyzed to reveal their effect on the wild type properties of biological processes involved. Predicting the effect of DNA mutations on individual’s health is typically referred to as personalized or companion diagnostics. Furthermore, once the molecular mechanism of the mutations is revealed, the patient should be given drugs which are the most appropriate for the individual genome, referred to as pharmacogenomics. Altogether, the shift in focus in medicine towards more genomic-oriented practices is the foundation of personalized medicine. The progress made in these rapidly developing fields is outlined.
Collapse
|
30
|
Andreani J, Guerois R. Evolution of protein interactions: From interactomes to interfaces. Arch Biochem Biophys 2014; 554:65-75. [DOI: 10.1016/j.abb.2014.05.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Revised: 04/28/2014] [Accepted: 05/12/2014] [Indexed: 12/16/2022]
|
31
|
Ochoa D, Pazos F. Practical aspects of protein co-evolution. Front Cell Dev Biol 2014; 2:14. [PMID: 25364721 PMCID: PMC4207036 DOI: 10.3389/fcell.2014.00014] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 04/02/2014] [Indexed: 11/15/2022] Open
Abstract
Co-evolution is a fundamental aspect of Evolutionary Theory. At the molecular level, co-evolutionary linkages between protein families have been used as indicators of protein interactions and functional relationships from long ago. Due to the complexity of the problem and the amount of genomic data required for these approaches to achieve good performances, it took a relatively long time from the appearance of the first ideas and concepts to the quotidian application of these approaches and their incorporation to the standard toolboxes of bioinformaticians and molecular biologists. Today, these methodologies are mature (both in terms of performance and usability/implementation), and the genomic information that feeds them large enough to allow their general application. This review tries to summarize the current landscape of co-evolution-based methodologies, with a strong emphasis on describing interesting cases where their application to important biological systems, alone or in combination with other computational and experimental approaches, allowed getting new insight into these.
Collapse
Affiliation(s)
- David Ochoa
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) Hinxton, UK
| | - Florencio Pazos
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC) Madrid, Spain
| |
Collapse
|
32
|
Yugandhar K, Gromiha MM. Feature selection and classification of protein-protein complexes based on their binding affinities using machine learning approaches. Proteins 2014; 82:2088-96. [PMID: 24648146 DOI: 10.1002/prot.24564] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 03/14/2014] [Indexed: 12/16/2022]
Abstract
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions.
Collapse
Affiliation(s)
- K Yugandhar
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
| | | |
Collapse
|
33
|
Saccà C, Teso S, Diligenti M, Passerini A. Improved multi-level protein-protein interaction prediction with semantic-based regularization. BMC Bioinformatics 2014; 15:103. [PMID: 24725682 PMCID: PMC4004462 DOI: 10.1186/1471-2105-15-103] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Accepted: 03/03/2014] [Indexed: 11/24/2022] Open
Abstract
Background Protein–protein interactions can be seen as a hierarchical process occurring at three related levels: proteins bind by means of specific domains, which in turn form interfaces through patches of residues. Detailed knowledge about which domains and residues are involved in a given interaction has extensive applications to biology, including better understanding of the binding process and more efficient drug/enzyme design. Alas, most current interaction prediction methods do not identify which parts of a protein actually instantiate an interaction. Furthermore, they also fail to leverage the hierarchical nature of the problem, ignoring otherwise useful information available at the lower levels; when they do, they do not generate predictions that are guaranteed to be consistent between levels. Results Inspired by earlier ideas of Yip et al. (BMC Bioinformatics 10:241, 2009), in the present paper we view the problem as a multi-level learning task, with one task per level (proteins, domains and residues), and propose a machine learning method that collectively infers the binding state of all object pairs. Our method is based on Semantic Based Regularization (SBR), a flexible and theoretically sound machine learning framework that uses First Order Logic constraints to tie the learning tasks together. We introduce a set of biologically motivated rules that enforce consistent predictions between the hierarchy levels. Conclusions We study the empirical performance of our method using a standard validation procedure, and compare its performance against the only other existing multi-level prediction technique. We present results showing that our method substantially outperforms the competitor in several experimental settings, indicating that exploiting the hierarchical nature of the problem can lead to better predictions. In addition, our method is also guaranteed to produce interactions that are consistent with respect to the protein–domain–residue hierarchy.
Collapse
Affiliation(s)
| | | | | | - Andrea Passerini
- Dipartimento di Ingegneria e Scienza dell'Informazione, University of Trento, Trento, Italy.
| |
Collapse
|
34
|
Schwede T. Protein modeling: what happened to the "protein structure gap"? Structure 2014; 21:1531-40. [PMID: 24010712 DOI: 10.1016/j.str.2013.08.007] [Citation(s) in RCA: 83] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Revised: 08/12/2013] [Accepted: 08/12/2013] [Indexed: 11/27/2022]
Abstract
Computational modeling of three-dimensional macromolecular structures and complexes from their sequence has been a long-standing vision in structural biology. Over the last 2 decades, a paradigm shift has occurred: starting from a large "structure knowledge gap" between the huge number of protein sequences and small number of known structures, today, some form of structural information, either experimental or template-based models, is available for the majority of amino acids encoded by common model organism genomes. With the scientific focus of interest moving toward larger macromolecular complexes and dynamic networks of interactions, the integration of computational modeling methods with low-resolution experimental techniques allows the study of large and complex molecular machines. One of the open challenges for computational modeling and prediction techniques is to convey the underlying assumptions, as well as the expected accuracy and structural variability of a specific model, which is crucial to understanding its limitations.
Collapse
Affiliation(s)
- Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 50-70, 4056 Basel, Switzerland.
| |
Collapse
|
35
|
de Moraes FR, Neshich IAP, Mazoni I, Yano IH, Pereira JGC, Salim JA, Jardine JG, Neshich G. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages. PLoS One 2014; 9:e87107. [PMID: 24489849 PMCID: PMC3904977 DOI: 10.1371/journal.pone.0087107] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Accepted: 12/22/2013] [Indexed: 11/18/2022] Open
Abstract
Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html).
Collapse
Affiliation(s)
- Fábio R. de Moraes
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Izabella A. P. Neshich
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Ivan Mazoni
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Inácio H. Yano
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - José G. C. Pereira
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - José A. Salim
- School of Electrical and Computer Engineering, University of Campinas, Campinas, São Paulo, Brazil
| | - José G. Jardine
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Goran Neshich
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
- * E-mail:
| |
Collapse
|
36
|
Ohue M, Matsuzaki Y, Shimoda T, Ishida T, Akiyama Y. Highly precise protein-protein interaction prediction based on consensus between template-based and de novo docking methods. BMC Proc 2013; 7:S6. [PMID: 24564962 PMCID: PMC4044902 DOI: 10.1186/1753-6561-7-s7-s6] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Background Elucidation of protein-protein interaction (PPI) networks is important for understanding disease mechanisms and for drug discovery. Tertiary-structure-based in silico PPI prediction methods have been developed with two typical approaches: a method based on template matching with known protein structures and a method based on de novo protein docking. However, the template-based method has a narrow applicable range because of its use of template information, and the de novo docking based method does not have good prediction performance. In addition, both of these in silico prediction methods have insufficient precision, and require validation of the predicted PPIs by biological experiments, leading to considerable expenditure; therefore, PPI prediction methods with greater precision are needed. Results We have proposed a new structure-based PPI prediction method by combining template-based prediction and de novo docking prediction. When we applied the method to the human apoptosis signaling pathway, we obtained a precision value of 0.333, which is higher than that achieved using conventional methods (0.231 for PRISM, a template-based method, and 0.145 for MEGADOCK, a non-template-based method), while maintaining an F-measure value (0.285) comparable to that obtained using conventional methods (0.296 for PRISM, and 0.220 for MEGADOCK). Conclusions Our consensus method successfully predicted a PPI network with greater precision than conventional template/non-template methods, which may thus reduce the cost of validation by laboratory experiments for confirming novel PPIs from predicted PPIs. Therefore, our method may serve as an aid for promoting interactome analysis.
Collapse
|
37
|
Dong GQ, Fan H, Schneidman-Duhovny D, Webb B, Sali A. Optimized atomic statistical potentials: assessment of protein interfaces and loops. Bioinformatics 2013; 29:3158-66. [PMID: 24078704 PMCID: PMC3842762 DOI: 10.1093/bioinformatics/btt560] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Revised: 08/13/2013] [Accepted: 09/22/2013] [Indexed: 01/16/2023] Open
Abstract
MOTIVATION Statistical potentials have been widely used for modeling whole proteins and their parts (e.g. sidechains and loops) as well as interactions between proteins, nucleic acids and small molecules. Here, we formulate the statistical potentials entirely within a statistical framework, avoiding questionable statistical mechanical assumptions and approximations, including a definition of the reference state. RESULTS We derive a general Bayesian framework for inferring statistically optimized atomic potentials (SOAP) in which the reference state is replaced with data-driven 'recovery' functions. Moreover, we restrain the relative orientation between two covalent bonds instead of a simple distance between two atoms, in an effort to capture orientation-dependent interactions such as hydrogen bonds. To demonstrate this general approach, we computed statistical potentials for protein-protein docking (SOAP-PP) and loop modeling (SOAP-Loop). For docking, a near-native model is within the top 10 scoring models in 40% of the PatchDock benchmark cases, compared with 23 and 27% for the state-of-the-art ZDOCK and FireDock scoring functions, respectively. Similarly, for modeling 12-residue loops in the PLOP benchmark, the average main-chain root mean square deviation of the best scored conformations by SOAP-Loop is 1.5 Å, close to the average root mean square deviation of the best sampled conformations (1.2 Å) and significantly better than that selected by Rosetta (2.1 Å), DFIRE (2.3 Å), DOPE (2.5 Å) and PLOP scoring functions (3.0 Å). Our Bayesian framework may also result in more accurate statistical potentials for additional modeling applications, thus affording better leverage of the experimentally determined protein structures. AVAILABILITY AND IMPLEMENTATION SOAP-PP and SOAP-Loop are available as part of MODELLER (http://salilab.org/modeller).
Collapse
Affiliation(s)
- Guang Qiang Dong
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry and California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, CA 94158, USA
| | | | | | | | | |
Collapse
|
38
|
Minhas FUAA, Geiss BJ, Ben-Hur A. PAIRpred: partner-specific prediction of interacting residues from sequence and structure. Proteins 2013; 82:1142-55. [PMID: 24243399 DOI: 10.1002/prot.24479] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 11/04/2013] [Accepted: 11/09/2013] [Indexed: 11/10/2022]
Abstract
We present a novel partner-specific protein-protein interaction site prediction method called PAIRpred. Unlike most existing machine learning binding site prediction methods, PAIRpred uses information from both proteins in a protein complex to predict pairs of interacting residues from the two proteins. PAIRpred captures sequence and structure information about residue pairs through pairwise kernels that are used for training a support vector machine classifier. As a result, PAIRpred presents a more detailed model of protein binding, and offers state of the art accuracy in predicting binding sites at the protein level as well as inter-protein residue contacts at the complex level. We demonstrate PAIRpred's performance on Docking Benchmark 4.0 and recent CAPRI targets. We present a detailed performance analysis outlining the contribution of different sequence and structure features, together with a comparison to a variety of existing interface prediction techniques. We have also studied the impact of binding-associated conformational change on prediction accuracy and found PAIRpred to be more robust to such structural changes than existing schemes. As an illustration of the potential applications of PAIRpred, we provide a case study in which PAIRpred is used to analyze the nature and specificity of the interface in the interaction of human ISG15 protein with NS1 protein from influenza A virus. Python code for PAIRpred is available at http://combi.cs.colostate.edu/supplements/pairpred/.
Collapse
|
39
|
Mosca R, Pons T, Céol A, Valencia A, Aloy P. Towards a detailed atlas of protein–protein interactions. Curr Opin Struct Biol 2013; 23:929-40. [DOI: 10.1016/j.sbi.2013.07.005] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Revised: 07/04/2013] [Accepted: 07/08/2013] [Indexed: 12/30/2022]
|
40
|
Kundrotas PJ, Vakser IA. Global and local structural similarity in protein-protein complexes: implications for template-based docking. Proteins 2013; 81:2137-42. [PMID: 23946125 DOI: 10.1002/prot.24392] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Revised: 07/23/2013] [Accepted: 08/02/2013] [Indexed: 02/02/2023]
Abstract
The increasing amount of structural information on protein-protein interactions makes it possible to predict the structure of protein-protein complexes by comparison/alignment of the interacting proteins to the ones in cocrystallized complexes. In the predictions based on structure similarity, the template search is performed by structural alignment of the target interactors with the entire structures or with the interface only of the subunits in cocrystallized complexes. This study investigates the scope of the structural similarity that facilitates the detection of a broad range of templates significantly divergent from the targets. The analysis of the target-template similarity is based on models of protein-protein complexes in a large representative set of heterodimers. The similarity of the biological and crystal packing interfaces, dissimilar interface structural motifs in overall similar structures, interface similarity to the full structure, and local similarity away from the interface were analyzed. The structural similarity at the protein-protein interfaces only was observed in ~25% of target-template pairs with sequence identity <20% and primarily homodimeric templates. For ~50% of the target-template pairs, the similarity at the interface was accompanied by the similarity of the whole structure. However, the structural similarity at the interfaces was still stronger than that of the noninterface parts. The study provides insights into structural and functional diversity of protein-protein complexes, and relative performance of the interface and full structure alignment in docking.
Collapse
|
41
|
Topham CM, Rouquier M, Tarrat N, André I. Adaptive Smith-Waterman residue match seeding for protein structural alignment. Proteins 2013; 81:1823-39. [DOI: 10.1002/prot.24327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Revised: 04/22/2013] [Accepted: 05/15/2013] [Indexed: 12/30/2022]
Affiliation(s)
- Christopher M. Topham
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Mickaël Rouquier
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Nathalie Tarrat
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| | - Isabelle André
- Université de Toulouse, INSA, UPS, INP, LISBP; 135 Avenue de Rangueil F-31077 Toulouse France
- CNRS, UMR5504; F-31400 Toulouse France
- INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés; F-31400 Toulouse France
| |
Collapse
|
42
|
Kundrotas PJ, Vakser IA. Protein-protein alternative binding modes do not overlap. Protein Sci 2013; 22:1141-5. [PMID: 23775945 DOI: 10.1002/pro.2295] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Revised: 06/01/2013] [Accepted: 06/03/2013] [Indexed: 11/09/2022]
Abstract
Proteins often bind other proteins in more than one way. Thus alternative binding modes is an essential feature of protein interactions. Such binding modes may be detected by X-ray crystallography and thus reflected in Protein Data Bank. The alternative binding is often observed not for the protein itself but for its structural homolog. The results of this study based on the analysis of a comprehensive set of co-crystallized protein-protein complexes show that the alternative binding modes generally do not overlap, but are spatially separated. This effect is based on molecular recognition characteristics of the protein structures. The results are also in excellent agreement with the intermolecular energy funnel size estimates obtained previously by an independent methodology. The results provide an important insight into the principles of protein association, as well as potential guidelines for modeling of protein complexes and the design of protein interfaces.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| | | |
Collapse
|
43
|
Bilgin T, Kurnaz IA, Wagner A. Selection shapes the robustness of ligand-binding amino acids. J Mol Evol 2013; 76:343-9. [PMID: 23689513 DOI: 10.1007/s00239-013-9564-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Accepted: 05/02/2013] [Indexed: 11/26/2022]
Abstract
The phenotypes of biological systems are to some extent robust to genotypic changes. Such robustness exists on multiple levels of biological organization. We analyzed this robustness for two categories of amino acids in proteins. Specifically, we studied the codons of amino acids that bind or do not bind small molecular ligands. We asked to what extent codon changes caused by mutation or mistranslation may affect physicochemical amino acid properties or protein folding. We found that the codons of ligand-binding amino acids are on average more robust than those of non-binding amino acids. Because mistranslation is usually more frequent than mutation, we speculate that selection for error mitigation at the translational level stands behind this phenomenon. Our observations suggest that natural selection can affect the robustness of very small units of biological organization.
Collapse
Affiliation(s)
- Tugce Bilgin
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland.
| | | | | |
Collapse
|
44
|
Functional site plasticity in domain superfamilies. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:874-89. [PMID: 23499848 PMCID: PMC3787744 DOI: 10.1016/j.bbapap.2013.02.042] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Revised: 02/20/2013] [Accepted: 02/28/2013] [Indexed: 11/21/2022]
Abstract
We present, to our knowledge, the first quantitative analysis of functional site diversity in homologous domain superfamilies. Different types of functional sites are considered separately. Our results show that most diverse superfamilies are very plastic in terms of the spatial location of their functional sites. This is especially true for protein–protein interfaces. In contrast, we confirm that catalytic sites typically occupy only a very small number of topological locations. Small-ligand binding sites are more diverse than expected, although in a more limited manner than protein–protein interfaces. In spite of the observed diversity, our results also confirm the previously reported preferential location of functional sites. We identify a subset of homologous domain superfamilies where diversity is particularly extreme, and discuss possible reasons for such plasticity, i.e. structural diversity. Our results do not contradict previous reports of preferential co-location of sites among homologues, but rather point at the importance of not ignoring other sites, especially in large and diverse superfamilies. Data on sites exploited by different relatives, within each well annotated domain superfamily, has been made accessible from the CATH website in order to highlight versatile superfamilies or superfamilies with highly preferential sites. This information is valuable for system biology and knowledge of any constraints on protein interactions could help in understanding the dynamic control of networks in which these proteins participate. The novelty of our work lies in the comprehensive nature of the analysis – we have used a significantly larger dataset than previous studies – and the fact that in many superfamilies we show that different parts of the domain surface are exploited by different relatives for ligand/protein interactions, particularly in superfamilies which are diverse in sequence and structure, an observation not previously reported on such a large scale. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly. Most diverse domain superfamilies have very diverse functional site locations. Catalytic sites are found in a small, restricted number of topological positions. Location of small-ligand binding sites is more diverse than expected. Protein–protein interfaces display the most flexibility in functional site locations.
Collapse
|
45
|
Abstract
Co-evolution is a fundamental component of the theory of evolution and is essential for understanding the relationships between species in complex ecological networks. A wide range of co-evolution-inspired computational methods has been designed to predict molecular interactions, but it is only recently that important advances have been made. Breakthroughs in the handling of phylogenetic information and in disentangling indirect relationships have resulted in an improved capacity to predict interactions between proteins and contacts between different protein residues. Here, we review the main co-evolution-based computational approaches, their theoretical basis, potential applications and foreseeable developments.
Collapse
Affiliation(s)
- David de Juan
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | | |
Collapse
|
46
|
Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013; 449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions (‘decorations’ at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure–function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
Collapse
|
47
|
Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol 2013; 23:198-205. [PMID: 23294579 DOI: 10.1016/j.sbi.2012.12.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 12/03/2012] [Indexed: 11/23/2022]
Abstract
Structural characterization of protein-protein interactions across the broad spectrum of scales is key to our understanding of life at the molecular level. Low-resolution approach to protein interactions is needed for modeling large interaction networks, given the significant level of uncertainties in large biomolecular systems and the high-throughput nature of the task. Since only a fraction of protein structures in interactome are determined experimentally, protein docking approaches are increasingly focusing on modeled proteins. Current rapid advancement of template-based modeling of protein-protein complexes is following a long standing trend in structure prediction of individual proteins. Protein-protein templates are already available for almost all interactions of structurally characterized proteins, and about one third of such templates are likely correct.
Collapse
|
48
|
Kastritis PL, Bonvin AMJJ. On the binding affinity of macromolecular interactions: daring to ask why proteins interact. J R Soc Interface 2012; 10:20120835. [PMID: 23235262 PMCID: PMC3565702 DOI: 10.1098/rsif.2012.0835] [Citation(s) in RCA: 276] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Interactions between proteins are orchestrated in a precise and time-dependent manner, underlying cellular function. The binding affinity, defined as the strength of these interactions, is translated into physico-chemical terms in the dissociation constant (Kd), the latter being an experimental measure that determines whether an interaction will be formed in solution or not. Predicting binding affinity from structural models has been a matter of active research for more than 40 years because of its fundamental role in drug development. However, all available approaches are incapable of predicting the binding affinity of protein–protein complexes from coordinates alone. Here, we examine both theoretical and experimental limitations that complicate the derivation of structure–affinity relationships. Most work so far has concentrated on binary interactions. Systems of increased complexity are far from being understood. The main physico-chemical measure that relates to binding affinity is the buried surface area, but it does not hold for flexible complexes. For the latter, there must be a significant entropic contribution that will have to be approximated in the future. We foresee that any theoretical modelling of these interactions will have to follow an integrative approach considering the biology, chemistry and physics that underlie protein–protein recognition.
Collapse
Affiliation(s)
- Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science, Chemistry, Utrecht University, , Padualaan 8, Utrecht, The Netherlands
| | | |
Collapse
|
49
|
Chen YPP. Computational methods for protein interaction and structural prediction. BIOCHIMICA ET BIOPHYSICA ACTA 2012; 1824:1416-1417. [PMID: 23084263 DOI: 10.1016/j.bbapap.2012.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
|
50
|
Ochoa D, García-Gutiérrez P, Juan D, Valencia A, Pazos F. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions. MOLECULAR BIOSYSTEMS 2012; 9:70-6. [PMID: 23104128 DOI: 10.1039/c2mb25325a] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.
Collapse
Affiliation(s)
- David Ochoa
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), C/Darwin, 3, Cantoblanco, 28049 Madrid, Spain
| | | | | | | | | |
Collapse
|