1
|
Pandey U, Behara SM, Sharma S, Patil RS, Nambiar S, Koner D, Bhukya H. DeePNAP: A Deep Learning Method to Predict Protein-Nucleic Acid Binding Affinity from Their Sequences. J Chem Inf Model 2024; 64:1806-1815. [PMID: 38458968 DOI: 10.1021/acs.jcim.3c01151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2024]
Abstract
Predicting the protein-nucleic acid (PNA) binding affinity solely from their sequences is of paramount importance for the experimental design and analysis of PNA interactions (PNAIs). A large number of currently developed models for binding affinity prediction are limited to specific PNAIs while also relying on the sequence and structural information of the PNA complexes for both training and testing, and also as inputs. As the PNA complex structures available are scarce, this significantly limits the diversity and generalizability due to the small training data set. Additionally, a majority of the tools predict a single parameter, such as binding affinity or free energy changes upon mutations, rendering a model less versatile for usage. Hence, we propose DeePNAP, a machine learning-based model built from a vast and heterogeneous data set with 14,401 entries (from both eukaryotes and prokaryotes) from the ProNAB database, consisting of wild-type and mutant PNA complex binding parameters. Our model precisely predicts the binding affinity and free energy changes due to the mutation(s) of PNAIs exclusively from their sequences. While other similar tools extract features from both sequence and structure information, DeePNAP employs sequence-based features to yield high correlation coefficients between the predicted and experimental values with low root mean squared errors for PNA complexes in predicting KD and ΔΔG, implying the generalizability of DeePNAP. Additionally, we have also developed a web interface hosting DeePNAP that can serve as a powerful tool to rapidly predict binding affinities for a myriad of PNAIs with high precision toward developing a deeper understanding of their implications in various biological systems. Web interface: http://14.139.174.41:8080/.
Collapse
Affiliation(s)
- Uddeshya Pandey
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Sasi M Behara
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Siddhant Sharma
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Rachit S Patil
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Souparnika Nambiar
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Debasish Koner
- Department of Chemistry, Indian Institute of Technology Hyderabad, Kandi 502284, India
| | - Hussain Bhukya
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| |
Collapse
|
2
|
Gohl P, Bonet J, Fornes O, Planas-Iglesias J, Fernandez-Fuentes N, Oliva B. SBILib: a handle for protein modeling and engineering. Bioinformatics 2023; 39:btad613. [PMID: 37796837 PMCID: PMC10589914 DOI: 10.1093/bioinformatics/btad613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/22/2023] [Accepted: 10/04/2023] [Indexed: 10/07/2023] Open
Abstract
SUMMARY The SBILib Python library provides an integrated platform for the analysis of macromolecular structures and interactions. It combines simple 3D file parsing and workup methods with more advanced analytical tools. SBILib includes modules for macromolecular interactions, loops, super-secondary structures, and biological sequences, as well as wrappers for external tools with which to integrate their results and facilitate the comparative analysis of protein structures and their complexes. The library can handle macromolecular complexes formed by proteins and/or nucleic acid molecules (i.e. DNA and RNA). It is uniquely capable of parsing and calculating protein super-secondary structure and loop geometry. We have compiled a list of example scenarios which SBILib may be applied to and provided access to these within the library. AVAILABILITY AND IMPLEMENTATION SBILib is made available on Github at https://github.com/structuralbioinformatics/SBILib.
Collapse
Affiliation(s)
- Patrick Gohl
- Department of Medicine and Life Sciences, SBI-GRIB, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Jaume Bonet
- Department of Medicine and Life Sciences, SBI-GRIB, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Oriol Fornes
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC V5Z 4H4, Canada
| | - Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St Anne’s University Hospital Brno, 656 916 Brno, Czech Republic
| | - Narcís Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Science, Aberystwyth University, Aberystwyth SY23 3DA, United Kingdom
| | - Baldo Oliva
- Department of Medicine and Life Sciences, SBI-GRIB, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
3
|
Yang YX, Huang JY, Wang P, Zhu BT. AREA-AFFINITY: A Web Server for Machine Learning-Based Prediction of Protein-Protein and Antibody-Protein Antigen Binding Affinities. J Chem Inf Model 2023. [PMID: 37235532 DOI: 10.1021/acs.jcim.2c01499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Protein-Protein binding affinity reflects the binding strength between the binding partners. The prediction of protein-protein binding affinity is important for elucidating protein functions and also for designing protein-based therapeutics. The geometric characteristics such as area (both interface and surface areas) in the structure of a protein-protein complex play an important role in determining protein-protein interactions and their binding affinity. Here, we present a free web server for academic use, AREA-AFFINITY, for prediction of protein-protein or antibody-protein antigen binding affinity based on interface and surface areas in the structure of a protein-protein complex. AREA-AFFINITY implements 60 effective area-based protein-protein affinity predictive models and 37 effective area-based models specific for antibody-protein antigen binding affinity prediction developed in our recent studies. These models take into consideration the roles of interface and surface areas in binding affinity by using areas classified according to different amino acid types with different biophysical nature. The models with the best performances integrate machine learning methods such as neural network or random forest. These newly developed models have superior or comparable performance compared to the commonly used existing methods. AREA-AFFINITY is available for free at: https://affinity.cuhk.edu.cn/.
Collapse
Affiliation(s)
- Yong Xiao Yang
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
| | - Jin Yan Huang
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
| | - Pan Wang
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
| | - Bao Ting Zhu
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
- Shenzhen Bay Laboratory, Shenzhen, 518055, China
| |
Collapse
|
4
|
Guo Z, Yamaguchi R. Machine learning methods for protein-protein binding affinity prediction in protein design. FRONTIERS IN BIOINFORMATICS 2022; 2:1065703. [PMID: 36591334 PMCID: PMC9800603 DOI: 10.3389/fbinf.2022.1065703] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/01/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions govern a wide range of biological activity. A proper estimation of the protein-protein binding affinity is vital to design proteins with high specificity and binding affinity toward a target protein, which has a variety of applications including antibody design in immunotherapy, enzyme engineering for reaction optimization, and construction of biosensors. However, experimental and theoretical modelling methods are time-consuming, hinder the exploration of the entire protein space, and deter the identification of optimal proteins that meet the requirements of practical applications. In recent years, the rapid development in machine learning methods for protein-protein binding affinity prediction has revealed the potential of a paradigm shift in protein design. Here, we review the prediction methods and associated datasets and discuss the requirements and construction methods of binding affinity prediction models for protein design.
Collapse
Affiliation(s)
- Zhongliang Guo
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan
| | - Rui Yamaguchi
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan,Division of Cancer Informatics, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan,*Correspondence: Rui Yamaguchi,
| |
Collapse
|
5
|
Desantis F, Miotto M, Di Rienzo L, Milanetti E, Ruocco G. Spatial organization of hydrophobic and charged residues affects protein thermal stability and binding affinity. Sci Rep 2022; 12:12087. [PMID: 35840609 PMCID: PMC9287411 DOI: 10.1038/s41598-022-16338-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/08/2022] [Indexed: 11/12/2022] Open
Abstract
What are the molecular determinants of protein–protein binding affinity and whether they are similar to those regulating fold stability are two major questions of molecular biology, whose answers bring important implications both from a theoretical and applicative point of view. Here, we analyze chemical and physical features on a large dataset of protein–protein complexes with reliable experimental binding affinity data and compare them with a set of monomeric proteins for which melting temperature data was available. In particular, we probed the spatial organization of protein (1) intramolecular and intermolecular interaction energies among residues, (2) amino acidic composition, and (3) their hydropathy features. Analyzing the interaction energies, we found that strong Coulombic interactions are preferentially associated with a high protein thermal stability, while strong intermolecular van der Waals energies correlate with stronger protein–protein binding affinity. Statistical analysis of amino acids abundances, exposed to the molecular surface and/or in interaction with the molecular partner, confirmed that hydrophobic residues present on the protein surfaces are preferentially located in the binding regions, while charged residues behave oppositely. Leveraging on the important role of van der Waals interface interactions in binding affinity, we focused on the molecular surfaces in the binding regions and evaluated their shape complementarity, decomposing the molecular patches in the 2D Zernike basis. For the first time, we quantified the correlation between local shape complementarity and binding affinity via the Zernike formalism. In addition, considering the solvent interactions via the residue hydropathy, we found that the hydrophobicity of the binding regions dictates their shape complementary as much as the correlation between van der Waals energy and binding affinity. In turn, these relationships pave the way to the fast and accurate prediction and design of optimal binding regions as the 2D Zernike formalism allows a rapid and superposition-free comparison between possible binding surfaces.
Collapse
Affiliation(s)
- Fausta Desantis
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.,The Open University Affiliated Research Centre at Istituto Italiano di Tecnologia, Via Morego, 30, 16163, Genoa, Italy
| | - Mattia Miotto
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.,Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Viale Regina Elena 291, 00161, Rome, Italy.,Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185, Rome, Italy
| |
Collapse
|
6
|
Yang YX, Wang P, Zhu BT. Relative importance of interface and surface areas in protein-protein binding affinity prediction: A machine learning analysis based on linear regression and artificial neural network. Biophys Chem 2022; 283:106762. [DOI: 10.1016/j.bpc.2022.106762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 01/11/2022] [Accepted: 01/14/2022] [Indexed: 11/02/2022]
|
7
|
Dhusia K, Madrid C, Su Z, Wu Y. EXCESP: A Structure-Based Online Database for Extracellular Interactome of Cell Surface Proteins in Humans. J Proteome Res 2022; 21:349-359. [PMID: 34978816 DOI: 10.1021/acs.jproteome.1c00612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The interactions between ectodomains of cell surface proteins are vital players in many important cellular processes, such as regulating immune responses, coordinating cell differentiation, and shaping neural plasticity. However, while the construction of a large-scale protein interactome has been greatly facilitated by the development of high-throughput experimental techniques, little progress has been made to support the discovery of extracellular interactome for cell surface proteins. Harnessed by the recent advances in computational modeling of protein-protein interactions, here we present a structure-based online database for the extracellular interactome of cell surface proteins in humans, called EXCESP. The database contains both experimentally determined and computationally predicted interactions among all type-I transmembrane proteins in humans. All structural models for these interactions and their binding affinities were further computationally modeled. Moreover, information such as expression levels of each protein in different cell types and its relation to various signaling pathways from other online resources has also been integrated into the database. In summary, the database serves as a valuable addition to the existing online resources for the study of cell surface proteins. It can contribute to the understanding of the functions of cell surface proteins in the era of systems biology.
Collapse
Affiliation(s)
- Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Carlos Madrid
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States.,Laboratory for Macromolecular Analysis and Proteomics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| |
Collapse
|
8
|
Wang B, Su Z, Wu Y. Computational Assessment of Protein-Protein Binding Affinity by Reverse Engineering the Energetics in Protein Complexes. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:1012-1022. [PMID: 33838354 PMCID: PMC9403033 DOI: 10.1016/j.gpb.2021.03.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 03/07/2019] [Accepted: 05/17/2019] [Indexed: 11/29/2022]
Abstract
The cellular functions of proteins are maintained by forming diverse complexes. The stability of these complexes is quantified by the measurement of binding affinity, and mutations that alter the binding affinity can cause various diseases such as cancer and diabetes. As a result, accurate estimation of the binding stability and the effects of mutations on changes of binding affinity is a crucial step to understanding the biological functions of proteins and their dysfunctional consequences. It has been hypothesized that the stability of a protein complex is dependent not only on the residues at its binding interface by pairwise interactions but also on all other remaining residues that do not appear at the binding interface. Here, we computationally reconstruct the binding affinity by decomposing it into the contributions of interfacial residues and other non-interfacial residues in a protein complex. We further assume that the contributions of both interfacial and non-interfacial residues to the binding affinity depend on their local structural environments such as solvent-accessible surfaces and secondary structural types. The weights of all corresponding parameters are optimized by Monte-Carlo simulations. After cross-validation against a large-scale dataset, we show that the model not only shows a strong correlation between the absolute values of the experimental and calculated binding affinities, but can also be an effective approach to predict the relative changes of binding affinity from mutations. Moreover, we have found that the optimized weights of many parameters can capture the first-principle chemical and physical features of molecular recognition, therefore reversely engineering the energetics of protein complexes. These results suggest that our method can serve as a useful addition to current computational approaches for predicting binding affinity and understanding the molecular mechanism of protein–protein interactions.
Collapse
Affiliation(s)
- Bo Wang
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
9
|
Abbasi WA, Yaseen A, Hassan FU, Andleeb S, Minhas FUAA. ISLAND: in-silico proteins binding affinity prediction using sequence information. BioData Min 2020; 13:20. [PMID: 33292419 PMCID: PMC7688004 DOI: 10.1186/s13040-020-00231-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 11/15/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Determining binding affinity in protein-protein interactions is important in the discovery and design of novel therapeutics and mutagenesis studies. Determination of binding affinity of proteins in the formation of protein complexes requires sophisticated, expensive and time-consuming experimentation which can be replaced with computational methods. Most computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures. In this work, we explore sequence-based protein binding affinity prediction using machine learning. METHOD We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the protein binding affinity. RESULTS We present our findings that the true generalization performance of even the state-of-the-art sequence-only predictor is far from satisfactory and that the development of machine learning methods for binding affinity prediction with improved generalization performance is still an open problem. We have also proposed a sequence-based novel protein binding affinity predictor called ISLAND which gives better accuracy than existing methods over the same validation set as well as on external independent test dataset. A cloud-based webserver implementation of ISLAND and its python code are available at https://sites.google.com/view/wajidarshad/software . CONCLUSION This paper highlights the fact that the true generalization performance of even the state-of-the-art sequence-only predictor of binding affinity is far from satisfactory and that the development of effective and practical methods in this domain is still an open problem.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Computational Biology and Data Analysis Laboratory, Department of Computer Science and Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan. .,Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan.
| | - Adiba Yaseen
- Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Fahad Ul Hassan
- Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Saiqa Andleeb
- Biotechnology Laboratory, Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan
| | | |
Collapse
|
10
|
Nithin C, Mukherjee S, Bahadur RP. A structure-based model for the prediction of protein-RNA binding affinity. RNA (NEW YORK, N.Y.) 2019; 25:1628-1645. [PMID: 31395671 PMCID: PMC6859855 DOI: 10.1261/rna.071779.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 08/05/2019] [Indexed: 05/28/2023]
Abstract
Protein-RNA recognition is highly affinity-driven and regulates a wide array of cellular functions. In this study, we have curated a binding affinity data set of 40 protein-RNA complexes, for which at least one unbound partner is available in the docking benchmark. The data set covers a wide affinity range of eight orders of magnitude as well as four different structural classes. On average, we find the complexes with single-stranded RNA have the highest affinity, whereas the complexes with the duplex RNA have the lowest. Nevertheless, free energy gain upon binding is the highest for the complexes with ribosomal proteins and the lowest for the complexes with tRNA with an average of -5.7 cal/mol/Å2 in the entire data set. We train regression models to predict the binding affinity from the structural and physicochemical parameters of protein-RNA interfaces. The best fit model with the lowest maximum error is provided with three interface parameters: relative hydrophobicity, conformational change upon binding and relative hydration pattern. This model has been used for predicting the binding affinity on a test data set, generated using mutated structures of yeast aspartyl-tRNA synthetase, for which experimentally determined ΔG values of 40 mutations are available. The predicted ΔGempirical values highly correlate with the experimental observations. The data set provided in this study should be useful for further development of the binding affinity prediction methods. Moreover, the model developed in this study enhances our understanding on the structural basis of protein-RNA binding affinity and provides a platform to engineer protein-RNA interfaces with desired affinity.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
11
|
Su Z, Wu Y. Multiscale simulation unravel the kinetic mechanisms of inflammasome assembly. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2019; 1867:118612. [PMID: 31758956 DOI: 10.1016/j.bbamcr.2019.118612] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 11/11/2019] [Accepted: 11/18/2019] [Indexed: 01/16/2023]
Abstract
In the innate immune system, the host defense from the invasion of external pathogens triggers the inflammatory responses. Proteins involved in the inflammatory pathways were often found to aggregate into supramolecular oligomers, called 'inflammasome', mostly through the homotypic interaction between their domains that belong to the death domain superfamily. Although much has been known about the formation of these helical molecular machineries, the detailed correlation between the dynamics of their assembly and the structure of each domain is still not well understood. Using the filament formed by the PYD domains of adaptor molecule ASC as a test system, we constructed a new multiscale simulation framework to study the kinetics of inflammasome assembly. We found that the filament assembly is a multi-step, but highly cooperative process. Moreover, there are three types of binding interfaces between domain subunits in the ASCPYD filament. The multiscale simulation results suggest that dynamics of domain assembly are rooted in the primary protein sequence which defines the energetics of molecular recognition through three binding interfaces. Interface I plays a more regulatory role than the other two in mediating both the kinetics and the thermodynamics of assembly. Finally, the efficiency of our computational framework allows us to design mutants on a systematic scale and predict their impacts on filament assembly. In summary, this is, to the best of our knowledge, the first simulation method to model the spatial-temporal process of inflammasome assembly. Our work is a useful addition to a suite of existing experimental techniques to study the functions of inflammasome in innate immune system.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, United States of America
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, United States of America.
| |
Collapse
|
12
|
Marín-López MA, Planas-Iglesias J, Aguirre-Plans J, Bonet J, Garcia-Garcia J, Fernandez-Fuentes N, Oliva B. On the mechanisms of protein interactions: predicting their affinity from unbound tertiary structures. Bioinformatics 2018; 34:592-598. [PMID: 29028891 PMCID: PMC5860604 DOI: 10.1093/bioinformatics/btx616] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 09/26/2017] [Indexed: 12/12/2022] Open
Abstract
Motivation The characterization of the protein–protein association mechanisms is crucial to understanding how biological processes occur. It has been previously shown that the early formation of non-specific encounters enhances the realization of the stereospecific (i.e. native) complex by reducing the dimensionality of the search process. The association rate for the formation of such complex plays a crucial role in the cell biology and depends on how the partners diffuse to be close to each other. Predicting the binding free energy of proteins provides new opportunities to modulate and control protein–protein interactions. However, existing methods require the 3D structure of the complex to predict its affinity, severely limiting their application to interactions with known structures. Results We present a new approach that relies on the unbound protein structures and protein docking to predict protein–protein binding affinities. Through the study of the docking space (i.e. decoys), the method predicts the binding affinity of the query proteins when the actual structure of the complex itself is unknown. We tested our approach on a set of globular and soluble proteins of the newest affinity benchmark, obtaining accuracy values comparable to other state-of-art methods: a 0.4 correlation coefficient between the experimental and predicted values of ΔG and an error < 3 Kcal/mol. Availability and implementation The binding affinity predictor is implemented and available at http://sbi.upf.edu/BADock and https://github.com/badocksbi/BADock. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Manuel Alejandro Marín-López
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Joan Planas-Iglesias
- Division of Metabolic and Vascular Health, University of Warwick, Coventry CV4?7AL, UK
| | - Joaquim Aguirre-Plans
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Jaume Bonet
- Laboratory of Protein Design and Immunoenginneering, School of Engineering, Ecole Polytechnique Federale de Lausanne, Lausanne 1015, Switzerland
| | - Javier Garcia-Garcia
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Narcis Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth SY23?3DA, UK
| | - Baldo Oliva
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| |
Collapse
|
13
|
Raucci R, Laine E, Carbone A. Local Interaction Signal Analysis Predicts Protein-Protein Binding Affinity. Structure 2018; 26:905-915.e4. [PMID: 29779789 DOI: 10.1016/j.str.2018.04.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/06/2018] [Accepted: 04/10/2018] [Indexed: 12/27/2022]
Abstract
Several models estimating the strength of the interaction between proteins in a complex have been proposed. By exploring the geometry of contact distribution at protein-protein interfaces, we provide an improved model of binding energy. Local interaction signal analysis (LISA) is a radial function based on terms describing favorable and non-favorable contacts obtained by density functional theory, the support-core-rim interface residue distribution, non-interacting charged residues and secondary structures contribution. The three-dimensional organization of the contacts and their contribution on localized hot-sites over the entire interaction surface were numerically evaluated. LISA achieves a correlation of 0.81 (and a root-mean-square error of 2.35 ± 0.38 kcal/mol) when tested on 125 complexes for which experimental measurements were realized. LISA's performance is stable for subsets defined by functional composition and extent of conformational changes upon complex formation. A large-scale comparison with 17 other functions demonstrated the power of the geometrical model in the understanding of complex binding.
Collapse
Affiliation(s)
- Raffaele Raucci
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Sorbonne Université, Institut des Sciences du Calcul et des Données (ISCD), 75005 Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Institut Universitaire de France, 75005 Paris, France.
| |
Collapse
|
14
|
Škrbić T, Zamuner S, Hong R, Seno F, Laio A, Trovato A. Vibrational entropy estimation can improve binding affinity prediction for non-obligatory protein complexes. Proteins 2018; 86:393-404. [DOI: 10.1002/prot.25454] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 12/22/2017] [Accepted: 01/05/2018] [Indexed: 01/10/2023]
Affiliation(s)
- Tatjana Škrbić
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Stefano Zamuner
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Rolando Hong
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Flavio Seno
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| | - Alessandro Laio
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Antonio Trovato
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| |
Collapse
|
15
|
Bier D, Mittal S, Bravo-Rodriguez K, Sowislok A, Guillory X, Briels J, Heid C, Bartel M, Wettig B, Brunsveld L, Sanchez-Garcia E, Schrader T, Ottmann C. The Molecular Tweezer CLR01 Stabilizes a Disordered Protein-Protein Interface. J Am Chem Soc 2017; 139:16256-16263. [PMID: 29039919 PMCID: PMC5691318 DOI: 10.1021/jacs.7b07939] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Indexed: 12/13/2022]
Abstract
Protein regions that are involved in protein-protein interactions (PPIs) very often display a high degree of intrinsic disorder, which is reduced during the recognition process. A prime example is binding of the rigid 14-3-3 adapter proteins to their numerous partner proteins, whose recognition motifs undergo an extensive disorder-to-order transition. In this context, it is highly desirable to control this entropy-costly process using tailored stabilizing agents. This study reveals how the molecular tweezer CLR01 tunes the 14-3-3/Cdc25CpS216 protein-protein interaction. Protein crystallography, biophysical affinity determination and biomolecular simulations unanimously deliver a remarkable finding: a supramolecular "Janus" ligand can bind simultaneously to a flexible peptidic PPI recognition motif and to a well-structured adapter protein. This binding fills a gap in the protein-protein interface, "freezes" one of the conformational states of the intrinsically disordered Cdc25C protein partner and enhances the apparent affinity of the interaction. This is the first structural and functional proof of a supramolecular ligand targeting a PPI interface and stabilizing the binding of an intrinsically disordered recognition motif to a rigid partner protein.
Collapse
Affiliation(s)
- David Bier
- Laboratory
of Chemical Biology, Department of Biomedical Engineering and Institute
for Complex Molecular Systems, Eindhoven
University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Sumit Mittal
- Max-Planck-Institut
für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Kenny Bravo-Rodriguez
- Max-Planck-Institut
für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Andrea Sowislok
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Xavier Guillory
- Laboratory
of Chemical Biology, Department of Biomedical Engineering and Institute
for Complex Molecular Systems, Eindhoven
University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Jeroen Briels
- Laboratory
of Chemical Biology, Department of Biomedical Engineering and Institute
for Complex Molecular Systems, Eindhoven
University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Christian Heid
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Maria Bartel
- Laboratory
of Chemical Biology, Department of Biomedical Engineering and Institute
for Complex Molecular Systems, Eindhoven
University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
| | - Burkhard Wettig
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Luc Brunsveld
- Laboratory
of Chemical Biology, Department of Biomedical Engineering and Institute
for Complex Molecular Systems, Eindhoven
University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
| | - Elsa Sanchez-Garcia
- Max-Planck-Institut
für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Thomas Schrader
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| | - Christian Ottmann
- Laboratory
of Chemical Biology, Department of Biomedical Engineering and Institute
for Complex Molecular Systems, Eindhoven
University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
- Department
of Chemistry, University of Duisburg-Essen, Universitätsstrasse 7, 45117 Essen, Germany
| |
Collapse
|
16
|
Integrating computational methods and experimental data for understanding the recognition mechanism and binding affinity of protein-protein complexes. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:33-38. [PMID: 28069340 DOI: 10.1016/j.pbiomolbio.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 01/04/2017] [Accepted: 01/05/2017] [Indexed: 01/09/2023]
Abstract
Protein-protein interactions perform several functions inside the cell. Understanding the recognition mechanism and binding affinity of protein-protein complexes is a challenging problem in experimental and computational biology. In this review, we focus on two aspects (i) understanding the recognition mechanism and (ii) predicting the binding affinity. The first part deals with computational techniques for identifying the binding site residues and the contribution of important interactions for understanding the recognition mechanism of protein-protein complexes in comparison with experimental observations. The second part is devoted to the methods developed for discriminating high and low affinity complexes, and predicting the binding affinity of protein-protein complexes using three-dimensional structural information and just from the amino acid sequence. The overall view enhances our understanding of the integration of experimental data and computational methods, recognition mechanism of protein-protein complexes and the binding affinity.
Collapse
|
17
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
18
|
Srinivasulu YS, Wang JR, Hsu KT, Tsai MJ, Charoenkwan P, Huang WL, Huang HL, Ho SY. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes. BMC Bioinformatics 2015; 16 Suppl 18:S14. [PMID: 26681483 PMCID: PMC4682391 DOI: 10.1186/1471-2105-16-s18-s14] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.
Collapse
|
19
|
Vangone A, Bonvin AM. Contacts-based prediction of binding affinity in protein-protein complexes. eLife 2015. [PMID: 26193119 DOI: 10.7554/elife07454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023] Open
Abstract
Almost all critical functions in cells rely on specific protein-protein interactions. Understanding these is therefore crucial in the investigation of biological systems. Despite all past efforts, we still lack a thorough understanding of the energetics of association of proteins. Here, we introduce a new and simple approach to predict binding affinity based on functional and structural features of the biological system, namely the network of interfacial contacts. We assess its performance against a protein-protein binding affinity benchmark and show that both experimental methods used for affinity measurements and conformational changes have a strong impact on prediction accuracy. Using a subset of complexes with reliable experimental binding affinities and combining our contacts and contact-types-based model with recent observations on the role of the non-interacting surface in protein-protein interactions, we reach a high prediction accuracy for such a diverse dataset outperforming all other tested methods.
Collapse
Affiliation(s)
- Anna Vangone
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science-Chemistry, Utrecht University, Utrecht, Netherlands
| | - Alexandre Mjj Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science-Chemistry, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
20
|
Vangone A, Bonvin AMJJ. Contacts-based prediction of binding affinity in protein-protein complexes. eLife 2015; 4:e07454. [PMID: 26193119 PMCID: PMC4523921 DOI: 10.7554/elife.07454] [Citation(s) in RCA: 313] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 07/08/2015] [Indexed: 12/13/2022] Open
Abstract
Almost all critical functions in cells rely on specific protein-protein interactions. Understanding these is therefore crucial in the investigation of biological systems. Despite all past efforts, we still lack a thorough understanding of the energetics of association of proteins. Here, we introduce a new and simple approach to predict binding affinity based on functional and structural features of the biological system, namely the network of interfacial contacts. We assess its performance against a protein-protein binding affinity benchmark and show that both experimental methods used for affinity measurements and conformational changes have a strong impact on prediction accuracy. Using a subset of complexes with reliable experimental binding affinities and combining our contacts and contact-types-based model with recent observations on the role of the non-interacting surface in protein-protein interactions, we reach a high prediction accuracy for such a diverse dataset outperforming all other tested methods.
Collapse
Affiliation(s)
- Anna Vangone
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, Netherlands
| | - Alexandre MJJ Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
21
|
Erijman A, Rosenthal E, Shifman JM. How structure defines affinity in protein-protein interactions. PLoS One 2014; 9:e110085. [PMID: 25329579 PMCID: PMC4199723 DOI: 10.1371/journal.pone.0110085] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Accepted: 09/14/2014] [Indexed: 01/29/2023] Open
Abstract
Protein-protein interactions (PPI) in nature are conveyed by a multitude of binding modes involving various surfaces, secondary structure elements and intermolecular interactions. This diversity results in PPI binding affinities that span more than nine orders of magnitude. Several early studies attempted to correlate PPI binding affinities to various structure-derived features with limited success. The growing number of high-resolution structures, the appearance of more precise methods for measuring binding affinities and the development of new computational algorithms enable more thorough investigations in this direction. Here, we use a large dataset of PPI structures with the documented binding affinities to calculate a number of structure-based features that could potentially define binding energetics. We explore how well each calculated biophysical feature alone correlates with binding affinity and determine the features that could be used to distinguish between high-, medium- and low- affinity PPIs. Furthermore, we test how various combinations of features could be applied to predict binding affinity and observe a slow improvement in correlation as more features are incorporated into the equation. In addition, we observe a considerable improvement in predictions if we exclude from our analysis low-resolution and NMR structures, revealing the importance of capturing exact intermolecular interactions in our calculations. Our analysis should facilitate prediction of new interactions on the genome scale, better characterization of signaling networks and design of novel binding partners for various target proteins.
Collapse
Affiliation(s)
- Ariel Erijman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Eran Rosenthal
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Julia M. Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- * E-mail:
| |
Collapse
|
22
|
Yugandhar K, Gromiha MM. Protein–protein binding affinity prediction from amino acid sequence. Bioinformatics 2014; 30:3583-9. [DOI: 10.1093/bioinformatics/btu580] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
|
23
|
Yugandhar K, Gromiha MM. Feature selection and classification of protein-protein complexes based on their binding affinities using machine learning approaches. Proteins 2014; 82:2088-96. [PMID: 24648146 DOI: 10.1002/prot.24564] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 03/14/2014] [Indexed: 12/16/2022]
Abstract
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions.
Collapse
Affiliation(s)
- K Yugandhar
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
| | | |
Collapse
|
24
|
Kastritis PL, Bonvin AMJJ. On the binding affinity of macromolecular interactions: daring to ask why proteins interact. J R Soc Interface 2012; 10:20120835. [PMID: 23235262 PMCID: PMC3565702 DOI: 10.1098/rsif.2012.0835] [Citation(s) in RCA: 276] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Interactions between proteins are orchestrated in a precise and time-dependent manner, underlying cellular function. The binding affinity, defined as the strength of these interactions, is translated into physico-chemical terms in the dissociation constant (Kd), the latter being an experimental measure that determines whether an interaction will be formed in solution or not. Predicting binding affinity from structural models has been a matter of active research for more than 40 years because of its fundamental role in drug development. However, all available approaches are incapable of predicting the binding affinity of protein–protein complexes from coordinates alone. Here, we examine both theoretical and experimental limitations that complicate the derivation of structure–affinity relationships. Most work so far has concentrated on binary interactions. Systems of increased complexity are far from being understood. The main physico-chemical measure that relates to binding affinity is the buried surface area, but it does not hold for flexible complexes. For the latter, there must be a significant entropic contribution that will have to be approximated in the future. We foresee that any theoretical modelling of these interactions will have to follow an integrative approach considering the biology, chemistry and physics that underlie protein–protein recognition.
Collapse
Affiliation(s)
- Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science, Chemistry, Utrecht University, , Padualaan 8, Utrecht, The Netherlands
| | | |
Collapse
|
25
|
Kong R, Wang C, Ma X, Liu J, Chen W. Peptides design based on the interfacial helix of integrase dimer. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2012; 2005:4743-6. [PMID: 17281301 DOI: 10.1109/iembs.2005.1615531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
HIV-I integrase (IN) plays a crucial role in the retroviral life cycle. The peptides derived from the helix of IN were reported to have the potency of inhibition. We designed a series of peptides based on interface helices alpha1 and alpha5 with the aim of increasing their inhibitory activity. The helix-forming tendency and the affinity with IN were essential for interfacial peptide inhibitors. The MD simulation and AGADIR prediction both showed favorable results for the designed peptides. The binding mode and binding free energy of peptide and IN were investigated subsequently to test our design. The improvement in binding free energy compared with that of alpha1 and alpha5 indicates that some of the designed peptides may have a higher potency for inhibiting the dimerization of IN. This study provides some useful information for rational design of IN peptide inhibitor.
Collapse
Affiliation(s)
- R Kong
- Coll. of Life Sci. & Bioeng., Beijing Univ. of Technol
| | | | | | | | | |
Collapse
|
26
|
Vreven T, Hwang H, Pierce BG, Weng Z. Prediction of protein-protein binding free energies. Protein Sci 2012; 21:396-404. [PMID: 22238219 DOI: 10.1002/pro.2027] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2011] [Revised: 12/23/2011] [Accepted: 01/04/2012] [Indexed: 11/09/2022]
Abstract
We present an energy function for predicting binding free energies of protein-protein complexes, using the three-dimensional structures of the complex and unbound proteins as input. Our function is a linear combination of nine terms and achieves a correlation coefficient of 0.63 with experimental measurements when tested on a benchmark of 144 complexes using leave-one-out cross validation. Although we systematically tested both atomic and residue-based scoring functions, the selected function is dominated by residue-based terms. Our function is stable for subsets of the benchmark stratified by experimental pH and extent of conformational change upon complex formation, with correlation coefficients ranging from 0.61 to 0.66.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | | | | | | |
Collapse
|
27
|
Tian F, Lv Y, Yang L. Structure-based prediction of protein–protein binding affinity with consideration of allosteric effect. Amino Acids 2011; 43:531-43. [DOI: 10.1007/s00726-011-1101-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2011] [Accepted: 09/21/2011] [Indexed: 11/28/2022]
|
28
|
Moal IH, Agius R, Bates PA. Protein-protein binding affinity prediction on a diverse set of structures. Bioinformatics 2011; 27:3002-9. [PMID: 21903632 DOI: 10.1093/bioinformatics/btr513] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2024] Open
Abstract
MOTIVATION Accurate binding free energy functions for protein-protein interactions are imperative for a wide range of purposes. Their construction is predicated upon ascertaining the factors that influence binding and their relative importance. A recent benchmark of binding affinities has allowed, for the first time, the evaluation and construction of binding free energy models using a diverse set of complexes, and a systematic assessment of our ability to model the energetics of conformational changes. RESULTS We construct a large set of molecular descriptors using commonly available tools, introducing the use of energetic factors associated with conformational changes and disorder to order transitions, as well as features calculated on structural ensembles. The descriptors are used to train and test a binding free energy model using a consensus of four machine learning algorithms, whose performance constitutes a significant improvement over the other state of the art empirical free energy functions tested. The internal workings of the learners show how the descriptors are used, illuminating the determinants of protein-protein binding. AVAILABILITY The molecular descriptor set and descriptor values for all complexes are available in the Supplementary Material. A web server for the learners and coordinates for the bound and unbound structures can be accessed from the website: http://bmm.cancerresearchuk.org/~Affinity. CONTACT paul.bates@cancer.org.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Iain H Moal
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | | | | |
Collapse
|
29
|
Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AMJJ, Janin J. A structure-based benchmark for protein-protein binding affinity. Protein Sci 2011; 20:482-91. [PMID: 21213247 PMCID: PMC3064828 DOI: 10.1002/pro.580] [Citation(s) in RCA: 219] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Revised: 12/15/2010] [Accepted: 12/16/2010] [Indexed: 11/06/2022]
Abstract
We have assembled a nonredundant set of 144 protein-protein complexes that have high-resolution structures available for both the complexes and their unbound components, and for which dissociation constants have been measured by biophysical methods. The set is diverse in terms of the biological functions it represents, with complexes that involve G-proteins and receptor extracellular domains, as well as antigen/antibody, enzyme/inhibitor, and enzyme/substrate complexes. It is also diverse in terms of the partners' affinity for each other, with K(d) ranging between 10(-5) and 10(-14) M. Nine pairs of entries represent closely related complexes that have a similar structure, but a very different affinity, each pair comprising a cognate and a noncognate assembly. The unbound structures of the component proteins being available, conformation changes can be assessed. They are significant in most of the complexes, and large movements or disorder-to-order transitions are frequently observed. The set may be used to benchmark biophysical models aiming to relate affinity to structure in protein-protein interactions, taking into account the reactants and the conformation changes that accompany the association reaction, instead of just the final product.
Collapse
Affiliation(s)
- Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University3584CH Utrecht, The Netherlands
| | - Iain H Moal
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, Lincoln's Inn Fields LaboratoriesLondon WC2A 3LY, United Kingdom
| | - Howook Hwang
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical SchoolWorcester, Massachusetts 01605
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical SchoolWorcester, Massachusetts 01605
| | - Paul A Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, Lincoln's Inn Fields LaboratoriesLondon WC2A 3LY, United Kingdom
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University3584CH Utrecht, The Netherlands
| | - Joël Janin
- Yeast Structural Genomics, IBBMC UMR 8619, Université Paris-Sud91405 Orsay, France
| |
Collapse
|
30
|
Mitra P, Pal D. dockYard–a repository to assist modeling of protein-protein docking. J Mol Model 2010; 17:599-606. [DOI: 10.1007/s00894-010-0758-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2010] [Accepted: 05/12/2010] [Indexed: 02/02/2023]
|
31
|
Dell'Orco D. Fast predictions of thermodynamics and kinetics of protein-protein recognition from structures: from molecular design to systems biology. MOLECULAR BIOSYSTEMS 2009; 5:323-34. [PMID: 19396368 DOI: 10.1039/b821580d] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The increasing call for an overall picture of the interactions between the components of a biological system that give rise to the observed function is often summarized by the expression systems biology. Both the interpretative and predictive capabilities of holistic models of biochemical systems, however, depend to a large extent on the level of physico-chemical knowledge of the individual molecular interactions making up the network. This review is focused on the structure-based quantitative characterization of protein-protein interactions, ubiquitous in any biochemical pathway. Recently developed, fast and effective computational methods are reviewed, which allow the assessment of kinetic and thermodynamic features of the association-dissociation processes of protein complexes, both in water soluble and membrane environments. The performance and the accuracy of fast and semi-empirical structure-based methods have reached comparable levels with respect to the classical and more elegant molecular simulations. Nevertheless, the broad accessibility and lower computational cost provide the former methods with the advantageous possibility to perform systems-level analyses including extensive in silico mutagenesis screenings and large-scale structural predictions of multiprotein complexes.
Collapse
Affiliation(s)
- Daniele Dell'Orco
- Department of Chemistry, University of Modena and Reggio Emilia, Via Campi 183, 41100, Modena, Italy.
| |
Collapse
|
32
|
Moreira IS, Fernandes PA, Ramos MJ. Protein-protein docking dealing with the unknown. J Comput Chem 2009; 31:317-42. [DOI: 10.1002/jcc.21276] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
33
|
Audie J. Development and validation of an empirical free energy function for calculating protein-protein binding free energy surfaces. Biophys Chem 2008; 139:84-91. [PMID: 19041170 DOI: 10.1016/j.bpc.2008.10.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2008] [Revised: 10/18/2008] [Accepted: 10/22/2008] [Indexed: 10/21/2022]
Abstract
In a previous paper, we described a novel empirical free energy function that was used to accurately predict experimental binding free energies for a diverse test set of 31 protein-protein complexes to within approximately 1.0 kcal. Here, we extend that work and show that an updated version of the function can be used to (1) accurately predict native binding free energies and (2) rank crystallographic, native-like and non-native binding modes in a physically realistic manner. The modified function includes terms designed to capture some of the unfavorable interactions that characterize non-native interfaces. The function was used to calculate one-dimensional binding free energy surfaces for 21 protein complexes. In roughly 90% of the cases tested, the function was used to place native-like and crystallographic binding modes in global free energy minima. Our analysis further suggests that buried hydrogen bonds might provide the key to distinguishing native from non-native interactions. To the best of our knowledge our function is the only one of its kind, a single expression that can be used to accurately calculate native and non-native binding free energies for a large number of proteins. Given the encouraging results presented in this paper, future work will focus on improving the function and applying it to the protein-protein docking problem.
Collapse
Affiliation(s)
- Joseph Audie
- Department of Chemistry, Sacred Heart University, Fairfield, CT 06825, USA.
| |
Collapse
|
34
|
Li YC, Zeng ZH. Interfacial atom pair analysis. BIOCHEMISTRY. BIOKHIMIIA 2008; 73:231-233. [PMID: 18298380 DOI: 10.1134/s0006297908020156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The relations of the binding free energies in a dataset of 69 protein complexes with the numbers of interfacial atom pairs, as well as with the atomic distances of the pairs, are analyzed. It is found that the interfacial main-chain atom pairs contribute more to the correlation than the interfacial side chain atom pairs do, and the polar atom pairs contribute more than the non-polar atom pairs do. Interfacial atom pairs with atomic distance in the range of 6-12 A are the most important to explain the differences in binding free energies in the datasets.
Collapse
Affiliation(s)
- Yong-Chao Li
- Institute of Biophysics, Chinese Academy of Sciences, Chaoyang District, Beijing, China
| | | |
Collapse
|
35
|
Audie J, Scarlata S. A novel empirical free energy function that explains and predicts protein–protein binding affinities. Biophys Chem 2007; 129:198-211. [PMID: 17600612 DOI: 10.1016/j.bpc.2007.05.021] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2007] [Revised: 05/31/2007] [Accepted: 05/31/2007] [Indexed: 11/22/2022]
Abstract
A free energy function can be defined as a mathematical expression that relates macroscopic free energy changes to microscopic or molecular properties. Free energy functions can be used to explain and predict the affinity of a ligand for a protein and to score and discriminate between native and non-native binding modes. However, there is a natural tension between developing a function fast enough to solve the scoring problem but rigorous enough to explain and predict binding affinities. Here, we present a novel, physics-based free energy function that is computationally inexpensive, yet explanatory and predictive. The function results from a derivation that assumes the cost of polar desolvation can be ignored and that includes a unique and implicit treatment of interfacial water-bridged interactions. The function was parameterized on an internally consistent, high quality training set giving R2=0.97 and Q2=0.91. We used the function to blindly and successfully predict binding affinities for a diverse test set of 31 wild-type protein-protein and protein-peptide complexes (R2=0.79, rmsd=1.2 kcal mol(-1)). The function performed very well in direct comparison with a recently described knowledge-based potential and the function appears to be transferable. Our results indicate that our function is well suited for solving a wide range of protein/peptide design and discovery problems.
Collapse
Affiliation(s)
- Joseph Audie
- Department of Physiology and Biophysics, State University of New York at Stony Brook, Stony, Brook, NY 11794, USA
| | | |
Collapse
|
36
|
Xiang Z, Steinbach PJ, Jacobson MP, Friesner RA, Honig B. Prediction of side-chain conformations on protein surfaces. Proteins 2007; 66:814-23. [PMID: 17206724 PMCID: PMC2743384 DOI: 10.1002/prot.21099] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
An approach is described that improves the prediction of the conformations of surface side chains in crystal structures, given the main-chain conformation of a protein. A key element of the methodology involves the use of the colony energy. This phenomenological term favors conformations found in frequently sampled regions, thereby approximating entropic effects and serving to smooth the potential energy surface. Use of the colony energy significantly improves prediction accuracy for surface side chains with little additional computational cost. Prediction accuracy was quantified as the percentage of side-chain dihedral angles predicted to be within 40 degrees of the angles measured by X-ray diffraction. Use of the colony energy in predictions for single side chains improved the prediction accuracy for chi(1) and chi(1+2) from 65 and 40% to 74 and 59%, respectively. Several other factors that affect prediction of surface side-chain conformations were also analyzed, including the extent of conformational sampling, details of the rotamer library employed, and accounting for the crystallographic environment. The prediction of conformations for polar residues on the surface was generally found to be more difficult than those for hydrophobic residues, except for polar residues participating in hydrogen bonds with other protein groups. For surface residues with hydrogen-bonded side chains, the prediction accuracy of chi(1) and chi(1+2) was 79 and 63%, respectively. For surface polar residues, in general (all side-chain prediction), the accuracy of chi(1) and chi(1+2) was only 73 and 56%, respectively. The most accurate results were obtained using the colony energy and an all-atom description that includes neighboring molecules in the crystal (protein chains and hetero atoms). Here, the accuracy of chi(1) and chi(1+2) predictions for surface side chains was 82 and 73%, respectively. The root mean square deviations obtained for hydrogen-bonding surface side chains were 1.64 and 1.81 A, with and without consideration of crystal packing effects, respectively.
Collapse
Affiliation(s)
- Zhexin Xiang
- Center for Molecular Modeling, Center for Information Technology, National Institutes of Health, Bethesda, Maryland 20892-5624, USA.
| | | | | | | | | |
Collapse
|
37
|
Dell'Orco D, Seeber M, De Benedetti PG, Fanelli F. Probing Fragment Complementation by Rigid-Body Docking: in Silico Reconstitution of Calbindin D9k. J Chem Inf Model 2005; 45:1429-38. [PMID: 16180920 DOI: 10.1021/ci0501995] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Fragment complementation is gaining an increasing impact as a nonperturbing method to probe noncovalent interactions within protein supersecondary structures. In this study, the fast Fourier transform rigid-body docking algorithm ZDOCK has been employed for in silico reconstitution of the calcium binding protein calbindin D9k, from its two EF-hands subdomains, namely, EF1 (residues 1-43) and EF2 (residues 44-75). The EF1 fragment has been used both in its wild type and in nine mutant forms, in line with in vitro experiments. Consistent with in vitro data, ZDOCK reconstituted the proper fold of wild-type and mutated calbindin, locating the nativelike structures (i.e., holding a root-mean-square deviation < 1 A with respect to the X-ray structure) among the first 10 top-scored solutions out of 4000. Moreover, the three independent in silico reconstitutions of wild-type calbindin ranked a nativelike structure at the top of the output list, that is, the best scored one. The algorithm has been also successfully challenged in reconstituting the EF2 homodimer from two identical copies of the monomer. Furthermore, quantitative models consisting of linear correlations between thermodynamic data and ZDOCK scores were built, providing a tested tool for very fast in silico predictions of the free energy of association of protein-protein complexes solved at the atomic level and known to not undergo significant conformational changes upon binding.
Collapse
Affiliation(s)
- Daniele Dell'Orco
- Department of Chemistry and Dulbecco Telethon Institute, University of Modena and Reggio Emilia, via Campi 183, 41100 Modena, Italy
| | | | | | | |
Collapse
|
38
|
Tame JRH. Scoring Functions – the First 100 Years. J Comput Aided Mol Des 2005; 19:445-51. [PMID: 16231202 DOI: 10.1007/s10822-005-8483-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2005] [Accepted: 06/07/2005] [Indexed: 10/25/2022]
Abstract
The use of simple linear mathematical models to estimate chemical properties is not a new idea. Albert Einstein used very simple 'gravity-like' forces to explain the capillarity of different liquids in 1900-1901. Today such models are used in more complicated situations, and a great many have been developed to analyse interactions between proteins and their ligands. This is not surprising, since proteins are too complicated to model accurately without lengthy numerical analysis, and simple models often do at least as good a job in predicting binding constants as much more computationally expensive methods. One hundred years after Einstein's 'miraculous year' in which he transformed physics, it is instructive to recall some of his even earlier work. As approximations, 'scoring functions' are excellent, but it is dangerous to read too much into them. A few cautionary tales are presented for the beginner to the field of ligand affinity prediction by linear models.
Collapse
Affiliation(s)
- Jeremy R H Tame
- Protein Design Laboratory, Yokohama City University, Suehiro 1-7-29, 230-0045, Tsurumi, Yokohama, Japan.
| |
Collapse
|