1
|
Zheng F, Jiang X, Wen Y, Yang Y, Li M. Systematic investigation of machine learning on limited data: A study on predicting protein-protein binding strength. Comput Struct Biotechnol J 2024; 23:460-472. [PMID: 38235359 PMCID: PMC10792694 DOI: 10.1016/j.csbj.2023.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/14/2023] [Accepted: 12/16/2023] [Indexed: 01/19/2024] Open
Abstract
The application of machine learning techniques in biological research, especially when dealing with limited data availability, poses significant challenges. In this study, we leveraged advancements in method development for predicting protein-protein binding strength to conduct a systematic investigation into the application of machine learning on limited data. The binding strength, quantitatively measured as binding affinity, is vital for understanding the processes of recognition, association, and dysfunction that occur within protein complexes. By incorporating transfer learning, integrating domain knowledge, and employing both deep learning and traditional machine learning algorithms, we mitigated the impact of data limitations and made significant advancements in predicting protein-protein binding affinity. In particular, we developed over 20 models, ultimately selecting three representative best-performing ones that belong to distinct categories. The first model is structure-based, consisting of a random forest regression and thirteen handcrafted features. The second model is sequence-based, employing an architecture that combines transferred embedding features with a multilayer perceptron. Finally, we created an ensemble model by averaging the predictions of the two aforementioned models. The comparison with other predictors on three independent datasets confirms the significant improvements achieved by our models in predicting protein-protein binding affinity. The programs for running these three models are available at https://github.com/minghuilab/BindPPI.
Collapse
Affiliation(s)
- Feifan Zheng
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Xin Jiang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yuhao Wen
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yan Yang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Minghui Li
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| |
Collapse
|
2
|
Yang YX, Huang JY, Wang P, Zhu BT. AREA-AFFINITY: A Web Server for Machine Learning-Based Prediction of Protein-Protein and Antibody-Protein Antigen Binding Affinities. J Chem Inf Model 2023. [PMID: 37235532 DOI: 10.1021/acs.jcim.2c01499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Protein-Protein binding affinity reflects the binding strength between the binding partners. The prediction of protein-protein binding affinity is important for elucidating protein functions and also for designing protein-based therapeutics. The geometric characteristics such as area (both interface and surface areas) in the structure of a protein-protein complex play an important role in determining protein-protein interactions and their binding affinity. Here, we present a free web server for academic use, AREA-AFFINITY, for prediction of protein-protein or antibody-protein antigen binding affinity based on interface and surface areas in the structure of a protein-protein complex. AREA-AFFINITY implements 60 effective area-based protein-protein affinity predictive models and 37 effective area-based models specific for antibody-protein antigen binding affinity prediction developed in our recent studies. These models take into consideration the roles of interface and surface areas in binding affinity by using areas classified according to different amino acid types with different biophysical nature. The models with the best performances integrate machine learning methods such as neural network or random forest. These newly developed models have superior or comparable performance compared to the commonly used existing methods. AREA-AFFINITY is available for free at: https://affinity.cuhk.edu.cn/.
Collapse
Affiliation(s)
- Yong Xiao Yang
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
| | - Jin Yan Huang
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
| | - Pan Wang
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
| | - Bao Ting Zhu
- Shenzhen Key Laboratory of Steroid Drug Discovery and Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China
- Shenzhen Bay Laboratory, Shenzhen, 518055, China
| |
Collapse
|
3
|
Pawlak JB, Hsu JCC, Xia H, Han P, Suh HW, Grove TL, Morrison J, Shi PY, Cresswell P, Laurent-Rolle M. CMPK2 restricts Zika virus replication by inhibiting viral translation. PLoS Pathog 2023; 19:e1011286. [PMID: 37075076 PMCID: PMC10150978 DOI: 10.1371/journal.ppat.1011286] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 05/01/2023] [Accepted: 03/09/2023] [Indexed: 04/20/2023] Open
Abstract
Flaviviruses continue to emerge as global health threats. There are currently no Food and Drug Administration (FDA) approved antiviral treatments for flaviviral infections. Therefore, there is a pressing need to identify host and viral factors that can be targeted for effective therapeutic intervention. Type I interferon (IFN-I) production in response to microbial products is one of the host's first line of defense against invading pathogens. Cytidine/uridine monophosphate kinase 2 (CMPK2) is a type I interferon-stimulated gene (ISG) that exerts antiviral effects. However, the molecular mechanism by which CMPK2 inhibits viral replication is unclear. Here, we report that CMPK2 expression restricts Zika virus (ZIKV) replication by specifically inhibiting viral translation and that IFN-I- induced CMPK2 contributes significantly to the overall antiviral response against ZIKV. We demonstrate that expression of CMPK2 results in a significant decrease in the replication of other pathogenic flaviviruses including dengue virus (DENV-2), Kunjin virus (KUNV) and yellow fever virus (YFV). Importantly, we determine that the N-terminal domain (NTD) of CMPK2, which lacks kinase activity, is sufficient to restrict viral translation. Thus, its kinase function is not required for CMPK2's antiviral activity. Furthermore, we identify seven conserved cysteine residues within the NTD as critical for CMPK2 antiviral activity. Thus, these residues may form an unknown functional site in the NTD of CMPK2 contributing to its antiviral function. Finally, we show that mitochondrial localization of CMPK2 is required for its antiviral effects. Given its broad antiviral activity against flaviviruses, CMPK2 is a promising potential pan-flavivirus inhibitor.
Collapse
Affiliation(s)
- Joanna B. Pawlak
- Department of Immunobiology, Yale University School of Medicine, New Haven, Connecticut, United States of America
- Section of Infectious Diseases, Department of Internal Medicine, Yale University School of Medicine, New Haven, Connecticut, United States of America
- Department of Microbial Pathogenesis, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Jack Chun-Chieh Hsu
- Department of Immunobiology, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Hongjie Xia
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - Patrick Han
- Department of Immunobiology, Yale University School of Medicine, New Haven, Connecticut, United States of America
- Department of Dermatology, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Hee-Won Suh
- Department of Biomedical Engineering, Yale University School of Engineering and Applied Science, New Haven, Connecticut, United States of America
| | - Tyler L. Grove
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Juliet Morrison
- Department of Microbiology and Plant Pathology, University of California, Riverside, California, United States of America
| | - Pei-Yong Shi
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, United States of America
- Institute for Human Infections and Immunity, University of Texas Medical Branch, Galveston, Texas, United States of America
- Sealy Institute for Vaccine Sciences, University of Texas Medical Branch, Galveston, Texas, United States of America
- Sealy Institute for Drug Discovery, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - Peter Cresswell
- Department of Immunobiology, Yale University School of Medicine, New Haven, Connecticut, United States of America
- Department of Cell Biology, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Maudry Laurent-Rolle
- Department of Immunobiology, Yale University School of Medicine, New Haven, Connecticut, United States of America
- Section of Infectious Diseases, Department of Internal Medicine, Yale University School of Medicine, New Haven, Connecticut, United States of America
- Department of Microbial Pathogenesis, Yale University School of Medicine, New Haven, Connecticut, United States of America
| |
Collapse
|
4
|
Guo Z, Yamaguchi R. Machine learning methods for protein-protein binding affinity prediction in protein design. FRONTIERS IN BIOINFORMATICS 2022; 2:1065703. [PMID: 36591334 PMCID: PMC9800603 DOI: 10.3389/fbinf.2022.1065703] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/01/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions govern a wide range of biological activity. A proper estimation of the protein-protein binding affinity is vital to design proteins with high specificity and binding affinity toward a target protein, which has a variety of applications including antibody design in immunotherapy, enzyme engineering for reaction optimization, and construction of biosensors. However, experimental and theoretical modelling methods are time-consuming, hinder the exploration of the entire protein space, and deter the identification of optimal proteins that meet the requirements of practical applications. In recent years, the rapid development in machine learning methods for protein-protein binding affinity prediction has revealed the potential of a paradigm shift in protein design. Here, we review the prediction methods and associated datasets and discuss the requirements and construction methods of binding affinity prediction models for protein design.
Collapse
Affiliation(s)
- Zhongliang Guo
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan
| | - Rui Yamaguchi
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan,Division of Cancer Informatics, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan,*Correspondence: Rui Yamaguchi,
| |
Collapse
|
5
|
Romero-Molina S, Ruiz-Blanco YB, Mieres-Perez J, Harms M, Münch J, Ehrmann M, Sanchez-Garcia E. PPI-Affinity: A Web Tool for the Prediction and Optimization of Protein-Peptide and Protein-Protein Binding Affinity. J Proteome Res 2022; 21:1829-1841. [PMID: 35654412 PMCID: PMC9361347 DOI: 10.1021/acs.jproteome.2c00020] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Virtual screening
of protein–protein and protein–peptide
interactions is a challenging task that directly impacts the processes
of hit identification and hit-to-lead optimization in drug design
projects involving peptide-based pharmaceuticals. Although several
screening tools designed to predict the binding affinity of protein–protein
complexes have been proposed, methods specifically developed to predict
protein–peptide binding affinity are comparatively scarce.
Frequently, predictors trained to score the affinity of small molecules
are used for peptides indistinctively, despite the larger complexity
and heterogeneity of interactions rendered by peptide binders. To
address this issue, we introduce PPI-Affinity, a tool that leverages
support vector machine (SVM) predictors of binding affinity to screen
datasets of protein–protein and protein–peptide complexes,
as well as to generate and rank mutants of a given structure. The
performance of the SVM models was assessed on four benchmark datasets,
which include protein–protein and protein–peptide binding
affinity data. In addition, we evaluated our model on a set of mutants
of EPI-X4, an endogenous peptide inhibitor of the chemokine receptor
CXCR4, and on complexes of the serine proteases HTRA1 and HTRA3 with
peptides. PPI-Affinity is freely accessible at https://protdcal.zmb.uni-due.de/PPIAffinity.
Collapse
Affiliation(s)
- Sandra Romero-Molina
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Yasser B Ruiz-Blanco
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Joel Mieres-Perez
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Mirja Harms
- Institute of Molecular Virology, Ulm University Medical Center, Ulm 89081, Germany
| | - Jan Münch
- Institute of Molecular Virology, Ulm University Medical Center, Ulm 89081, Germany.,Core Facility Functional Peptidomics, Ulm University Medical Center, Ulm 89081, Germany
| | - Michael Ehrmann
- Faculty of Biology, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Elsa Sanchez-Garcia
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| |
Collapse
|
6
|
Yang YX, Wang P, Zhu BT. Relative importance of interface and surface areas in protein-protein binding affinity prediction: A machine learning analysis based on linear regression and artificial neural network. Biophys Chem 2022; 283:106762. [DOI: 10.1016/j.bpc.2022.106762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 01/11/2022] [Accepted: 01/14/2022] [Indexed: 11/02/2022]
|
7
|
Dhusia K, Madrid C, Su Z, Wu Y. EXCESP: A Structure-Based Online Database for Extracellular Interactome of Cell Surface Proteins in Humans. J Proteome Res 2022; 21:349-359. [PMID: 34978816 DOI: 10.1021/acs.jproteome.1c00612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The interactions between ectodomains of cell surface proteins are vital players in many important cellular processes, such as regulating immune responses, coordinating cell differentiation, and shaping neural plasticity. However, while the construction of a large-scale protein interactome has been greatly facilitated by the development of high-throughput experimental techniques, little progress has been made to support the discovery of extracellular interactome for cell surface proteins. Harnessed by the recent advances in computational modeling of protein-protein interactions, here we present a structure-based online database for the extracellular interactome of cell surface proteins in humans, called EXCESP. The database contains both experimentally determined and computationally predicted interactions among all type-I transmembrane proteins in humans. All structural models for these interactions and their binding affinities were further computationally modeled. Moreover, information such as expression levels of each protein in different cell types and its relation to various signaling pathways from other online resources has also been integrated into the database. In summary, the database serves as a valuable addition to the existing online resources for the study of cell surface proteins. It can contribute to the understanding of the functions of cell surface proteins in the era of systems biology.
Collapse
Affiliation(s)
- Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Carlos Madrid
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States.,Laboratory for Macromolecular Analysis and Proteomics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| |
Collapse
|
8
|
Markiewicz M, Szczelina R, Milanovic B, Subczynski WK, Pasenkiewicz-Gierula M. Chirality affects cholesterol-oxysterol association in water, a computational study. Comput Struct Biotechnol J 2021; 19:4319-4335. [PMID: 34429850 PMCID: PMC8361299 DOI: 10.1016/j.csbj.2021.07.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 07/18/2021] [Accepted: 07/21/2021] [Indexed: 01/04/2023] Open
Abstract
Cholesterol (Chol) is the most prevalent sterol in the animal kingdom and an indispensable component of mammalian cell membranes. Chol content in the membrane is strictly controlled, although the oxidation of phospholipids may change the relative content of membrane Chol. An excess of it results in the formation of pure Chol microdomains in the membrane. It is likely that some Chol molecules detach from the domains and self-assemble in the aqueous environment. This may promote Chol microcrystallisation, which initiates the development of gallstones and atherosclerotic plaque. In this study, the molecular dynamics, free energy perturbation, umbrella sampling and Voronoi diagram methods are used to reveal the details of self-association of Chol and its oxidised forms (oxChol), namely 7α,β-hydroxycholesterol and 7α,β-hydroperoxycholesterol, in water. In the first part of the study the interactions between a sterol monomer and water over a short and longer timescale as well as the energy of hydration of each sterol are analysed. This helps one to understand Chol-Chol and Chol-OxChol with different chirality self-association in water better, which is analysed in the second part of the study. The Voronoi diagram approach is used to determine the relative arrangement of molecules in the dimer and, most importantly, to analyse the dehydration of the contacting surfaces of the assembling molecules. Free energy calculations indicate that Chol and 7β-hydroxycholesterol associate into the most stable dimer and that Chol-Chol is the next most stable of the five dimers studied. Employing different computational methods enables us to obtain an adequate picture of Chol-sterol self-association in water, which includes dynamic, energetic and temporal aspects of the process.
Collapse
Affiliation(s)
- Michal Markiewicz
- Department of Computational Biophysics and Bioinformatics, Faculty of Biochemistry, Biophysics, and Biotechnology, Jagiellonian University, Krakow, Poland
| | - Robert Szczelina
- Division of Computational Mathematics, Faculty of Mathematics and Computer Science, Jagiellonian University, 30-348 Krakow, Poland
| | - Bozena Milanovic
- Department of Computational Biophysics and Bioinformatics, Faculty of Biochemistry, Biophysics, and Biotechnology, Jagiellonian University, Krakow, Poland
| | - Witold K. Subczynski
- Department of Biophysics, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Marta Pasenkiewicz-Gierula
- Department of Computational Biophysics and Bioinformatics, Faculty of Biochemistry, Biophysics, and Biotechnology, Jagiellonian University, Krakow, Poland
| |
Collapse
|
9
|
Wang B, Su Z, Wu Y. Computational Assessment of Protein-Protein Binding Affinity by Reverse Engineering the Energetics in Protein Complexes. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:1012-1022. [PMID: 33838354 PMCID: PMC9403033 DOI: 10.1016/j.gpb.2021.03.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 03/07/2019] [Accepted: 05/17/2019] [Indexed: 11/29/2022]
Abstract
The cellular functions of proteins are maintained by forming diverse complexes. The stability of these complexes is quantified by the measurement of binding affinity, and mutations that alter the binding affinity can cause various diseases such as cancer and diabetes. As a result, accurate estimation of the binding stability and the effects of mutations on changes of binding affinity is a crucial step to understanding the biological functions of proteins and their dysfunctional consequences. It has been hypothesized that the stability of a protein complex is dependent not only on the residues at its binding interface by pairwise interactions but also on all other remaining residues that do not appear at the binding interface. Here, we computationally reconstruct the binding affinity by decomposing it into the contributions of interfacial residues and other non-interfacial residues in a protein complex. We further assume that the contributions of both interfacial and non-interfacial residues to the binding affinity depend on their local structural environments such as solvent-accessible surfaces and secondary structural types. The weights of all corresponding parameters are optimized by Monte-Carlo simulations. After cross-validation against a large-scale dataset, we show that the model not only shows a strong correlation between the absolute values of the experimental and calculated binding affinities, but can also be an effective approach to predict the relative changes of binding affinity from mutations. Moreover, we have found that the optimized weights of many parameters can capture the first-principle chemical and physical features of molecular recognition, therefore reversely engineering the energetics of protein complexes. These results suggest that our method can serve as a useful addition to current computational approaches for predicting binding affinity and understanding the molecular mechanism of protein–protein interactions.
Collapse
Affiliation(s)
- Bo Wang
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
10
|
Abbasi WA, Yaseen A, Hassan FU, Andleeb S, Minhas FUAA. ISLAND: in-silico proteins binding affinity prediction using sequence information. BioData Min 2020; 13:20. [PMID: 33292419 PMCID: PMC7688004 DOI: 10.1186/s13040-020-00231-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 11/15/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Determining binding affinity in protein-protein interactions is important in the discovery and design of novel therapeutics and mutagenesis studies. Determination of binding affinity of proteins in the formation of protein complexes requires sophisticated, expensive and time-consuming experimentation which can be replaced with computational methods. Most computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures. In this work, we explore sequence-based protein binding affinity prediction using machine learning. METHOD We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the protein binding affinity. RESULTS We present our findings that the true generalization performance of even the state-of-the-art sequence-only predictor is far from satisfactory and that the development of machine learning methods for binding affinity prediction with improved generalization performance is still an open problem. We have also proposed a sequence-based novel protein binding affinity predictor called ISLAND which gives better accuracy than existing methods over the same validation set as well as on external independent test dataset. A cloud-based webserver implementation of ISLAND and its python code are available at https://sites.google.com/view/wajidarshad/software . CONCLUSION This paper highlights the fact that the true generalization performance of even the state-of-the-art sequence-only predictor of binding affinity is far from satisfactory and that the development of effective and practical methods in this domain is still an open problem.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Computational Biology and Data Analysis Laboratory, Department of Computer Science and Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan. .,Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan.
| | - Adiba Yaseen
- Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Fahad Ul Hassan
- Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Saiqa Andleeb
- Biotechnology Laboratory, Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan
| | | |
Collapse
|
11
|
Barigye SJ, Gómez-Ganau S, Serrano-Candelas E, Gozalbes R. PeptiDesCalculator: Software for computation of peptide descriptors. Definition, implementation and case studies for 9 bioactivity endpoints. Proteins 2020; 89:174-184. [PMID: 32881068 DOI: 10.1002/prot.26003] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 08/05/2020] [Accepted: 08/27/2020] [Indexed: 11/09/2022]
Abstract
We present a novel Java-based program denominated PeptiDesCalculator for computing peptide descriptors. These descriptors include: redefinitions of known protein parameters to suite the peptide domain, generalization schemes for the global descriptions of peptide characteristics, as well as empirical descriptors based on experimental evidence on peptide stability and interaction propensity. The PeptiDesCalculator software provides a user-friendly Graphical User Interface (GUI) and is parallelized to maximize the use of computational resources available in current work stations. The PeptiDesCalculator indices are employed in modeling 8 peptide bioactivity endpoints demonstrating satisfactory behavior. Moreover, we compare the performance of a support vector machine (SVM) classifier built using 15 PeptiDesCalculator indices with that of a recently reported deep neural network (DNN) antimicrobial activity classifier, demonstrating comparable test set performance notwithstanding the remarkably lower degree of freedom for the former. This software will facilitate the development of in silico models for the prediction of peptide properties.
Collapse
Affiliation(s)
- Stephen J Barigye
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,MolDrug AI Systems SL, Valencia, Spain
| | - Sergi Gómez-Ganau
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,Eurofins Agroscience Services Regulatory Spain SL, Valencia, Spain
| | - Eva Serrano-Candelas
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain
| | - Rafael Gozalbes
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,MolDrug AI Systems SL, Valencia, Spain
| |
Collapse
|
12
|
Nithin C, Mukherjee S, Bahadur RP. A structure-based model for the prediction of protein-RNA binding affinity. RNA (NEW YORK, N.Y.) 2019; 25:1628-1645. [PMID: 31395671 PMCID: PMC6859855 DOI: 10.1261/rna.071779.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 08/05/2019] [Indexed: 05/28/2023]
Abstract
Protein-RNA recognition is highly affinity-driven and regulates a wide array of cellular functions. In this study, we have curated a binding affinity data set of 40 protein-RNA complexes, for which at least one unbound partner is available in the docking benchmark. The data set covers a wide affinity range of eight orders of magnitude as well as four different structural classes. On average, we find the complexes with single-stranded RNA have the highest affinity, whereas the complexes with the duplex RNA have the lowest. Nevertheless, free energy gain upon binding is the highest for the complexes with ribosomal proteins and the lowest for the complexes with tRNA with an average of -5.7 cal/mol/Å2 in the entire data set. We train regression models to predict the binding affinity from the structural and physicochemical parameters of protein-RNA interfaces. The best fit model with the lowest maximum error is provided with three interface parameters: relative hydrophobicity, conformational change upon binding and relative hydration pattern. This model has been used for predicting the binding affinity on a test data set, generated using mutated structures of yeast aspartyl-tRNA synthetase, for which experimentally determined ΔG values of 40 mutations are available. The predicted ΔGempirical values highly correlate with the experimental observations. The data set provided in this study should be useful for further development of the binding affinity prediction methods. Moreover, the model developed in this study enhances our understanding on the structural basis of protein-RNA binding affinity and provides a platform to engineer protein-RNA interfaces with desired affinity.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
13
|
Su Z, Wu Y. Multiscale simulation unravel the kinetic mechanisms of inflammasome assembly. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2019; 1867:118612. [PMID: 31758956 DOI: 10.1016/j.bbamcr.2019.118612] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 11/11/2019] [Accepted: 11/18/2019] [Indexed: 01/16/2023]
Abstract
In the innate immune system, the host defense from the invasion of external pathogens triggers the inflammatory responses. Proteins involved in the inflammatory pathways were often found to aggregate into supramolecular oligomers, called 'inflammasome', mostly through the homotypic interaction between their domains that belong to the death domain superfamily. Although much has been known about the formation of these helical molecular machineries, the detailed correlation between the dynamics of their assembly and the structure of each domain is still not well understood. Using the filament formed by the PYD domains of adaptor molecule ASC as a test system, we constructed a new multiscale simulation framework to study the kinetics of inflammasome assembly. We found that the filament assembly is a multi-step, but highly cooperative process. Moreover, there are three types of binding interfaces between domain subunits in the ASCPYD filament. The multiscale simulation results suggest that dynamics of domain assembly are rooted in the primary protein sequence which defines the energetics of molecular recognition through three binding interfaces. Interface I plays a more regulatory role than the other two in mediating both the kinetics and the thermodynamics of assembly. Finally, the efficiency of our computational framework allows us to design mutants on a systematic scale and predict their impacts on filament assembly. In summary, this is, to the best of our knowledge, the first simulation method to model the spatial-temporal process of inflammasome assembly. Our work is a useful addition to a suite of existing experimental techniques to study the functions of inflammasome in innate immune system.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, United States of America
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, United States of America.
| |
Collapse
|
14
|
A Multiscale Computational Model for Simulating the Kinetics of Protein Complex Assembly. Methods Mol Biol 2019; 1764:401-411. [PMID: 29605930 DOI: 10.1007/978-1-4939-7759-8_26] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
Proteins fulfill versatile biological functions by interacting with each other and forming high-order complexes. Although the order in which protein subunits assemble is important for the biological function of their final complex, this kinetic information has received comparatively little attention in recent years. Here we describe a multiscale framework that can be used to simulate the kinetics of protein complex assembly. There are two levels of models in the framework. The structural details of a protein complex are reflected by the residue-based model, while a lower-resolution model uses a rigid-body (RB) representation to simulate the process of complex assembly. These two levels of models are integrated together, so that we are able to provide the kinetic information about complex assembly with both structural details and computational efficiency.
Collapse
|
15
|
Marín-López MA, Planas-Iglesias J, Aguirre-Plans J, Bonet J, Garcia-Garcia J, Fernandez-Fuentes N, Oliva B. On the mechanisms of protein interactions: predicting their affinity from unbound tertiary structures. Bioinformatics 2018; 34:592-598. [PMID: 29028891 PMCID: PMC5860604 DOI: 10.1093/bioinformatics/btx616] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 09/26/2017] [Indexed: 12/12/2022] Open
Abstract
Motivation The characterization of the protein–protein association mechanisms is crucial to understanding how biological processes occur. It has been previously shown that the early formation of non-specific encounters enhances the realization of the stereospecific (i.e. native) complex by reducing the dimensionality of the search process. The association rate for the formation of such complex plays a crucial role in the cell biology and depends on how the partners diffuse to be close to each other. Predicting the binding free energy of proteins provides new opportunities to modulate and control protein–protein interactions. However, existing methods require the 3D structure of the complex to predict its affinity, severely limiting their application to interactions with known structures. Results We present a new approach that relies on the unbound protein structures and protein docking to predict protein–protein binding affinities. Through the study of the docking space (i.e. decoys), the method predicts the binding affinity of the query proteins when the actual structure of the complex itself is unknown. We tested our approach on a set of globular and soluble proteins of the newest affinity benchmark, obtaining accuracy values comparable to other state-of-art methods: a 0.4 correlation coefficient between the experimental and predicted values of ΔG and an error < 3 Kcal/mol. Availability and implementation The binding affinity predictor is implemented and available at http://sbi.upf.edu/BADock and https://github.com/badocksbi/BADock. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Manuel Alejandro Marín-López
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Joan Planas-Iglesias
- Division of Metabolic and Vascular Health, University of Warwick, Coventry CV4?7AL, UK
| | - Joaquim Aguirre-Plans
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Jaume Bonet
- Laboratory of Protein Design and Immunoenginneering, School of Engineering, Ecole Polytechnique Federale de Lausanne, Lausanne 1015, Switzerland
| | - Javier Garcia-Garcia
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Narcis Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth SY23?3DA, UK
| | - Baldo Oliva
- Structural Bioinformatics Lab, Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona 08003, Spain
| |
Collapse
|
16
|
Lu B, Li C, Chen Q, Song J. ProBAPred: Inferring protein–protein binding affinity by incorporating protein sequence and structural features. J Bioinform Comput Biol 2018; 16:1850011. [PMID: 29954286 DOI: 10.1142/s0219720018500117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Protein-protein binding interaction is the most prevalent biological activity that mediates a great variety of biological processes. The increasing availability of experimental data of protein–protein interaction allows a systematic construction of protein–protein interaction networks, significantly contributing to a better understanding of protein functions and their roles in cellular pathways and human diseases. Compared to well-established classification for protein–protein interactions (PPIs), limited work has been conducted for estimating protein–protein binding free energy, which can provide informative real-value regression models for characterizing the protein–protein binding affinity. In this study, we propose a novel ensemble computational framework, termed ProBAPred (Protein–protein Binding Affinity Predictor), for quantitative estimation of protein–protein binding affinity. A large number of sequence and structural features, including physical–chemical properties, binding energy and conformation annotations, were collected and calculated from currently available protein binding complex datasets and the literature. Feature selection based on the WEKA package was performed to identify and characterize the most informative and contributing feature subsets. Experiments on the independent test showed that our ensemble method achieved the lowest Mean Absolute Error (MAE; 1.657[Formula: see text]kcal/mol) and the second highest correlation coefficient ([Formula: see text]), compared with the existing methods. The datasets and source codes of ProBAPred, and the supplementary materials in this study can be downloaded at http://lightning.med.monash.edu/probapred/ for academic use. We anticipate that the developed ProBAPred regression models can facilitate computational characterization and experimental studies of protein–protein binding affinity.
Collapse
Affiliation(s)
- Bangli Lu
- School of Computer, Electronic and Information, and State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, 100 Daxue Road, 530004 Nanning, P. R. China
| | - Chen Li
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, VIC 3800, Australia
| | - Qingfeng Chen
- School of Computer, Electronic and Information, and State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, 100 Daxue Road, 530004 Nanning, P. R. China
| | - Jiangning Song
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, VIC 3800, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, VIC 3800, Australia
- ARC Centre of Excellence for Advanced Molecular Imaging, Monash University, VIC 3800, Australia
| |
Collapse
|
17
|
Raucci R, Laine E, Carbone A. Local Interaction Signal Analysis Predicts Protein-Protein Binding Affinity. Structure 2018; 26:905-915.e4. [PMID: 29779789 DOI: 10.1016/j.str.2018.04.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/06/2018] [Accepted: 04/10/2018] [Indexed: 12/27/2022]
Abstract
Several models estimating the strength of the interaction between proteins in a complex have been proposed. By exploring the geometry of contact distribution at protein-protein interfaces, we provide an improved model of binding energy. Local interaction signal analysis (LISA) is a radial function based on terms describing favorable and non-favorable contacts obtained by density functional theory, the support-core-rim interface residue distribution, non-interacting charged residues and secondary structures contribution. The three-dimensional organization of the contacts and their contribution on localized hot-sites over the entire interaction surface were numerically evaluated. LISA achieves a correlation of 0.81 (and a root-mean-square error of 2.35 ± 0.38 kcal/mol) when tested on 125 complexes for which experimental measurements were realized. LISA's performance is stable for subsets defined by functional composition and extent of conformational changes upon complex formation. A large-scale comparison with 17 other functions demonstrated the power of the geometrical model in the understanding of complex binding.
Collapse
Affiliation(s)
- Raffaele Raucci
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Sorbonne Université, Institut des Sciences du Calcul et des Données (ISCD), 75005 Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Institut Universitaire de France, 75005 Paris, France.
| |
Collapse
|
18
|
Škrbić T, Zamuner S, Hong R, Seno F, Laio A, Trovato A. Vibrational entropy estimation can improve binding affinity prediction for non-obligatory protein complexes. Proteins 2018; 86:393-404. [DOI: 10.1002/prot.25454] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 12/22/2017] [Accepted: 01/05/2018] [Indexed: 01/10/2023]
Affiliation(s)
- Tatjana Škrbić
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Stefano Zamuner
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Rolando Hong
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Flavio Seno
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| | - Alessandro Laio
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Antonio Trovato
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| |
Collapse
|
19
|
Xie ZR, Chen J, Wu Y. Predicting Protein-protein Association Rates using Coarse-grained Simulation and Machine Learning. Sci Rep 2017; 7:46622. [PMID: 28418043 PMCID: PMC5394550 DOI: 10.1038/srep46622] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 03/21/2017] [Indexed: 12/20/2022] Open
Abstract
Protein–protein interactions dominate all major biological processes in living cells. We have developed a new Monte Carlo-based simulation algorithm to study the kinetic process of protein association. We tested our method on a previously used large benchmark set of 49 protein complexes. The predicted rate was overestimated in the benchmark test compared to the experimental results for a group of protein complexes. We hypothesized that this resulted from molecular flexibility at the interface regions of the interacting proteins. After applying a machine learning algorithm with input variables that accounted for both the conformational flexibility and the energetic factor of binding, we successfully identified most of the protein complexes with overestimated association rates and improved our final prediction by using a cross-validation test. This method was then applied to a new independent test set and resulted in a similar prediction accuracy to that obtained using the training set. It has been thought that diffusion-limited protein association is dominated by long-range interactions. Our results provide strong evidence that the conformational flexibility also plays an important role in regulating protein association. Our studies provide new insights into the mechanism of protein association and offer a computationally efficient tool for predicting its rate.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Jiawen Chen
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA
| |
Collapse
|
20
|
Integrating computational methods and experimental data for understanding the recognition mechanism and binding affinity of protein-protein complexes. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:33-38. [PMID: 28069340 DOI: 10.1016/j.pbiomolbio.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 01/04/2017] [Accepted: 01/05/2017] [Indexed: 01/09/2023]
Abstract
Protein-protein interactions perform several functions inside the cell. Understanding the recognition mechanism and binding affinity of protein-protein complexes is a challenging problem in experimental and computational biology. In this review, we focus on two aspects (i) understanding the recognition mechanism and (ii) predicting the binding affinity. The first part deals with computational techniques for identifying the binding site residues and the contribution of important interactions for understanding the recognition mechanism of protein-protein complexes in comparison with experimental observations. The second part is devoted to the methods developed for discriminating high and low affinity complexes, and predicting the binding affinity of protein-protein complexes using three-dimensional structural information and just from the amino acid sequence. The overall view enhances our understanding of the integration of experimental data and computational methods, recognition mechanism of protein-protein complexes and the binding affinity.
Collapse
|
21
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
22
|
Xue LC, Rodrigues JP, Kastritis PL, Bonvin AM, Vangone A. PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. Bioinformatics 2016; 32:3676-3678. [PMID: 27503228 DOI: 10.1093/bioinformatics/btw514] [Citation(s) in RCA: 474] [Impact Index Per Article: 59.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Revised: 07/17/2016] [Accepted: 07/30/2016] [Indexed: 11/13/2022] Open
Abstract
Gaining insights into the structural determinants of protein-protein interactions holds the key for a deeper understanding of biological functions, diseases and development of therapeutics. An important aspect of this is the ability to accurately predict the binding strength for a given protein-protein complex. Here we present PROtein binDIng enerGY prediction (PRODIGY), a web server to predict the binding affinity of protein-protein complexes from their 3D structure. The PRODIGY server implements our simple but highly effective predictive model based on intermolecular contacts and properties derived from non-interface surface. AVAILABILITY AND IMPLEMENTATION PRODIGY is freely available at: http://milou.science.uu.nl/services/PRODIGY CONTACT: a.m.j.j.bonvin@uu.nl, a.vangone@uu.nl.
Collapse
Affiliation(s)
- Li C Xue
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science - Department of Chemistry, Utrecht University, 3584CH Utrecht, The Netherlands
| | - João Pglm Rodrigues
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science - Department of Chemistry, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Panagiotis L Kastritis
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science - Department of Chemistry, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Alexandre Mjj Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science - Department of Chemistry, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Anna Vangone
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science - Department of Chemistry, Utrecht University, 3584CH Utrecht, The Netherlands
| |
Collapse
|
23
|
Yan Z, Wang J. Incorporating specificity into optimization: evaluation of SPA using CSAR 2014 and CASF 2013 benchmarks. J Comput Aided Mol Des 2016; 30:219-27. [DOI: 10.1007/s10822-016-9897-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Accepted: 01/28/2016] [Indexed: 01/04/2023]
|
24
|
Xie ZR, Chen J, Wu Y. Multiscale Model for the Assembly Kinetics of Protein Complexes. J Phys Chem B 2016; 120:621-32. [DOI: 10.1021/acs.jpcb.5b08962] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- Zhong-Ru Xie
- Department of Systems and
Computational Biology, Albert Einstein College of Medicine, 1300 Morris
Park Avenue, Bronx, New York 10461, United States
| | - Jiawen Chen
- Department of Systems and
Computational Biology, Albert Einstein College of Medicine, 1300 Morris
Park Avenue, Bronx, New York 10461, United States
| | - Yinghao Wu
- Department of Systems and
Computational Biology, Albert Einstein College of Medicine, 1300 Morris
Park Avenue, Bronx, New York 10461, United States
| |
Collapse
|
25
|
Srinivasulu YS, Wang JR, Hsu KT, Tsai MJ, Charoenkwan P, Huang WL, Huang HL, Ho SY. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes. BMC Bioinformatics 2015; 16 Suppl 18:S14. [PMID: 26681483 PMCID: PMC4682391 DOI: 10.1186/1471-2105-16-s18-s14] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.
Collapse
|
26
|
Yan Z, Wang J. Optimizing the affinity and specificity of ligand binding with the inclusion of solvation effect. Proteins 2015; 83:1632-42. [PMID: 26111900 DOI: 10.1002/prot.24848] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Revised: 06/03/2015] [Accepted: 06/21/2015] [Indexed: 01/08/2023]
Abstract
Solvation effect is an important factor for protein-ligand binding in aqueous water. Previous scoring function of protein-ligand interactions rarely incorporates the solvation model into the quantification of protein-ligand interactions, mainly due to the immense computational cost, especially in the structure-based virtual screening, and nontransferable application of independently optimized atomic solvation parameters. In order to overcome these barriers, we effectively combine knowledge-based atom-pair potentials and the atomic solvation energy of charge-independent implicit solvent model in the optimization of binding affinity and specificity. The resulting scoring functions with optimized atomic solvation parameters is named as specificity and affinity with solvation effect (SPA-SE). The performance of SPA-SE is evaluated and compared to 20 other scoring functions, as well as SPA. The comparative results show that SPA-SE outperforms all other scoring functions in binding affinity prediction and "native" pose identification. Our optimization validates that solvation effect is an important regulator to the stability and specificity of protein-ligand binding. The development strategy of SPA-SE sets an example for other scoring function to account for the solvation effect in biomolecular recognitions.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences Changchun, Jilin, 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences Changchun, Jilin, 130022, China.,Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York, 11794-3400, USA
| |
Collapse
|
27
|
Vangone A, Bonvin AM. Contacts-based prediction of binding affinity in protein-protein complexes. eLife 2015. [PMID: 26193119 DOI: 10.7554/elife07454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023] Open
Abstract
Almost all critical functions in cells rely on specific protein-protein interactions. Understanding these is therefore crucial in the investigation of biological systems. Despite all past efforts, we still lack a thorough understanding of the energetics of association of proteins. Here, we introduce a new and simple approach to predict binding affinity based on functional and structural features of the biological system, namely the network of interfacial contacts. We assess its performance against a protein-protein binding affinity benchmark and show that both experimental methods used for affinity measurements and conformational changes have a strong impact on prediction accuracy. Using a subset of complexes with reliable experimental binding affinities and combining our contacts and contact-types-based model with recent observations on the role of the non-interacting surface in protein-protein interactions, we reach a high prediction accuracy for such a diverse dataset outperforming all other tested methods.
Collapse
Affiliation(s)
- Anna Vangone
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science-Chemistry, Utrecht University, Utrecht, Netherlands
| | - Alexandre Mjj Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science-Chemistry, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
28
|
Vangone A, Bonvin AMJJ. Contacts-based prediction of binding affinity in protein-protein complexes. eLife 2015; 4:e07454. [PMID: 26193119 PMCID: PMC4523921 DOI: 10.7554/elife.07454] [Citation(s) in RCA: 309] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 07/08/2015] [Indexed: 12/13/2022] Open
Abstract
Almost all critical functions in cells rely on specific protein-protein interactions. Understanding these is therefore crucial in the investigation of biological systems. Despite all past efforts, we still lack a thorough understanding of the energetics of association of proteins. Here, we introduce a new and simple approach to predict binding affinity based on functional and structural features of the biological system, namely the network of interfacial contacts. We assess its performance against a protein-protein binding affinity benchmark and show that both experimental methods used for affinity measurements and conformational changes have a strong impact on prediction accuracy. Using a subset of complexes with reliable experimental binding affinities and combining our contacts and contact-types-based model with recent observations on the role of the non-interacting surface in protein-protein interactions, we reach a high prediction accuracy for such a diverse dataset outperforming all other tested methods.
Collapse
Affiliation(s)
- Anna Vangone
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, Netherlands
| | - Alexandre MJJ Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
29
|
Pucci F, Bernaerts K, Teheux F, Gilis D, Rooman M. Symmetry Principles in Optimization Problems: an application to Protein Stability Prediction. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.ifacol.2015.05.068] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
30
|
Erijman A, Rosenthal E, Shifman JM. How structure defines affinity in protein-protein interactions. PLoS One 2014; 9:e110085. [PMID: 25329579 PMCID: PMC4199723 DOI: 10.1371/journal.pone.0110085] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Accepted: 09/14/2014] [Indexed: 01/29/2023] Open
Abstract
Protein-protein interactions (PPI) in nature are conveyed by a multitude of binding modes involving various surfaces, secondary structure elements and intermolecular interactions. This diversity results in PPI binding affinities that span more than nine orders of magnitude. Several early studies attempted to correlate PPI binding affinities to various structure-derived features with limited success. The growing number of high-resolution structures, the appearance of more precise methods for measuring binding affinities and the development of new computational algorithms enable more thorough investigations in this direction. Here, we use a large dataset of PPI structures with the documented binding affinities to calculate a number of structure-based features that could potentially define binding energetics. We explore how well each calculated biophysical feature alone correlates with binding affinity and determine the features that could be used to distinguish between high-, medium- and low- affinity PPIs. Furthermore, we test how various combinations of features could be applied to predict binding affinity and observe a slow improvement in correlation as more features are incorporated into the equation. In addition, we observe a considerable improvement in predictions if we exclude from our analysis low-resolution and NMR structures, revealing the importance of capturing exact intermolecular interactions in our calculations. Our analysis should facilitate prediction of new interactions on the genome scale, better characterization of signaling networks and design of novel binding partners for various target proteins.
Collapse
Affiliation(s)
- Ariel Erijman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Eran Rosenthal
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Julia M. Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- * E-mail:
| |
Collapse
|
31
|
Park J, Saitou K. ROTAS: a rotamer-dependent, atomic statistical potential for assessment and prediction of protein structures. BMC Bioinformatics 2014; 15:307. [PMID: 25236673 PMCID: PMC4262145 DOI: 10.1186/1471-2105-15-307] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2014] [Accepted: 09/09/2014] [Indexed: 12/31/2022] Open
Abstract
Background Multibody potentials accounting for cooperative effects of molecular interactions have shown better accuracy than typical pairwise potentials. The main challenge in the development of such potentials is to find relevant structural features that characterize the tightly folded proteins. Also, the side-chains of residues adopt several specific, staggered conformations, known as rotamers within protein structures. Different molecular conformations result in different dipole moments and induce charge reorientations. However, until now modeling of the rotameric state of residues had not been incorporated into the development of multibody potentials for modeling non-bonded interactions in protein structures. Results In this study, we develop a new multibody statistical potential which can account for the influence of rotameric states on the specificity of atomic interactions. In this potential, named “rotamer-dependent atomic statistical potential” (ROTAS), the interaction between two atoms is specified by not only the distance and relative orientation but also by two state parameters concerning the rotameric state of the residues to which the interacting atoms belong. It was clearly found that the rotameric state is correlated to the specificity of atomic interactions. Such rotamer-dependencies are not limited to specific type or certain range of interactions. The performance of ROTAS was tested using 13 sets of decoys and was compared to those of existing atomic-level statistical potentials which incorporate orientation-dependent energy terms. The results show that ROTAS performs better than other competing potentials not only in native structure recognition, but also in best model selection and correlation coefficients between energy and model quality. Conclusions A new multibody statistical potential, ROTAS accounting for the influence of rotameric states on the specificity of atomic interactions was developed and tested on decoy sets. The results show that ROTAS has improved ability to recognize native structure from decoy models compared to other potentials. The effectiveness of ROTAS may provide insightful information for the development of many applications which require accurate side-chain modeling such as protein design, mutation analysis, and docking simulation. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-307) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Kazuhiro Saitou
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
32
|
Yugandhar K, Gromiha MM. Protein–protein binding affinity prediction from amino acid sequence. Bioinformatics 2014; 30:3583-9. [DOI: 10.1093/bioinformatics/btu580] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
|
33
|
Proteins Feel More Than They See: Fine-Tuning of Binding Affinity by Properties of the Non-Interacting Surface. J Mol Biol 2014; 426:2632-52. [DOI: 10.1016/j.jmb.2014.04.017] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Revised: 03/11/2014] [Accepted: 04/17/2014] [Indexed: 11/21/2022]
|
34
|
A functional feature analysis on diverse protein–protein interactions: application for the prediction of binding affinity. J Comput Aided Mol Des 2014; 28:619-29. [DOI: 10.1007/s10822-014-9746-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 04/26/2014] [Indexed: 11/25/2022]
|
35
|
Yugandhar K, Gromiha MM. Feature selection and classification of protein-protein complexes based on their binding affinities using machine learning approaches. Proteins 2014; 82:2088-96. [PMID: 24648146 DOI: 10.1002/prot.24564] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 03/14/2014] [Indexed: 12/16/2022]
Abstract
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions.
Collapse
Affiliation(s)
- K Yugandhar
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
| | | |
Collapse
|
36
|
Yan Z, Wang J. Optimizing scoring function of protein-nucleic acid interactions with both affinity and specificity. PLoS One 2013; 8:e74443. [PMID: 24098651 PMCID: PMC3787031 DOI: 10.1371/journal.pone.0074443] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2013] [Accepted: 08/02/2013] [Indexed: 12/14/2022] Open
Abstract
Protein-nucleic acid (protein-DNA and protein-RNA) recognition is fundamental to the regulation of gene expression. Determination of the structures of the protein-nucleic acid recognition and insight into their interactions at molecular level are vital to understanding the regulation function. Recently, quantitative computational approach has been becoming an alternative of experimental technique for predicting the structures and interactions of biomolecular recognition. However, the progress of protein-nucleic acid structure prediction, especially protein-RNA, is far behind that of the protein-ligand and protein-protein structure predictions due to the lack of reliable and accurate scoring function for quantifying the protein-nucleic acid interactions. In this work, we developed an accurate scoring function (named as SPA-PN, SPecificity and Affinity of the Protein-Nucleic acid interactions) for protein-nucleic acid interactions by incorporating both the specificity and affinity into the optimization strategy. Specificity and affinity are two requirements of highly efficient and specific biomolecular recognition. Previous quantitative descriptions of the biomolecular interactions considered the affinity, but often ignored the specificity owing to the challenge of specificity quantification. We applied our concept of intrinsic specificity to connect the conventional specificity, which circumvents the challenge of specificity quantification. In addition to the affinity optimization, we incorporated the quantified intrinsic specificity into the optimization strategy of SPA-PN. The testing results and comparisons with other scoring functions validated that SPA-PN performs well on both the prediction of binding affinity and identification of native conformation. In terms of its performance, SPA-PN can be widely used to predict the protein-nucleic acid structures and quantify their interactions.
Collapse
Affiliation(s)
- Zhiqiang Yan
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York, United States of America
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
| | - Jin Wang
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York, United States of America
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
| |
Collapse
|
37
|
Moal IH, Fernandez-Recio J. Intermolecular Contact Potentials for Protein-Protein Interactions Extracted from Binding Free Energy Changes upon Mutation. J Chem Theory Comput 2013; 9:3715-27. [PMID: 26584123 DOI: 10.1021/ct400295z] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Understanding and predicting the energetics of protein-protein interactions is fundamental to the structural modeling of protein complexes. Binding free energy can be approximated as a sum of pairwise atomic or residue contact energies, which are commonly inferred from contact frequencies observed in experimental protein structures. However, such statistically inferred potentials require certain assumptions and approximation. Here, we explore the possibility of deriving atomic and residue contact potentials directly from experimental binding free energy changes following mutation and present a number of such potentials. The first set of potentials is obtained by unweighted least-squares fitting and bootsrap aggregating. The second set is calculated using a weighting scheme optimized against absolute binding affinity data, so as to account for the over-representation of certain complexes, residues, and families of interactions. The congruence of the potentials with known physical chemistry is investigated. The potentials are further validated by ranking and clustering protein-protein docking poses.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Supercomputing Center , C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Juan Fernandez-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Supercomputing Center , C/Jordi Girona 29, 08034 Barcelona, Spain
| |
Collapse
|
38
|
Yan Z, Guo L, Hu L, Wang J. Specificity and affinity quantification of protein-protein interactions. Bioinformatics 2013; 29:1127-33. [DOI: 10.1093/bioinformatics/btt121] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
39
|
Audie J, Swanson J. Advances in the Prediction of Protein-Peptide Binding Affinities: Implications for Peptide-Based Drug Discovery. Chem Biol Drug Des 2012; 81:50-60. [DOI: 10.1111/cbdd.12076] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
40
|
Kastritis PL, Bonvin AMJJ. On the binding affinity of macromolecular interactions: daring to ask why proteins interact. J R Soc Interface 2012; 10:20120835. [PMID: 23235262 PMCID: PMC3565702 DOI: 10.1098/rsif.2012.0835] [Citation(s) in RCA: 276] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Interactions between proteins are orchestrated in a precise and time-dependent manner, underlying cellular function. The binding affinity, defined as the strength of these interactions, is translated into physico-chemical terms in the dissociation constant (Kd), the latter being an experimental measure that determines whether an interaction will be formed in solution or not. Predicting binding affinity from structural models has been a matter of active research for more than 40 years because of its fundamental role in drug development. However, all available approaches are incapable of predicting the binding affinity of protein–protein complexes from coordinates alone. Here, we examine both theoretical and experimental limitations that complicate the derivation of structure–affinity relationships. Most work so far has concentrated on binary interactions. Systems of increased complexity are far from being understood. The main physico-chemical measure that relates to binding affinity is the buried surface area, but it does not hold for flexible complexes. For the latter, there must be a significant entropic contribution that will have to be approximated in the future. We foresee that any theoretical modelling of these interactions will have to follow an integrative approach considering the biology, chemistry and physics that underlie protein–protein recognition.
Collapse
Affiliation(s)
- Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science, Chemistry, Utrecht University, , Padualaan 8, Utrecht, The Netherlands
| | | |
Collapse
|
41
|
Vreven T, Hwang H, Pierce BG, Weng Z. Prediction of protein-protein binding free energies. Protein Sci 2012; 21:396-404. [PMID: 22238219 DOI: 10.1002/pro.2027] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2011] [Revised: 12/23/2011] [Accepted: 01/04/2012] [Indexed: 11/09/2022]
Abstract
We present an energy function for predicting binding free energies of protein-protein complexes, using the three-dimensional structures of the complex and unbound proteins as input. Our function is a linear combination of nine terms and achieves a correlation coefficient of 0.63 with experimental measurements when tested on a benchmark of 144 complexes using leave-one-out cross validation. Although we systematically tested both atomic and residue-based scoring functions, the selected function is dominated by residue-based terms. Our function is stable for subsets of the benchmark stratified by experimental pH and extent of conformational change upon complex formation, with correlation coefficients ranging from 0.61 to 0.66.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | | | | | | |
Collapse
|
42
|
Moal IH, Bates PA. Kinetic rate constant prediction supports the conformational selection mechanism of protein binding. PLoS Comput Biol 2012; 8:e1002351. [PMID: 22253587 PMCID: PMC3257286 DOI: 10.1371/journal.pcbi.1002351] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Accepted: 11/29/2011] [Indexed: 12/24/2022] Open
Abstract
The prediction of protein-protein kinetic rate constants provides a fundamental test of our understanding of molecular recognition, and will play an important role in the modeling of complex biological systems. In this paper, a feature selection and regression algorithm is applied to mine a large set of molecular descriptors and construct simple models for association and dissociation rate constants using empirical data. Using separate test data for validation, the predicted rate constants can be combined to calculate binding affinity with accuracy matching that of state of the art empirical free energy functions. The models show that the rate of association is linearly related to the proportion of unbound proteins in the bound conformational ensemble relative to the unbound conformational ensemble, indicating that the binding partners must adopt a geometry near to that of the bound prior to binding. Mirroring the conformational selection and population shift mechanism of protein binding, the models provide a strong separate line of evidence for the preponderance of this mechanism in protein-protein binding, complementing structural and theoretical studies.
Collapse
Affiliation(s)
- Iain H. Moal
- Protein Interactions and Docking Laboratory, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London, United Kingdom
| |
Collapse
|
43
|
Li X, Zhu M, Li X, Wang HQ, Wang S. Protein-Protein Binding Affinity Prediction Based on an SVR Ensemble. LECTURE NOTES IN COMPUTER SCIENCE 2012. [DOI: 10.1007/978-3-642-31588-6_19] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
44
|
Tian F, Lv Y, Yang L. Structure-based prediction of protein–protein binding affinity with consideration of allosteric effect. Amino Acids 2011; 43:531-43. [DOI: 10.1007/s00726-011-1101-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2011] [Accepted: 09/21/2011] [Indexed: 11/28/2022]
|
45
|
Moal IH, Agius R, Bates PA. Protein-protein binding affinity prediction on a diverse set of structures. Bioinformatics 2011; 27:3002-9. [PMID: 21903632 DOI: 10.1093/bioinformatics/btr513] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2024] Open
Abstract
MOTIVATION Accurate binding free energy functions for protein-protein interactions are imperative for a wide range of purposes. Their construction is predicated upon ascertaining the factors that influence binding and their relative importance. A recent benchmark of binding affinities has allowed, for the first time, the evaluation and construction of binding free energy models using a diverse set of complexes, and a systematic assessment of our ability to model the energetics of conformational changes. RESULTS We construct a large set of molecular descriptors using commonly available tools, introducing the use of energetic factors associated with conformational changes and disorder to order transitions, as well as features calculated on structural ensembles. The descriptors are used to train and test a binding free energy model using a consensus of four machine learning algorithms, whose performance constitutes a significant improvement over the other state of the art empirical free energy functions tested. The internal workings of the learners show how the descriptors are used, illuminating the determinants of protein-protein binding. AVAILABILITY The molecular descriptor set and descriptor values for all complexes are available in the Supplementary Material. A web server for the learners and coordinates for the bound and unbound structures can be accessed from the website: http://bmm.cancerresearchuk.org/~Affinity. CONTACT paul.bates@cancer.org.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Iain H Moal
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | | | | |
Collapse
|
46
|
Melquiond AS, Karaca E, Kastritis PL, Bonvin AM. Next challenges in protein-protein docking: from proteome to interactome and beyond. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.91] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
47
|
Liu S, Vakser IA. DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking. BMC Bioinformatics 2011; 12:280. [PMID: 21745398 PMCID: PMC3145612 DOI: 10.1186/1471-2105-12-280] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2011] [Accepted: 07/11/2011] [Indexed: 11/13/2022] Open
Abstract
Background Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state. Results The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the DOCKGROUND resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results. Conclusions A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials.
Collapse
Affiliation(s)
- Shiyong Liu
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | | |
Collapse
|
48
|
Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AMJJ, Janin J. A structure-based benchmark for protein-protein binding affinity. Protein Sci 2011; 20:482-91. [PMID: 21213247 PMCID: PMC3064828 DOI: 10.1002/pro.580] [Citation(s) in RCA: 219] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Revised: 12/15/2010] [Accepted: 12/16/2010] [Indexed: 11/06/2022]
Abstract
We have assembled a nonredundant set of 144 protein-protein complexes that have high-resolution structures available for both the complexes and their unbound components, and for which dissociation constants have been measured by biophysical methods. The set is diverse in terms of the biological functions it represents, with complexes that involve G-proteins and receptor extracellular domains, as well as antigen/antibody, enzyme/inhibitor, and enzyme/substrate complexes. It is also diverse in terms of the partners' affinity for each other, with K(d) ranging between 10(-5) and 10(-14) M. Nine pairs of entries represent closely related complexes that have a similar structure, but a very different affinity, each pair comprising a cognate and a noncognate assembly. The unbound structures of the component proteins being available, conformation changes can be assessed. They are significant in most of the complexes, and large movements or disorder-to-order transitions are frequently observed. The set may be used to benchmark biophysical models aiming to relate affinity to structure in protein-protein interactions, taking into account the reactants and the conformation changes that accompany the association reaction, instead of just the final product.
Collapse
Affiliation(s)
- Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University3584CH Utrecht, The Netherlands
| | - Iain H Moal
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, Lincoln's Inn Fields LaboratoriesLondon WC2A 3LY, United Kingdom
| | - Howook Hwang
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical SchoolWorcester, Massachusetts 01605
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical SchoolWorcester, Massachusetts 01605
| | - Paul A Bates
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, Lincoln's Inn Fields LaboratoriesLondon WC2A 3LY, United Kingdom
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University3584CH Utrecht, The Netherlands
| | - Joël Janin
- Yeast Structural Genomics, IBBMC UMR 8619, Université Paris-Sud91405 Orsay, France
| |
Collapse
|
49
|
Hamelryck T, Borg M, Paluszewski M, Paulsen J, Frellsen J, Andreetta C, Boomsma W, Bottaro S, Ferkinghoff-Borg J. Potentials of mean force for protein structure prediction vindicated, formalized and generalized. PLoS One 2010; 5:e13714. [PMID: 21103041 PMCID: PMC2978081 DOI: 10.1371/journal.pone.0013714] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Accepted: 10/04/2010] [Indexed: 11/26/2022] Open
Abstract
Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances – so-called “potentials of mean force” (PMFs) – have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state – a necessary component of these potentials – is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities “reference ratio distributions” deriving from the application of the “reference ratio method.” This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
Collapse
Affiliation(s)
- Thomas Hamelryck
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- * E-mail: (TH); (JFB)
| | - Mikael Borg
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Martin Paluszewski
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jonas Paulsen
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jes Frellsen
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Christian Andreetta
- Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Wouter Boomsma
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Sandro Bottaro
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
| | - Jesper Ferkinghoff-Borg
- Biomedical Engineering, Technical University of Denmark (DTU) Elektro, Technical University of Denmark, Lyngby, Denmark
- * E-mail: (TH); (JFB)
| |
Collapse
|
50
|
A Residual Level Potential of Mean Force Based Approach to Predict Protein-Protein Interaction Affinity. LECTURE NOTES IN COMPUTER SCIENCE 2010. [DOI: 10.1007/978-3-642-14922-1_85] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|