51
|
Zhang J, Ghadermarzi S, Kurgan L. Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins. Bioinformatics 2021; 36:4729-4738. [PMID: 32860044 DOI: 10.1093/bioinformatics/btaa573] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 05/22/2020] [Accepted: 06/10/2020] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). RESULTS Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to cross-over, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs. AVAILABILITY AND IMPLEMENTATION HybridPBRpred webserver, benchmark dataset and supplementary information are available at http://biomine.cs.vcu.edu/servers/hybridPBRpred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
52
|
McCafferty CL, Marcotte EM, Taylor DW. Simplified geometric representations of protein structures identify complementary interaction interfaces. Proteins 2021; 89:348-360. [PMID: 33140424 PMCID: PMC7855953 DOI: 10.1002/prot.26020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 09/22/2020] [Accepted: 10/25/2020] [Indexed: 12/12/2022]
Abstract
Protein-protein interactions are critical to protein function, but three-dimensional (3D) arrangements of interacting proteins have proven hard to predict, even given the identities and 3D structures of the interacting partners. Specifically, identifying the relevant pairwise interaction surfaces remains difficult, often relying on shape complementarity with molecular docking while accounting for molecular motions to optimize rigid 3D translations and rotations. However, such approaches can be computationally expensive, and faster, less accurate approximations may prove useful for large-scale prediction and assembly of 3D structures of multi-protein complexes. We asked if a reduced representation of protein geometry retains enough information about molecular properties to predict pairwise protein interaction interfaces that are tolerant of limited structural rearrangements. Here, we describe a reduced representation of 3D protein accessible surfaces on which molecular properties such as charge, hydrophobicity, and evolutionary rate can be easily mapped, implemented in the MorphProt package. Pairs of surfaces are compared to rapidly assess partner-specific potential surface complementarity. On two available benchmarks of 185 overall known protein complexes, we observe predictions comparable to other structure-based tools at correctly identifying protein interaction surfaces. Furthermore, we examined the effect of molecular motion through normal mode simulation on a benchmark receptor-ligand pair and observed no marked loss of predictive accuracy for distortions of up to 6 Å Cα-RMSD. Thus, a shape reduction of protein surfaces retains considerable information about surface complementarity, offers enhanced speed of comparison relative to more complex geometric representations, and exhibits tolerance to conformational changes.
Collapse
Affiliation(s)
- Caitlyn L. McCafferty
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
| | - Edward M. Marcotte
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
| | - David W. Taylor
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
- LIVESTRONG Cancer InstitutesDell Medical SchoolAustinTexasUSA
| |
Collapse
|
53
|
Bitencourt-Ferreira G, Duarte da Silva A, Filgueira de Azevedo W. Application of Machine Learning Techniques to Predict Binding Affinity for Drug Targets: A Study of Cyclin-Dependent Kinase 2. Curr Med Chem 2021; 28:253-265. [PMID: 31729287 DOI: 10.2174/2213275912666191102162959] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 08/22/2019] [Accepted: 09/24/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. OBJECTIVE Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. METHODS We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. RESULTS Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. CONCLUSION Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.
Collapse
Affiliation(s)
- Gabriela Bitencourt-Ferreira
- Laboratory of Computational Systems Biology. Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900 , Brazil
| | - Amauri Duarte da Silva
- Specialization Program in Bioinformatics. Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900, Brazil
| | - Walter Filgueira de Azevedo
- Laboratory of Computational Systems Biology. Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900 , Brazil
| |
Collapse
|
54
|
Zhang F, Shi W, Zhang J, Zeng M, Li M, Kurgan L. PROBselect: accurate prediction of protein-binding residues from proteins sequences via dynamic predictor selection. Bioinformatics 2020; 36:i735-i744. [DOI: 10.1093/bioinformatics/btaa806] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2020] [Indexed: 12/13/2022] Open
Abstract
Abstract
Motivation
Knowledge of protein-binding residues (PBRs) improves our understanding of protein−protein interactions, contributes to the prediction of protein functions and facilitates protein−protein docking calculations. While many sequence-based predictors of PBRs were published, they offer modest levels of predictive performance and most of them cross-predict residues that interact with other partners. One unexplored option to improve the predictive quality is to design consensus predictors that combine results produced by multiple methods.
Results
We empirically investigate predictive performance of a representative set of nine predictors of PBRs. We report substantial differences in predictive quality when these methods are used to predict individual proteins, which contrast with the dataset-level benchmarks that are currently used to assess and compare these methods. Our analysis provides new insights for the cross-prediction concern, dissects complementarity between predictors and demonstrates that predictive performance of the top methods depends on unique characteristics of the input protein sequence. Using these insights, we developed PROBselect, first-of-its-kind consensus predictor of PBRs. Our design is based on the dynamic predictor selection at the protein level, where the selection relies on regression-based models that accurately estimate predictive performance of selected predictors directly from the sequence. Empirical assessment using a low-similarity test dataset shows that PROBselect provides significantly improved predictive quality when compared with the current predictors and conventional consensuses that combine residue-level predictions. Moreover, PROBselect informs the users about the expected predictive quality for the prediction generated from a given input protein.
Availability and implementation
PROBselect is available at http://bioinformatics.csu.edu.cn/PROBselect/home/index.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fuhao Zhang
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Wenbo Shi
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Min Zeng
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
55
|
Milanetti E, Miotto M, Di Rienzo L, Monti M, Gosti G, Ruocco G. 2D Zernike polynomial expansion: Finding the protein-protein binding regions. Comput Struct Biotechnol J 2020; 19:29-36. [PMID: 33363707 PMCID: PMC7750141 DOI: 10.1016/j.csbj.2020.11.051] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/26/2020] [Accepted: 11/28/2020] [Indexed: 01/26/2023] Open
Abstract
We present a method for efficiently and effectively assessing whether and where two proteins can interact with each other to form a complex. This is still largely an open problem, even for those relatively few cases where the 3D structure of both proteins is known. In fact, even if much of the information about the interaction is encoded in the chemical and geometric features of the structures, the set of possible contact patches and of their relative orientations are too large to be computationally affordable in a reasonable time, thus preventing the compilation of reliable interactome. Our method is able to rapidly and quantitatively measure the geometrical shape complementarity between interacting proteins, comparing their molecular iso-electron density surfaces expanding the surface patches in term of 2D Zernike polynomials. We first test the method against the real binding region of a large dataset of known protein complexes, reaching a success rate of 0.72. We then apply the method for the blind recognition of binding sites, identifying the real region of interaction in about 60% of the analyzed cases. Finally, we investigate how the efficiency in finding the right binding region depends on the surface roughness as a function of the expansion order.
Collapse
Affiliation(s)
- Edoardo Milanetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy.,Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Mattia Miotto
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy.,Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Michele Monti
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain.,RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genoa, Italy
| | - Giorgio Gosti
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Giancarlo Ruocco
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy.,Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| |
Collapse
|
56
|
Chen G, Seukep AJ, Guo M. Recent Advances in Molecular Docking for the Research and Discovery of Potential Marine Drugs. Mar Drugs 2020; 18:md18110545. [PMID: 33143025 PMCID: PMC7692358 DOI: 10.3390/md18110545] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 10/27/2020] [Accepted: 10/28/2020] [Indexed: 12/28/2022] Open
Abstract
Marine drugs have long been used and exhibit unique advantages in clinical practices. Among the marine drugs that have been approved by the Food and Drug Administration (FDA), the protein–ligand interactions, such as cytarabine–DNA polymerase, vidarabine–adenylyl cyclase, and eribulin–tubulin complexes, are the important mechanisms of action for their efficacy. However, the complex and multi-targeted components in marine medicinal resources, their bio-active chemical basis, and mechanisms of action have posed huge challenges in the discovery and development of marine drugs so far, which need to be systematically investigated in-depth. Molecular docking could effectively predict the binding mode and binding energy of the protein–ligand complexes and has become a major method of computer-aided drug design (CADD), hence this powerful tool has been widely used in many aspects of the research on marine drugs. This review introduces the basic principles and software of the molecular docking and further summarizes the applications of this method in marine drug discovery and design, including the early virtual screening in the drug discovery stage, drug target discovery, potential mechanisms of action, and the prediction of drug metabolism. In addition, this review would also discuss and prospect the problems of molecular docking, in order to provide more theoretical basis for clinical practices and new marine drug research and development.
Collapse
Affiliation(s)
- Guilin Chen
- Key Laboratory of Plant Germplasm Enhancement & Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China; (G.C.); (A.J.S.)
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan 430074, China
- Innovation Academy for Drug Discovery and Development, Chinese Academy of Sciences, Shanghai 201203, China
| | - Armel Jackson Seukep
- Key Laboratory of Plant Germplasm Enhancement & Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China; (G.C.); (A.J.S.)
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan 430074, China
- Innovation Academy for Drug Discovery and Development, Chinese Academy of Sciences, Shanghai 201203, China
- Department of Biomedical Sciences, Faculty of Health Sciences, University of Buea, P.O. Box 63 Buea, Cameroon
| | - Mingquan Guo
- Key Laboratory of Plant Germplasm Enhancement & Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China; (G.C.); (A.J.S.)
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan 430074, China
- Innovation Academy for Drug Discovery and Development, Chinese Academy of Sciences, Shanghai 201203, China
- Correspondence: ; Tel.: +86-27-8770-0850
| |
Collapse
|
57
|
A Two-Layer SVM Ensemble-Classifier to Predict Interface Residue Pairs of Protein Trimers. MOLECULES (BASEL, SWITZERLAND) 2020; 25:molecules25194353. [PMID: 32977371 PMCID: PMC7582526 DOI: 10.3390/molecules25194353] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 09/16/2020] [Accepted: 09/18/2020] [Indexed: 11/29/2022]
Abstract
Study of interface residue pairs is important for understanding the interactions between monomers inside a trimer protein–protein complex. We developed a two-layer support vector machine (SVM) ensemble-classifier that considers physicochemical and geometric properties of amino acids and the influence of surrounding amino acids. Different descriptors and different combinations may give different prediction results. We propose feature combination engineering based on correlation coefficients and F-values. The accuracy of our method is 65.38% in independent test set, indicating biological significance. Our predictions are consistent with the experimental results. It shows the effectiveness and reliability of our method to predict interface residue pairs of protein trimers.
Collapse
|
58
|
Han Y, Cheng L, Sun W. Analysis of Protein-Protein Interaction Networks through Computational Approaches. Protein Pept Lett 2020; 27:265-278. [PMID: 31692419 DOI: 10.2174/0929866526666191105142034] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 05/08/2019] [Accepted: 09/26/2019] [Indexed: 01/02/2023]
Abstract
The interactions among proteins and genes are extremely important for cellular functions. Molecular interactions at protein or gene levels can be used to construct interaction networks in which the interacting species are categorized based on direct interactions or functional similarities. Compared with the limited experimental techniques, various computational tools make it possible to analyze, filter, and combine the interaction data to get comprehensive information about the biological pathways. By the efficient way of integrating experimental findings in discovering PPIs and computational techniques for prediction, the researchers have been able to gain many valuable data on PPIs, including some advanced databases. Moreover, many useful tools and visualization programs enable the researchers to establish, annotate, and analyze biological networks. We here review and list the computational methods, databases, and tools for protein-protein interaction prediction.
Collapse
Affiliation(s)
- Ying Han
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weiju Sun
- Cardiovascular Department, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
59
|
Sun D, Gong X. Tetramer protein complex interface residue pairs prediction with LSTM combined with graph representations. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020; 1868:140504. [PMID: 32717382 DOI: 10.1016/j.bbapap.2020.140504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 06/30/2020] [Accepted: 07/16/2020] [Indexed: 10/23/2022]
Abstract
MOTIVATION Protein-protein interactions are important for many biological processes. Theoretical understanding of the structurally determining factors of interaction sites will help to understand the underlying mechanism of protein-protein interactions. Taking advantage of advanced mathematical methods to correctly predict interaction sites will be useful. Although some previous studies have been devoted to the interaction interface of protein monomer and the interface residues between chains of protein dimers, very few studies about the interface residues prediction of protein multimers, including trimers, tetramer and even more monomers in a large protein complex. As we all know, a large number of proteins function with the form of multibody protein complexes. And the complexity of the protein multimers structure causes the difficulty of interface residues prediction on them. So, we hope to build a method for the prediction of protein tetramer interface residue pairs. RESULTS Here, we developed a new deep network based on LSTM network combining with graph to predict protein tetramers interaction interface residue pairs. On account of the protein structure data is not the same as the image or video data which is well-arranged matrices, namely the Euclidean Structure mentioned in many researches. Because the Non-Euclidean Structure data can't keep the translation invariance, and we hope to extract some spatial features from this kind of data applying on deep learning, an algorithm combining with graph was developed to predict the interface residue pairs of protein interactions based on a topological graph building a relationship between vertexes and edges in graph theory combining multilayer Long Short-Term Memory network. First, selecting the training and test samples from the Protein Data Bank, and then extracting the physicochemical property features and the geometric features of surface residue associated with interfacial properties. Subsequently, we transform the protein multimers data to topological graphs and predict protein interaction interface residue pairs using the model. In addition, different types of evaluation indicators verified its validity.
Collapse
Affiliation(s)
- Daiwen Sun
- Mathematics Intelligence Application LAB, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, PR China
| | - Xinqi Gong
- Mathematics Intelligence Application LAB, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, PR China; Beijing Advanced Innovation Center for Structural Biology, Tsinghua Univeristy, Beijing 100091, PR China.
| |
Collapse
|
60
|
Savojardo C, Martelli PL, Casadio R. Protein–Protein Interaction Methods and Protein Phase Separation. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-011720-104428] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the last decade, newly developed experimental methods have made it possible to highlight that macromolecules in the cell milieu physically interact to support physiology. This has shifted the problem of protein–protein interaction from a microscopic, electron-density scale to a mesoscopic one. Further, nowadays there is increasing evidence that proteins in the nucleus and in the cytoplasm can aggregate in membraneless organelles for different physiological reasons. In this scenario, it is urgent to face the problem of biomolecule functional annotation with efficient computational methods, suited to extract knowledge from reliable data and transfer information across different domains of investigation. Here, we revise the present state of the art of our knowledge of protein–protein interaction and the computational methods that differently implement it. Furthermore, we explore experimental and computational features of a set of proteins involved in phase separation.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
- Institute of Biomembranes, Bioenergetics, and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), 70126 Bari, Italy
| |
Collapse
|
61
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|
62
|
Barreto CAV, Baptista SJ, Preto AJ, Matos-Filipe P, Mourão J, Melo R, Moreira I. Prediction and targeting of GPCR oligomer interfaces. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 169:105-149. [PMID: 31952684 DOI: 10.1016/bs.pmbts.2019.11.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
GPCR oligomerization has emerged as a hot topic in the GPCR field in the last years. Receptors that are part of these oligomers can influence each other's function, although it is not yet entirely understood how these interactions work. The existence of such a highly complex network of interactions between GPCRs generates the possibility of alternative targets for new therapeutic approaches. However, challenges still exist in the characterization of these complexes, especially at the interface level. Different experimental approaches, such as FRET or BRET, are usually combined to study GPCR oligomer interactions. Computational methods have been applied as a useful tool for retrieving information from GPCR sequences and the few X-ray-resolved oligomeric structures that are accessible, as well as for predicting new and trustworthy GPCR oligomeric interfaces. Machine-learning (ML) approaches have recently helped with some hindrances of other methods. By joining and evaluating multiple structure-, sequence- and co-evolution-based features on the same algorithm, it is possible to dilute the issues of particular structures and residues that arise from the experimental methodology into all-encompassing algorithms capable of accurately predict GPCR-GPCR interfaces. All these methods used as a single or a combined approach provide useful information about GPCR oligomerization and its role in GPCR function and dynamics. Altogether, we present experimental, computational and machine-learning methods used to study oligomers interfaces, as well as strategies that have been used to target these dynamic complexes.
Collapse
Affiliation(s)
- Carlos A V Barreto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Salete J Baptista
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - António José Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Institute for Interdisciplinary Research, University of Coimbra, Coimbra, Portugal
| | - Rita Melo
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - Irina Moreira
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Science and Technology Faculty, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
63
|
Geng C, Jung Y, Renaud N, Honavar V, Bonvin AMJJ, Xue LC. iScore: a novel graph kernel-based function for scoring protein-protein docking models. Bioinformatics 2020; 36:112-121. [PMID: 31199455 PMCID: PMC6956772 DOI: 10.1093/bioinformatics/btz496] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 05/08/2019] [Accepted: 06/11/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Protein complexes play critical roles in many aspects of biological functions. Three-dimensional (3D) structures of protein complexes are critical for gaining insights into structural bases of interactions and their roles in the biomolecular pathways that orchestrate key cellular processes. Because of the expense and effort associated with experimental determinations of 3D protein complex structures, computational docking has evolved as a valuable tool to predict 3D structures of biomolecular complexes. Despite recent progress, reliably distinguishing near-native docking conformations from a large number of candidate conformations, the so-called scoring problem, remains a major challenge. RESULTS Here we present iScore, a novel approach to scoring docked conformations that combines HADDOCK energy terms with a score obtained using a graph representation of the protein-protein interfaces and a measure of evolutionary conservation. It achieves a scoring performance competitive with, or superior to, that of state-of-the-art scoring functions on two independent datasets: (i) Docking software-specific models and (ii) the CAPRI score set generated by a wide variety of docking approaches (i.e. docking software-non-specific). iScore ranks among the top scoring approaches on the CAPRI score set (13 targets) when compared with the 37 scoring groups in CAPRI. The results demonstrate the utility of combining evolutionary, topological and energetic information for scoring docked conformations. This work represents the first successful demonstration of graph kernels to protein interfaces for effective discrimination of near-native and non-native conformations of protein complexes. AVAILABILITY AND IMPLEMENTATION The iScore code is freely available from Github: https://github.com/DeepRank/iScore (DOI: 10.5281/zenodo.2630567). And the docking models used are available from SBGrid: https://data.sbgrid.org/dataset/684). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cunliang Geng
- Bijvoet Center for Biomolecular Research, Faculty of Science – Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands
| | - Yong Jung
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16823, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Nicolas Renaud
- Netherlands eScience Center, Amsterdam 1098 XG, The Netherlands
| | - Vasant Honavar
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16823, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA 16823, USA
- Institute for Cyberscience, University Park, PA 16802, USA
- Clinical and Translational Sciences Institute, University Park, PA 16802, USA
- College of Information Sciences & Technology, Pennsylvania State University, University Park, PA 16802, USA
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science – Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands
| | - Li C Xue
- Bijvoet Center for Biomolecular Research, Faculty of Science – Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands
| |
Collapse
|
64
|
McNeil HE, Alav I, Torres RC, Rossiter AE, Laycock E, Legood S, Kaur I, Davies M, Wand M, Webber MA, Bavro VN, Blair JMA. Identification of binding residues between periplasmic adapter protein (PAP) and RND efflux pumps explains PAP-pump promiscuity and roles in antimicrobial resistance. PLoS Pathog 2019; 15:e1008101. [PMID: 31877175 PMCID: PMC6975555 DOI: 10.1371/journal.ppat.1008101] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 01/22/2020] [Accepted: 09/20/2019] [Indexed: 11/19/2022] Open
Abstract
Active efflux due to tripartite RND efflux pumps is an important mechanism of clinically relevant antibiotic resistance in Gram-negative bacteria. These pumps are also essential for Gram-negative pathogens to cause infection and form biofilms. They consist of an inner membrane RND transporter; a periplasmic adaptor protein (PAP), and an outer membrane channel. The role of PAPs in assembly, and the identities of specific residues involved in PAP-RND binding, remain poorly understood. Using recent high-resolution structures, four 3D sites involved in PAP-RND binding within each PAP protomer were defined that correspond to nine discrete linear binding sequences or "binding boxes" within the PAP sequence. In the important human pathogen Salmonella enterica, these binding boxes are conserved within phylogenetically-related PAPs, such as AcrA and AcrE, while differing considerably between divergent PAPs such as MdsA and MdtA, despite overall conservation of the PAP structure. By analysing these binding sequences we created a predictive model of PAP-RND interaction, which suggested the determinants that may allow promiscuity between certain PAPs, but discrimination of others. We corroborated these predictions using direct phenotypic data, confirming that only AcrA and AcrE, but not MdtA or MsdA, can function with the major RND pump AcrB. Furthermore, we provide functional validation of the involvement of the binding boxes by disruptive site-directed mutagenesis. These results directly link sequence conservation within identified PAP binding sites with functional data providing mechanistic explanation for assembly of clinically relevant RND-pumps and explain how Salmonella and other pathogens maintain a degree of redundancy in efflux mediated resistance. Overall, our study provides a novel understanding of the molecular determinants driving the RND-PAP recognition by bridging the available structural information with experimental functional validation thus providing the scientific community with a predictive model of pump-contacts that could be exploited in the future for the development of targeted therapeutics and efflux pump inhibitors.
Collapse
Affiliation(s)
- Helen E. McNeil
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Ilyas Alav
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | | | - Amanda E. Rossiter
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Eve Laycock
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Simon Legood
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Inderpreet Kaur
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Matthew Davies
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Matthew Wand
- Public Health England, National Infection Service, Porton Down, Salisbury, Wiltshire, United Kingdom
| | - Mark A. Webber
- Quadram Institute Bioscience, Norwich Research Park, Norwich, United Kingdom
| | - Vassiliy N. Bavro
- School of Life Sciences, University of Essex, Colchester, United Kingdom
- * E-mail: (VNB); (JMAB)
| | - Jessica M. A. Blair
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- * E-mail: (VNB); (JMAB)
| |
Collapse
|
65
|
Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat Methods 2019; 17:184-192. [DOI: 10.1038/s41592-019-0666-6] [Citation(s) in RCA: 172] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Accepted: 10/28/2019] [Indexed: 02/05/2023]
|
66
|
Takemura K, Kitao A. More efficient screening of protein-protein complex model structures for reducing the number of candidates. Biophys Physicobiol 2019; 16:295-303. [PMID: 31984184 PMCID: PMC6975980 DOI: 10.2142/biophysico.16.0_295] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 08/01/2019] [Indexed: 01/29/2023] Open
Abstract
Rigid-body protein-protein docking is very efficient in generating tens of thousands of docked complex models (decoys) in a very short time without considering structure change upon binding, but typical docking scoring functions are not necessarily sufficiently accurate to narrow these decoys down to a small number of plausible candidates. Flexible refinements and sophisticated evaluation of the decoys are thus required to achieve more accurate prediction. Since this process is time-consuming, an efficient screening method to reduce the number of decoys is necessary immediately following rigid-body dockings. We attempted to develop an efficient screening method by clustering decoys generated by the rigid-body docking ZDOCK. We introduced the three metrics ligand-root-mean-square deviation (L-RMSD), interface-ligand-RMSD (iL-RMSD), and the fraction of common contacts (FCC), and examined various ranges of cut-offs for clusters to determine the best set of clustering parameters. Although the employed clustering algorithm is simple, it successfully reduced the number of decoys. Using iL-RMSD with a cut-off radius of 8 Å, the number of decoys that contain at least one near-native model with 90% probability decreased from 4,808 to 320, a 93% reduction in the original number of decoys. Using FCC for the clustering step, the top 1,000 success rates, defined as the probability that the top 1,000 models contain at least one near-native structure, reached 97%. We conclude that the proposed method is very efficient in selecting a small number of decoys that include near-native decoys.
Collapse
Affiliation(s)
- Kazuhiro Takemura
- School of Life Science and Technology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan
| | - Akio Kitao
- School of Life Science and Technology, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan
| |
Collapse
|
67
|
Sanchez-Garcia R, Sorzano COS, Carazo JM, Segura J. BIPSPI: a method for the prediction of partner-specific protein-protein interfaces. Bioinformatics 2019; 35:470-477. [PMID: 30020406 PMCID: PMC6361243 DOI: 10.1093/bioinformatics/bty647] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 07/17/2018] [Indexed: 11/15/2022] Open
Abstract
Motivation Protein-Protein Interactions (PPI) are essentials for most cellular processes and thus, unveiling how proteins interact is a crucial question that can be better understood by identifying which residues are responsible for the interaction. Computational approaches are orders of magnitude cheaper and faster than experimental ones, leading to proliferation of multiple methods aimed to predict which residues belong to the interface of an interaction. Results We present BIPSPI, a new machine learning-based method for the prediction of partner-specific PPI sites. Contrary to most binding site prediction methods, the proposed approach takes into account a pair of interacting proteins rather than a single one in order to predict partner-specific binding sites. BIPSPI has been trained employing sequence-based and structural features from both protein partners of each complex compiled in the Protein-Protein Docking Benchmark version 5.0 and in an additional set independently compiled. Also, a version trained only on sequences has been developed. The performance of our approach has been assessed by a leave-one-out cross-validation over different benchmarks, outperforming state-of-the-art methods. Availability and implementation BIPSPI web server is freely available at http://bipspi.cnb.csic.es. BIPSPI code is available at https://github.com/bioinsilico/BIPSPI. Docker image is available at https://hub.docker.com/r/bioinsilico/bipspi/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ruben Sanchez-Garcia
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| | - C O S Sorzano
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| | - J M Carazo
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| | - Joan Segura
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| |
Collapse
|
68
|
Hadi-Alijanvand H. Soft regions of protein surface are potent for stable dimer formation. J Biomol Struct Dyn 2019; 38:3587-3598. [PMID: 31476974 DOI: 10.1080/07391102.2019.1662328] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
By having knowledge about the characteristics of protein interaction interfaces, we will be able to manipulate protein complexes for therapies. Dimer state is considered as the primary alphabet of the most proteins' quaternary structure. The properties of binding interface between subunits and of noninterface region define the specificity and stability of the intended protein complex. Considering some topological properties and amino acids' affinity for binding in interfaces of protein dimers, we construct the interface-specific recurrence plots. The data obtained from recurrence quantitative analysis, and accessibility-related metrics help us to classify the protein dimers into four distinct classes. Some mechanical properties of binding interfaces are computed for each predefined class of the dimers. The computed mechanical characteristics of binding patch region are compared with those of nonbinding region of proteins. Our observations indicate that the mechanical properties of protein binding sites have a decisive impact on determining the dimer stability. We introduce a new concept in analyzing protein structure by considering mechanical properties of protein structure. We conclude that the interface region between subunits of stable dimers is usually mechanically softer than the interface of unstable protein dimers. AbbreviationsAABaverage affinity for bindingANManisotropic network modelAPCaffinity propagation clusteringASAaccessible surface areaCCDinter residues distanceCSCcomplex stability codeDMdistance matrixΔGdissPISA-computed dissociation free energyGNMGaussian normal mode analysisNMAnormal mode analysisPBPprotein binding patchPISAproteins, interfaces, structures and assembliesrASArelative accessible area in respect to unfolded state of residuesRMrecurrence matrixrPrelative protrusionRPrecurrence plotRQArecurrence quantitative analysisSEMstandard error of meanCommunicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Hamid Hadi-Alijanvand
- Department of Biological Sciences, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| |
Collapse
|
69
|
Wong ETC, Gsponer J. Predicting Protein-Protein Interfaces that Bind Intrinsically Disordered Protein Regions. J Mol Biol 2019; 431:3157-3178. [PMID: 31207240 DOI: 10.1016/j.jmb.2019.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 06/01/2019] [Accepted: 06/04/2019] [Indexed: 12/18/2022]
Abstract
A long-standing goal in biology is the complete annotation of function and structure on all protein-protein interactions, a large fraction of which is mediated by intrinsically disordered protein regions (IDRs). However, knowledge derived from experimental structures of such protein complexes is disproportionately small due, in part, to challenges in studying interactions of IDRs. Here, we introduce IDRBind, a computational method that by combining gradient boosted trees and conditional random field models predicts binding sites of IDRs with performance approaching state-of-the-art globular interface predictions, making it suitable for proteome-wide applications. Although designed and trained with a focus on molecular recognition features, which are long interaction-mediating-elements in IDRs, IDRBind also predicts the binding sites of short peptides more accurately than existing specialized predictors. Consistent with IDRBind's specificity, a comparison of protein interface categories uncovered uniform trends in multiple physicochemical properties, positioning molecular recognition feature interfaces between peptide and globular interfaces.
Collapse
Affiliation(s)
- Eric T C Wong
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada; Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada; Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
70
|
The Symmetric Difference Distance: A New Way to Evaluate the Evolution of Interfaces along Molecular Dynamics Trajectories; Application to Influenza Hemagglutinin. Symmetry (Basel) 2019. [DOI: 10.3390/sym11050662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
We propose a new and easy approach to evaluate structural dissimilarities between frames issued from molecular dynamics, and we test this methodology on human hemagglutinin. This protein is responsible for the entry of the influenza virus into the host cell by endocytosis, and this virus causes seasonal epidemics of infectious disease, which can be estimated to result in hundreds of thousands of deaths each year around the world. We computed the three interfaces between the three protomers of the hemagglutinin H1 homotrimer (PDB code: 1RU7) for each of its conformations generated from molecular dynamics simulation. For each conformation, we considered the set of residues involved in the union of these three interfaces. The dissimilarity between each pair of conformations was measured with our new methodology, the symmetric difference distance between the associated set of residues. The main advantages of the full procedure are: (i) it is parameter free; (ii) no spatial alignment is needed and (iii) it is simple enough so that it can be implemented by a beginner in programming. It is shown to be a relevant tool to follow the evolution of the conformation along the molecular dynamics trajectories.
Collapse
|
71
|
Nute M, Saleh E, Warnow T. Evaluating Statistical Multiple Sequence Alignment in Comparison to Other Alignment Methods on Protein Data Sets. Syst Biol 2019; 68:396-411. [PMID: 30329135 PMCID: PMC6472439 DOI: 10.1093/sysbio/syy068] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 09/27/2018] [Accepted: 10/11/2018] [Indexed: 01/15/2023] Open
Abstract
The estimation of multiple sequence alignments of protein sequences is a basic step in many bioinformatics pipelines, including protein structure prediction, protein family identification, and phylogeny estimation. Statistical coestimation of alignments and trees under stochastic models of sequence evolution has long been considered the most rigorous technique for estimating alignments and trees, but little is known about the accuracy of such methods on biological benchmarks. We report the results of an extensive study evaluating the most popular protein alignment methods as well as the statistical coestimation method BAli-Phy on 1192 protein data sets from established benchmarks as well as on 120 simulated data sets. Our study (which used more than 230 CPU years for the BAli-Phy analyses alone) shows that BAli-Phy has better precision and recall (with respect to the true alignments) than the other alignment methods on the simulated data sets but has consistently lower recall on the biological benchmarks (with respect to the reference alignments) than many of the other methods. In other words, we find that BAli-Phy systematically underaligns when operating on biological sequence data but shows no sign of this on simulated data. There are several potential causes for this change in performance, including model misspecification, errors in the reference alignments, and conflicts between structural alignment and evolutionary alignments, and future research is needed to determine the most likely explanation. We conclude with a discussion of the potential ramifications for each of these possibilities. [BAli-Phy; homology; multiple sequence alignment; protein sequences; structural alignment.]
Collapse
Affiliation(s)
- Michael Nute
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 S Wright St #101, Champaign, IL 61820, USA
| | - Ehsan Saleh
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 N. Goodwin Ave, Urbana, IL 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 N. Goodwin Ave, Urbana, IL 61801, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1205 W. Clark St., Urbana, IL 61801, USA.,National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
72
|
Straightforward Protein-Protein Interaction Interface Mapping via Random Mutagenesis and Mammalian Protein Protein Interaction Trap (MAPPIT). Int J Mol Sci 2019; 20:ijms20092058. [PMID: 31027327 PMCID: PMC6539206 DOI: 10.3390/ijms20092058] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 04/08/2019] [Accepted: 04/13/2019] [Indexed: 01/18/2023] Open
Abstract
The MAPPIT (mammalian protein protein interaction trap) method allows high-throughput detection of protein interactions by very simple co-transfection of three plasmids in HEK293T cells, followed by a luciferase readout. MAPPIT detects a large percentage of all protein interactions, including those requiring posttranslational modifications and endogenous or exogenous ligands. Here, we present a straightforward method that allows detailed mapping of interaction interfaces via MAPPIT. The method provides insight into the interaction mechanism and reveals how this is affected by disease-associated mutations. By combining error-prone polymerase chain reaction (PCR) for random mutagenesis, 96-well DNA prepping, Sanger sequencing, and MAPPIT via 384-well transfections, we test the effects of a large number of mutations of a selected protein on its protein interactions. The entire screen takes less than three months and interactions with multiple partners can be studied in parallel. The effect of mutations on the MAPPIT readout is mapped on the protein structure, allowing unbiased identification of all putative interaction sites. We have thus far analysed 6 proteins and mapped their interfaces for 16 different interaction partners. Our method is broadly applicable as the required tools are simple and widely available.
Collapse
|
73
|
Galeazzi R, Laudadio E, Falconi E, Massaccesi L, Ercolani L, Mobbili G, Minnelli C, Scirè A, Cianfruglia L, Armeni T. Protein-protein interactions of human glyoxalase II: findings of a reliable docking protocol. Org Biomol Chem 2019; 16:5167-5177. [PMID: 29971290 DOI: 10.1039/c8ob01194j] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Glyoxalase II (GlxII) is an antioxidant glutathione-dependent enzyme, which catalyzes the hydrolysis of S-d-lactoylglutathione to form d-lactic acid and glutathione (GSH). The last product is the most important thiol reducing agent present in all eukaryotic cells that have mitochondria and chloroplasts. It is generally known that GSH plays a crucial role not only in the cellular redox state but also in various cellular processes. One of them is protein S-glutathionylation, a process that can occur through an oxidation reaction of proteins' thiol groups by GSH. Changes in protein S-glutathionylation have been associated with a range of human diseases such as diabetes, cardiovascular and pulmonary diseases, neurodegenerative diseases and cancer. Within a major project aimed at elucidating the role of GlxII in the mechanism of S-glutathionylation, a reliable computational protocol consisting of a protein-protein docking approach followed by atomistic Molecular Dynamics (MD) simulations was developed and it was applied to the prediction of molecular associations between human GlxII (in the presence and absence of GSH) and some proteins that are known to be S-glutathionylated in vitro, such as actin, malate dehydrogenase (MDH) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH). The computational results show a high propensity of GlxII to interact with actin and MDH through its active site and a high stability of the GlxII-protein systems when GSH is present. Moreover, close proximities of GSH with actin and MDH cysteine residues have been found, suggesting that GlxII could be able to perform protein S-glutathionylation by using the GSH molecule present in its catalytic site.
Collapse
Affiliation(s)
- Roberta Galeazzi
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona, Italy.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
74
|
Abstract
Recent progress in the development of scientific libraries with machine-learning techniques paved the way for the implementation of integrated computational tools to predict ligand-binding affinity. The prediction of binding affinity uses the atomic coordinates of protein-ligand complexes. These new computational tools made application of a broad spectrum of machine-learning techniques to study protein-ligand interactions possible. The essential aspect of these machine-learning approaches is to train a new computational model by using technologies such as supervised machine-learning techniques, convolutional neural network, and random forest to mention the most commonly applied methods. In this chapter, we focus on supervised machine-learning techniques and their applications in the development of protein-targeted scoring functions for the prediction of binding affinity. We discuss the development of the program SAnDReS and its application to the creation of machine-learning models to predict inhibition of cyclin-dependent kinase and HIV-1 protease. Moreover, we describe the scoring function space, and how to use it to explain the development of targeted scoring functions.
Collapse
Affiliation(s)
- Gabriela Bitencourt-Ferreira
- Escola de Ciências da Saúde, Pontifícia Universidade Católica do Rio Grande do Sul-PUCRS, Porto Alegre, RS, Brazil
| | - Walter Filgueira de Azevedo
- Escola de Ciências da Saúde, Pontifícia Universidade Católica do Rio Grande do Sul-PUCRS, Porto Alegre, RS, Brazil.
| |
Collapse
|
75
|
Jung Y, El-Manzalawy Y, Dobbs D, Honavar VG. Partner-specific prediction of RNA-binding residues in proteins: A critical assessment. Proteins 2018; 87:198-211. [PMID: 30536635 PMCID: PMC6389706 DOI: 10.1002/prot.25639] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 10/10/2018] [Accepted: 11/29/2018] [Indexed: 01/06/2023]
Abstract
RNA-protein interactions play essential roles in regulating gene expression. While some RNA-protein interactions are "specific", that is, the RNA-binding proteins preferentially bind to particular RNA sequence or structural motifs, others are "non-RNA specific." Deciphering the protein-RNA recognition code is essential for comprehending the functional implications of these interactions and for developing new therapies for many diseases. Because of the high cost of experimental determination of protein-RNA interfaces, there is a need for computational methods to identify RNA-binding residues in proteins. While most of the existing computational methods for predicting RNA-binding residues in RNA-binding proteins are oblivious to the characteristics of the partner RNA, there is growing interest in methods for partner-specific prediction of RNA binding sites in proteins. In this work, we assess the performance of two recently published partner-specific protein-RNA interface prediction tools, PS-PRIP, and PRIdictor, along with our own new tools. Specifically, we introduce a novel metric, RNA-specificity metric (RSM), for quantifying the RNA-specificity of the RNA binding residues predicted by such tools. Our results show that the RNA-binding residues predicted by previously published methods are oblivious to the characteristics of the putative RNA binding partner. Moreover, when evaluated using partner-agnostic metrics, RNA partner-specific methods are outperformed by the state-of-the-art partner-agnostic methods. We conjecture that either (a) the protein-RNA complexes in PDB are not representative of the protein-RNA interactions in nature, or (b) the current methods for partner-specific prediction of RNA-binding residues in proteins fail to account for the differences in RNA partner-specific versus partner-agnostic protein-RNA interactions, or both.
Collapse
Affiliation(s)
- Yong Jung
- Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania.,Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania
| | - Yasser El-Manzalawy
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, Pennsylvania.,College of Information Sciences and Technology, Pennsylvania State University, Pennsylvania
| | - Drena Dobbs
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa.,Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa
| | - Vasant G Honavar
- Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania.,Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,Institute for Cyberscience, Pennsylvania State University, University Park, Pennsylvania.,Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, Pennsylvania.,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania.,College of Information Sciences and Technology, Pennsylvania State University, Pennsylvania
| |
Collapse
|
76
|
Shinobu A, Takemura K, Matubayasi N, Kitao A. Refining evERdock: Improved selection of good protein-protein complex models achieved by MD optimization and use of multiple conformations. J Chem Phys 2018; 149:195101. [PMID: 30466278 DOI: 10.1063/1.5055799] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
A method for evaluating binding free energy differences of protein-protein complex structures generated by protein docking was recently developed by some of us. The method, termed evERdock, combined short (2 ns) molecular dynamics (MD) simulations in explicit water and solution theory in the energy representation (ER) and succeeded in selecting the near-native complex structures from a set of decoys. In the current work, we performed longer (up to 100 ns) MD simulations before employing ER analysis in order to further refine the structures of the decoy set with improved binding free energies. Moreover, we estimated the binding free energies for each complex structure based on an average value from five individual MD snapshots. After MD simulations, all decoys exhibit a decrease in binding free energy, suggesting that proper equilibration in explicit solvent resulted in more favourably bound complexes. During the MD simulations, non-native structures tend to become unstable and in some cases dissociate, while near-native structures maintain a stable interface. The energies after the MD simulations show an improved correlation between similarity criteria (such as interface root-mean-square distance) to the native (crystal) structure and the binding free energy. In addition, calculated binding free energies show sensitivity to the number of contacts, which was demonstrated to reflect the relative stability of structures at earlier stages of the MD simulation. We therefore conclude that the additional equilibration step along with the use of multiple conformations can make the evERdock scheme more versatile under low computational cost.
Collapse
Affiliation(s)
- Ai Shinobu
- School of Life Science and Technology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo 152-8550, Japan
| | - Kazuhiro Takemura
- School of Life Science and Technology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo 152-8550, Japan
| | - Nobuyuki Matubayasi
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan
| | - Akio Kitao
- School of Life Science and Technology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo 152-8550, Japan
| |
Collapse
|
77
|
Wong AKC, Sze-To HY, Johanning GL. Pattern to Knowledge: Deep Knowledge-Directed Machine Learning for Residue-Residue Interaction Prediction. Sci Rep 2018; 8:14841. [PMID: 30287904 PMCID: PMC6172270 DOI: 10.1038/s41598-018-32834-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 09/17/2018] [Indexed: 11/21/2022] Open
Abstract
Residue-residue close contact (R2R-C) data procured from three-dimensional protein-protein interaction (PPI) experiments is currently used for predicting residue-residue interaction (R2R-I) in PPI. However, due to complex physiochemical environments, R2R-I incidences, facilitated by multiple factors, are usually entangled in the source environment and masked in the acquired data. Here we present a novel method, P2K (Pattern to Knowledge), to disentangle R2R-I patterns and render much succinct discriminative information expressed in different specific R2R-I statistical/functional spaces. Since such knowledge is not visible in the data acquired, we refer to it as deep knowledge. Leveraging the deep knowledge discovered to construct machine learning models for sequence-based R2R-I prediction, without trial-and-error combination of the features over external knowledge of sequences, our R2R-I predictor was validated for its effectiveness under stringent leave-one-complex-out-alone cross-validation in a benchmark dataset, and was surprisingly demonstrated to perform better than an existing sequence-based R2R-I predictor by 28% (p: 1.9E-08). P2K is accessible via our web server on https://p2k.uwaterloo.ca .
Collapse
Affiliation(s)
- Andrew K C Wong
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada.
| | - Ho Yin Sze-To
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada
| | - Gary L Johanning
- Biosciences Division, SRI International, 333 Ravenswood Ave, Menlo Park, CA, USA
| |
Collapse
|
78
|
Oreluk J, Liu Z, Hegde A, Li W, Packard A, Frenklach M, Zubarev D. Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method. Sci Rep 2018; 8:13248. [PMID: 30185953 PMCID: PMC6125339 DOI: 10.1038/s41598-018-31677-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 08/22/2018] [Indexed: 12/21/2022] Open
Abstract
We report an evaluation of a semi-empirical quantum chemical method PM7 from the perspective of uncertainty quantification. Specifically, we apply Bound-to-Bound Data Collaboration, an uncertainty quantification framework, to characterize (a) variability of PM7 model parameter values consistent with the uncertainty in the training data and (b) uncertainty propagation from the training data to the model predictions. Experimental heats of formation of a homologous series of linear alkanes are used as the property of interest. The training data are chemically accurate, i.e., they have very low uncertainty by the standards of computational chemistry. The analysis does not find evidence of PM7 consistency with the entire data set considered as no single set of parameter values is found that captures the experimental uncertainties of all training data. A set of parameter values for PM7 was able to capture the training data within ±1 kcal/mol, but not to the smaller level of uncertainty in the reported data. Nevertheless, PM7 was found to be consistent for subsets of the training data. In such cases, uncertainty propagation from the chemically accurate training data to the predicted values preserves error within bounds of chemical accuracy if predictions are made for the molecules of comparable size. Otherwise, the error grows linearly with the relative size of the molecules.
Collapse
Affiliation(s)
- James Oreluk
- Department of Mechanical Engineering, University of California at Berkeley, Berkeley, California, 94720-1740, USA
| | - Zhenyuan Liu
- Department of Mechanical Engineering, University of California at Berkeley, Berkeley, California, 94720-1740, USA
| | - Arun Hegde
- Department of Mechanical Engineering, University of California at Berkeley, Berkeley, California, 94720-1740, USA
| | - Wenyu Li
- Department of Mechanical Engineering, University of California at Berkeley, Berkeley, California, 94720-1740, USA
| | - Andrew Packard
- Department of Mechanical Engineering, University of California at Berkeley, Berkeley, California, 94720-1740, USA
| | - Michael Frenklach
- Department of Mechanical Engineering, University of California at Berkeley, Berkeley, California, 94720-1740, USA.
| | - Dmitry Zubarev
- IBM Almaden Research Center, 650 Harry Road, San Jose, California, 95136, USA
| |
Collapse
|
79
|
Macalino SJY, Basith S, Clavio NAB, Chang H, Kang S, Choi S. Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules 2018; 23:E1963. [PMID: 30082644 PMCID: PMC6222862 DOI: 10.3390/molecules23081963] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/03/2018] [Accepted: 08/04/2018] [Indexed: 12/14/2022] Open
Abstract
The advent of advanced molecular modeling software, big data analytics, and high-speed processing units has led to the exponential evolution of modern drug discovery and better insights into complex biological processes and disease networks. This has progressively steered current research interests to understanding protein-protein interaction (PPI) systems that are related to a number of relevant diseases, such as cancer, neurological illnesses, metabolic disorders, etc. However, targeting PPIs are challenging due to their "undruggable" binding interfaces. In this review, we focus on the current obstacles that impede PPI drug discovery, and how recent discoveries and advances in in silico approaches can alleviate these barriers to expedite the search for potential leads, as shown in several exemplary studies. We will also discuss about currently available information on PPI compounds and systems, along with their usefulness in molecular modeling. Finally, we conclude by presenting the limits of in silico application in drug discovery and offer a perspective in the field of computer-aided PPI drug discovery.
Collapse
Affiliation(s)
- Stephani Joy Y Macalino
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Shaherin Basith
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Nina Abigail B Clavio
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Hyerim Chang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Soosung Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Sun Choi
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| |
Collapse
|
80
|
Artificial intelligence in drug design. SCIENCE CHINA-LIFE SCIENCES 2018; 61:1191-1204. [PMID: 30054833 DOI: 10.1007/s11427-018-9342-2] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 05/22/2018] [Indexed: 12/27/2022]
Abstract
Thanks to the fast improvement of the computing power and the rapid development of the computational chemistry and biology, the computer-aided drug design techniques have been successfully applied in almost every stage of the drug discovery and development pipeline to speed up the process of research and reduce the cost and risk related to preclinical and clinical trials. Owing to the development of machine learning theory and the accumulation of pharmacological data, the artificial intelligence (AI) technology, as a powerful data mining tool, has cut a figure in various fields of the drug design, such as virtual screening, activity scoring, quantitative structure-activity relationship (QSAR) analysis, de novo drug design, and in silico evaluation of absorption, distribution, metabolism, excretion and toxicity (ADME/T) properties. Although it is still challenging to provide a physical explanation of the AI-based models, it indeed has been acting as a great power to help manipulating the drug discovery through the versatile frameworks. Recently, due to the strong generalization ability and powerful feature extraction capability, deep learning methods have been employed in predicting the molecular properties as well as generating the desired molecules, which will further promote the application of AI technologies in the field of drug design.
Collapse
|
81
|
Tiwari PB, Chapagain PP, Üren A. Investigating molecular interactions between oxidized neuroglobin and cytochrome c. Sci Rep 2018; 8:10557. [PMID: 30002427 PMCID: PMC6043506 DOI: 10.1038/s41598-018-28836-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 07/02/2018] [Indexed: 11/17/2022] Open
Abstract
The formation of a complex between neuroglobin (Ngb) and cytochrome c (Cyt c) has an important biological role in preventing apoptosis. Binding of Ngb to Cyt c alone is sufficient to block the caspase 9 activation by ferric Cyt c that is released during ischemic insults. Therefore, a detailed information on the Ngb-Cyt c interactions is important for understanding apoptosis. However, the exact nature of the interactions between oxidized human neuroglobin (hNgb) and Cyt c is not well understood. In this work, we used a combination of computational modeling and surface plasmon resonance experiments to obtain and characterize the complex formation between oxidized hNgb and Cyt c. We identified important residues involved in the complex formation, including K72 in Cyt c, which is otherwise known to interact with the apoptotic protease-activation factor-1. Our computational results, together with an optimized structure of the hNgb-Cyt c complex, provide unique insights into how the hNgb-Cyt c complex can abate the apoptotic cascade without an hNgb-Cyt c redox reaction.
Collapse
Affiliation(s)
| | - Prem P Chapagain
- Department of Physics, Florida International University, Miami, FL, USA
- Biomolecular Sciences Institute, Florida International University, Miami, FL, USA
| | - Aykut Üren
- Department of Oncology, Georgetown University, Washington D.C., USA
| |
Collapse
|
82
|
Takemura K, Matubayasi N, Kitao A. Binding free energy analysis of protein-protein docking model structures by evERdock. J Chem Phys 2018; 148:105101. [PMID: 29544320 DOI: 10.1063/1.5019864] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
To aid the evaluation of protein-protein complex model structures generated by protein docking prediction (decoys), we previously developed a method to calculate the binding free energies for complexes. The method combines a short (2 ns) all-atom molecular dynamics simulation with explicit solvent and solution theory in the energy representation (ER). We showed that this method successfully selected structures similar to the native complex structure (near-native decoys) as the lowest binding free energy structures. In our current work, we applied this method (evERdock) to 100 or 300 model structures of four protein-protein complexes. The crystal structures and the near-native decoys showed the lowest binding free energy of all the examined structures, indicating that evERdock can successfully evaluate decoys. Several decoys that show low interface root-mean-square distance but relatively high binding free energy were also identified. Analysis of the fraction of native contacts, hydrogen bonds, and salt bridges at the protein-protein interface indicated that these decoys were insufficiently optimized at the interface. After optimizing the interactions around the interface by including interfacial water molecules, the binding free energies of these decoys were improved. We also investigated the effect of solute entropy on binding free energy and found that consideration of the entropy term does not necessarily improve the evaluations of decoys using the normal model analysis for entropy calculation.
Collapse
Affiliation(s)
- Kazuhiro Takemura
- Institute of Molecular and Cellular Biosciences, University of Tokyo, Bunkyo, Tokyo 113-0032, Japan
| | - Nobuyuki Matubayasi
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan
| | - Akio Kitao
- Institute of Molecular and Cellular Biosciences, University of Tokyo, Bunkyo, Tokyo 113-0032, Japan
| |
Collapse
|
83
|
Geometric and amino acid type determinants for protein-protein interaction interfaces. QUANTITATIVE BIOLOGY 2018. [DOI: 10.1007/s40484-018-0138-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
84
|
Ahmed MS, Shahjaman M, Kabir E, Kamruzzaman M. Structure modeling to function prediction of Uncharacterized Human Protein C15orf41. Bioinformation 2018; 14:206-212. [PMID: 30108417 PMCID: PMC6077826 DOI: 10.6026/97320630014206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Revised: 04/29/2018] [Accepted: 04/29/2018] [Indexed: 01/18/2023] Open
Abstract
The dyserythropoietic anemia disease is a genetic disorder of erythropoiesis characterized by morphological abnormalities of erythroblasts. This is caused by human gene C15orf41 mutation. The uncharacterized C15orf41 protein is involved in the formation of a functional complex structure. The uncharacterized C15orf41 protein is thermostable, unstable and acidic. This is associated with TPD (Treponema Pallidum) domain (135 to 265 residue position) and three PTM sites such as K50 (Acetylation), T114 (Phosphorylation) and K176 (Ubiquitination). C15orf41 is paralogous to isoform-1 (gi|194018542|) and open reading frame isoform-CRA_c (gi|119612744|) of Homo sapiens located at chromosome 15. It interacts with the human ATP (Adenosine Triphosphate) binding domain 4 (ATPBD4) having similarity score 0.725 as per protein-protein interaction (PPI) network analysis. This data provides valuable insights towards the functional characterization of human gene C15orf41.
Collapse
Affiliation(s)
- Md. Shakil Ahmed
- Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh
| | - Md. Shahjaman
- Department of Statistics, Begum Rokeya University, Rangpur-5400, Bangladesh
| | - Enamul Kabir
- School of Agricultural, Computational and Environmental Sciences, University of Southern Queensland, Australia
| | - Md. Kamruzzaman
- Data Science for Knowledge Creation Research Center, Seoul National University, Korea
| |
Collapse
|
85
|
Hadi-Alijanvand H, Rouhani M. Partner-Specific Prediction of Protein-Dimer Stability from Unbound Structure of Monomer. J Chem Inf Model 2018; 58:733-745. [PMID: 29444397 DOI: 10.1021/acs.jcim.7b00606] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Protein complexes play deterministic roles in live entities in sensing, compiling, controlling, and responding to external and internal stimuli. Thermodynamic stability is an important property of protein complexes; having knowledge about complex stability helps us to understand the basics of protein assembly-related diseases and the mechanism of protein assembly clearly. Enormous protein-protein interactions, detected by high-throughput methods, necessitate finding fast methods for predicting the stability of protein assemblies in a quantitative and qualitative manner. The existing methods of predicting complex stability need knowledge about the three-dimensional (3D) structure of the intended protein complex. Here, we introduce a new method for predicting dissociation free energy of subunits by analyzing the structural and topological properties of a protein binding patch on a single subunit of the desired protein complex. The method needs the 3D structure of just one subunit and the information about the position of the intended binding site on the surface of that subunit to predict dimer stability in a classwise manner. The patterns of structural and topological properties of a protein binding patch are decoded by recurrence quantification analysis. Nonparametric discrimination is then utilized to predict the stability class of the intended dimer with accuracy greater than 85%.
Collapse
Affiliation(s)
- Hamid Hadi-Alijanvand
- Department of Biological Sciences , Institute for Advanced Studies in Basic Sciences (IASBS) , Zanjan , 45137-66731 , Iran
| | - Maryam Rouhani
- Department of Biological Sciences , Institute for Advanced Studies in Basic Sciences (IASBS) , Zanjan , 45137-66731 , Iran
| |
Collapse
|
86
|
Söderberg CAG, Månsson C, Bernfur K, Rutsdottir G, Härmark J, Rajan S, Al-Karadaghi S, Rasmussen M, Höjrup P, Hebert H, Emanuelsson C. Structural modelling of the DNAJB6 oligomeric chaperone shows a peptide-binding cleft lined with conserved S/T-residues at the dimer interface. Sci Rep 2018; 8:5199. [PMID: 29581438 PMCID: PMC5979959 DOI: 10.1038/s41598-018-23035-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 03/05/2018] [Indexed: 12/28/2022] Open
Abstract
The remarkably efficient suppression of amyloid fibril formation by the DNAJB6 chaperone is dependent on a set of conserved S/T-residues and an oligomeric structure, features unusual among DNAJ chaperones. We explored the structure of DNAJB6 using a combination of structural methods. Lysine-specific crosslinking mass spectrometry provided distance constraints to select a homology model of the DNAJB6 monomer, which was subsequently used in crosslink-assisted docking to generate a dimer model. A peptide-binding cleft lined with S/T-residues is formed at the monomer-monomer interface. Mixed isotope crosslinking showed that the oligomers are dynamic entities that exchange subunits. The purified protein is well folded, soluble and composed of oligomers with a varying number of subunits according to small-angle X-ray scattering (SAXS). Elongated particles (160 × 120 Å) were detected by electron microscopy and single particle reconstruction resulted in a density map of 20 Å resolution into which the DNAJB6 dimers fit. The structure of the oligomer and the S/T-rich region is of great importance for the understanding of the function of DNAJB6 and how it can bind aggregation-prone peptides and prevent amyloid diseases.
Collapse
Affiliation(s)
| | - Cecilia Månsson
- Department of Biochemistry and Structural Biology, Center for Molecular Protein Science, Lund University, PO Box 124, SE-221 00, Lund, Sweden
| | - Katja Bernfur
- Department of Biochemistry and Structural Biology, Center for Molecular Protein Science, Lund University, PO Box 124, SE-221 00, Lund, Sweden
| | - Gudrun Rutsdottir
- Department of Biochemistry and Structural Biology, Center for Molecular Protein Science, Lund University, PO Box 124, SE-221 00, Lund, Sweden
| | - Johan Härmark
- School of Technology and Health, KTH Royal Institute of Technology and Department of Biosciences and Nutrition, Karolinska Institute, Stockholm, Sweden
| | - Sreekanth Rajan
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Salam Al-Karadaghi
- Department of Biochemistry and Structural Biology, Center for Molecular Protein Science, Lund University, PO Box 124, SE-221 00, Lund, Sweden
| | - Morten Rasmussen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Peter Höjrup
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Hans Hebert
- School of Technology and Health, KTH Royal Institute of Technology and Department of Biosciences and Nutrition, Karolinska Institute, Stockholm, Sweden
| | - Cecilia Emanuelsson
- Department of Biochemistry and Structural Biology, Center for Molecular Protein Science, Lund University, PO Box 124, SE-221 00, Lund, Sweden.
| |
Collapse
|
87
|
Daberdaku S, Ferrari C. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction. BMC Bioinformatics 2018; 19:35. [PMID: 29409446 PMCID: PMC5802066 DOI: 10.1186/s12859-018-2043-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 01/24/2018] [Indexed: 12/22/2022] Open
Abstract
Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class. Electronic supplementary material The online version of this article (10.1186/s12859-018-2043-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Daberdaku
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy.
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy
| |
Collapse
|
88
|
Škrbić T, Zamuner S, Hong R, Seno F, Laio A, Trovato A. Vibrational entropy estimation can improve binding affinity prediction for non-obligatory protein complexes. Proteins 2018; 86:393-404. [DOI: 10.1002/prot.25454] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 12/22/2017] [Accepted: 01/05/2018] [Indexed: 01/10/2023]
Affiliation(s)
- Tatjana Škrbić
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Stefano Zamuner
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Rolando Hong
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Flavio Seno
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| | - Alessandro Laio
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Antonio Trovato
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| |
Collapse
|
89
|
Yang Y, Gong X. A new probability method to understand protein-protein interface formation mechanism at amino acid level. J Theor Biol 2018; 436:18-25. [DOI: 10.1016/j.jtbi.2017.09.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 09/21/2017] [Accepted: 09/27/2017] [Indexed: 10/18/2022]
|
90
|
Conserved salt-bridge competition triggered by phosphorylation regulates the protein interactome. Proc Natl Acad Sci U S A 2017; 114:13453-13458. [PMID: 29208709 DOI: 10.1073/pnas.1711543114] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Phosphorylation is a major regulator of protein interactions; however, the mechanisms by which regulation occurs are not well understood. Here we identify a salt-bridge competition or "theft" mechanism that enables a phospho-triggered swap of protein partners by Raf Kinase Inhibitory Protein (RKIP). RKIP transitions from inhibiting Raf-1 to inhibiting G-protein-coupled receptor kinase 2 upon phosphorylation, thereby bridging MAP kinase and G-Protein-Coupled Receptor signaling. NMR and crystallography indicate that a phosphoserine, but not a phosphomimetic, competes for a lysine from a preexisting salt bridge, initiating a partial unfolding event and promoting new protein interactions. Structural elements underlying the theft occurred early in evolution and are found in 10% of homo-oligomers and 30% of hetero-oligomers including Bax, Troponin C, and Early Endosome Antigen 1. In contrast to a direct recognition of phosphorylated residues by binding partners, the salt-bridge theft mechanism represents a facile strategy for promoting or disrupting protein interactions using solvent-accessible residues, and it can provide additional specificity at protein interfaces through local unfolding or conformational change.
Collapse
|
91
|
Lensink MF, Velankar S, Baek M, Heo L, Seok C, Wodak SJ. The challenge of modeling protein assemblies: the CASP12-CAPRI experiment. Proteins 2017; 86 Suppl 1:257-273. [PMID: 29127686 DOI: 10.1002/prot.25419] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 10/31/2017] [Accepted: 11/07/2017] [Indexed: 12/18/2022]
Abstract
We present the quality assessment of 5613 models submitted by predictor groups from both CAPRI and CASP for the total of 15 most tractable targets from the second joint CASP-CAPRI protein assembly prediction experiment. These targets comprised 12 homo-oligomers and 3 hetero-complexes. The bulk of the analysis focuses on 10 targets (of CAPRI Round 37), which included all 3 hetero-complexes, and whose protein chains or the full assembly could be readily modeled from structural templates in the PDB. On average, 28 CAPRI groups and 10 CASP groups (including automatic servers), submitted models for each of these 10 targets. Additionally, about 16 groups participated in the CAPRI scoring experiments. A range of acceptable to high quality models were obtained for 6 of the 10 Round 37 targets, for which templates were available for the full assembly. Poorer results were achieved for the remaining targets due to the lower quality of the templates available for the full complex or the individual protein chains, highlighting the unmet challenge of modeling the structural adjustments of the protein components that occur upon binding or which must be accounted for in template-based modeling. On the other hand, our analysis indicated that residues in binding interfaces were correctly predicted in a sizable fraction of otherwise poorly modeled assemblies and this with higher accuracy than published methods that do not use information on the binding partner. Lastly, the strengths and weaknesses of the assessment methods are evaluated and improvements suggested.
Collapse
Affiliation(s)
- Marc F Lensink
- University Lille, CNRS UMR8576 UGSF, Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Korea
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul, Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Korea
| | - Shoshana J Wodak
- VIB Structural Biology Research Center, VUB, Pleinlaan 2, Brussels, Belgium
| |
Collapse
|
92
|
Different protein-protein interface patterns predicted by different machine learning methods. Sci Rep 2017; 7:16023. [PMID: 29167570 PMCID: PMC5700192 DOI: 10.1038/s41598-017-16397-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 11/13/2017] [Indexed: 12/02/2022] Open
Abstract
Different types of protein-protein interactions make different protein-protein interface patterns. Different machine learning methods are suitable to deal with different types of data. Then, is it the same situation that different interface patterns are preferred for prediction by different machine learning methods? Here, four different machine learning methods were employed to predict protein-protein interface residue pairs on different interface patterns. The performances of the methods for different types of proteins are different, which suggest that different machine learning methods tend to predict different protein-protein interface patterns. We made use of ANOVA and variable selection to prove our result. Our proposed methods taking advantages of different single methods also got a good prediction result compared to single methods. In addition to the prediction of protein-protein interactions, this idea can be extended to other research areas such as protein structure prediction and design.
Collapse
|
93
|
Huang RYC, Krystek SR, Felix N, Graziano RF, Srinivasan M, Pashine A, Chen G. Hydrogen/deuterium exchange mass spectrometry and computational modeling reveal a discontinuous epitope of an antibody/TL1A Interaction. MAbs 2017; 10:95-103. [PMID: 29135326 DOI: 10.1080/19420862.2017.1393595] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
TL1A, a tumor necrosis factor-like cytokine, is a ligand for the death domain receptor DR3. TL1A, upon binding to DR3, can stimulate lymphocytes and trigger secretion of proinflammatory cytokines. Therefore, blockade of TL1A/DR3 interaction may be a potential therapeutic strategy for autoimmune and inflammatory diseases. Recently, the anti-TL1A monoclonal antibody 1 (mAb1) with a strong potency in blocking the TL1A/DR3 interaction was identified. Here, we report on the use of hydrogen/deuterium exchange mass spectrometry (HDX-MS) to obtain molecular-level details of mAb1's binding epitope on TL1A. HDX coupled with electron-transfer dissociation MS provided residue-level epitope information. The HDX dataset, in combination with solvent accessible surface area (SASA) analysis and computational modeling, revealed a discontinuous epitope within the predicted interaction interface of TL1A and DR3. The epitope regions span a distance within the approximate size of the variable domains of mAb1's heavy and light chains, indicating it uses a unique mechanism of action to block the TL1A/DR3 interaction.
Collapse
Affiliation(s)
- Richard Y-C Huang
- a Bioanalytical and Discovery Analytical Sciences, Pharmaceutical Candidate Optimization, Research and Development , Bristol-Myers Squibb Company , Princeton , NJ , USA
| | - Stanley R Krystek
- b Molecular Discovery Technologies, Research and Development , Bristol-Myers Squibb Company , Princeton , NJ , USA
| | - Nathan Felix
- c Discovery Biology, Research and Development , Bristol-Myers Squibb Company , Princeton , NJ , USA
| | - Robert F Graziano
- c Discovery Biology, Research and Development , Bristol-Myers Squibb Company , Princeton , NJ , USA
| | - Mohan Srinivasan
- d Biologics Discovery California, Research and Development , Bristol-Myers Squibb Company , Redwood City , CA , USA
| | - Achal Pashine
- c Discovery Biology, Research and Development , Bristol-Myers Squibb Company , Princeton , NJ , USA
| | - Guodong Chen
- a Bioanalytical and Discovery Analytical Sciences, Pharmaceutical Candidate Optimization, Research and Development , Bristol-Myers Squibb Company , Princeton , NJ , USA
| |
Collapse
|
94
|
Ivanov SM, Cawley A, Huber RG, Bond PJ, Warwicker J. Protein-protein interactions in paralogues: Electrostatics modulates specificity on a conserved steric scaffold. PLoS One 2017; 12:e0185928. [PMID: 29016650 PMCID: PMC5634604 DOI: 10.1371/journal.pone.0185928] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 09/21/2017] [Indexed: 12/05/2022] Open
Abstract
An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge) are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners.
Collapse
Affiliation(s)
- Stefan M. Ivanov
- Manchester Institute of Biotechnology, School of Chemistry, The University of Manchester, Manchester, United Kingdom
| | - Andrew Cawley
- Manchester Institute of Biotechnology, School of Chemistry, The University of Manchester, Manchester, United Kingdom
| | - Roland G. Huber
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Matrix, Singapore, Singapore
| | - Peter J. Bond
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Matrix, Singapore, Singapore
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore
| | - Jim Warwicker
- Manchester Institute of Biotechnology, School of Chemistry, The University of Manchester, Manchester, United Kingdom
- * E-mail:
| |
Collapse
|
95
|
Membrane proteins structures: A review on computational modeling tools. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2017; 1859:2021-2039. [DOI: 10.1016/j.bbamem.2017.07.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/04/2017] [Accepted: 07/13/2017] [Indexed: 01/02/2023]
|
96
|
Moreira IS, Koukos PI, Melo R, Almeida JG, Preto AJ, Schaarschmidt J, Trellet M, Gümüş ZH, Costa J, Bonvin AMJJ. SpotOn: High Accuracy Identification of Protein-Protein Interface Hot-Spots. Sci Rep 2017; 7:8007. [PMID: 28808256 PMCID: PMC5556074 DOI: 10.1038/s41598-017-08321-2] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 07/07/2017] [Indexed: 12/21/2022] Open
Abstract
We present SpotOn, a web server to identify and classify interfacial residues as Hot-Spots (HS) and Null-Spots (NS). SpotON implements a robust algorithm with a demonstrated accuracy of 0.95 and sensitivity of 0.98 on an independent test set. The predictor was developed using an ensemble machine learning approach with up-sampling of the minor class. It was trained on 53 complexes using various features, based on both protein 3D structure and sequence. The SpotOn web interface is freely available at: http://milou.science.uu.nl/services/SPOTON/.
Collapse
Affiliation(s)
- Irina S Moreira
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal. .,Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands.
| | - Panagiotis I Koukos
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Rita Melo
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal.,Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, Estrada Nacional 10 (ao km 139,7), 2695-066, Bobadela LRS, Portugal
| | - Jose G Almeida
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal
| | - Antonio J Preto
- CNC - Center for Neuroscience and Cell Biology; Rua Larga, FMUC, Polo I, 1°andar, Universidade de Coimbra, 3004-517, Coimbra, Portugal
| | - Joerg Schaarschmidt
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Mikael Trellet
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Zeynep H Gümüş
- Department of Genetics and Genomics and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joaquim Costa
- CMUP/FCUP, Centro de Matemática da Universidade do Porto, Faculdade de Ciências, Rua do Campo Alegre, 4169-007, Porto, Portugal
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands.
| |
Collapse
|
97
|
Murakami Y, Tripathi LP, Prathipati P, Mizuguchi K. Network analysis and in silico prediction of protein-protein interactions with applications in drug discovery. Curr Opin Struct Biol 2017; 44:134-142. [PMID: 28364585 DOI: 10.1016/j.sbi.2017.02.005] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Revised: 02/05/2017] [Accepted: 02/23/2017] [Indexed: 11/29/2022]
Abstract
Protein-protein interactions (PPIs) are vital to maintaining cellular homeostasis. Several PPI dysregulations have been implicated in the etiology of various diseases and hence PPIs have emerged as promising targets for drug discovery. Surface residues and hotspot residues at the interface of PPIs form the core regions, which play a key role in modulating cellular processes such as signal transduction and are used as starting points for drug design. In this review, we briefly discuss how PPI networks (PPINs) inferred from experimentally characterized PPI data have been utilized for knowledge discovery and how in silico approaches to PPI characterization can contribute to PPIN-based biological research. Next, we describe the principles of in silico PPI prediction and survey the existing PPI and PPI site prediction servers that are useful for drug discovery. Finally, we discuss the potential of in silico PPI prediction in drug discovery.
Collapse
Affiliation(s)
- Yoichi Murakami
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Lokesh P Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan
| | - Kenji Mizuguchi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| |
Collapse
|
98
|
Pfeiffenberger E, Chaleil RA, Moal IH, Bates PA. A machine learning approach for ranking clusters of docked protein-protein complexes by pairwise cluster comparison. Proteins 2017; 85:528-543. [PMID: 27935158 PMCID: PMC5396268 DOI: 10.1002/prot.25218] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Revised: 11/14/2016] [Accepted: 11/21/2016] [Indexed: 01/28/2023]
Abstract
Reliable identification of near-native poses of docked protein-protein complexes is still an unsolved problem. The intrinsic heterogeneity of protein-protein interactions is challenging for traditional biophysical or knowledge based potentials and the identification of many false positive binding sites is not unusual. Often, ranking protocols are based on initial clustering of docked poses followed by the application of an energy function to rank each cluster according to its lowest energy member. Here, we present an approach of cluster ranking based not only on one molecular descriptor (e.g., an energy function) but also employing a large number of descriptors that are integrated in a machine learning model, whereby, an extremely randomized tree classifier based on 109 molecular descriptors is trained. The protocol is based on first locally enriching clusters with additional poses, the clusters are then characterized using features describing the distribution of molecular descriptors within the cluster, which are combined into a pairwise cluster comparison model to discriminate near-native from incorrect clusters. The results show that our approach is able to identify clusters containing near-native protein-protein complexes. In addition, we present an analysis of the descriptors with respect to their power to discriminate near native from incorrect clusters and how data transformations and recursive feature elimination can improve the ranking performance. Proteins 2017; 85:528-543. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
| | | | - Iain H. Moal
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute, Wellcome Trust Genome Campus, HinxtonCambridgeCB10 1SDUK
| | - Paul A. Bates
- Biomolecular Modelling LaboratoryThe Francis Crick InstituteLondonNW1 1ATUK
| |
Collapse
|
99
|
Zhang J, Kurgan L. Review and comparative assessment of sequence-based predictors of protein-binding residues. Brief Bioinform 2017; 19:821-837. [DOI: 10.1093/bib/bbx022] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Indexed: 12/31/2022] Open
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
100
|
Molecular Simulations of Disulfide-Rich Venom Peptides with Ion Channels and Membranes. Molecules 2017; 22:molecules22030362. [PMID: 28264446 PMCID: PMC6155311 DOI: 10.3390/molecules22030362] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Revised: 02/23/2017] [Accepted: 02/24/2017] [Indexed: 12/12/2022] Open
Abstract
Disulfide-rich peptides isolated from the venom of arthropods and marine animals are a rich source of potent and selective modulators of ion channels. This makes these peptides valuable lead molecules for the development of new drugs to treat neurological disorders. Consequently, much effort goes into understanding their mechanism of action. This paper presents an overview of how molecular simulations have been used to study the interactions of disulfide-rich venom peptides with ion channels and membranes. The review is focused on the use of docking, molecular dynamics simulations, and free energy calculations to (i) predict the structure of peptide-channel complexes; (ii) calculate binding free energies including the effect of peptide modifications; and (iii) study the membrane-binding properties of disulfide-rich venom peptides. The review concludes with a summary and outlook.
Collapse
|