1
|
Jandova Z, Vargiu AV, Bonvin AMJJ. Native or Non-Native Protein-Protein Docking Models? Molecular Dynamics to the Rescue. J Chem Theory Comput 2021; 17:5944-5954. [PMID: 34342983 PMCID: PMC8444332 DOI: 10.1021/acs.jctc.1c00336] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Indexed: 11/29/2022]
Abstract
Molecular docking excels at creating a plethora of potential models of protein-protein complexes. To correctly distinguish the favorable, native-like models from the remaining ones remains, however, a challenge. We assessed here if a protocol based on molecular dynamics (MD) simulations would allow distinguishing native from non-native models to complement scoring functions used in docking. To this end, the first models for 25 protein-protein complexes were generated using HADDOCK. Next, MD simulations complemented with machine learning were used to discriminate between native and non-native complexes based on a combination of metrics reporting on the stability of the initial models. Native models showed higher stability in almost all measured properties, including the key ones used for scoring in the Critical Assessment of PRedicted Interaction (CAPRI) competition, namely the positional root mean square deviations and fraction of native contacts from the initial docked model. A random forest classifier was trained, reaching a 0.85 accuracy in correctly distinguishing native from non-native complexes. Reasonably modest simulation lengths of the order of 50-100 ns are sufficient to reach this accuracy, which makes this approach applicable in practice.
Collapse
Affiliation(s)
- Zuzana Jandova
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| | - Attilio Vittorio Vargiu
- Physics
Department, University of Cagliari, Cittadella
Universitaria, S.P. 8 km 0.700, 09042 Monserrato, Italy
| | - Alexandre M. J. J. Bonvin
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| |
Collapse
|
2
|
Sotudian S, Desta IT, Hashemi N, Zarbafian S, Kozakov D, Vakili P, Vajda S, Paschalidis IC. Improved cluster ranking in protein-protein docking using a regression approach. Comput Struct Biotechnol J 2021; 19:2269-2278. [PMID: 33995918 PMCID: PMC8102165 DOI: 10.1016/j.csbj.2021.04.028] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/08/2021] [Accepted: 04/09/2021] [Indexed: 11/21/2022] Open
Abstract
We develop a Regression-based Ranking by Pairwise Cluster Comparisons (RRPCC) method to rank clusters of similar protein complex conformations generated by an underlying docking program. The method leverages robust regression to predict the relative quality difference between any pair or clusters and combines these pairwise assessments to form a ranked list of clusters, from higher to lower quality. We apply RRPCC to clusters produced by the automated docking server ClusPro and, depending on the training/validation strategy, we show improvement by 24-100% in ranking acceptable or better quality clusters first, and by 15-100% in ranking medium or better quality clusters first. We compare the RRPCC-ClusPro combination to a number of alternatives, and show that very different machine learning approaches to scoring docked structures yield similar success rates. Finally, we discuss the current limitations on sampling and scoring, looking ahead to further improvements. Interestingly, some features important for improved scoring are internal energy terms that occur only due to the local energy minimization applied in the refinement stage following rigid body docking.
Collapse
Affiliation(s)
| | | | - Nasser Hashemi
- Division of Systems Engineering, Boston University, Boston, USA
| | | | - Dima Kozakov
- Laufer Center for Physical and Quantitative Biology, Institute for Advanced Computational Sciences, Stony Brook University, Stony Brook, USA
| | - Pirooz Vakili
- Division of Systems Engineering, Boston University, Boston, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University
- Department of Chemistry, Boston University
| | - Ioannis Ch. Paschalidis
- Division of Systems Engineering, Boston University, Boston, USA
- Department of Biomedical Engineering, Boston University
- Department of Electrical & Computer Engineering, and Faculty for Computing & Data Sciences, Boston University
| |
Collapse
|
3
|
Jing Y. Research on fuzzy English automatic recognition and human-computer interaction based on machine learning. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-189057] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Fuzzy English recognition is affected by many factors, which leads to certain accuracy problems in intelligent recognition results. In order to improve the automatic recognition efficiency of fuzzy English, based on machine learning technology, this study constructs a neural network model. At the same time, this paper analyzes the research status and existing problems of handwritten character recognition, analyzes the model, and adopts multiple modules for automatic English recognition. In addition, the system is built on the basis of algorithms and model support, which makes fuzzy English recognition intelligent. Finally, in order to study the algorithm and model performance, the fuzzy English recognition is carried out through experiments. The research shows that the model constructed in this paper has certain recognition effect, which can be applied to practice, and can provide theoretical reference for subsequent related research.
Collapse
Affiliation(s)
- Yuqin Jing
- School of Electronic Information Engineering, Chongqing Technology and Business Institute, Chongqing, China
| |
Collapse
|
4
|
Arya R, Paliwal S, Gupta SP, Sharma S, Madan K, Mishra A, Verma K, Chauhan N. In-silico Studies and Biological Activity of Potential BACE-1 Inhibitors. Comb Chem High Throughput Screen 2020; 24:729-736. [PMID: 32957879 DOI: 10.2174/1386207323999200918151331] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Revised: 08/08/2020] [Accepted: 08/12/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Alzheimer's disease is a neurological condition causing cognitive inability and dementia. The pathological lesions and neuronal damage in the brain are caused by self-aggregated fragments of mutated Amyloidal precursor protein (APP). OBJECTIVE The controlled APP processing by inhibition of secretase is the strategy to reduce Aβ load to treat Alzheimer's disease. METHODS A QSAR study was performed on 55 Pyrrolidine based ligands as BACE-1 inhibitors with an activity magnitude greater than 4 of compounds. RESULTS In the advent of designing new BACE-1 inhibitors, the pharmacophore model with correlation (r = 0.90) and root mean square deviation (RMSD) of 0.87 was developed and validated. Further, the hits retrieved by the in-silico approach were evaluated by docking interactions. CONCLUSION Two structurally diverse compounds exhibited Asp32 and Thr232 binding with the BACE-1 receptor. The aryl-substituted carbamate compound exhibited the highest fit value and docking score. The biological activity evaluation by in-vitro assay was found to be >0.1μM.
Collapse
Affiliation(s)
- Richa Arya
- Banasthali Vidyapith, Banasthali-304022 (Raj.), India
| | | | - Satya P Gupta
- Department of Pharmaceutical Technology, Meerut Institute of Engineering and Technology, Meerut-250005, India
| | | | - Kirtika Madan
- Banasthali Vidyapith, Banasthali-304022 (Raj.), India
| | - Achal Mishra
- Faculty of Pharmaceutical Sciences, Shri Shankaracharya Tech. Campus. Bhilai, India
| | - Kanika Verma
- Banasthali Vidyapith, Banasthali-304022 (Raj.), India
| | - Neha Chauhan
- Banasthali Vidyapith, Banasthali-304022 (Raj.), India
| |
Collapse
|
5
|
Tanemura KA, Pei J, Merz KM. Refinement of pairwise potentials via logistic regression to score protein-protein interactions. Proteins 2020; 88:1559-1568. [PMID: 32729132 DOI: 10.1002/prot.25973] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 05/17/2020] [Accepted: 06/14/2020] [Indexed: 12/20/2022]
Abstract
Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the accurate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, and then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared with conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. The scripts used are available at https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR.
Collapse
Affiliation(s)
- Kiyoto A Tanemura
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| | - Jun Pei
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
6
|
Abstract
Many of the biological functions of the cell are driven by protein-protein interactions. However, determining which proteins interact and exactly how they do so to enable their functions, remain major research questions. Functional interactions are dependent on a number of complicated factors; therefore, modeling the three-dimensional structure of protein-protein complexes is still considered a complex endeavor. Nevertheless, the rewards for modeling protein interactions to atomic level detail are substantial, and there are numerous examples of how models can provide useful information for drug design, protein engineering, systems biology, and understanding of the immune system. Here, we provide practical guidelines for docking proteins using the web-server, SwarmDock, a flexible protein-protein docking method. Moreover, we provide an overview of the factors that need to be considered when deciding whether docking is likely to be successful.
Collapse
Affiliation(s)
- Iain H Moal
- European Bioinformatics Institute, Hinxton, UK
| | | | | | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK.
| |
Collapse
|
7
|
Perthold JW, Oostenbrink C. GroScore: Accurate Scoring of Protein–Protein Binding Poses Using Explicit-Solvent Free-Energy Calculations. J Chem Inf Model 2019; 59:5074-5085. [DOI: 10.1021/acs.jcim.9b00687] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Jan Walther Perthold
- Institute of Molecular Modeling and Simulation, University of Natural Resources and Life Sciences, Muthgasse 18, 1190 Vienna, Austria
| | - Chris Oostenbrink
- Institute of Molecular Modeling and Simulation, University of Natural Resources and Life Sciences, Muthgasse 18, 1190 Vienna, Austria
| |
Collapse
|
8
|
Jankauskaite J, Jiménez-García B, Dapkunas J, Fernández-Recio J, Moal IH. SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 2019; 35:462-469. [PMID: 30020414 PMCID: PMC6361233 DOI: 10.1093/bioinformatics/bty635] [Citation(s) in RCA: 161] [Impact Index Per Article: 32.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 07/17/2018] [Indexed: 11/18/2022] Open
Abstract
Motivation Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein–protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering. Results We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein–protein interactions. This version now contains manually curated binding data for 7085 mutations, an increase of 133%, including changes in kinetics for 1844 mutations, enthalpy and entropy changes for 443 mutations, and 440 mutations, which abolish detectable binding. Availability and implementation The database is available as supplementary data and at https://life.bsc.es/pid/skempi2/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Justina Jankauskaite
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Brian Jiménez-García
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, Utrecht, the Netherlands
| | - Justas Dapkunas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Juan Fernández-Recio
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Institut de Biologia Molecular de Barcelona (IBMB), CSIC, Barcelona, Spain
| | - Iain H Moal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
9
|
Pfeiffenberger E, Bates PA. Refinement of protein-protein complexes in contact map space with metadynamics simulations. Proteins 2019; 87:12-22. [PMID: 30370948 PMCID: PMC6492248 DOI: 10.1002/prot.25612] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Revised: 09/21/2018] [Accepted: 09/26/2018] [Indexed: 12/18/2022]
Abstract
Accurate protein-protein complex prediction, to atomic detail, is a challenging problem. For flexible docking cases, current state-of-the-art docking methods are limited in their ability to exhaustively search the high dimensionality of the problem space. In this study, to obtain more accurate models, an investigation into the local optimization of initial docked solutions is presented with respect to a reference crystal structure. We show how physics-based refinement of protein-protein complexes in contact map space (CMS), within a metadynamics protocol, can be performed. The method uses 5 times replicated 10 ns simulations for sampling and ranks the generated conformational snapshots with ZRANK to identify an ensemble of n snapshots for final model building. Furthermore, we investigated whether the reconstructed free energy surface (FES), or a combination of both FES and ZRANK, referred to as CSα , can help to reduce snapshot ranking error.
Collapse
Affiliation(s)
- Erik Pfeiffenberger
- Biomolecular Modelling LaboratoryThe Francis Crick InstituteLondonUnited Kingdom
| | - Paul A. Bates
- Biomolecular Modelling LaboratoryThe Francis Crick InstituteLondonUnited Kingdom
| |
Collapse
|
10
|
Pfeiffenberger E, Bates PA. Predicting improved protein conformations with a temporal deep recurrent neural network. PLoS One 2018; 13:e0202652. [PMID: 30180164 PMCID: PMC6122789 DOI: 10.1371/journal.pone.0202652] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 08/07/2018] [Indexed: 02/03/2023] Open
Abstract
Accurate protein structure prediction from amino acid sequence is still an unsolved problem. The most reliable methods centre on template based modelling. However, the accuracy of these models entirely depends on the availability of experimentally resolved homologous template structures. In order to generate more accurate models, extensive physics based molecular dynamics (MD) refinement simulations are performed to sample many different conformations to find improved conformational states. In this study, we propose a deep recurrent network model, called DeepTrajectory, that is able to identify these improved conformational states, with high precision, from a variety of different MD based sampling protocols. The proposed model learns the temporal patterns of features computed from MD trajectory data in order to classify whether each recorded simulation snapshot is an improved quality conformational state, decreased quality conformational state or whether there is no perceivable change in state with respect to the starting conformation. The model was trained and tested on 904 trajectories from 42 different protein systems with a cumulative number of more than 1.7 million snapshots. We show that our model outperforms other state of the art machine-learning algorithms that do not consider temporal dependencies. To our knowledge, DeepTrajectory is the first implementation of a time-dependent deep-learning protocol that is re-trainable and able to adapt to any new MD based sampling procedure, thereby demonstrating how a neural network can be used to learn the latter part of the protein folding funnel.
Collapse
Affiliation(s)
- Erik Pfeiffenberger
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| |
Collapse
|
11
|
Zarbafian S, Moghadasi M, Roshandelpoor A, Nan F, Li K, Vakli P, Vajda S, Kozakov D, Paschalidis IC. Protein docking refinement by convex underestimation in the low-dimensional subspace of encounter complexes. Sci Rep 2018; 8:5896. [PMID: 29650980 PMCID: PMC5955889 DOI: 10.1038/s41598-018-23982-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Accepted: 03/21/2018] [Indexed: 01/18/2023] Open
Abstract
We propose a novel stochastic global optimization algorithm with applications to the refinement stage of protein docking prediction methods. Our approach can process conformations sampled from multiple clusters, each roughly corresponding to a different binding energy funnel. These clusters are obtained using a density-based clustering method. In each cluster, we identify a smooth “permissive” subspace which avoids high-energy barriers and then underestimate the binding energy function using general convex polynomials in this subspace. We use the underestimator to bias sampling towards its global minimum. Sampling and subspace underestimation are repeated several times and the conformations sampled at the last iteration form a refined ensemble. We report computational results on a comprehensive benchmark of 224 protein complexes, establishing that our refined ensemble significantly improves the quality of the conformations of the original set given to the algorithm. We also devise a method to enhance the ensemble from which near-native models are selected.
Collapse
Affiliation(s)
- Shahrooz Zarbafian
- Department of Mechanical Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Mohammad Moghadasi
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Athar Roshandelpoor
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Feng Nan
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Keyong Li
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Pirooz Vakli
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America.,Department of Mechanical Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America.
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics and Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, United States of America.
| | - Ioannis Ch Paschalidis
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America. .,Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America. .,Department of Electrical and Computer Engineering, Boston University, Boston, Massachusetts, United States of America. .,8 Saint Mary's St., Boston, MA, 02215, United States of America.
| |
Collapse
|
12
|
Abstract
The atomic structures of protein complexes can provide useful information for drug design, protein engineering, systems biology, and understanding pathology. Obtaining this information experimentally can be challenging. However, if the structures of the subunits are known, then it is often possible to model the complex computationally. This chapter provide practical guidelines for docking proteins using the SwarmDock flexible protein-protein docking method, providing an overview of the factors that need to be considered when deciding whether docking is likely to be successful, the preparation of structural input, generation of docked poses, analysis and ranking of docked poses, and the validation of models using external data.
Collapse
Affiliation(s)
- Iain H Moal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.
| | | | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| |
Collapse
|
13
|
Tamò G, Maesani A, Träger S, Degiacomi MT, Floreano D, Dal Peraro M. Disentangling constraints using viability evolution principles in integrative modeling of macromolecular assemblies. Sci Rep 2017; 7:235. [PMID: 28331186 PMCID: PMC5427971 DOI: 10.1038/s41598-017-00266-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Accepted: 02/14/2017] [Indexed: 11/22/2022] Open
Abstract
Predicting the structure of large molecular assemblies remains a challenging task in structural biology when using integrative modeling approaches. One of the main issues stems from the treatment of heterogeneous experimental data used to predict the architecture of native complexes. We propose a new method, applied here for the first time to a set of symmetrical complexes, based on evolutionary computation that treats every available experimental input independently, bypassing the need to balance weight components assigned to aggregated fitness functions during optimization.
Collapse
Affiliation(s)
- Giorgio Tamò
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland.,Swiss Institute of Bioinformatics (SIB), Lausanne, CH-1015, Switzerland
| | - Andrea Maesani
- Laboratory of Intelligent Systems, Institute of Microengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland
| | - Sylvain Träger
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland.,Swiss Institute of Bioinformatics (SIB), Lausanne, CH-1015, Switzerland
| | - Matteo T Degiacomi
- Chemistry Research Laboratory, Department of Chemistry, University of Oxford, Oxford, UK
| | - Dario Floreano
- Laboratory of Intelligent Systems, Institute of Microengineering, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland.
| | - Matteo Dal Peraro
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, CH-1015, Switzerland. .,Swiss Institute of Bioinformatics (SIB), Lausanne, CH-1015, Switzerland.
| |
Collapse
|