1
|
Lu D, Luo D, Zhang Y, Wang B. A Robust Induced Fit Docking Approach with the Combination of the Hybrid All-Atom/United-Atom/Coarse-Grained Model and Simulated Annealing. J Chem Theory Comput 2024; 20:6414-6423. [PMID: 38966989 DOI: 10.1021/acs.jctc.4c00653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2024]
Abstract
Molecular docking remains an indispensable tool in computational biology and structure-based drug discovery. However, the correct prediction of binding poses remains a major challenge for molecular docking, especially for target proteins where a substrate binding induces significant reorganization of the active site. Here, we introduce an Induced Fit Docking (IFD) approach named AA/UA/CG-SA-IFD, which combines a hybrid All-Atom/United-Atom/Coarse-Grained model with Simulated Annealing. In this approach, the core region is represented by the All-Atom(AA) model, while the protein environment beyond the core region and the solvent are treated with either the United-Atom (UA) or the Coarse-Grained (CG) model. By combining the Elastic Network Model (ENM) for the CG region, the hybrid model ensures a reasonable description of ligand binding and the environmental effects of the protein, facilitating highly efficient and reliable sampling of ligand binding through Simulated Annealing (SA) at a high temperature. Upon validation with two testing sets, the AA/UA/CG-SA-IFD approach demonstrates remarkable accuracy and efficiency in induced fit docking, even for challenging cases where the docked poses significantly deviate from crystal structures.
Collapse
Affiliation(s)
- Dexin Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Yuwei Zhang
- Jiangsu Key Laboratory of New Power Batteries, Jiangsu Collaborative Innovation Centre of Biomedical Functional Materials, School of Chemistry and Materials Science, Nanjing Normal University, Nanjing 210023, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| |
Collapse
|
2
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
3
|
Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022; 21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open
Abstract
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
Collapse
Affiliation(s)
- Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Aalt D.J. van Dijk
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| |
Collapse
|
4
|
Love O, Pacheco Lima MC, Clark C, Cornillie S, Roalstad S, Cheatham TE. Evaluating the accuracy of the AMBER protein force fields in modeling dihydrofolate reductase structures: misbalance in the conformational arrangements of the flexible loop domains. J Biomol Struct Dyn 2022:1-15. [PMID: 35838167 PMCID: PMC9840716 DOI: 10.1080/07391102.2022.2098823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Protein flexible loop regions were once thought to be simple linkers between other more functional secondary structural elements. However, as it becomes clearer that these loop domains are critical players in a plethora of biological processes, accurate conformational sampling of 3D loop structures is vital to the advancement of drug design techniques and the overall growth of knowledge surrounding molecular systems. While experimental techniques provide a wealth of structural information, the resolution of flexible loop domains is sometimes low or entirely absent due to their complex and dynamic nature. This highlights an opportunity for de novo structure prediction using in silico methods with molecular dynamics (MDs). This study evaluates some of the AMBER protein force field's (ffs) ability to accurately model dihydrofolate reductase (DHFR) conformations, a protein complex characterized by specific arrangements and interactions of multiple flexible loops whose conformations are determined by the presence or absence of bound ligands and cofactors. Although the AMBER ffs, including ff19SB, studied well model most protein structures with rich secondary structure, results obtained here suggest the inability to significantly sample the expected DHFR loop-loop conformations - of the six distinct protein-ligand systems simulated, a majority lacked consistent stabilization of experimentally derived metrics definitive the three enzyme conformations. Although under-sampling and the chosen ff parameter combinations could be the cause, given past successes with these MD approaches for many protein systems, this suggests a potential misbalance in available ff parameters required to accurately predict the structure of multiple flexible loop regions present in proteins.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Olivia Love
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | | | - Casey Clark
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | - Sean Cornillie
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | - Shelly Roalstad
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | - Thomas E. Cheatham
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| |
Collapse
|
5
|
Qi JH, Dong FX, Wang K, Zhang SY, Liu ZM, Wang WJ, Sun FZ, Zhang HM, Wang XL. Feasibility analysis and mechanism exploration of Rhei Radix et Rhizome-Schisandrae Sphenantherae Fructus (RS) against COVID-19. J Med Microbiol 2022; 71. [PMID: 35584000 DOI: 10.1099/jmm.0.001528] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Introduction. As a novel global epidemic, corona virus disease 2019 (COVID-19) caused by SARS-CoV-2 brought great suffering and disaster to mankind. Recently, although significant progress has been made in vaccines against SARS-CoV-2, there are still no drugs for treating COVID-19. It is well known that traditional Chinese medicine (TCM) has achieved excellent efficacy in the treatment of COVID-19 in China. As a treasure-house of natural drugs, Chinese herbs offer a promising prospect for discovering anti-COVID-19 drugs.Hypothesis/Gap Statement. We proposed that Rhei Radix et Rhizome-Schisandrae Sphenantherae Fructus (RS) may have potential value in the treatment of COVID-19 patients by regulating immune response, protecting the cardiovascular system, inhibiting the production of inflammatory factors, and blocking virus invasion and replication processes.Aim. We aimed to explore the feasibility and molecular mechanisms of RS against COVID-19, to provide a reference for basic research and clinical applications.Methodology. Through literature mining, it is found that a Chinese herbal pair, RS, has potential anti-COVID-19 activity. In this study, we analysed the feasibility of RS against COVID-19 by high-throughput molecular docking and molecular dynamics simulations. Furthermore, we predicted the molecular mechanisms of RS against COVID-19 based on network pharmacology.Results. We proved the feasibility of RS anti-COVID-19 by literature mining, virtual docking and molecular dynamics simulations, and found that angiotensin converting enzyme 2 (ACE2) and 3C-like protease (3 CL pro) were also two critical targets for RS against COVID-19. In addition, we predicted the molecular mechanisms of RS in the treatment of COVID-19, and identified 29 main ingredients, 21 potential targets and 16 signalling pathways. Rhein, eupatin, (-)-catechin, aloe-emodin may be important active ingredients in RS. ALB, ESR1, EGFR, HMOX1, CTSL, and RHOA may be important targets against COVID-19. Platelet activation, renin secretion, ras signalling pathway, chemokine signalling pathway, and human cytomegalovirus infection may be important signalling pathways against COVID-19.Conclusion. RS plays a key role in the treatment of COVID-19, which may be closely related to immune regulation, cardiovascular protection, anti-inflammation, virus invasion and replication processes.
Collapse
Affiliation(s)
- Jian-Hong Qi
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| | - Fang-Xu Dong
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| | - Ke Wang
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| | - Shan-Yu Zhang
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| | - Zi-Ming Liu
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| | - Wen-Jing Wang
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| | - Feng-Zhi Sun
- The Pharmacy Department, Maternal and Child Health Care Hospital of Shandong Province, Jinan 250014, PR China
| | - Hui-Min Zhang
- Shandong Academy of Chinese Medicine, Jinan 250014, PR China
| | - Xiao-Long Wang
- Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China.,Key Laboratory of Traditional Chinese Medicine Classical Theory, Ministry of Education, Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China.,Shandong Provincial Key Laboratory of Traditional Chinese Medicine for Basic Research, Shandong University of Traditional Chinese Medicine, Jinan 250355, PR China
| |
Collapse
|
6
|
Alfonso-Prieto M. Bitter Taste and Olfactory Receptors: Beyond Chemical Sensing in the Tongue and the Nose. J Membr Biol 2021; 254:343-352. [PMID: 34173018 PMCID: PMC8231087 DOI: 10.1007/s00232-021-00182-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 04/29/2021] [Indexed: 11/24/2022]
Abstract
Abstract The Up-and-Coming-Scientist section of the current issue of the Journal of Membrane Biology features the invited essay by Dr. Mercedes Alfonso-Prieto, Assistant Professor at the Forschungszentrum Jülich (FZJ), Germany, and the Heinrich-Heine University Düsseldorf, Vogt Institute for Brain Research.
Dr. Alfonso-Prieto completed her doctoral degree in chemistry at the Barcelona Science Park, Spain, in 2009, pursued post-doctoral research in computational molecular sciences at Temple University, USA, and then, as a Marie Curie post-doctoral fellow at the University of Barcelona, worked on computations of enzyme reactions and modeling of photoswitchable ligands targeting neuronal receptors. In 2016, she joined the Institute for Advanced Science and the Institute for Computational Biomedicine at the FZJ, where she pursues research on modeling and simulation of chemical senses.
The invited essay by Dr. Alfonso-Prieto discusses state-of-the-art modeling of molecular receptors involved in chemical sensing – the senses of taste and smell. These receptors, and computational methods to study them, are the focus of Dr. Alfonso-Prieto’s research. Recently, Dr. Alfonso-Prieto and colleagues have presented a new methodology to predict ligand binding poses for GPCRs, and extensive computations that deciphered the ligand selectivity determinants of bitter taste receptors. These developments inform our current understanding of how taste occurs at the molecular level. Graphic Abstract ![]()
Collapse
Affiliation(s)
- Mercedes Alfonso-Prieto
- Institute for Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany. .,Medical Faculty, Cécile and Oskar Vogt Institute for Brain Research, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
| |
Collapse
|
7
|
Korshunova K, Carloni P. Ligand Affinities within the Open-Boundary Molecular Mechanics/Coarse-Grained Framework (I): Alchemical Transformations within the Hamiltonian Adaptive Resolution Scheme. J Phys Chem B 2021; 125:789-797. [PMID: 33443434 DOI: 10.1021/acs.jpcb.0c09805] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Our recently developed Open-Boundary Molecular Mechanics/Coarse Grained (OB-MM/CG) framework predicts ligand poses in important pharmaceutical targets, such as G-protein Coupled Receptors, even when experimental structural information is lacking. The approach, which is based on GROMOS and AMBER force fields, allows for grand-canonical simulations of protein-ligand complexes by using the Hamiltonian Adaptive Resolution Scheme (H-AdResS) for the solvent. Here, we present a key step toward the estimation of ligand binding affinities for their targets within this approach. This is the implementation of the H-AdResS in the GROMACS code. The accuracy of our implementation is established by calculating hydration free energies of several molecules in water by means of alchemical transformations. The deviations of the GROMOS- and AMBER-based H-AdResS results from the reference fully atomistic simulations are smaller than the accuracy of the force field and/or they are in the range of the published results. Importantly, our predictions are in good agreement with experimental data. The current implementation paves the way to the use of the OB-MM/CG framework for the study of large biological systems.
Collapse
Affiliation(s)
- Ksenia Korshunova
- Department of Physics, RWTH Aachen University, 52074 Aachen, Germany.,Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, 52428 Jülich, Germany
| | - Paolo Carloni
- Department of Physics, RWTH Aachen University, 52074 Aachen, Germany.,Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, 52428 Jülich, Germany.,Molecular Neuroscience and Neuroimaging (INM-11), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany
| |
Collapse
|
8
|
Schneider J, Ribeiro R, Alfonso-Prieto M, Carloni P, Giorgetti A. Hybrid MM/CG Webserver: Automatic Set Up of Molecular Mechanics/Coarse-Grained Simulations for Human G Protein-Coupled Receptor/Ligand Complexes. Front Mol Biosci 2020; 7:576689. [PMID: 33102525 PMCID: PMC7500467 DOI: 10.3389/fmolb.2020.576689] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 08/13/2020] [Indexed: 12/12/2022] Open
Abstract
Hybrid Molecular Mechanics/Coarse-Grained (MM/CG) simulations help predict ligand poses in human G protein-coupled receptors (hGPCRs), the most important protein superfamily for pharmacological applications. This approach allows the description of the ligand, the binding cavity, and the surrounding water molecules at atomistic resolution, while coarse-graining the rest of the receptor. Here, we present the Hybrid MM/CG Webserver (mmcg.grs.kfa-juelich.de) that automatizes and speeds up the MM/CG simulation setup of hGPCR/ligand complexes. Initial structures for such complexes can be easily and efficiently generated with other webservers. The Hybrid MM/CG server also allows for equilibration of the systems, either fully automatically or interactively. The results are visualized online (using both interactive 3D visualizations and analysis plots), helping the user identify possible issues and modify the setup parameters accordingly. Furthermore, the prepared system can be downloaded and the simulation continued locally.
Collapse
Affiliation(s)
- Jakob Schneider
- Computational Biomedicine, Institute for Advanced Simulation IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich, Germany.,JARA-Institute: Molecular Neuroscience and Neuroimaging, Institute for Neuroscience and Medicine INM-11/JARA-BRAIN Institute JBI-2, Forschungszentrum Jülich GmbH, Jülich, Germany.,Department of Physics, RWTH Aachen University, Aachen, Germany
| | - Rui Ribeiro
- Department of Biotechnology, University of Verona, Verona, Italy
| | - Mercedes Alfonso-Prieto
- Computational Biomedicine, Institute for Advanced Simulation IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich, Germany.,Medical Faculty, Cécile and Oskar Vogt Institute for Brain Research, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Paolo Carloni
- Computational Biomedicine, Institute for Advanced Simulation IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich, Germany.,JARA-Institute: Molecular Neuroscience and Neuroimaging, Institute for Neuroscience and Medicine INM-11/JARA-BRAIN Institute JBI-2, Forschungszentrum Jülich GmbH, Jülich, Germany.,Department of Physics, RWTH Aachen University, Aachen, Germany
| | - Alejandro Giorgetti
- Computational Biomedicine, Institute for Advanced Simulation IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich, Germany.,Department of Biotechnology, University of Verona, Verona, Italy
| |
Collapse
|