1
|
Gutiérrez-Mondragón MA, Vellido A, König C. A Study on the Robustness and Stability of Explainable Deep Learning in an Imbalanced Setting: The Exploration of the Conformational Space of G Protein-Coupled Receptors. Int J Mol Sci 2024; 25:6572. [PMID: 38928278 PMCID: PMC11203844 DOI: 10.3390/ijms25126572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 06/03/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
G-protein coupled receptors (GPCRs) are transmembrane proteins that transmit signals from the extracellular environment to the inside of the cells. Their ability to adopt various conformational states, which influence their function, makes them crucial in pharmacoproteomic studies. While many drugs target specific GPCR states to exert their effects-thereby regulating the protein's activity-unraveling the activation pathway remains challenging due to the multitude of intermediate transformations occurring throughout this process, and intrinsically influencing the dynamics of the receptors. In this context, computational modeling, particularly molecular dynamics (MD) simulations, may offer valuable insights into the dynamics and energetics of GPCR transformations, especially when combined with machine learning (ML) methods and techniques for achieving model interpretability for knowledge generation. The current study builds upon previous work in which the layer relevance propagation (LRP) technique was employed to interpret the predictions in a multi-class classification problem concerning the conformational states of the β2-adrenergic (β2AR) receptor from MD simulations. Here, we address the challenges posed by class imbalance and extend previous analyses by evaluating the robustness and stability of deep learning (DL)-based predictions under different imbalance mitigation techniques. By meticulously evaluating explainability and imbalance strategies, we aim to produce reliable and robust insights.
Collapse
Affiliation(s)
- Mario A. Gutiérrez-Mondragón
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; (M.A.G.-M.); (A.V.)
| | - Alfredo Vellido
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; (M.A.G.-M.); (A.V.)
- Centro de Investigacion Biomédica en Red (CIBER), 28029 Madrid, Spain
| | - Caroline König
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; (M.A.G.-M.); (A.V.)
| |
Collapse
|
2
|
Bosio S, Bernetti M, Rocchia W, Masetti M. Similarities and Differences in Ligand Binding to Protein and RNA Targets: The Case of Riboflavin. J Chem Inf Model 2024; 64:4570-4586. [PMID: 38800845 DOI: 10.1021/acs.jcim.4c00420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
It is nowadays clear that RNA molecules can play active roles in several biological processes. As a result, an increasing number of RNAs are gradually being identified as potentially druggable targets. In particular, noncoding RNAs can adopt highly organized conformations that are suitable for drug binding. However, RNAs are still considered challenging targets due to their complex structural dynamics and high charge density. Thus, elucidating relevant features of drug-RNA binding is fundamental for advancing drug discovery. Here, by using Molecular Dynamics simulations, we compare key features of ligand binding to proteins with those observed in RNA. Specifically, we explore similarities and differences in terms of (i) conformational flexibility of the target, (ii) electrostatic contribution to binding free energy, and (iii) water and ligand dynamics. As a test case, we examine binding of the same ligand, namely riboflavin, to protein and RNA targets, specifically the riboflavin (RF) kinase and flavin mononucleotide (FMN) riboswitch. The FMN riboswitch exhibited enhanced fluctuations and explored a wider conformational space, compared to the protein target, underscoring the importance of RNA flexibility in ligand binding. Conversely, a similar electrostatic contribution to the binding free energy of riboflavin was found. Finally, greater stability of water molecules was observed in the FMN riboswitch compared to the RF kinase, possibly due to the different shape and polarity of the pockets.
Collapse
Affiliation(s)
- Stefano Bosio
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, Via Belmeloro 6, 40126 Bologna, Italy
- Computational and Chemical Biology, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, I-16163 Genova, Italy
| | - Mattia Bernetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, Via Belmeloro 6, 40126 Bologna, Italy
- Computational and Chemical Biology, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, I-16163 Genova, Italy
| | - Walter Rocchia
- Computational mOdelling of NanosCalE and bioPhysical sysTems (CONCEPT) Lab, Istituto Italiano di Tecnologia, Via Melen - 83, B Block, 16152 Genova, Italy
| | - Matteo Masetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, Via Belmeloro 6, 40126 Bologna, Italy
| |
Collapse
|
3
|
Oh M, da Hora GCA, Swanson JMJ. tICA-Metadynamics for Identifying Slow Dynamics in Membrane Permeation. J Chem Theory Comput 2023; 19:8886-8900. [PMID: 37943658 DOI: 10.1021/acs.jctc.3c00526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Molecular simulations are commonly used to understand the mechanism of membrane permeation of small molecules, particularly for biomedical and pharmaceutical applications. However, despite significant advances in computing power and algorithms, calculating an accurate permeation free energy profile remains elusive for many drug molecules because it can require identifying the rate-limiting degrees of freedom (i.e., appropriate reaction coordinates). To resolve this issue, researchers have developed machine learning approaches to identify slow system dynamics. In this work, we apply time-lagged independent component analysis (tICA), an unsupervised dimensionality reduction algorithm, to molecular dynamics simulations with well-tempered metadynamics to find the slowest collective degrees of freedom of the permeation process of trimethoprim through a multicomponent membrane. We show that tICA-metadynamics yields translational and orientational collective variables (CVs) that increase convergence efficiency ∼1.5 times. However, crossing the periodic boundary is shown to introduce artifacts in the translational CV that can be corrected by taking absolute values of molecular features. Additionally, we find that the convergence of the tICA CVs is reached with approximately five membrane crossings and that data reweighting is required to avoid deviations in the translational CV.
Collapse
Affiliation(s)
- Myongin Oh
- Department of Chemistry, University of Utah, 315 South 1400 East, Rm 2020, Salt Lake City, Utah 84112, United States
| | - Gabriel C A da Hora
- Department of Chemistry, University of Utah, 315 South 1400 East, Rm 2020, Salt Lake City, Utah 84112, United States
| | - Jessica M J Swanson
- Department of Chemistry, University of Utah, 315 South 1400 East, Rm 2020, Salt Lake City, Utah 84112, United States
| |
Collapse
|
4
|
Conflitti P, Raniolo S, Limongelli V. Perspectives on Ligand/Protein Binding Kinetics Simulations: Force Fields, Machine Learning, Sampling, and User-Friendliness. J Chem Theory Comput 2023; 19:6047-6061. [PMID: 37656199 PMCID: PMC10536999 DOI: 10.1021/acs.jctc.3c00641] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Indexed: 09/02/2023]
Abstract
Computational techniques applied to drug discovery have gained considerable popularity for their ability to filter potentially active drugs from inactive ones, reducing the time scale and costs of preclinical investigations. The main focus of these studies has historically been the search for compounds endowed with high affinity for a specific molecular target to ensure the formation of stable and long-lasting complexes. Recent evidence has also correlated the in vivo drug efficacy with its binding kinetics, thus opening new fascinating scenarios for ligand/protein binding kinetic simulations in drug discovery. The present article examines the state of the art in the field, providing a brief summary of the most popular and advanced ligand/protein binding kinetics techniques and evaluating their current limitations and the potential solutions to reach more accurate kinetic models. Particular emphasis is put on the need for a paradigm change in the present methodologies toward ligand and protein parametrization, the force field problem, characterization of the transition states, the sampling issue, and algorithms' performance, user-friendliness, and data openness.
Collapse
Affiliation(s)
- Paolo Conflitti
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
| | - Stefano Raniolo
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
| | - Vittorio Limongelli
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
- Department
of Pharmacy, University of Naples “Federico
II”, 80131 Naples, Italy
| |
Collapse
|
5
|
Oh M, da Hora GCA, Swanson JMJ. tICA-Metadynamics for Identifying Slow Dynamics in Membrane Permeation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.16.553477. [PMID: 37645884 PMCID: PMC10462029 DOI: 10.1101/2023.08.16.553477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Molecular simulations are commonly used to understand the mechanism of membrane permeation of small molecules, particularly for biomedical and pharmaceutical applications. However, despite significant advances in computing power and algorithms, calculating an accurate permeation free energy profile remains elusive for many drug molecules because it can require identifying the rate-limiting degrees of freedom (i.e., appropriate reaction coordinates). To resolve this issue, researchers have developed machine learning approaches to identify slow system dynamics. In this work, we apply time-lagged independent component analysis (tICA), an unsupervised dimensionality reduction algorithm, to molecular dynamics simulations with well-tempered metadynamics to find the slowest collective degrees of freedom of the permeation process of trimethoprim through a multicomponent membrane. We show that tICA-metadynamics yields translational and orientational collective variables (CVs) that increase convergence efficiency ∼1.5 times. However, crossing the periodic boundary is shown to introduce artefacts in the translational CV that can be corrected by taking absolute values of molecular features. Additionally, we find that the convergence of the tICA CVs is reached with approximately five membrane crossings, and that data reweighting is required to avoid deviations in the translational CV.
Collapse
|
6
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
7
|
Gutiérrez-Mondragón MA, König C, Vellido A. Layer-Wise Relevance Analysis for Motif Recognition in the Activation Pathway of the β2- Adrenergic GPCR Receptor. Int J Mol Sci 2023; 24:ijms24021155. [PMID: 36674669 PMCID: PMC9865744 DOI: 10.3390/ijms24021155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/22/2022] [Accepted: 12/30/2022] [Indexed: 01/11/2023] Open
Abstract
G-protein-coupled receptors (GPCRs) are cell membrane proteins of relevance as therapeutic targets, and are associated to the development of treatments for illnesses such as diabetes, Alzheimer's, or even cancer. Therefore, comprehending the underlying mechanisms of the receptor functional properties is of particular interest in pharmacoproteomics and in disease therapy at large. Their interaction with ligands elicits multiple molecular rearrangements all along their structure, inducing activation pathways that distinctly influence the cell response. In this work, we studied GPCR signaling pathways from molecular dynamics simulations as they provide rich information about the dynamic nature of the receptors. We focused on studying the molecular properties of the receptors using deep-learning-based methods. In particular, we designed and trained a one-dimensional convolution neural network and illustrated its use in a classification of conformational states: active, intermediate, or inactive, of the β2-adrenergic receptor when bound to the full agonist BI-167107. Through a novel explainability-oriented investigation of the prediction results, we were able to identify and assess the contribution of individual motifs (residues) influencing a particular activation pathway. Consequently, we contribute a methodology that assists in the elucidation of the underlying mechanisms of receptor activation-deactivation.
Collapse
Affiliation(s)
- Mario A. Gutiérrez-Mondragón
- Computer Science Department, Universitat Politècnica de Catalunya—UPC BarcelonaTech, 08034 Barcelona, Spain
- Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya—UPC BarcelonaTech, 08034 Barcelona, Spain
| | - Caroline König
- Computer Science Department, Universitat Politècnica de Catalunya—UPC BarcelonaTech, 08034 Barcelona, Spain
- Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya—UPC BarcelonaTech, 08034 Barcelona, Spain
- Correspondence:
| | - Alfredo Vellido
- Computer Science Department, Universitat Politècnica de Catalunya—UPC BarcelonaTech, 08034 Barcelona, Spain
- Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya—UPC BarcelonaTech, 08034 Barcelona, Spain
| |
Collapse
|
8
|
Bedart C, Renault N, Chavatte P, Porcherie A, Lachgar A, Capron M, Farce A. SINAPs: A Software Tool for Analysis and Visualization of Interaction Networks of Molecular Dynamics Simulations. J Chem Inf Model 2022; 62:1425-1436. [PMID: 35239339 PMCID: PMC8966674 DOI: 10.1021/acs.jcim.1c00854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
As long as the structural study of molecular mechanisms requires multiple molecular dynamics reflecting contrasted bioactive states, the subsequent analysis of molecular interaction networks remains a bottleneck to be fairly treated and requires a user-friendly 3D view of key interactions. Structural Interaction Network Analysis Protocols (SINAPs) is a proprietary python tool developed to (i) quickly solve key interactions able to distinguish two protein states, either from two sets of molecular dynamics simulations or from two crystallographic structures, and (ii) render a user-friendly 3D view of these key interactions through a plugin of UCSF Chimera, one of the most popular open-source viewing software for biomolecular systems. Through two case studies, glucose transporter-1 (GLUT-1) and A2A adenosine receptor (A2AR), SINAPs easily pinpointed key interactions observed experimentally and relevant for their bioactivities. This very effective tool was thus applied to identify the amino acids involved in the molecular enzymatic mechanisms ruling the activation of an immunomodulator drug candidate, P28 glutathione-S-transferase (P28GST). SINAPs is freely available at https://github.com/ParImmune/SINAPs.
Collapse
Affiliation(s)
- Corentin Bedart
- Univ.
Lille, Inserm, CHU Lille, U1286 - Infinite - Institute for Translational
Research in Inflammation, F-59000 Lille, France,Par’Immune,
Bio-incubateur Eurasanté, 70 rue du Dr. Yersin, 59120 Loos-Lez-Lille, France,
| | - Nicolas Renault
- Univ.
Lille, Inserm, CHU Lille, U1286 - Infinite - Institute for Translational
Research in Inflammation, F-59000 Lille, France
| | - Philippe Chavatte
- Univ.
Lille, Inserm, CHU Lille, U1286 - Infinite - Institute for Translational
Research in Inflammation, F-59000 Lille, France
| | - Adeline Porcherie
- Par’Immune,
Bio-incubateur Eurasanté, 70 rue du Dr. Yersin, 59120 Loos-Lez-Lille, France
| | - Abderrahim Lachgar
- Par’Immune,
Bio-incubateur Eurasanté, 70 rue du Dr. Yersin, 59120 Loos-Lez-Lille, France
| | - Monique Capron
- Univ.
Lille, Inserm, CHU Lille, U1286 - Infinite - Institute for Translational
Research in Inflammation, F-59000 Lille, France,Par’Immune,
Bio-incubateur Eurasanté, 70 rue du Dr. Yersin, 59120 Loos-Lez-Lille, France
| | - Amaury Farce
- Univ.
Lille, Inserm, CHU Lille, U1286 - Infinite - Institute for Translational
Research in Inflammation, F-59000 Lille, France,
| |
Collapse
|
9
|
Gianti E, Percec S. Machine Learning at the Interface of Polymer Science and Biology: How Far Can We Go? Biomacromolecules 2022; 23:576-591. [PMID: 35133143 DOI: 10.1021/acs.biomac.1c01436] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This Perspective outlines recent progress and future directions for using machine learning (ML), a data-driven method, to address critical questions in the design, synthesis, processing, and characterization of biomacromolecules. The achievement of these tasks requires the navigation of vast and complex chemical and biological spaces, difficult to accomplish with reasonable speed. Using modern algorithms and supercomputers, quantum physics methods are able to examine systems containing a few hundred interacting species and determine the probability of finding them in a particular region of phase space, thereby anticipating their properties. Likewise, modern approaches in chemistry and biomolecular simulation, supported by high performance computing, have culminated in producing data sets of escalating size and intrinsically high complexity. Hence, using ML to extract relevant information from these fields is of paramount importance to advance our understanding of chemical and biomolecular systems. At the heart of ML approaches lie statistical algorithms, which by evaluating a portion of a given data set, identify, learn, and manipulate the underlying rules that govern the whole data set. The assembly of a quality model to represent the data followed by the predictions and elimination of error sources are the key steps in ML. In addition to a growing infrastructure of ML tools to address complex problems, an increasing number of aspects related to our understanding of the fundamental properties of biomacromolecules are exposed to ML. These fields, including those residing at the interface of polymer science and biology (i.e., structure determination, de novo design, folding, and dynamics), strive to adopt and take advantage of the transformative power offered by approaches in the ML domain, which clearly has the potential of accelerating research in the field of biomacromolecules.
Collapse
Affiliation(s)
- Eleonora Gianti
- Institute for Computational Molecular Science (ICMS), Temple University, Philadelphia, Pennsylvania 19122, United States.,Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Simona Percec
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
10
|
Challenges and frontiers of computational modelling of biomolecular recognition. QRB DISCOVERY 2022. [DOI: 10.1017/qrd.2022.11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Abstract
Biomolecular recognition including binding of small molecules, peptides and proteins to their target receptors plays a key role in cellular function and has been targeted for therapeutic drug design. However, the high flexibility of biomolecules and slow binding and dissociation processes have presented challenges for computational modelling. Here, we review the challenges and computational approaches developed to characterise biomolecular binding, including molecular docking, molecular dynamics simulations (especially enhanced sampling) and machine learning. Further improvements are still needed in order to accurately and efficiently characterise binding structures, mechanisms, thermodynamics and kinetics of biomolecules in the future.
Collapse
|
11
|
Lagoutte-Renosi J, Allemand F, Ramseyer C, Yesylevskyy S, Davani S. Molecular modeling in cardiovascular pharmacology: Current state of the art and perspectives. Drug Discov Today 2021; 27:985-1007. [PMID: 34863931 DOI: 10.1016/j.drudis.2021.11.026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 11/02/2021] [Accepted: 11/25/2021] [Indexed: 01/10/2023]
Abstract
Molecular modeling in pharmacology is a promising emerging tool for exploring drug interactions with cellular components. Recent advances in molecular simulations, big data analysis, and artificial intelligence (AI) have opened new opportunities for rationalizing drug interactions with their pharmacological targets. Despite the obvious utility and increasing impact of computational approaches, their development is not progressing at the same speed in different fields of pharmacology. Here, we review current in silico techniques used in cardiovascular diseases (CVDs), cardiological drug discovery, and assessment of cardiotoxicity. In silico techniques are paving the way to a new era in cardiovascular medicine, but their use somewhat lags behind that in other fields.
Collapse
Affiliation(s)
- Jennifer Lagoutte-Renosi
- EA 3920 Université Bourgogne Franche-Comté, 25000 Besançon, France; Laboratoire de Pharmacologie Clinique et Toxicologie-CHU de Besançon, 25000 Besançon, France
| | - Florentin Allemand
- EA 3920 Université Bourgogne Franche-Comté, 25000 Besançon, France; Laboratoire Chrono Environnement UMR CNRS 6249, Université de Bourgogne Franche-Comté, 16 route de Gray, 25000 Besançon, France
| | - Christophe Ramseyer
- Laboratoire Chrono Environnement UMR CNRS 6249, Université de Bourgogne Franche-Comté, 16 route de Gray, 25000 Besançon, France
| | - Semen Yesylevskyy
- Laboratoire Chrono Environnement UMR CNRS 6249, Université de Bourgogne Franche-Comté, 16 route de Gray, 25000 Besançon, France; Department of Physics of Biological Systems, Institute of Physics of The National Academy of Sciences of Ukraine, Nauky Sve. 46, Kyiv, Ukraine; Receptor.ai inc, 16192 Coastal Highway, Lewes, DE, USA
| | - Siamak Davani
- EA 3920 Université Bourgogne Franche-Comté, 25000 Besançon, France; Laboratoire de Pharmacologie Clinique et Toxicologie-CHU de Besançon, 25000 Besançon, France.
| |
Collapse
|
12
|
Casalino L, Dommer AC, Gaieb Z, Barros EP, Sztain T, Ahn SH, Trifan A, Brace A, Bogetti AT, Clyde A, Ma H, Lee H, Turilli M, Khalid S, Chong LT, Simmerling C, Hardy DJ, Maia JD, Phillips JC, Kurth T, Stern AC, Huang L, McCalpin JD, Tatineni M, Gibbs T, Stone JE, Jha S, Ramanathan A, Amaro RE. AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics. THE INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2021; 35:432-451. [PMID: 38603008 PMCID: PMC8064023 DOI: 10.1177/10943420211006452] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
We develop a generalizable AI-driven workflow that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems. We use this workflow to investigate the mechanisms of infectivity of the SARS-CoV-2 spike protein, the main viral infection machinery. Our workflow enables more efficient investigation of spike dynamics in a variety of complex environments, including within a complete SARS-CoV-2 viral envelope simulation, which contains 305 million atoms and shows strong scaling on ORNL Summit using NAMD. We present several novel scientific discoveries, including the elucidation of the spike's full glycan shield, the role of spike glycans in modulating the infectivity of the virus, and the characterization of the flexible interactions between the spike and the human ACE2 receptor. We also demonstrate how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.
Collapse
Affiliation(s)
- Lorenzo Casalino
- University of California San Diego, La Jolla, CA, USA
- Authors with symbol indicate equal contribution
| | - Abigail C Dommer
- University of California San Diego, La Jolla, CA, USA
- Authors with symbol indicate equal contribution
| | - Zied Gaieb
- University of California San Diego, La Jolla, CA, USA
- Authors with symbol indicate equal contribution
| | | | - Terra Sztain
- University of California San Diego, La Jolla, CA, USA
| | - Surl-Hee Ahn
- University of California San Diego, La Jolla, CA, USA
| | - Anda Trifan
- Argonne National Lab, Lemont, IL, USA
- University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | | | - Austin Clyde
- Argonne National Lab, Lemont, IL, USA
- University of Chicago, Chicago, IL, USA
| | - Heng Ma
- Argonne National Lab, Lemont, IL, USA
| | | | | | | | | | | | - David J Hardy
- University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Julio Dc Maia
- University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | | | | | - Lei Huang
- Texas Advanced Computing Center, Austin, TX, USA
| | | | | | - Tom Gibbs
- NVIDIA Corporation, Santa Clara, CA, USA
| | - John E Stone
- University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Shantenu Jha
- Rutgers University, Piscataway, NJ, USA
- Brookhaven National Lab, Upton, NY, USA
| | | | | |
Collapse
|
13
|
Bertazzo M, Gobbo D, Decherchi S, Cavalli A. Machine Learning and Enhanced Sampling Simulations for Computing the Potential of Mean Force and Standard Binding Free Energy. J Chem Theory Comput 2021; 17:5287-5300. [PMID: 34260233 PMCID: PMC8389529 DOI: 10.1021/acs.jctc.1c00177] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Indexed: 02/07/2023]
Abstract
Computational capabilities are rapidly increasing, primarily because of the availability of GPU-based architectures. This creates unprecedented simulative possibilities for the systematic and robust computation of thermodynamic observables, including the free energy of a drug binding to a target. In contrast to calculations of relative binding free energy, which are nowadays widely exploited for drug discovery, we here push the boundary of computing the binding free energy and the potential of mean force. We introduce a novel protocol that leverages enhanced sampling, machine learning, and ad hoc algorithms to limit human intervention, computing time, and free parameters in free energy calculations. We first validate the method on a host-guest system, and then we apply the protocol to glycogen synthase kinase 3 beta, a protein kinase of pharmacological interest. Overall, we obtain a good correlation with experimental values in relative and absolute terms. While we focus on protein-ligand binding, the strategy is of broad applicability to any complex event that can be described with a path collective variable. We systematically discuss key details that influence the final result. The parameters and simulation settings are available at PLUMED-NEST to allow full reproducibility.
Collapse
Affiliation(s)
- Martina Bertazzo
- Computational
& Chemical Biology, Fondazione Istituto
Italiano di Tecnologia, via Morego 30, 16163 Genoa, Italy
- Department
of Pharmacy and Biotechnology (FaBiT), Alma
Mater Studiorum − University of Bologna, via Belmeloro 6, 40126 Bologna, Italy
| | - Dorothea Gobbo
- Computational
& Chemical Biology, Fondazione Istituto
Italiano di Tecnologia, via Morego 30, 16163 Genoa, Italy
| | - Sergio Decherchi
- Computational
& Chemical Biology, Fondazione Istituto
Italiano di Tecnologia, via Morego 30, 16163 Genoa, Italy
- BiKi
Technologies s.r.l., Via XX Settembre 33/10, 16121 Genoa, Italy
| | - Andrea Cavalli
- Computational
& Chemical Biology, Fondazione Istituto
Italiano di Tecnologia, via Morego 30, 16163 Genoa, Italy
- Department
of Pharmacy and Biotechnology (FaBiT), Alma
Mater Studiorum − University of Bologna, via Belmeloro 6, 40126 Bologna, Italy
| |
Collapse
|
14
|
Gallego V, Naveiro R, Roca C, Ríos Insua D, Campillo NE. AI in drug development: a multidisciplinary perspective. Mol Divers 2021; 25:1461-1479. [PMID: 34251580 PMCID: PMC8342381 DOI: 10.1007/s11030-021-10266-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/29/2021] [Indexed: 01/09/2023]
Abstract
The introduction of a new drug to the commercial market follows a complex and long process that typically spans over several years and entails large monetary costs due to a high attrition rate. Because of this, there is an urgent need to improve this process using innovative technologies such as artificial intelligence (AI). Different AI tools are being applied to support all four steps of the drug development process (basic research for drug discovery; pre-clinical phase; clinical phase; and postmarketing). Some of the main tasks where AI has proven useful include identifying molecular targets, searching for hit and lead compounds, synthesising drug-like compounds and predicting ADME-Tox. This review, on the one hand, brings in a mathematical vision of some of the key AI methods used in drug development closer to medicinal chemists and, on the other hand, brings the drug development process and the use of different models closer to mathematicians. Emphasis is placed on two aspects not mentioned in similar surveys, namely, Bayesian approaches and their applications to molecular modelling and the eventual final use of the methods to actually support decisions. Promoting a perfect synergy.
Collapse
Affiliation(s)
- Víctor Gallego
- Institute of Mathematical Sciences (ICMAT-CSIC), Nicolás Cabrera 13-15, 28049, Madrid, Spain
| | - Roi Naveiro
- Institute of Mathematical Sciences (ICMAT-CSIC), Nicolás Cabrera 13-15, 28049, Madrid, Spain
| | - Carlos Roca
- AItenea Biotech S.L. Parque Científico de Madrid, Faraday, 7, 28049, Madrid, Spain
| | - David Ríos Insua
- ICMAT-CSIC and Dept. of Statistics and OR, U. Compl. Madrid, Madrid, Spain
| | - Nuria E Campillo
- CIB-Margarita Salas (CSIC), Ramiro de Maeztu, 9, 28040, Madrid, Spain.
| |
Collapse
|
15
|
In Silico Approaches: A Way to Unveil Novel Therapeutic Drugs for Cervical Cancer Management. Pharmaceuticals (Basel) 2021; 14:ph14080741. [PMID: 34451838 PMCID: PMC8400112 DOI: 10.3390/ph14080741] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 07/22/2021] [Accepted: 07/27/2021] [Indexed: 02/07/2023] Open
Abstract
Cervical cancer (CC) is the fourth most common pathology in women worldwide and presents a high impact in developing countries due to limited financial resources as well as difficulties in monitoring and access to health services. Human papillomavirus (HPV) is the leading cause of CC, and despite the approval of prophylactic vaccines, there is no effective treatment for patients with pre-existing infections or HPV-induced carcinomas. High-risk (HR) HPV E6 and E7 oncoproteins are considered biomarkers in CC progression. Since the E6 structure was resolved, it has been one of the most studied targets to develop novel and specific therapeutics to treat/manage CC. Therefore, several small molecules (plant-derived or synthetic compounds) have been reported as blockers/inhibitors of E6 oncoprotein action, and computational-aided methods have been of high relevance in their discovery and development. In silico approaches have become a powerful tool for reducing the time and cost of the drug development process. Thus, this review will depict small molecules that are already being explored as HR HPV E6 protein blockers and in silico approaches to the design of novel therapeutics for managing CC. Besides, future perspectives in CC therapy will be briefly discussed.
Collapse
|
16
|
Masetti M, Bertazzo M, Recanatini M, Ciurli S, Musiani F. Probing the transport of Ni(II) ions through the internal tunnels of the Helicobacter pylori UreDFG multimeric protein complex. J Inorg Biochem 2021; 223:111554. [PMID: 34325209 DOI: 10.1016/j.jinorgbio.2021.111554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 07/08/2021] [Accepted: 07/16/2021] [Indexed: 11/19/2022]
Abstract
The survival of several pathogenic bacteria, such as Helicobacter pylori (Hp), relies on the activity of the nickel-dependent enzyme urease. Nickel insertion into urease is mediated by a multimeric chaperone complex (HpUreDFG) that is responsible for the transport of Ni(II) from a conserved metal binding motif located in the UreG dimer (CPH motif) to the catalytic site of the enzyme. The X-ray structure of HpUreDFG revealed the presence of water-filled tunnels that were proposed as a route for Ni(II) translocation. Here, we probe the transport of Ni(II) through the internal tunnels of HpUreDFG, from the CPH motif to the external surface of the complex, using microsecond-long enhanced molecular dynamics simulations. The results suggest a "bucket-brigade" mechanism whereby Ni(II) can be transported through a series of stations found along these internal pathways.
Collapse
Affiliation(s)
- Matteo Masetti
- Laboratory of Computational Medicinal Chemistry, Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, via Belmeloro 6, I-40126 Bologna, Italy.
| | - Martina Bertazzo
- Laboratory of Computational Medicinal Chemistry, Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, via Belmeloro 6, I-40126 Bologna, Italy; Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, I-16163 Genova, Italy
| | - Maurizio Recanatini
- Laboratory of Computational Medicinal Chemistry, Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| | - Stefano Ciurli
- Laboratory of Bioinorganic Chemistry, Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, viale G. Fanin 40, I-40127 Bologna, Italy.
| | - Francesco Musiani
- Laboratory of Bioinorganic Chemistry, Department of Pharmacy and Biotechnology, Alma Mater Studiorum - University of Bologna, viale G. Fanin 40, I-40127 Bologna, Italy.
| |
Collapse
|
17
|
Pavlova A, Zhang Z, Acharya A, Lynch DL, Pang YT, Mou Z, Parks JM, Chipot C, Gumbart JC. Machine Learning Reveals the Critical Interactions for SARS-CoV-2 Spike Protein Binding to ACE2. J Phys Chem Lett 2021; 12:5494-5502. [PMID: 34086459 PMCID: PMC8204752 DOI: 10.1021/acs.jpclett.1c01494] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 06/02/2021] [Indexed: 05/06/2023]
Abstract
SARS-CoV and SARS-CoV-2 bind to the human ACE2 receptor in practically identical conformations, although several residues of the receptor-binding domain (RBD) differ between them. Herein, we have used molecular dynamics (MD) simulations, machine learning (ML), and free-energy perturbation (FEP) calculations to elucidate the differences in binding by the two viruses. Although only subtle differences were observed from the initial MD simulations of the two RBD-ACE2 complexes, ML identified the individual residues with the most distinctive ACE2 interactions, many of which have been highlighted in previous experimental studies. FEP calculations quantified the corresponding differences in binding free energies to ACE2, and examination of MD trajectories provided structural explanations for these differences. Lastly, the energetics of emerging SARS-CoV-2 mutations were studied, showing that the affinity of the RBD for ACE2 is increased by N501Y and E484K mutations but is slightly decreased by K417N.
Collapse
Affiliation(s)
- Anna Pavlova
- School
of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Zijian Zhang
- School
of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Atanu Acharya
- School
of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Diane L. Lynch
- School
of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yui Tik Pang
- School
of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Zhongyu Mou
- UT/ORNL
Center for Molecular Biophysics, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Jerry M. Parks
- UT/ORNL
Center for Molecular Biophysics, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Chris Chipot
- Université
de Lorraine, UMR 7019, Laboratoire International Associé
CNRS and University of Illinois at Urbana−Champaign, Vandoeuvre-lès-Nancy F-54506, France
- Department
of Physics, University of Illinois at Urbana−Champaign, Urbana 61801-3003, Illinois, United States
| | - James C. Gumbart
- School
of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
18
|
Paissoni C, Camilloni C. How to Determine Accurate Conformational Ensembles by Metadynamics Metainference: A Chignolin Study Case. Front Mol Biosci 2021; 8:694130. [PMID: 34124166 PMCID: PMC8187852 DOI: 10.3389/fmolb.2021.694130] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 05/14/2021] [Indexed: 11/13/2022] Open
Abstract
The reliability and usefulness of molecular dynamics simulations of equilibrium processes rests on their statistical precision and their capability to generate conformational ensembles in agreement with available experimental knowledge. Metadynamics Metainference (M&M), coupling molecular dynamics with the enhanced sampling ability of Metadynamics and with the ability to integrate experimental information of Metainference, can in principle achieve both goals. Here we show that three different Metadynamics setups provide converged estimate of the populations of the three-states populated by a model peptide. Errors are estimated correctly by block averaging, but higher precision is obtained by performing independent replicates. One effect of Metadynamics is that of dramatically decreasing the number of effective frames resulting from the simulations and this is relevant for M&M where the number of replicas should be large enough to capture the conformational heterogeneity behind the experimental data. Our simulations allow also us to propose that monitoring the relative error associated with conformational averaging can help to determine the minimum number of replicas to be simulated in the context of M&M simulations. Altogether our data provides useful indication on how to generate sound conformational ensemble in agreement with experimental data.
Collapse
Affiliation(s)
- Cristina Paissoni
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milan, Italy
| | - Carlo Camilloni
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milan, Italy
| |
Collapse
|
19
|
Casalino L, Dommer A, Gaieb Z, Barros EP, Sztain T, Ahn SH, Trifan A, Brace A, Bogetti A, Ma H, Lee H, Turilli M, Khalid S, Chong L, Simmerling C, Hardy DJ, Maia JDC, Phillips JC, Kurth T, Stern A, Huang L, McCalpin J, Tatineni M, Gibbs T, Stone JE, Jha S, Ramanathan A, Amaro RE. AI-Driven Multiscale Simulations Illuminate Mechanisms of SARS-CoV-2 Spike Dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.11.19.390187. [PMID: 33236007 PMCID: PMC7685317 DOI: 10.1101/2020.11.19.390187] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/17/2023]
Abstract
We develop a generalizable AI-driven workflow that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems. We use this workflow to investigate the mechanisms of infectivity of the SARS-CoV-2 spike protein, the main viral infection machinery. Our workflow enables more efficient investigation of spike dynamics in a variety of complex environments, including within a complete SARS-CoV-2 viral envelope simulation, which contains 305 million atoms and shows strong scaling on ORNL Summit using NAMD. We present several novel scientific discoveries, including the elucidation of the spike's full glycan shield, the role of spike glycans in modulating the infectivity of the virus, and the characterization of the flexible interactions between the spike and the human ACE2 receptor. We also demonstrate how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Anda Trifan
- Argonne National Lab
- University of Illinois at Urbana-Champaign
| | | | | | | | - Hyungro Lee
- Rutgers University & Brookhaven National Lab
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|