1
|
Wen H, Ouyang H, Shang H, Da C, Zhang T. Helix-to-sheet transition of the Aβ42 peptide revealed using an enhanced sampling strategy and Markov state model. Comput Struct Biotechnol J 2024; 23:688-699. [PMID: 38292476 PMCID: PMC10825278 DOI: 10.1016/j.csbj.2023.12.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/14/2023] [Accepted: 12/16/2023] [Indexed: 02/01/2024] Open
Abstract
The self-assembly of Aβ peptides into toxic oligomers and fibrils is the primary cause of Alzheimer's disease. Moreover, the conformational transition from helix to sheet is considered a crucial step in the aggregation of Aβ peptides. However, the structural details of this process still remain unclear due to the heterogeneity and transient nature of the Aβ peptides. In this study, we developed an enhanced sampling strategy that combines artificial neural networks (ANN) with metadynamics to explore the conformational space of the Aβ42 peptides. The strategy consists of two parts: applying ANN to optimize CVs and conducting metadynamics based on the resulting CVs to sample conformations. The results showed that this strategy achieved better sampling performance in terms of the distribution of sampled conformations. The sampling efficiency is increased by 10-fold compared to our previous Hamiltonian Exchange Molecular Dynamics (MD) and by 1000-fold compared to ordinary MD. Based on the sampled conformations, we constructed a Markov state model to understand the detailed transition process. The intermediate states in this process are identified, and the connecting paths are analyzed. The conformational transitions in D23-K28 and M35-V40 are proven to be crucial for aggregation. These results are helpful in clarifying the mechanism and process of Aβ42 peptide aggregation. D23-K28 and M35-V40 can be identified as potential targets for screening and designing inhibitors of Aβ peptide aggregation.
Collapse
Affiliation(s)
- Huilin Wen
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, PR China
- The Third Hospital of Hebei Medical University, Shijiazhuang 050051, PR China
| | - Hao Ouyang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, PR China
| | - Hao Shang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, PR China
| | - Chaohong Da
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, PR China
| | - Tao Zhang
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, PR China
| |
Collapse
|
2
|
Oh M, Rosa M, Xie H, Khelashvili G. Automated collective variable discovery for MFSD2A transporter from molecular dynamics simulations. Biophys J 2024; 123:2934-2955. [PMID: 38932456 PMCID: PMC11393714 DOI: 10.1016/j.bpj.2024.06.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 06/03/2024] [Accepted: 06/24/2024] [Indexed: 06/28/2024] Open
Abstract
Biomolecules often exhibit complex free energy landscapes in which long-lived metastable states are separated by large energy barriers. Overcoming these barriers to robustly sample transitions between the metastable states with classical molecular dynamics (MD) simulations presents a challenge. To circumvent this issue, collective variable (CV)-based enhanced sampling MD approaches are often employed. Traditional CV selection relies on intuition and prior knowledge of the system. This approach introduces bias, which can lead to incomplete mechanistic insights. Thus, automated CV detection is desired to gain a deeper understanding of the system/process. Analysis of MD data with various machine-learning algorithms, such as principal component analysis (PCA), support vector machine, and linear discriminant analysis (LDA) based approaches have been implemented for automated CV detection. However, their performance has not been systematically evaluated on structurally and mechanistically complex biological systems. Here, we applied these methods to MD simulations of the MFSD2A (Major Facilitator Superfamily Domain 2A) lysolipid transporter in multiple functionally relevant metastable states with the goal of identifying optimal CVs that would structurally discriminate these states. Specific emphasis was on the automated detection and interpretive power of LDA-based CVs. We found that LDA methods, which included a novel gradient descent-based multiclass harmonic variant, termed GDHLDA, we developed here, outperform PCA in class separation, exhibiting remarkable consistency in extracting CVs critical for distinguishing metastable states. Furthermore, the identified CVs included features previously associated with conformational transitions in MFSD2A. Specifically, conformational shifts in transmembrane helix 7 and in residue Y294 on this helix emerged as critical features discriminating the metastable states in MFSD2A. This highlights the effectiveness of LDA-based approaches in automatically extracting from MD trajectories CVs of functional relevance that can be used to drive biased MD simulations to efficiently sample conformational transitions in the molecular system.
Collapse
Affiliation(s)
- Myongin Oh
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York
| | - Margarida Rosa
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York
| | - Hengyi Xie
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York
| | - George Khelashvili
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York.
| |
Collapse
|
3
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
4
|
Oh M, da Hora GCA, Swanson JMJ. tICA-Metadynamics for Identifying Slow Dynamics in Membrane Permeation. J Chem Theory Comput 2023; 19:8886-8900. [PMID: 37943658 PMCID: PMC11282584 DOI: 10.1021/acs.jctc.3c00526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Molecular simulations are commonly used to understand the mechanism of membrane permeation of small molecules, particularly for biomedical and pharmaceutical applications. However, despite significant advances in computing power and algorithms, calculating an accurate permeation free energy profile remains elusive for many drug molecules because it can require identifying the rate-limiting degrees of freedom (i.e., appropriate reaction coordinates). To resolve this issue, researchers have developed machine learning approaches to identify slow system dynamics. In this work, we apply time-lagged independent component analysis (tICA), an unsupervised dimensionality reduction algorithm, to molecular dynamics simulations with well-tempered metadynamics to find the slowest collective degrees of freedom of the permeation process of trimethoprim through a multicomponent membrane. We show that tICA-metadynamics yields translational and orientational collective variables (CVs) that increase convergence efficiency ∼1.5 times. However, crossing the periodic boundary is shown to introduce artifacts in the translational CV that can be corrected by taking absolute values of molecular features. Additionally, we find that the convergence of the tICA CVs is reached with approximately five membrane crossings and that data reweighting is required to avoid deviations in the translational CV.
Collapse
Affiliation(s)
- Myongin Oh
- Department of Chemistry, University of Utah, 315 South 1400 East, Rm 2020, Salt Lake City, Utah 84112, United States
| | - Gabriel C A da Hora
- Department of Chemistry, University of Utah, 315 South 1400 East, Rm 2020, Salt Lake City, Utah 84112, United States
| | - Jessica M J Swanson
- Department of Chemistry, University of Utah, 315 South 1400 East, Rm 2020, Salt Lake City, Utah 84112, United States
| |
Collapse
|
5
|
Oh M, da Hora GCA, Swanson JMJ. tICA-Metadynamics for Identifying Slow Dynamics in Membrane Permeation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.16.553477. [PMID: 37645884 PMCID: PMC10462029 DOI: 10.1101/2023.08.16.553477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Molecular simulations are commonly used to understand the mechanism of membrane permeation of small molecules, particularly for biomedical and pharmaceutical applications. However, despite significant advances in computing power and algorithms, calculating an accurate permeation free energy profile remains elusive for many drug molecules because it can require identifying the rate-limiting degrees of freedom (i.e., appropriate reaction coordinates). To resolve this issue, researchers have developed machine learning approaches to identify slow system dynamics. In this work, we apply time-lagged independent component analysis (tICA), an unsupervised dimensionality reduction algorithm, to molecular dynamics simulations with well-tempered metadynamics to find the slowest collective degrees of freedom of the permeation process of trimethoprim through a multicomponent membrane. We show that tICA-metadynamics yields translational and orientational collective variables (CVs) that increase convergence efficiency ∼1.5 times. However, crossing the periodic boundary is shown to introduce artefacts in the translational CV that can be corrected by taking absolute values of molecular features. Additionally, we find that the convergence of the tICA CVs is reached with approximately five membrane crossings, and that data reweighting is required to avoid deviations in the translational CV.
Collapse
|
6
|
Ketkaew R, Luber S. DeepCV: A Deep Learning Framework for Blind Search of Collective Variables in Expanded Configurational Space. J Chem Inf Model 2022; 62:6352-6364. [PMID: 36445176 DOI: 10.1021/acs.jcim.2c00883] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
We present Deep learning for Collective Variables (DeepCV), a computer code that provides an efficient and customizable implementation of the deep autoencoder neural network (DAENN) algorithm that has been developed in our group for computing collective variables (CVs) and can be used with enhanced sampling methods to reconstruct free energy surfaces of chemical reactions. DeepCV can be used to conveniently calculate molecular features, train models, generate CVs, validate rare events from sampling, and analyze a trajectory for chemical reactions of interest. We use DeepCV in an example study of the conformational transition of cyclohexene, where metadynamics simulations are performed using DAENN-generated CVs. The results show that the adopted CVs give free energies in line with those obtained by previously developed CVs and experimental results. DeepCV is open-source software written in Python/C++ object-oriented languages, based on the TensorFlow framework and distributed free of charge for noncommercial purposes, which can be incorporated into general molecular dynamics software. DeepCV also comes with several additional tools, i.e., an application program interface (API), documentation, and tutorials.
Collapse
Affiliation(s)
- Rangsiman Ketkaew
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Sandra Luber
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| |
Collapse
|
7
|
Spiwok V, Kurečka M, Křenek A. Collective Variable for Metadynamics Derived From AlphaFold Output. Front Mol Biosci 2022; 9:878133. [PMID: 35769910 PMCID: PMC9234394 DOI: 10.3389/fmolb.2022.878133] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 05/05/2022] [Indexed: 11/13/2022] Open
Abstract
AlphaFold is a neural network–based tool for the prediction of 3D structures of proteins. In CASP14, a blind structure prediction challenge, it performed significantly better than other competitors, making it the best available structure prediction tool. One of the outputs of AlphaFold is the probability profile of residue–residue distances. This makes it possible to score any conformation of the studied protein to express its compliance with the AlphaFold model. Here, we show how this score can be used to drive protein folding simulation by metadynamics and parallel tempering metadynamics. Using parallel tempering metadynamics, we simulated the folding of a mini-protein Trp-cage and β hairpin and predicted their folding equilibria. We observe the potential of the AlphaFold-based collective variable in applications beyond structure prediction, such as in structure refinement or prediction of the outcome of a mutation.
Collapse
Affiliation(s)
- Vojtěch Spiwok
- Department of Biochemistry and Microbiology, Faculty of Food and Biochemical Technology, University of Chemistry and Technology, Prague, Czechia
- *Correspondence: Vojtěch Spiwok,
| | - Martin Kurečka
- Institute of Computer Science, Masaryk University, Brno, Czechia
| | - Aleš Křenek
- Institute of Computer Science, Masaryk University, Brno, Czechia
| |
Collapse
|
8
|
Basciu A, Callea L, Motta S, Bonvin AM, Bonati L, Vargiu AV. No dance, no partner! A tale of receptor flexibility in docking and virtual screening. VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Chen H, Liu H, Feng H, Fu H, Cai W, Shao X, Chipot C. MLCV: Bridging Machine-Learning-Based Dimensionality Reduction and Free-Energy Calculation. J Chem Inf Model 2021; 62:1-8. [PMID: 34939790 DOI: 10.1021/acs.jcim.1c01010] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Importance-sampling algorithms leaning on the definition of a model reaction coordinate (RC) are widely employed to probe processes relevant to chemistry and biology alike, spanning time scales not amenable to common, brute-force molecular dynamics (MD) simulations. In practice, the model RC often consists of a handful of collective variables (CVs) chosen on the basis of chemical intuition. However, constructing manually a low-dimensional RC model to describe an intricate geometrical transformation for the purpose of free-energy calculations and analyses remains a daunting challenge due to the inherent complexity of the conformational transitions at play. To solve this issue, remarkable progress has been made in employing machine-learning techniques, such as autoencoders, to extract the low-dimensional RC model from a large set of CVs. Implementation of the differentiable, nonlinear machine-learned CVs in common MD engines to perform free-energy calculations is, however, particularly cumbersome. To address this issue, we present here a user-friendly tool (called MLCV) that facilitates the use of machine-learned CVs in importance-sampling simulations through the popular Colvars module. Our approach is critically probed with three case examples consisting of small peptides, showcasing that through hard-coded neural network in Colvars, deep-learning and enhanced-sampling can be effectively bridged with MD simulations. The MLCV code is versatile, applicable to all the CVs available in Colvars, and can be connected to any kind of dense neural networks. We believe that MLCV provides an effective, powerful, and user-friendly platform accessible to experts and nonexperts alike for machine-learning (ML)-guided CV discovery and enhanced-sampling simulations to unveil the molecular mechanisms underlying complex biochemical processes.
Collapse
Affiliation(s)
- Haochuan Chen
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.,Tianjin Key Laboratory of Biosensing and Molecular Recognition, Tianjin 300071, China.,State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
| | - Han Liu
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.,Tianjin Key Laboratory of Biosensing and Molecular Recognition, Tianjin 300071, China.,State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
| | - Heying Feng
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.,Tianjin Key Laboratory of Biosensing and Molecular Recognition, Tianjin 300071, China.,State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
| | - Haohao Fu
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.,Tianjin Key Laboratory of Biosensing and Molecular Recognition, Tianjin 300071, China.,State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.,Tianjin Key Laboratory of Biosensing and Molecular Recognition, Tianjin 300071, China.,State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.,Tianjin Key Laboratory of Biosensing and Molecular Recognition, Tianjin 300071, China.,State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
| | - Christophe Chipot
- Laboratoire International Associé CNRS and University of Illinois at Urbana-Champaign, UMR no. 7019, Université de Lorraine, BP 70239, F-54506 Vandœuvre-lès-Nancy, France
| |
Collapse
|
10
|
Ahn SH, Ojha AA, Amaro RE, McCammon JA. Gaussian-Accelerated Molecular Dynamics with the Weighted Ensemble Method: A Hybrid Method Improves Thermodynamic and Kinetic Sampling. J Chem Theory Comput 2021; 17:7938-7951. [PMID: 34844409 DOI: 10.1021/acs.jctc.1c00770] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Gaussian-accelerated molecular dynamics (GaMD) is a well-established enhanced sampling method for molecular dynamics simulations that effectively samples the potential energy landscape of the system by adding a boost potential, which smoothens the surface and lowers the energy barriers between states. GaMD is unable to give time-dependent properties such as kinetics directly. On the other hand, the weighted ensemble (WE) method can efficiently sample transitions between states with its many weighted trajectories, which directly yield rates and pathways. However, convergence to equilibrium conditions remains a challenge for the WE method. Hence, we have developed a hybrid method that combines the two methods, wherein GaMD is first used to sample the potential energy landscape of the system and WE is subsequently used to further sample the potential energy landscape and kinetic properties of interest. We show that the hybrid method can sample both thermodynamic and kinetic properties more accurately and quickly compared to using either method alone.
Collapse
Affiliation(s)
- Surl-Hee Ahn
- Department of Chemistry, University of California San Diego, La Jolla 92093, California, United States
| | - Anupam A Ojha
- Department of Chemistry, University of California San Diego, La Jolla 92093, California, United States
| | - Rommie E Amaro
- Department of Chemistry, University of California San Diego, La Jolla 92093, California, United States
| | - J Andrew McCammon
- Department of Chemistry, University of California San Diego, La Jolla 92093, California, United States.,Department of Pharmacology, University of California San Diego, La Jolla 92093, California, United States
| |
Collapse
|
11
|
Kingdon ADH, Alderwick LJ. Structure-based in silico approaches for drug discovery against Mycobacterium tuberculosis. Comput Struct Biotechnol J 2021; 19:3708-3719. [PMID: 34285773 PMCID: PMC8258792 DOI: 10.1016/j.csbj.2021.06.034] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/22/2021] [Accepted: 06/22/2021] [Indexed: 12/12/2022] Open
Abstract
Mycobacterium tuberculosis is the causative agent of TB and was estimated to cause 1.4 million death in 2019, alongside 10 million new infections. Drug resistance is a growing issue, with multi-drug resistant infections representing 3.3% of all new infections, hence novel antimycobacterial drugs are urgently required to combat this growing health emergency. Alongside this, increased knowledge of gene essentiality in the pathogenic organism and larger compound databases can aid in the discovery of new drug compounds. The number of protein structures, X-ray based and modelled, is increasing and now accounts for greater than > 80% of all predicted M. tuberculosis proteins; allowing novel targets to be investigated. This review will focus on structure-based in silico approaches for drug discovery, covering a range of complexities and computational demands, with associated antimycobacterial examples. This includes molecular docking, molecular dynamic simulations, ensemble docking and free energy calculations. Applications of machine learning onto each of these approaches will be discussed. The need for experimental validation of computational hits is an essential component, which is unfortunately missing from many current studies. The future outlooks of these approaches will also be discussed.
Collapse
Key Words
- CV, collective variable
- Docking
- Drug discovery
- In silico
- LIE, Linear Interaction Energy
- MD, Molecular Dynamic
- MDR, multi-drug resistant
- MMPB(GB)SA, Molecular Mechanics with Poisson Boltzmann (or generalised Born) and Surface Area solvation
- Machine learning
- Mt, Mycobacterium tuberculosis
- Mycobacterium tuberculosis
- PTC, peptidyl transferase centre
- RMSD, root-mean square-deviation
- Tuberculosis, TB
- cMD, Classical Molecular Dynamic
- cryo-EM, cryogenic electron microscopy
- ns, nanosecond
Collapse
Affiliation(s)
- Alexander D H Kingdon
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Luke J Alderwick
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| |
Collapse
|
12
|
Damjanovic J, Miao J, Huang H, Lin YS. Elucidating Solution Structures of Cyclic Peptides Using Molecular Dynamics Simulations. Chem Rev 2021; 121:2292-2324. [PMID: 33426882 DOI: 10.1021/acs.chemrev.0c01087] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Protein-protein interactions are vital to biological processes, but the shape and size of their interfaces make them hard to target using small molecules. Cyclic peptides have shown promise as protein-protein interaction modulators, as they can bind protein surfaces with high affinity and specificity. Dozens of cyclic peptides are already FDA approved, and many more are in various stages of development as immunosuppressants, antibiotics, antivirals, or anticancer drugs. However, most cyclic peptide drugs so far have been natural products or derivatives thereof, with de novo design having proven challenging. A key obstacle is structural characterization: cyclic peptides frequently adopt multiple conformations in solution, which are difficult to resolve using techniques like NMR spectroscopy. The lack of solution structural information prevents a thorough understanding of cyclic peptides' sequence-structure-function relationship. Here we review recent development and application of molecular dynamics simulations with enhanced sampling to studying the solution structures of cyclic peptides. We describe novel computational methods capable of sampling cyclic peptides' conformational space and provide examples of computational studies that relate peptides' sequence and structure to biological activity. We demonstrate that molecular dynamics simulations have grown from an explanatory technique to a full-fledged tool for systematic studies at the forefront of cyclic peptide therapeutic design.
Collapse
Affiliation(s)
- Jovan Damjanovic
- Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - Jiayuan Miao
- Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - He Huang
- Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - Yu-Shan Lin
- Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| |
Collapse
|
13
|
Bernetti M, Bertazzo M, Masetti M. Data-Driven Molecular Dynamics: A Multifaceted Challenge. Pharmaceuticals (Basel) 2020; 13:E253. [PMID: 32961909 PMCID: PMC7557855 DOI: 10.3390/ph13090253] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 12/18/2022] Open
Abstract
The big data concept is currently revolutionizing several fields of science including drug discovery and development. While opening up new perspectives for better drug design and related strategies, big data analysis strongly challenges our current ability to manage and exploit an extraordinarily large and possibly diverse amount of information. The recent renewal of machine learning (ML)-based algorithms is key in providing the proper framework for addressing this issue. In this respect, the impact on the exploitation of molecular dynamics (MD) simulations, which have recently reached mainstream status in computational drug discovery, can be remarkable. Here, we review the recent progress in the use of ML methods coupled to biomolecular simulations with potentially relevant implications for drug design. Specifically, we show how different ML-based strategies can be applied to the outcome of MD simulations for gaining knowledge and enhancing sampling. Finally, we discuss how intrinsic limitations of MD in accurately modeling biomolecular systems can be alleviated by including information coming from experimental data.
Collapse
Affiliation(s)
- Mattia Bernetti
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, I-34136 Trieste, Italy;
| | - Martina Bertazzo
- Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, I-16163 Genova, Italy;
| | - Matteo Masetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| |
Collapse
|
14
|
Gkeka P, Stoltz G, Barati Farimani A, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson AL, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Lelièvre T. Machine Learning Force Fields and Coarse-Grained Variables in Molecular Dynamics: Application to Materials and Biological Systems. J Chem Theory Comput 2020; 16:4757-4775. [PMID: 32559068 PMCID: PMC8312194 DOI: 10.1021/acs.jctc.0c00355] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Machine learning encompasses tools and algorithms that are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
Collapse
Affiliation(s)
- Paraskevi Gkeka
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
| | - Gabriel Stoltz
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| | | | - Zineb Belkacemi
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
| | - Michele Ceriotti
- Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Aaron R Dinner
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | | | - Hervé Minoux
- Integrated Drug Discovery, Sanofi R&D, 94403 Vitry-sur-Seine, France
| | | | - Fabio Pietrucci
- UMR CNRS 7590, MNHN, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, 75005 Paris, France
| | - Ana Silveira
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Zofia Trstanova
- School of Mathematics, The University of Edinburgh, Edinburgh EH9 3FD, U.K
| | - Rafal Wiewiora
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Tony Lelièvre
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| |
Collapse
|
15
|
Spiwok V, Kříž P. Time-Lagged t-Distributed Stochastic Neighbor Embedding (t-SNE) of Molecular Simulation Trajectories. Front Mol Biosci 2020; 7:132. [PMID: 32714941 PMCID: PMC7344294 DOI: 10.3389/fmolb.2020.00132] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 06/03/2020] [Indexed: 11/30/2022] Open
Abstract
Molecular simulation trajectories represent high-dimensional data. Such data can be visualized by methods of dimensionality reduction. Non-linear dimensionality reduction methods are likely to be more efficient than linear ones due to the fact that motions of atoms are non-linear. Here we test a popular non-linear t-distributed Stochastic Neighbor Embedding (t-SNE) method on analysis of trajectories of 200 ns alanine dipeptide dynamics and 208 μs Trp-cage folding and unfolding. Furthermore, we introduce a time-lagged variant of t-SNE in order to focus on rarely occurring transitions in the molecular system. This time-lagged t-SNE efficiently separates states according to distance in time. Using this method it is possible to visualize key states of studied systems (e.g., unfolded and folded protein) as well as possible kinetic traps using a two-dimensional plot. Time-lagged t-SNE is a visualization method and other applications, such as clustering and free energy modeling, must be done with caution.
Collapse
Affiliation(s)
- Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Czechia
| | - Pavel Kříž
- Department of Mathematics, University of Chemistry and Technology, Prague, Czechia
| |
Collapse
|
16
|
Liao Q. Enhanced sampling and free energy calculations for protein simulations. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 170:177-213. [PMID: 32145945 DOI: 10.1016/bs.pmbts.2020.01.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Molecular dynamics simulation is a powerful computational technique to study biomolecular systems, which complements experiments by providing insights into the structural dynamics relevant to biological functions at atomic scale. It can also be used to calculate the free energy landscapes of the conformational transitions to better understand the functions of the biomolecules. However, the sampling of biomolecular configurations is limited by the free energy barriers that need to be overcome, leading to considerable gaps between the timescales reached by MD simulation and those governing biological processes. To address this issue, many enhanced sampling methodologies have been developed to increase the sampling efficiency of molecular dynamics simulations and free energy calculations. Usually, enhanced sampling algorithms can be classified into methods based on collective variables (CV-based) and approaches which do not require predefined CVs (CV-free). In this chapter, the theoretical basis of free energy estimation is briefly reviewed first, followed by the reviews of the most common CV-based and CV-free methods including the presentation of some examples and recent developments. Finally, the combination of different enhanced sampling methods is discussed.
Collapse
Affiliation(s)
- Qinghua Liao
- Science for Life Laboratory, Department of Chemistry-BMC, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
17
|
Fleetwood O, Kasimova MA, Westerlund AM, Delemotte L. Molecular Insights from Conformational Ensembles via Machine Learning. Biophys J 2020; 118:765-780. [PMID: 31952811 PMCID: PMC7002924 DOI: 10.1016/j.bpj.2019.12.016] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 11/21/2019] [Accepted: 12/16/2019] [Indexed: 01/04/2023] Open
Abstract
Biomolecular simulations are intrinsically high dimensional and generate noisy data sets of ever-increasing size. Extracting important features from the data is crucial for understanding the biophysical properties of molecular processes, but remains a big challenge. Machine learning (ML) provides powerful dimensionality reduction tools. However, such methods are often criticized as resembling black boxes with limited human-interpretable insight. We use methods from supervised and unsupervised ML to efficiently create interpretable maps of important features from molecular simulations. We benchmark the performance of several methods, including neural networks, random forests, and principal component analysis, using a toy model with properties reminiscent of macromolecular behavior. We then analyze three diverse biological processes: conformational changes within the soluble protein calmodulin, ligand binding to a G protein-coupled receptor, and activation of an ion channel voltage-sensor domain, unraveling features critical for signal transduction, ligand binding, and voltage sensing. This work demonstrates the usefulness of ML in understanding biomolecular states and demystifying complex simulations.
Collapse
Affiliation(s)
- Oliver Fleetwood
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Marina A Kasimova
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Annie M Westerlund
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Lucie Delemotte
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden.
| |
Collapse
|
18
|
Wang Y, Lamim Ribeiro JM, Tiwary P. Machine learning approaches for analyzing and enhancing molecular dynamics simulations. Curr Opin Struct Biol 2020; 61:139-145. [PMID: 31972477 DOI: 10.1016/j.sbi.2019.12.016] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Revised: 12/16/2019] [Accepted: 12/26/2019] [Indexed: 10/25/2022]
Abstract
Molecular dynamics (MD) has become a powerful tool for studying biophysical systems, due to increasing computational power and availability of software. Although MD has made many contributions to better understanding these complex biophysical systems, there remain methodological difficulties to be surmounted. First, how to make the deluge of data generated in running even a microsecond long MD simulation human comprehensible. Second, how to efficiently sample the underlying free energy surface and kinetics. In this short perspective, we summarize machine learning based ideas that are solving both of these limitations, with a focus on their key theoretical underpinnings and remaining challenges.
Collapse
Affiliation(s)
- Yihang Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, USA
| | - João Marcelo Lamim Ribeiro
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1677, New York, NY 10029, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|
19
|
Patrizi B, Cozza C, Pietropaolo A, Foggi P, Siciliani de Cumis M. Synergistic Approach of Ultrafast Spectroscopy and Molecular Simulations in the Characterization of Intramolecular Charge Transfer in Push-Pull Molecules. Molecules 2020; 25:E430. [PMID: 31968694 PMCID: PMC7024558 DOI: 10.3390/molecules25020430] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 01/14/2020] [Accepted: 01/17/2020] [Indexed: 11/28/2022] Open
Abstract
The comprehensive characterization of Intramolecular Charge Transfer (ICT) stemming in push-pull molecules with a delocalized π-system of electrons is noteworthy for a bespoke design of organic materials, spanning widespread applications from photovoltaics to nanomedicine imaging devices. Photo-induced ICT is characterized by structural reorganizations, which allows the molecule to adapt to the new electronic density distribution. Herein, we discuss recent photophysical advances combined with recent progresses in the computational chemistry of photoactive molecular ensembles. We focus the discussion on femtosecond Transient Absorption Spectroscopy (TAS) enabling us to follow the transition from a Locally Excited (LE) state to the ICT and to understand how the environment polarity influences radiative and non-radiative decay mechanisms. In many cases, the charge transfer transition is accompanied by structural rearrangements, such as the twisting or molecule planarization. The possibility of an accurate prediction of the charge-transfer occurring in complex molecules and molecular materials represents an enormous advantage in guiding new molecular and materials design. We briefly report on recent advances in ultrafast multidimensional spectroscopy, in particular, Two-Dimensional Electronic Spectroscopy (2DES), in unraveling the ICT nature of push-pull molecular systems. A theoretical description at the atomistic level of photo-induced molecular transitions can predict with reasonable accuracy the properties of photoactive molecules. In this framework, the review includes a discussion on the advances from simulation and modeling, which have provided, over the years, significant information on photoexcitation, emission, charge-transport, and decay pathways. Density Functional Theory (DFT) coupled with the Time-Dependent (TD) framework can describe electronic properties and dynamics for a limited system size. More recently, Machine Learning (ML) or deep learning approaches, as well as free-energy simulations containing excited state potentials, can speed up the calculations with transferable accuracy to more complex molecules with extended system size. A perspective on combining ultrafast spectroscopy with molecular simulations is foreseen for optimizing the design of photoactive compounds with tunable properties.
Collapse
Affiliation(s)
- Barbara Patrizi
- National Institute of Optics-National Research Council (INO-CNR), Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy; (B.P.); (P.F.)
- European Laboratory for Non-Linear Spectroscopy (LENS),Via Nello Carrara 1, 50019 Sesto Fiorentino, Italy
| | - Concetta Cozza
- Dipartimento di Scienze della Salute, Università di Catanzaro, Viale Europa, 88100 Catanzaro, Italy; (C.C.); (A.P.)
| | - Adriana Pietropaolo
- Dipartimento di Scienze della Salute, Università di Catanzaro, Viale Europa, 88100 Catanzaro, Italy; (C.C.); (A.P.)
| | - Paolo Foggi
- National Institute of Optics-National Research Council (INO-CNR), Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy; (B.P.); (P.F.)
- European Laboratory for Non-Linear Spectroscopy (LENS),Via Nello Carrara 1, 50019 Sesto Fiorentino, Italy
- Dipartimento di Chimica, Biologia e Biotecnologie, Università di Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy
| | | |
Collapse
|