1
|
Cheng Y, Ouyang W, Liu L, Tang L, Zhang Z, Yue X, Liang L, Hu J, Luo T. Molecular recognition of ITIM/ITSM domains with SHP2 and their allosteric effect. Phys Chem Chem Phys 2024; 26:9155-9169. [PMID: 38165855 DOI: 10.1039/d3cp03923d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Src homology 2-domain-containing tyrosine phosphatase 2 (SHP2) is a non-receptor protein tyrosine phosphatase that is widely expressed in a variety of cells and regulates the immune response of T cells through the PD-1 pathway. However, the activation mechanism and allosteric effects of SHP2 remain unclear, hindering the development of small molecule inhibitors. For the first time, in this study, the complex structure formed by the intact PD-1 tail and SHP2 was modeled. The molecular recognition and conformational changes of inactive/active SHP2 versus ITIM/ITSM were compared based on prolonged MD simulations. The relative flexibility of the two SH2 domains during MD simulations contributes to the recruitment of ITIM/ITSM and supports the subsequent conformational change of SHP2. The binding free energy calculation shows that inactive SHP2 has a higher affinity for ITIM/ITSM than active SHP2, mainly because the former's N-SH2 refers to the α-state. In addition, a significant decrease in the contribution to the binding energy of certain residues (e.g., R32, S34, K35, T42, and K55) of conformationally transformed SHP2 contributes to the above result. These detailed changes during conformational transition will provide theoretical guidance for the molecular design of subsequent novel anticancer drugs.
Collapse
Affiliation(s)
- Yan Cheng
- Breast Disease Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610000, China.
- Multi-omics Laboratory of Breast Diseases, State Key Laboratory of Biotherapy, National Collaborative, Innovation Center for Biotherapy, West China Hospital, Sichuan University, China
| | - Weiwei Ouyang
- Department of Thoracic Oncology, Affiliated Cancer Hospital, Guizhou Medical University, Guiyang, China
| | - Ling Liu
- Key Laboratory of Medicinal and Edible Plants Resources, Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, China
| | - Lingkai Tang
- Key Laboratory of Medicinal and Edible Plants Resources, Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, China
| | - Zhigang Zhang
- Key Laboratory of Medicinal and Edible Plants Resources, Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, China
| | - Xinru Yue
- Key Laboratory of Medicinal and Edible Plants Resources, Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, China
| | - Li Liang
- Key Laboratory of Medicinal and Edible Plants Resources, Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, China
| | - Jianping Hu
- Key Laboratory of Medicinal and Edible Plants Resources, Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, China
| | - Ting Luo
- Breast Disease Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610000, China.
- Multi-omics Laboratory of Breast Diseases, State Key Laboratory of Biotherapy, National Collaborative, Innovation Center for Biotherapy, West China Hospital, Sichuan University, China
| |
Collapse
|
2
|
Ge Y, Luo Q, Liu L, Shi Q, Zhang Z, Yue X, Tang L, Liang L, Hu J, Ouyang W. S288T mutation altering MmpL3 periplasmic domain channel and H-bond network: a novel dual drug resistance mechanism. J Mol Model 2024; 30:39. [PMID: 38224406 DOI: 10.1007/s00894-023-05814-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 12/18/2023] [Indexed: 01/16/2024]
Abstract
CONTEXT Mycobacterial membrane proteins Large 3 (MmpL3) is responsible for the transport of mycobacterial acids out of cell membrane to form cell wall, which is essential for the survival of Mycobacterium tuberculosis (Mtb) and has become a potent anti-tuberculosis target. SQ109 is an ethambutol (EMB) analogue, as a novel anti-tuberculosis drug, can effectively inhibit MmpL3, and has completed phase 2b-3 clinical trials. Drug resistance has always been the bottleneck problem in clinical treatment of tuberculosis. The S288T mutant of MmpL3 shows significant resistance to the inhibitor SQ109, while the specific action mechanism remains unclear. The results show that MmpL3 S288T mutation causes local conformational change with little effect on the global structure. With MmpL3 bound by SQ109 inhibitor, the distance between D710 and R715 increases resulting in H-bond destruction, but their interactions and proton transfer function are still restored. In addition, the rotation of Y44 in the S288T mutant leads to an obvious bend in the periplasmic domain channel and an increased number of contact residues, reducing substrate transport efficiency. This work not only provides a possible dual drug resistance mechanism of MmpL3 S288T mutant but also aids the development of novel anti-tuberculosis inhibitors. METHODS In this work, molecular dynamics (MD) and quantum mechanics (QM) simulations both were performed to compare inhibitor (i.e., SQ109) recognition, motion characteristics, and H-bond energy change of MmpL3 after S288T mutation. In addition, the WT_SQ109 complex structure was obtained by molecular docking program (Autodock 4.2); Molecular Mechanics/ Poisson Boltzmann Surface Area (MM-PBSA) and Solvated Interaction Energy (SIE) methods were used to calculate the binding free energies (∆Gbind); Geometric criteria were used to analyze the changes of hydrogen bond networks.
Collapse
Affiliation(s)
- Yutong Ge
- Department of Thoracic Oncology, Affiliated Cancer Hospital, Guizhou Medical University, Guiyang, China
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Qing Luo
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| | - Ling Liu
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Quanshan Shi
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Zhigang Zhang
- Department of Thoracic Oncology, Affiliated Cancer Hospital, Guizhou Medical University, Guiyang, China
| | - Xinru Yue
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Lingkai Tang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Li Liang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Jianping Hu
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China.
| | - Weiwei Ouyang
- Department of Thoracic Oncology, Affiliated Cancer Hospital, Guizhou Medical University, Guiyang, China.
| |
Collapse
|
3
|
Herringer NSM, Dasetty S, Gandhi D, Lee J, Ferguson AL. Permutationally Invariant Networks for Enhanced Sampling (PINES): Discovery of Multimolecular and Solvent-Inclusive Collective Variables. J Chem Theory Comput 2024; 20:178-198. [PMID: 38150421 DOI: 10.1021/acs.jctc.3c00923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
The typically rugged nature of molecular free-energy landscapes can frustrate efficient sampling of the thermodynamically relevant phase space due to the presence of high free-energy barriers. Enhanced sampling techniques can improve phase space exploration by accelerating sampling along particular collective variables (CVs). A number of techniques exist for the data-driven discovery of CVs parametrizing the important large-scale motions of the system. A challenge to CV discovery is learning CVs invariant to the symmetries of the molecular system, frequently rigid translation, rigid rotation, and permutational relabeling of identical particles. Of these, permutational invariance has proved a persistent challenge in frustrating the data-driven discovery of multimolecular CVs in systems of self-assembling particles and solvent-inclusive CVs for solvated systems. In this work, we integrate permutation invariant vector (PIV) featurizations with autoencoding neural networks to learn nonlinear CVs invariant to translation, rotation, and permutation and perform interleaved rounds of CV discovery and enhanced sampling to iteratively expand the sampling of configurational phase space and obtain converged CVs and free-energy landscapes. We demonstrate the permutationally invariant network for enhanced sampling (PINES) approach in applications to the self-assembly of a 13-atom argon cluster, association/dissociation of a NaCl ion pair in water, and hydrophobic collapse of a C45H92 n-pentatetracontane polymer chain. We make the approach freely available as a new module within the PLUMED2 enhanced sampling libraries.
Collapse
Affiliation(s)
| | - Siva Dasetty
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Diya Gandhi
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Junhee Lee
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
4
|
Wang Y, Zhang D, Huang L, Zhang Z, Shi Q, Hu J, He G, Guo X, Shi H, Liang L. Uncovering the interactions between PME and PMEI at the gene and protein levels: Implications for the design of specific PMEI. J Mol Model 2023; 29:286. [PMID: 37610510 DOI: 10.1007/s00894-023-05644-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 06/30/2023] [Indexed: 08/24/2023]
Abstract
CONTEXT Pectin methylesterase inhibitor (PMEI) can specifically bind and inhibit the activity of pectin methylesterase (PME), which has been widely used in fruit and vegetable juice processing. However, the limited three-dimensional structure, unclear action mechanism, low thermal stability and biological activity of PMEI severely limited its application. In this work, molecular recognition and conformational changes of PME and PMEI were analyzed by various molecular simulation methods. Then suggestions were proposed for improving thermal stability and affinity maturation of PMEI through semi-rational design. METHODS Phylogenetic trees of PME and PMEI were established using the Maximum likelihood (ML) method. The results show that PME and PMEI have good sequence and structure conservation in various plants, and the simulated data can be widely adopted. In this work, MD simulations were performed using AMBER20 package and ff14SB force field. Protein interaction analysis indicates that H-bonds, van der Waals forces, and the salt bridge formed of K224 with ID116 are the main driving forces for mutual molecular recognition of PME and PMEI. According to the analyses of free energy landscape (FEL), conformational cluster, and motion, the association with PMEI greatly disrupts PME's dispersed functional motion mode and biological function. By monitoring the changes of residue contact number and binding free energy, IG35M/ IG35R: IT93F and IT113W/ IT113W: ID116W mutations contribute to thermal stability and affinity maturation of the PME-PMEI complex system, respectively. This work reveals the interaction between PME and PMEI at the gene and protein levels and provides options for modifying specific PMEI.
Collapse
Affiliation(s)
- Yueteng Wang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Derong Zhang
- School of Marxism, Chengdu Vocational & Technical College of Industry, Chengdu, 610081, China
| | - Lifen Huang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Zelan Zhang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Quanshan Shi
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Jianping Hu
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Gang He
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Xiaoqiang Guo
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, 213001, China.
| | - Li Liang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, School of Pharmacy, Chengdu University, Chengdu, 610106, China.
| |
Collapse
|
5
|
Shmilovich K, Ferguson AL. Girsanov Reweighting Enhanced Sampling Technique (GREST): On-the-Fly Data-Driven Discovery of and Enhanced Sampling in Slow Collective Variables. J Phys Chem A 2023; 127:3497-3517. [PMID: 37036804 DOI: 10.1021/acs.jpca.3c00505] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Abstract
Molecular dynamics simulations of microscopic phenomena are limited by the short integration time steps which are required for numerical stability but which limit the practically achievable simulation time scales. Collective variable (CV) enhanced sampling techniques apply biases to predefined collective coordinates to promote barrier crossing, phase space exploration, and sampling of rare events. The efficacy of these techniques is contingent on the selection of good CVs correlated with the molecular motions governing the long-time dynamical evolution of the system. In this work, we introduce Girsanov Reweighting Enhanced Sampling Technique (GREST) as an adaptive sampling scheme that interleaves rounds of data-driven slow CV discovery and enhanced sampling along these coordinates. Since slow CVs are inherently dynamical quantities, a key ingredient in our approach is the use of both thermodynamic and dynamical Girsanov reweighting corrections for rigorous estimation of slow CVs from biased simulation data. We demonstrate our approach on a toy 1D 4-well potential, a simple biomolecular system alanine dipeptide, and the Trp-Leu-Ala-Leu-Leu (WLALL) pentapeptide. In each case GREST learns appropriate slow CVs and drives sampling of all thermally accessible metastable states starting from zero prior knowledge of the system. We make GREST accessible to the community via a publicly available open source Python package.
Collapse
Affiliation(s)
- Kirill Shmilovich
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
6
|
Ge Y, Yan H, Shi X, Wu Z, Wang Y, Zhang Z, Luo Q, Liu W, Liang L, Peng L, Hu J. Study on dietary intake, risk assessment, and molecular toxicity mechanism of benzo[α]pyrene in college students in China Bashu area. Food Sci Nutr 2022; 10:4155-4167. [PMID: 36514765 PMCID: PMC9731532 DOI: 10.1002/fsn3.3007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 07/05/2022] [Accepted: 07/14/2022] [Indexed: 12/16/2022] Open
Abstract
As an extremely strong polycyclic aromatic hydrocarbon carcinogen, benzo[α]pyrene (BaP) is often produced during food processing at high temperatures. Recently, food safety, as well as toxicity mechanism and risk assessment of BaP, has received extensive attention. We first constructed the database of BaP pollution concentration in Chinese daily food with over 104 data items; collected dietary intake data using online survey; then assessed dietary exposure risk; and finally revealed the possible toxicity mechanism through four comparative molecular dynamics (MD) simulations. The statistical results showed that the concentration of BaP in olive oil was the highest, followed by that in fried meat products. The margins of exposure and incremental lifetime cancer risk both indicated that the dietary exposure to BaP of the participants was generally safe, but there were still some people with certain carcinogenic risks. Specifically, the health risk of the core district population was higher than that of the noncore district in Bashu area, and the female postgraduate group was higher than the male group with bachelor degree or below. From MD trajectories, BaP binding does not affect the global motion of individual nucleic acid sequences, but local weak noncovalent interactions changed greatly; it also weakens molecular interactions of nucleic acid with Bacillus stearothermophilus DNA polymerase I large fragment (BF), and significantly changes the cavity structure of recognition interface. This work not only reveals the possible toxicity mechanism of BaP, but also provides theoretical guidance for the subsequent optimization of food safety standards and reference of rational diet.
Collapse
Affiliation(s)
- Yutong Ge
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Hailian Yan
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Xiaodong Shi
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Zhixiang Wu
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Yueteng Wang
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Zelan Zhang
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Qing Luo
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Wei Liu
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Li Liang
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Lianxin Peng
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| | - Jianping Hu
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, School of Pharmacy, Sichuan Industrial Institute of AntibioticsChengdu UniversityChengduChina
| |
Collapse
|
7
|
Paul TK, Taraphder S. Nonlinear Reaction Coordinate of an Enzyme Catalyzed Proton Transfer Reaction. J Phys Chem B 2022; 126:1413-1425. [PMID: 35138854 DOI: 10.1021/acs.jpcb.1c08760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We present an in-depth study on the theoretical calculation of an optimum reaction coordinate as a linear or nonlinear combination of important collective variables (CVs) sampled from an ensemble of reactive transition paths for an intramolecular proton transfer reaction catalyzed by the enzyme human carbonic anhydrase (HCA) II. The linear models are optimized by likelihood maximization for a given number of CVs. The nonlinear models are based on an artificial neural network with the same number of CVs and optimized by minimizing the root-mean-square error in comparison to a training set of committor estimators generated for the given transition. The nonlinear reaction coordinate thus obtained yields the free energy of activation and rate constant as 9.46 kcal mol-1 and 1.25 × 106 s-1, respectively. These estimates are found to be in quantitative agreement with the known experimental results. We have also used an extended autoencoder model to show that a similar analysis can be carried out using a single CV only. The resultant free energies and kinetics of the reaction slightly overestimate the experimental data. The implications of these results are discussed using a detailed microkinetic scheme of the proton transfer reaction catalyzed by HCA II.
Collapse
Affiliation(s)
- Tanmoy Kumar Paul
- Department of Chemistry, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Srabani Taraphder
- Department of Chemistry, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
8
|
Beyerle ER, Guenza MG. Identifying the leading dynamics of ubiquitin: A comparison between the tICA and the LE4PD slow fluctuations in amino acids' position. J Chem Phys 2021; 155:244108. [PMID: 34972386 DOI: 10.1063/5.0059688] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Molecular Dynamics (MD) simulations of proteins implicitly contain the information connecting the atomistic molecular structure and proteins' biologically relevant motion, where large-scale fluctuations are deemed to guide folding and function. In the complex multiscale processes described by MD trajectories, it is difficult to identify, separate, and study those large-scale fluctuations. This problem can be formulated as the need to identify a small number of collective variables that guide the slow kinetic processes. The most promising method among the ones used to study the slow leading processes in proteins' dynamics is the time-structure based on time-lagged independent component analysis (tICA), which identifies the dominant components in a noisy signal. Recently, we developed an anisotropic Langevin approach for the dynamics of proteins, called the anisotropic Langevin Equation for Protein Dynamics or LE4PD-XYZ. This approach partitions the protein's MD dynamics into mostly uncorrelated, wavelength-dependent, diffusive modes. It associates with each mode a free-energy map, where one measures the spatial extension and the time evolution of the mode-dependent, slow dynamical fluctuations. Here, we compare the tICA modes' predictions with the collective LE4PD-XYZ modes. We observe that the two methods consistently identify the nature and extension of the slowest fluctuation processes. The tICA separates the leading processes in a smaller number of slow modes than the LE4PD does. The LE4PD provides time-dependent information at short times and a formal connection to the physics of the kinetic processes that are missing in the pure statistical analysis of tICA.
Collapse
Affiliation(s)
- E R Beyerle
- Institute for Fundamental Science and Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, USA
| | - M G Guenza
- Institute for Fundamental Science and Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, USA
| |
Collapse
|
9
|
Ghorbani M, Prasad S, Klauda JB, Brooks BR. Variational embedding of protein folding simulations using Gaussian mixture variational autoencoders. J Chem Phys 2021; 155:194108. [PMID: 34800961 DOI: 10.1063/5.0069708] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Conformational sampling of biomolecules using molecular dynamics simulations often produces a large amount of high dimensional data that makes it difficult to interpret using conventional analysis techniques. Dimensionality reduction methods are thus required to extract useful and relevant information. Here, we devise a machine learning method, Gaussian mixture variational autoencoder (GMVAE), that can simultaneously perform dimensionality reduction and clustering of biomolecular conformations in an unsupervised way. We show that GMVAE can learn a reduced representation of the free energy landscape of protein folding with highly separated clusters that correspond to the metastable states during folding. Since GMVAE uses a mixture of Gaussians as its prior, it can directly acknowledge the multi-basin nature of the protein folding free energy landscape. To make the model end-to-end differentiable, we use a Gumbel-softmax distribution. We test the model on three long-timescale protein folding trajectories and show that GMVAE embedding resembles the folding funnel with folded states down the funnel and unfolded states outside the funnel path. Additionally, we show that the latent space of GMVAE can be used for kinetic analysis and Markov state models built on this embedding produce folding and unfolding timescales that are in close agreement with other rigorous dynamical embeddings such as time independent component analysis.
Collapse
Affiliation(s)
- Mahdi Ghorbani
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| | - Samarjeet Prasad
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| | - Jeffery B Klauda
- Department of Chemical and Biomolecular Engineering, University of Maryland, College Park, Maryland 20742, USA
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| |
Collapse
|
10
|
Maiti KS. Two-dimensional Infrared Spectroscopy Reveals Better Insights of Structure and Dynamics of Protein. Molecules 2021; 26:molecules26226893. [PMID: 34833985 PMCID: PMC8618531 DOI: 10.3390/molecules26226893] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 11/07/2021] [Accepted: 11/10/2021] [Indexed: 11/25/2022] Open
Abstract
Proteins play an important role in biological and biochemical processes taking place in the living system. To uncover these fundamental processes of the living system, it is an absolutely necessary task to understand the structure and dynamics of the protein. Vibrational spectroscopy is an established tool to explore protein structure and dynamics. In particular, two-dimensional infrared (2DIR) spectroscopy has already proven its versatility to explore the protein structure and its ultrafast dynamics, and it has essentially unprecedented time resolutions to observe the vibrational dynamics of the protein. Providing several examples from our theoretical and experimental efforts, it is established here that two-dimensional vibrational spectroscopy provides exceptionally more information than one-dimensional vibrational spectroscopy. The structural information of the protein is encoded in the position, shape, and strength of the peak in 2DIR spectra. The time evolution of the 2DIR spectra allows for the visualisation of molecular motions.
Collapse
Affiliation(s)
- Kiran Sankar Maiti
- Max-Planck-Institut für Quantenoptik, Hans-Kopfermann-Straße 1, 85748 Garching, Germany; ; Tel.: +49-89-289-54056
- Lehrstuhl für Experimental Physik, Ludwig-Maximilians-Universität München, Am Coulombwall 1, 85748 Garching, Germany
| |
Collapse
|
11
|
Tsai ST, Tiwary P. On the distance between A and B in molecular configuration space. MOLECULAR SIMULATION 2021. [DOI: 10.1080/08927022.2020.1761548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Sun-Ting Tsai
- Department of Physics and Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA
| |
Collapse
|
12
|
Chen M. Collective variable-based enhanced sampling and machine learning. THE EUROPEAN PHYSICAL JOURNAL. B 2021; 94:211. [PMID: 34697536 PMCID: PMC8527828 DOI: 10.1140/epjb/s10051-021-00220-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 10/03/2021] [Indexed: 05/14/2023]
Abstract
ABSTRACT Collective variable-based enhanced sampling methods have been widely used to study thermodynamic properties of complex systems. Efficiency and accuracy of these enhanced sampling methods are affected by two factors: constructing appropriate collective variables for enhanced sampling and generating accurate free energy surfaces. Recently, many machine learning techniques have been developed to improve the quality of collective variables and the accuracy of free energy surfaces. Although machine learning has achieved great successes in improving enhanced sampling methods, there are still many challenges and open questions. In this perspective, we shall review recent developments on integrating machine learning techniques and collective variable-based enhanced sampling approaches. We also discuss challenges and future research directions including generating kinetic information, exploring high-dimensional free energy surfaces, and efficiently sampling all-atom configurations.
Collapse
Affiliation(s)
- Ming Chen
- Department of Chemistry, Purdue University, West Lafayette, IN 47907 USA
| |
Collapse
|
13
|
Topel M, Ferguson AL. Reconstruction of protein structures from single-molecule time series. J Chem Phys 2020; 153:194102. [DOI: 10.1063/5.0024732] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Maximilian Topel
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, USA
| | - Andrew L. Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
14
|
BP[dG]-induced distortions to DNA polymerase and DNA duplex: A detailed mechanism of BP adducts blocking replication. Food Chem Toxicol 2020; 140:111325. [DOI: 10.1016/j.fct.2020.111325] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 03/15/2020] [Accepted: 04/04/2020] [Indexed: 01/21/2023]
|
15
|
Mitsuta Y, Shigeta Y. Analytical Method Using a Scaled Hypersphere Search for High-Dimensional Metadynamics Simulations. J Chem Theory Comput 2020; 16:3869-3878. [PMID: 32384233 DOI: 10.1021/acs.jctc.0c00010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Metadynamics (MTD) is one of the most effective methods for calculating the free energy surface and finding rare events. Nevertheless, numerous studies using MTD have been carried out using 3D or lower dimensional collective variables (CVs), as higher dimensional CVs require costly computational resources and the obtained results are too complex to understand the important events. The latter issue can be conveniently solved by utilizing the free energy reaction network (FERN), which is a graph structure consisting of edges of minimum free energy paths (MFEPs), nodes of equation (EQ) points, and transition state (TS) points. In the present article, a new method for exploring FERNs on high-dimensional CVs using MTD and the scaled hypersphere search (SHS) method is described. A test calculation based on the MTD-SHS simulation of met-enkephalin in explicit water with 7 CVs was conducted. As a result, 889 EQ points and 1805 TS points were found. The MTD-SHS approach can find MFEPs exhaustively; therefore, the FERNs can be estimated without any a priori knowledge of the EQ and TS points.
Collapse
Affiliation(s)
- Yuki Mitsuta
- Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
16
|
Liu X, Zhang Y, Duan H, Luo Q, Liu W, Liang L, Wan H, Chang S, Hu J, Shi H. Inhibition Mechanism of Indoleamine 2, 3-Dioxygenase 1 (IDO1) by Amidoxime Derivatives and Its Revelation in Drug Design: Comparative Molecular Dynamics Simulations. Front Mol Biosci 2020; 6:164. [PMID: 32047753 PMCID: PMC6997135 DOI: 10.3389/fmolb.2019.00164] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 12/31/2019] [Indexed: 02/05/2023] Open
Abstract
For cancer treatment, in addition to the three standard therapies of surgery, chemotherapy, and radiotherapy, immunotherapy has become the fourth internationally-recognized alternative treatment. Indoleamine 2, 3-dioxygenase 1 (IDO1) catalyzes the conversion of tryptophan to kynurenine causing lysine depletion, which is an important target in the research and development of anticancer drugs. Epacadostat (INCB024360) is currently one of the most potent IDO1 inhibitors, nevertheless its inhibition mechanism still remains elusive. In this work, comparative molecular dynamics simulations were performed to reveal that the high inhibitory activity of INCB024360 mainly comes from two aspects: disturbing the ligand delivery tunnel and then preventing small molecules such as oxygen and water molecules from accessing the active site, as well as hindering the shuttle of substrate tryptophan with product kynurenine through the heme binding pocket. The scanning of key residues showed that L234 and R231 residues both were crucial to the catalytic activity of IDO1. With the association with INCB024360, L234 forms a stable hydrogen bond with G262, which significantly affects the spatial position of G262-A264 loop and then greatly disturbs the orderliness of ligand delivery tunnel. In addition, the cleavage of hydrogen bond between G380 and R231 increases the mobility of the GTGG conserved region, leading to the closure of the substrate tryptophan channel. This work provides new ideas for understanding action mechanism of amidoxime derivatives, improving its inhibitor activity and developing novel inhibitors of IDO1.
Collapse
Affiliation(s)
- Xinyu Liu
- Laboratory of Tumor Targeted and Immune Therapy, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, China
| | - Yiwen Zhang
- Laboratory of Tumor Targeted and Immune Therapy, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, China
| | - Huaichuan Duan
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Chengdu University, Chengdu, China
| | - Qing Luo
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Chengdu University, Chengdu, China
| | - Wei Liu
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Chengdu University, Chengdu, China
| | - Li Liang
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Chengdu University, Chengdu, China
| | - Hua Wan
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Shan Chang
- School of Electrical and Information Engineering, Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Jianping Hu
- Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Chengdu University, Chengdu, China
| | - Hubing Shi
- Laboratory of Tumor Targeted and Immune Therapy, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, China
| |
Collapse
|
17
|
Pearce P, Woodhouse FG, Forrow A, Kelly A, Kusumaatmaja H, Dunkel J. Learning dynamical information from static protein and sequencing data. Nat Commun 2019; 10:5368. [PMID: 31772168 PMCID: PMC6879630 DOI: 10.1038/s41467-019-13307-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 10/24/2019] [Indexed: 11/09/2022] Open
Abstract
Many complex processes, from protein folding to neuronal network dynamics, can be described as stochastic exploration of a high-dimensional energy landscape. Although efficient algorithms for cluster detection in high-dimensional spaces have been developed over the last two decades, considerably less is known about the reliable inference of state transition dynamics in such settings. Here we introduce a flexible and robust numerical framework to infer Markovian transition networks directly from time-independent data sampled from stationary equilibrium distributions. We demonstrate the practical potential of the inference scheme by reconstructing the network dynamics for several protein-folding transitions, gene-regulatory network motifs, and HIV evolution pathways. The predicted network topologies and relative transition time scales agree well with direct estimates from time-dependent molecular dynamics data, stochastic simulations, and phylogenetic trees, respectively. Owing to its generic structure, the framework introduced here will be applicable to high-throughput RNA and protein-sequencing datasets, and future cryo-electron microscopy (cryo-EM) data.
Collapse
Affiliation(s)
- Philip Pearce
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139-4307, USA
| | - Francis G Woodhouse
- Mathematical Institute, University of Oxford, Andrew Wiles Building, Woodstock Road, Oxford, OX2 6GG, UK
| | - Aden Forrow
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139-4307, USA.,Mathematical Institute, University of Oxford, Andrew Wiles Building, Woodstock Road, Oxford, OX2 6GG, UK
| | - Ashley Kelly
- Department of Physics, Durham University, South Road, Durham, DH1 3LE, UK
| | - Halim Kusumaatmaja
- Department of Physics, Durham University, South Road, Durham, DH1 3LE, UK.
| | - Jörn Dunkel
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139-4307, USA.
| |
Collapse
|
18
|
Zhang J, Lei YK, Che X, Zhang Z, Yang YI, Gao YQ. Deep Representation Learning for Complex Free-Energy Landscapes. J Phys Chem Lett 2019; 10:5571-5576. [PMID: 31476868 DOI: 10.1021/acs.jpclett.9b02012] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this Letter, we analyzed the inductive bias underlying complex free-energy landscapes (FELs) and exploited it to train deep neural networks that yield reduced and clustered representation for the FEL. Our parametric method, called information distilling of metastability (IDM), is end-to-end differentiable and thus scalable to ultralarge data sets. IDM is able to perform clustering in the meantime of reducing the dimensionality. Besides, as an unsupervised learning method, IDM differs from many existing dimensionality reduction and clustering methods in that it requires neither a cherry-picked distance metric nor the ground-true number of clusters defined a priori, and it can be used to unroll and zoom in on the hierarchical FEL with respect to different time scales. Through multiple experiments, we show that IDM can achieve physically meaningful representations that partition the FEL into well-defined metastable states that hence are amenable for downstream tasks such as mechanism analysis and kinetic modeling.
Collapse
Affiliation(s)
- Jun Zhang
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering , Peking University , 100871 Beijing , China
- Biodynamic Optical Imaging Center , Peking University , 100871 Beijing , China
| | - Yao-Kun Lei
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering , Peking University , 100871 Beijing , China
| | - Xing Che
- Institute of Molecular Biophysics , Florida State University , Tallahassee , Florida 32306 , United States
| | - Zhen Zhang
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering , Peking University , 100871 Beijing , China
- Department of Physics , Tangshan Normal University , 063000 Tangshan , China
| | - Yi Isaac Yang
- Institute of Systems Biology , Shenzhen Bay Laboratory , 518055 Shenzhen , China
| | - Yi Qin Gao
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering , Peking University , 100871 Beijing , China
- Biodynamic Optical Imaging Center , Peking University , 100871 Beijing , China
- Institute of Systems Biology , Shenzhen Bay Laboratory , 518055 Shenzhen , China
| |
Collapse
|
19
|
Xie T, Wu Z, Gu J, Guo R, Yan X, Duan H, Liu X, Liu W, Liang L, Wan H, Luo Y, Tang D, Shi H, Hu J. The global motion affecting electron transfer in Plasmodium falciparum type II NADH dehydrogenases: a novel non-competitive mechanism for quinoline ketone derivative inhibitors. Phys Chem Chem Phys 2019; 21:18105-18118. [PMID: 31396604 DOI: 10.1039/c9cp02645b] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
With the emergence of drug-resistant Plasmodium falciparum, the treatment of malaria has become a significant challenge; therefore, the development of antimalarial drugs acting on new targets is extremely urgent. In Plasmodium falciparum, type II nicotinamide adenine dinucleotide (NADH) dehydrogenase (NDH-2) is responsible for catalyzing the transfer of two electrons from NADH to flavin adenine dinucleotide (FAD), which in turn transfers the electrons to coenzyme Q (CoQ). As an entry enzyme for oxidative phosphorylation, NDH-2 has become one of the popular targets for the development of new antimalarial drugs. In this study, reliable motion trajectories of the NDH-2 complex with its co-factors (NADH and FAD) and inhibitor, RYL-552, were obtained by comparative molecular dynamics simulations. The influence of cofactor binding on the global motion of NDH-2 was explored through conformational clustering, principal component analysis and free energy landscape. The molecular interactions of NDH-2 before and after its binding with the inhibitor RYL-552 were analyzed, and the key residues and important hydrogen bonds were also determined. The results show that the association of RYL-552 results in the weakening of intramolecular hydrogen bonds and large allosterism of NDH-2. There was a significant positive correlation between the angular change of the key pocket residues in the NADH-FAD-pockets that represents the global functional motion and the change in distance between NADH-C4 and FAD-N5 that represents the electron transfer efficiency. Finally, the possible non-competitive inhibitory mechanism of RYL-552 was proposed. Specifically, the association of inhibitors with NDH-2 significantly affects the global motion mode of NDH-2, leading to widening of the distance between NADH and FAD through cooperative motion induction; this reduces the electron transfer efficiency of the mitochondrial respiratory chain. The simulation results provide useful theoretical guidance for subsequent antimalarial drug design based on the NDH-2 structure and the respiratory chain electron transfer mechanism.
Collapse
Affiliation(s)
- Tao Xie
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Chengdu University, Chengdu, 610106, China.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Nagel D, Weber A, Lickert B, Stock G. Dynamical coring of Markov state models. J Chem Phys 2019; 150:094111. [DOI: 10.1063/1.5081767] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Affiliation(s)
- Daniel Nagel
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Anna Weber
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Benjamin Lickert
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| |
Collapse
|
21
|
Abstract
This chapter discusses the way in which dimensionality reduction algorithms such as diffusion maps and sketch-map can be used to analyze molecular dynamics trajectories. The first part discusses how these various algorithms function as well as practical issues such as landmark selection and how these algorithms can be used when the data to be analyzed comes from enhanced sampling trajectories. In the later part a comparison between the results obtained by applying various algorithms to two sets of sample data is performed and discussed. This section is then followed by a summary of how one algorithm in particular, sketch-map, has been applied to a range of problems. The chapter concludes with a discussion on the directions that we believe this field is currently moving.
Collapse
|
22
|
Wang J, Ferguson AL. Recovery of Protein Folding Funnels from Single-Molecule Time Series by Delay Embeddings and Manifold Learning. J Phys Chem B 2018; 122:11931-11952. [PMID: 30428261 DOI: 10.1021/acs.jpcb.8b08800] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The stability and folding of proteins is governed by the underlying single-molecule free energy surface (smFES) mapping the free energy of the molecule as a function of configurational state. Ascertaining the smFES is of great value in understanding and engineering protein structure and function. By integrating tools from dynamical systems theory and nonlinear manifold learning, we describe an approach to reconstruct the multidimensional smFES for a protein from a time series in a single experimentally measurable observable. We employ Takens' delay embeddings to project the time series into a high-dimensional space in which the projected dynamics are C1-equivalent to the true system dynamics and employ diffusion maps to recover a low-dimensional reconstruction of the smFES that is equivalent to the true smFES up to a smooth and invertible transformation. We validate the approach in molecular dynamics simulations of Trp-cage, Villin, and BBA to demonstrate that landscapes recovered from univariate time series in the head-to-tail distance are topologically identical-they precisely preserve the metastable states and folding pathways-and topographically approximate-the free energy barrier heights and well depths are approximately preserved-to the true landscapes determined from complete knowledge of all atomic coordinates. We go on to show that the reconstructed landscapes reliably predict temperature denaturation and identify point mutations and groups of mutations critical to folding. These results demonstrate that protein folding funnels can be reconstructed from experimentally measurable time series and used to understand and engineer folding.
Collapse
Affiliation(s)
- Jiang Wang
- Department of Physics , University of Illinois at Urbana-Champaign , 1110 West Green Street , Urbana , Illinois 61801 , United States
| | - Andrew L Ferguson
- Institute for Molecular Engineering , University of Chicago , 5640 South Ellis Avenue , Chicago , Illinois 60637 , United States
| |
Collapse
|
23
|
Sittel F, Stock G. Perspective: Identification of collective variables and metastable states of protein dynamics. J Chem Phys 2018; 149:150901. [PMID: 30342445 DOI: 10.1063/1.5049637] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The statistical analysis of molecular dynamics simulations requires dimensionality reduction techniques, which yield a low-dimensional set of collective variables (CVs) {x i } = x that in some sense describe the essential dynamics of the system. Considering the distribution P( x ) of the CVs, the primal goal of a statistical analysis is to detect the characteristic features of P( x ), in particular, its maxima and their connection paths. This is because these features characterize the low-energy regions and the energy barriers of the corresponding free energy landscape ΔG( x ) = -k B T ln P( x ), and therefore amount to the metastable states and transition regions of the system. In this perspective, we outline a systematic strategy to identify CVs and metastable states, which subsequently can be employed to construct a Langevin or a Markov state model of the dynamics. In particular, we account for the still limited sampling typically achieved by molecular dynamics simulations, which in practice seriously limits the applicability of theories (e.g., assuming ergodicity) and black-box software tools (e.g., using redundant input coordinates). We show that it is essential to use internal (rather than Cartesian) input coordinates, employ dimensionality reduction methods that avoid rescaling errors (such as principal component analysis), and perform density based (rather than k-means-type) clustering. Finally, we briefly discuss a machine learning approach to dimensionality reduction, which highlights the essential internal coordinates of a system and may reveal hidden reaction mechanisms.
Collapse
Affiliation(s)
- Florian Sittel
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| |
Collapse
|
24
|
Chen W, Ferguson AL. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration. J Comput Chem 2018; 39:2079-2102. [PMID: 30368832 DOI: 10.1002/jcc.25520] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 06/14/2018] [Indexed: 01/08/2023]
Abstract
Macromolecular and biomolecular folding landscapes typically contain high free energy barriers that impede efficient sampling of configurational space by standard molecular dynamics simulation. Biased sampling can artificially drive the simulation along prespecified collective variables (CVs), but success depends critically on the availability of good CVs associated with the important collective dynamical motions. Nonlinear machine learning techniques can identify such CVs but typically do not furnish an explicit relationship with the atomic coordinates necessary to perform biased sampling. In this work, we employ auto-associative artificial neural networks ("autoencoders") to learn nonlinear CVs that are explicit and differentiable functions of the atomic coordinates. Our approach offers substantial speedups in exploration of configurational space, and is distinguished from existing approaches by its capacity to simultaneously discover and directly accelerate along data-driven CVs. We demonstrate the approach in simulations of alanine dipeptide and Trp-cage, and have developed an open-source and freely available implementation within OpenMM. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 W Green Street, Urbana, Illinois, 61801.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois, 61801
| |
Collapse
|
25
|
Chen W, Tan AR, Ferguson AL. Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design. J Chem Phys 2018; 149:072312. [PMID: 30134681 DOI: 10.1063/1.5023804] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Auto-associative neural networks ("autoencoders") present a powerful nonlinear dimensionality reduction technique to mine data-driven collective variables from molecular simulation trajectories. This technique furnishes explicit and differentiable expressions for the nonlinear collective variables, making it ideally suited for integration with enhanced sampling techniques for accelerated exploration of configurational space. In this work, we describe a number of sophistications of the neural network architectures to improve and generalize the process of interleaved collective variable discovery and enhanced sampling. We employ circular network nodes to accommodate periodicities in the collective variables, hierarchical network architectures to rank-order the collective variables, and generalized encoder-decoder architectures to support bespoke error functions for network training to incorporate prior knowledge. We demonstrate our approach in blind collective variable discovery and enhanced sampling of the configurational free energy landscapes of alanine dipeptide and Trp-cage using an open-source plugin developed for the OpenMM molecular simulation package.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| | - Aik Rui Tan
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, Illinois 61801, USA
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| |
Collapse
|
26
|
Sun X, Yan X, Zhuo W, Gu J, Zuo K, Liu W, Liang L, Gan Y, He G, Wan H, Gou X, Shi H, Hu J. PD-L1 Nanobody Competitively Inhibits the Formation of the PD-1/PD-L1 Complex: Comparative Molecular Dynamics Simulations. Int J Mol Sci 2018; 19:E1984. [PMID: 29986511 PMCID: PMC6073277 DOI: 10.3390/ijms19071984] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 07/02/2018] [Accepted: 07/04/2018] [Indexed: 12/22/2022] Open
Abstract
The anti-PD-L1 monoclonal antibody (mAb) targeting PD-1/PD-L1 immune checkpoint has achieved outstanding results in clinical application and has become one of the most popular anti-cancer drugs. The mechanism of molecular recognition and inhibition of PD-L1 mAbs is not yet clear, which hinders the subsequent antibody design and modification. In this work, the trajectories of PD-1/PD-L1 and nanobody/PD-L1 complexes were obtained via comparative molecular dynamics simulations. Then, a series of physicochemical parameters including hydrogen bond, dihedral angle distribution, pKa value and binding free energy, and so forth, were all comparatively analyzed to investigate the recognition difference between PD-L1 and PD-1 and nanobody. Both LR113 (the amino acid residues in PD-L1 are represented by the lower left sign of L) and LR125 residues of PD-L1 undergo significant conformational change after association with mAbs, which dominates a strong electrostatic interaction. Solvation effect analysis revealed that solvent-water enhanced molecular recognition between PD-L1 and nanobody. By combining the analyses of the time-dependent root mean squared fluctuation (RMSF), free energy landscape, clustering and energy decomposition, the potential inhibition mechanism was proposed that the nanobody competitively and specifically bound to the β-sheet groups of PD-L1, reduced the PD-L1’s flexibility and finally blocked the formation of PD-1/PD-L1 complex. Based on the simulation results, site-directed mutagenesis of ND99 (the amino acid residues in Nano are displayed by the lower left sign of N) and NQ116 in the nanobody may be beneficial for improving antibody activity. This work offers some structural guidance for the design and modification of anticancer mAbs based on the structure of the PD-1/PD-L1 complex.
Collapse
Affiliation(s)
- Xin Sun
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Xiao Yan
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Wei Zhuo
- Ministry of Education Key Laboratory of Protein Science, Tsinghua-Peking Joint Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China.
| | - Jinke Gu
- Ministry of Education Key Laboratory of Protein Science, Tsinghua-Peking Joint Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China.
| | - Ke Zuo
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Wei Liu
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Li Liang
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Ya Gan
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Gang He
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Hua Wan
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China.
| | - Xiaojun Gou
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| | - Hubing Shi
- Laboratory of tumor targeted and immune therapy, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu 610041, China.
| | - Jianping Hu
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China.
| |
Collapse
|
27
|
Zhang J, Chen M. Unfolding Hidden Barriers by Active Enhanced Sampling. PHYSICAL REVIEW LETTERS 2018; 121:010601. [PMID: 30028174 DOI: 10.1103/physrevlett.121.010601] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Indexed: 05/27/2023]
Abstract
Collective variable (CV) or order parameter based enhanced sampling algorithms have achieved great success due to their ability to efficiently explore the rough potential energy landscapes of complex systems. However, the degeneracy of microscopic configurations, originating from the orthogonal space perpendicular to the CVs, is likely to shadow "hidden barriers" and greatly reduce the efficiency of CV-based sampling. Here we demonstrate that systematic machine learning CV, through enhanced sampling, can iteratively lift such degeneracies on the fly. We introduce an active learning scheme that consists of a parametric CV learner based on deep neural network and a CV-based enhanced sampler. Our active enhanced sampling algorithm is capable of identifying the least informative regions based on a historical sample, forming a positive feedback loop between the CV learner and sampler. This approach is able to globally preserve kinetic characteristics by incrementally enhancing both sample completeness and CV quality.
Collapse
Affiliation(s)
- Jing Zhang
- KLA-Tencor, One Technology Drive, Milpitas, California 95035, USA
| | - Ming Chen
- Department of Chemistry, University of California, Berkeley, California 94720, USA
| |
Collapse
|
28
|
Stock G, Hamm P. A non-equilibrium approach to allosteric communication. Philos Trans R Soc Lond B Biol Sci 2018; 373:20170187. [PMID: 29735740 PMCID: PMC5941181 DOI: 10.1098/rstb.2017.0187] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2018] [Indexed: 12/16/2022] Open
Abstract
While the theory of protein folding is well developed, including concepts such as rugged energy landscape, folding funnel, etc., the same degree of understanding has not been reached for the description of the dynamics of allosteric transitions in proteins. This is not only due to the small size of the structural change upon ligand binding to an allosteric site, but also due to challenges in designing experiments that directly observe such an allosteric transition. On the basis of recent pump-probe-type experiments (Buchli et al. 2013 Proc. Natl Acad. Sci. USA110, 11 725-11 730. (doi:10.1073/pnas.1306323110)) and non-equilibrium molecular dynamics simulations (Buchenberg et al. 2017 Proc. Natl Acad. Sci. USA114, E6804-E6811. (doi:10.1073/pnas.1707694114)) studying an photoswitchable PDZ2 domain as model for an allosteric transition, we outline in this perspective how such a description of allosteric communication might look. That is, calculating the dynamical content of both experiment and simulation (which agree remarkably well with each other), we find that allosteric communication shares some properties with downhill folding, except that it is an 'order-order' transition. Discussing the multiscale and hierarchical features of the dynamics, the validity of linear response theory as well as the meaning of 'allosteric pathways', we conclude that non-equilibrium experiments and simulations are a promising way to study dynamical aspects of allostery.This article is part of a discussion meeting issue 'Allostery and molecular machines'.
Collapse
Affiliation(s)
- Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, Freiburg, Germany
| | - Peter Hamm
- Department of Chemistry, University of Zurich, Zurich, Switzerland
| |
Collapse
|
29
|
Wang J, Gayatri M, Ferguson AL. Coarse-Grained Molecular Simulation and Nonlinear Manifold Learning of Archipelago Asphaltene Aggregation and Folding. J Phys Chem B 2018; 122:6627-6647. [DOI: 10.1021/acs.jpcb.8b01634] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Jiang Wang
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, United States
| | - Mohit Gayatri
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois 61801, United States
| | - Andrew L. Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, United States
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois 61801, United States
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, Illinois 61801, United States
| |
Collapse
|
30
|
Ferguson AL. Machine learning and data science in soft materials engineering. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2018; 30:043002. [PMID: 29111979 DOI: 10.1088/1361-648x/aa98bd] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.
Collapse
Affiliation(s)
- Andrew L Ferguson
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, IL 61801, United States of America. Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, IL 61801, United States of America. Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, IL 61801, United States of America. Frederick Seitz Materials Research Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America. Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America
| |
Collapse
|
31
|
Wang J, Ferguson AL. A Study of the Morphology, Dynamics, and Folding Pathways of Ring Polymers with Supramolecular Topological Constraints Using Molecular Simulation and Nonlinear Manifold Learning. Macromolecules 2018. [DOI: 10.1021/acs.macromol.7b01684] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Jiang Wang
- Department
of Physics, ‡Department of Materials Science and Engineering, and §Department of
Chemical and Biomolecular Engineering, University of Illinois Urbana−Champaign, Urbana, Illinois 61801, United States
| | - Andrew L. Ferguson
- Department
of Physics, ‡Department of Materials Science and Engineering, and §Department of
Chemical and Biomolecular Engineering, University of Illinois Urbana−Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
32
|
Wang J, Ferguson AL. Nonlinear machine learning in simulations of soft and biological materials. MOLECULAR SIMULATION 2017. [DOI: 10.1080/08927022.2017.1400164] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- J. Wang
- Department of Physics, University of Illinois Urbana-Champaign , Urbana, IL, USA
| | - A. L. Ferguson
- Department of Physics, University of Illinois Urbana-Champaign , Urbana, IL, USA
- Department of Materials Science and Engineering, University of Illinois Urbana-Champaign , Urbana, IL, USA
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign , Urbana, IL, USA
| |
Collapse
|
33
|
Affiliation(s)
- Zhen-Gang Wang
- Division of Chemistry and
Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
34
|
Cloete R, Akurugu WA, Werely CJ, van Helden PD, Christoffels A. Structural and functional effects of nucleotide variation on the human TB drug metabolizing enzyme arylamine N-acetyltransferase 1. J Mol Graph Model 2017. [PMID: 28628859 DOI: 10.1016/j.jmgm.2017.04.026] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The human arylamine N-acetyltransferase 1 (NAT1) enzyme plays a vital role in determining the duration of action of amine-containing drugs such as para-aminobenzoic acid (PABA) by influencing the balance between detoxification and metabolic activation of these drugs. Recently, four novel single nucleotide polymorphisms (SNPs) were identified within a South African mixed ancestry population. Modeling the effects of these SNPs within the structural protein was done to assess possible structure and function changes in the enzyme. The use of molecular dynamics simulations and stability predictions indicated less thermodynamically stable protein structures containing E264K and V231G, while the N245I change showed a stabilizing effect. Coincidently the N245I change displayed a similar free energy landscape profile to the known R64W amino acid substitution (slow acetylator), while the R242M displayed a similar profile to the published variant, I263V (proposed fast acetylator), and the wild type protein structure. Similarly, principal component analysis indicated that two amino acid substitutions (E264K and V231G) occupied less conformational clusters of folded states as compared to the WT and were found to be destabilizing (may affect protein function). However, two of the four novel SNPs that result in amino acid changes: (V231G and N245I) were predicted by both SIFT and POLYPHEN-2 algorithms to affect NAT1 protein function, while two other SNPs that result in R242M and E264K substitutions showed contradictory results based on SIFT and POLYPHEN-2 analysis. In conclusion, the structural methods were able to verify that two non-synonymous substitutions (E264K and V231G) can destabilize the protein structure, and are in agreement with mCSM predictions, and should therefore be experimentally tested for NAT1 activity. These findings could inform a strategy of incorporating genotypic data (i.e., functional SNP alleles) with phenotypic information (slow or fast acetylator) to better prescribe effective treatment using drugs metabolized by NAT1.
Collapse
Affiliation(s)
- Ruben Cloete
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa.
| | - Wisdom A Akurugu
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa.
| | - Cedric J Werely
- SAMRC Centre for Molecular and Cellular Biology, and DST-NRF Centre of Excellence for Biomedical TB Research. Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, P.O. Box 241, Cape Town 8000, South Africa.
| | - Paul D van Helden
- SAMRC Centre for Molecular and Cellular Biology, and DST-NRF Centre of Excellence for Biomedical TB Research. Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, P.O. Box 241, Cape Town 8000, South Africa.
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa.
| |
Collapse
|
35
|
Ferguson AL. BayesWHAM: A Bayesian approach for free energy estimation, reweighting, and uncertainty quantification in the weighted histogram analysis method. J Comput Chem 2017; 38:1583-1605. [PMID: 28475830 DOI: 10.1002/jcc.24800] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Revised: 11/27/2016] [Accepted: 03/19/2017] [Indexed: 01/18/2023]
Abstract
The weighted histogram analysis method (WHAM) is a powerful approach to estimate molecular free energy surfaces (FES) from biased simulation data. Bayesian reformulations of WHAM are valuable in proving statistically optimal use of the data and providing a transparent means to incorporate regularizing priors and estimate statistical uncertainties. In this work, we develop a fully Bayesian treatment of WHAM to generate statistically optimal FES estimates in any number of biasing dimensions under arbitrary choices of the Bayes prior. Rigorous uncertainty estimates are generated by Metropolis-Hastings sampling from the Bayes posterior. We also report a means to project the FES and its uncertainties into arbitrary auxiliary order parameters beyond those in which biased sampling was conducted. We demonstrate the approaches in applications of alanine dipeptide and the unthreading of a synthetic mimic of the astexin-3 lasso peptide. Open-source MATLAB and Python implementations of our codes are available for free public download. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Andrew L Ferguson
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801
| |
Collapse
|
36
|
Hashemian B, Millán D, Arroyo M. Charting molecular free-energy landscapes with an atlas of collective variables. J Chem Phys 2016; 145:174109. [DOI: 10.1063/1.4966262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Affiliation(s)
- Behrooz Hashemian
- LaCàN, Universitat Politècnica de Catalunya–BarcelonaTech, Barcelona, Spain
| | - Daniel Millán
- LaCàN, Universitat Politècnica de Catalunya–BarcelonaTech, Barcelona, Spain
| | - Marino Arroyo
- LaCàN, Universitat Politècnica de Catalunya–BarcelonaTech, Barcelona, Spain
| |
Collapse
|
37
|
Sittel F, Stock G. Robust Density-Based Clustering To Identify Metastable Conformational States of Proteins. J Chem Theory Comput 2016; 12:2426-35. [PMID: 27058020 DOI: 10.1021/acs.jctc.5b01233] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
A density-based clustering method is proposed that is deterministic, computationally efficient, and self-consistent in its parameter choice. By calculating a geometric coordinate space density for every point of a given data set, a local free energy is defined. On the basis of these free energy estimates, the frames are lumped into local free energy minima, ultimately forming microstates separated by local free energy barriers. The algorithm is embedded into a complete workflow to robustly generate Markov state models from molecular dynamics trajectories. It consists of (i) preprocessing of the data via principal component analysis in order to reduce the dimensionality of the problem, (ii) proposed density-based clustering to generate microstates, and (iii) dynamical clustering via the most probable path algorithm to construct metastable states. To characterize the resulting state-resolved conformational distribution, dihedral angle content color plots are introduced which identify structural differences of protein states in a concise way. To illustrate the performance of the method, three well-established model problems are adopted: conformational transitions of hepta-alanine, folding of villin headpiece, and functional dynamics of bovine pancreatic trypsin inhibitor.
Collapse
Affiliation(s)
- Florian Sittel
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University , 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University , 79104 Freiburg, Germany
| |
Collapse
|
38
|
Wang J, Ferguson AL. Nonlinear reconstruction of single-molecule free-energy surfaces from univariate time series. Phys Rev E 2016; 93:032412. [PMID: 27078395 DOI: 10.1103/physreve.93.032412] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2015] [Indexed: 01/27/2023]
Abstract
The stable conformations and dynamical fluctuations of polymers and macromolecules are governed by the underlying single-molecule free energy surface. By integrating ideas from dynamical systems theory with nonlinear manifold learning, we have recovered single-molecule free energy surfaces from univariate time series in a single coarse-grained system observable. Using Takens' Delay Embedding Theorem, we expand the univariate time series into a high dimensional space in which the dynamics are equivalent to those of the molecular motions in real space. We then apply the diffusion map nonlinear manifold learning algorithm to extract a low-dimensional representation of the free energy surface that is diffeomorphic to that computed from a complete knowledge of all system degrees of freedom. We validate our approach in molecular dynamics simulations of a C(24)H(50) n-alkane chain to demonstrate that the two-dimensional free energy surface extracted from the atomistic simulation trajectory is - subject to spatial and temporal symmetries - geometrically and topologically equivalent to that recovered from a knowledge of only the head-to-tail distance of the chain. Our approach lays the foundations to extract empirical single-molecule free energy surfaces directly from experimental measurements.
Collapse
Affiliation(s)
- Jiang Wang
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Andrew L Ferguson
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
39
|
Ernst M, Sittel F, Stock G. Contact- and distance-based principal component analysis of protein dynamics. J Chem Phys 2015; 143:244114. [DOI: 10.1063/1.4938249] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
40
|
Mansbach RA, Ferguson AL. Machine learning of single molecule free energy surfaces and the impact of chemistry and environment upon structure and dynamics. J Chem Phys 2015; 142:105101. [DOI: 10.1063/1.4914144] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
41
|
Hashemian B, Arroyo M. Topological obstructions in the way of data-driven collective variables. J Chem Phys 2015; 142:044102. [PMID: 25637964 DOI: 10.1063/1.4906425] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Nonlinear dimensionality reduction (NLDR) techniques are increasingly used to visualize molecular trajectories and to create data-driven collective variables for enhanced sampling simulations. The success of these methods relies on their ability to identify the essential degrees of freedom characterizing conformational changes. Here, we show that NLDR methods face serious obstacles when the underlying collective variables present periodicities, e.g., arising from proper dihedral angles. As a result, NLDR methods collapse very distant configurations, thus leading to misinterpretations and inefficiencies in enhanced sampling. Here, we identify this largely overlooked problem and discuss possible approaches to overcome it. We also characterize the geometry and topology of conformational changes of alanine dipeptide, a benchmark system for testing new methods to identify collective variables.
Collapse
Affiliation(s)
- Behrooz Hashemian
- LaCàN, Universitat Politecnica de Catalunya–BarcelonaTech, Barcelona, Spain
| | - Marino Arroyo
- LaCàN, Universitat Politecnica de Catalunya–BarcelonaTech, Barcelona, Spain
| |
Collapse
|
42
|
George Priya Doss C, Rajith B, Magesh R, Ashish Kumar A. Influence of the SNPs on the structural stability of CBS protein: Insight from molecular dynamics simulations. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/s11515-014-1320-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
43
|
Hashemian B, Millán D, Arroyo M. Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables. J Chem Phys 2014; 139:214101. [PMID: 24320358 DOI: 10.1063/1.4830403] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Collective variables (CVs) are low-dimensional representations of the state of a complex system, which help us rationalize molecular conformations and sample free energy landscapes with molecular dynamics simulations. Given their importance, there is need for systematic methods that effectively identify CVs for complex systems. In recent years, nonlinear manifold learning has shown its ability to automatically characterize molecular collective behavior. Unfortunately, these methods fail to provide a differentiable function mapping high-dimensional configurations to their low-dimensional representation, as required in enhanced sampling methods. We introduce a methodology that, starting from an ensemble representative of molecular flexibility, builds smooth and nonlinear data-driven collective variables (SandCV) from the output of nonlinear manifold learning algorithms. We demonstrate the method with a standard benchmark molecule, alanine dipeptide, and show how it can be non-intrusively combined with off-the-shelf enhanced sampling methods, here the adaptive biasing force method. We illustrate how enhanced sampling simulations with SandCV can explore regions that were poorly sampled in the original molecular ensemble. We further explore the transferability of SandCV from a simpler system, alanine dipeptide in vacuum, to a more complex system, alanine dipeptide in explicit water.
Collapse
Affiliation(s)
- Behrooz Hashemian
- LaCàN, Universitat Politècnica de Catalunya - BarcelonaTech, Campus Nord, 08034 Barcelona, Spain
| | | | | |
Collapse
|
44
|
Sicard F, Senet P. Reconstructing the free-energy landscape of Met-enkephalin using dihedral principal component analysis and well-tempered metadynamics. J Chem Phys 2014; 138:235101. [PMID: 23802984 DOI: 10.1063/1.4810884] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Well-Tempered Metadynamics (WTmetaD) is an efficient method to enhance the reconstruction of the free-energy surface of proteins. WTmetaD guarantees a faster convergence in the long time limit in comparison with the standard metadynamics. It still suffers, however, from the same limitation, i.e., the non-trivial choice of pertinent collective variables (CVs). To circumvent this problem, we couple WTmetaD with a set of CVs generated from a dihedral Principal Component Analysis (dPCA) on the Ramachandran dihedral angles describing the backbone structure of the protein. The dPCA provides a generic method to extract relevant CVs built from internal coordinates, and does not depend on the alignment to an arbitrarily chosen reference structure as usual in Cartesian PCA. We illustrate the robustness of this method in the case of a reference model protein, the small and very diffusive Met-enkephalin pentapeptide. We propose a justification a posteriori of the considered number of CVs necessary to bias the metadynamics simulation in terms of the one-dimensional free-energy profiles associated with Ramachandran dihedral angles along the amino-acid sequence.
Collapse
Affiliation(s)
- François Sicard
- Laboratoire Interdisciplinaire Carnot de Bourgogne, UMR 6303 CNRS-Université de Bourgogne, 9 Avenue A. Savary, BP 47 870, F-21078 Dijon Cedex, France.
| | | |
Collapse
|
45
|
Jain A, Stock G. Hierarchical Folding Free Energy Landscape of HP35 Revealed by Most Probable Path Clustering. J Phys Chem B 2014; 118:7750-60. [DOI: 10.1021/jp410398a] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Abhinav Jain
- Biomolecular
Dynamics, Institute
of Physics and Freiburg Institute for Advanced Studies (FRIAS), Albert Ludwigs University, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular
Dynamics, Institute
of Physics and Freiburg Institute for Advanced Studies (FRIAS), Albert Ludwigs University, 79104 Freiburg, Germany
| |
Collapse
|
46
|
Li SS, Huang CY, Hao JJ, Wang CS. A polarizable dipole-dipole interaction model for evaluation of the interaction energies for NH···OC and CH···OC hydrogen-bonded complexes. J Comput Chem 2013; 35:415-26. [DOI: 10.1002/jcc.23473] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Revised: 09/30/2013] [Accepted: 10/03/2013] [Indexed: 02/02/2023]
Affiliation(s)
- Shu-Shi Li
- Department of Chemistry; Liaoning Normal University; Dalian 116029 People's Republic of China
| | - Cui-Ying Huang
- Department of Chemistry; Liaoning Normal University; Dalian 116029 People's Republic of China
| | - Jiao-Jiao Hao
- Department of Chemistry; Liaoning Normal University; Dalian 116029 People's Republic of China
| | - Chang-Sheng Wang
- Department of Chemistry; Liaoning Normal University; Dalian 116029 People's Republic of China
| |
Collapse
|
47
|
Cormanich RA, Ducati LC, Tormena CF, Rittner R. A theoretical investigation of the dictating forces in small amino acid conformational preferences: The case of glycine, sarcosine and N,N-dimethylglycine. Chem Phys 2013. [DOI: 10.1016/j.chemphys.2013.05.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
48
|
Berezovska G, Prada-Gracia D, Mostarda S, Rao F. Accounting for the kinetics in order parameter analysis: lessons from theoretical models and a disordered peptide. J Chem Phys 2013. [PMID: 23181288 DOI: 10.1063/1.4764868] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Molecular simulations as well as single molecule experiments have been widely analyzed in terms of order parameters, the latter representing candidate probes for the relevant degrees of freedom. Notwithstanding this approach is very intuitive, mounting evidence showed that such descriptions are inaccurate, leading to ambiguous definitions of states and wrong kinetics. To overcome these limitations a framework making use of order parameter fluctuations in conjunction with complex network analysis is investigated. Derived from recent advances in the analysis of single molecule time traces, this approach takes into account the fluctuations around each time point to distinguish between states that have similar values of the order parameter but different dynamics. Snapshots with similar fluctuations are used as nodes of a transition network, the clusterization of which into states provides accurate Markov-state-models of the system under study. Application of the methodology to theoretical models with a noisy order parameter as well as the dynamics of a disordered peptide illustrates the possibility to build accurate descriptions of molecular processes on the sole basis of order parameter time series without using any supplementary information.
Collapse
Affiliation(s)
- Ganna Berezovska
- Freiburg Institute for Advanced Studies, School of Soft Matter Research, Freiburg im Breisgau, Germany
| | | | | | | |
Collapse
|
49
|
Wan H, Hu JP, Tian XH, Chang S. Molecular dynamics simulations of wild type and mutants of human complement receptor 2 complexed with C3d. Phys Chem Chem Phys 2013; 15:1241-51. [DOI: 10.1039/c2cp41388d] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
50
|
Abstract
Recent molecular dynamics simulations of biopolymers have shown that in many cases the global features of the free energy landscape can be characterized in terms of the metastable conformational states of the system. To identify these states, a conceptionally and computationally simple approach is proposed. It consists of (i) an initial preprocessing via principal component analysis to reduce the dimensionality of the data, followed by k-means clustering to generate up to 10(4) microstates, (ii) the most probable path algorithm to identify the metastable states of the system, and (iii) boundary corrections of these states via the introduction of cluster cores in order to obtain the correct dynamics. By adopting two well-studied model problems, hepta-alanine and the villin headpiece protein, the potential and the performance of the approach are demonstrated.
Collapse
Affiliation(s)
- Abhinav Jain
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University , 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University , 79104 Freiburg, Germany
| |
Collapse
|