1
|
Chen L, Roe DR, Kochert M, Simmerling C, Miranda-Quintana RA. k-Means NANI: An Improved Clustering Algorithm for Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:5583-5597. [PMID: 38905589 PMCID: PMC11541788 DOI: 10.1021/acs.jctc.4c00308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2024]
Abstract
One of the key challenges of k-means clustering is the seed selection or the initial centroid estimation since the clustering result depends heavily on this choice. Alternatives such as k-means++ have mitigated this limitation by estimating the centroids using an empirical probability distribution. However, with high-dimensional and complex data sets such as those obtained from molecular simulation, k-means++ fails to partition the data in an optimal manner. Furthermore, stochastic elements in all flavors of k-means++ will lead to a lack of reproducibility. K-means N-Ary Natural Initiation (NANI) is presented as an alternative to tackle this challenge by using efficient n-ary comparisons to both identify high-density regions in the data and select a diverse set of initial conformations. Centroids generated from NANI are not only representative of the data and different from one another, helping k-means to partition the data accurately, but also deterministic, providing consistent cluster populations across replicates. From peptide and protein folding molecular simulations, NANI was able to create compact and well-separated clusters as well as accurately find the metastable states that agree with the literature. NANI can cluster diverse data sets and be used as a standalone tool or as part of our MDANCE clustering package.
Collapse
Affiliation(s)
- Lexin Chen
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Daniel R Roe
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Matthew Kochert
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Carlos Simmerling
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794, United States
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
2
|
Chen L, Roe DR, Kochert M, Simmerling C, Miranda-Quintana RA. k-Means NANI: an improved clustering algorithm for Molecular Dynamics simulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.07.583975. [PMID: 38496504 PMCID: PMC10942464 DOI: 10.1101/2024.03.07.583975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
One of the key challenges of k-means clustering is the seed selection or the initial centroid estimation since the clustering result depends heavily on this choice. Alternatives such as k-means++ have mitigated this limitation by estimating the centroids using an empirical probability distribution. However, with high-dimensional and complex datasets such as those obtained from molecular simulation, k-means++ fails to partition the data in an optimal manner. Furthermore, stochastic elements in all flavors of k-means++ will lead to a lack of reproducibility. K-means N-Ary Natural Initiation (NANI) is presented as an alternative to tackle this challenge by using efficient n-ary comparisons to both identify high-density regions in the data and select a diverse set of initial conformations. Centroids generated from NANI are not only representative of the data and different from one another, helping k-means to partition the data accurately, but also deterministic, providing consistent cluster populations across replicates. From peptide and protein folding molecular simulations, NANI was able to create compact and well-separated clusters as well as accurately find the metastable states that agree with the literature. NANI can cluster diverse datasets and be used as a standalone tool or as part of our MDANCE clustering package.
Collapse
Affiliation(s)
- Lexin Chen
- Department of Chemistry, University of Florida, FL, USA
- Quantum Theory Project, University of Florida, FL, USA
| | - Daniel R Roe
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Matthew Kochert
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, 11794, USA
- Department of Chemistry, Stony Brook University, Stony Brook 11794, USA
| | - Carlos Simmerling
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, 11794, USA
- Department of Chemistry, Stony Brook University, Stony Brook 11794, USA
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook 11794, USA
| | | |
Collapse
|
3
|
Ricardi N, González-Espinoza CE, Adam S, Church JR, Schapiro I, Wesołowski TA. Embedding Nonrigid Solutes in an Averaged Environment: A Case Study on Rhodopsins. J Chem Theory Comput 2023; 19:5289-5302. [PMID: 37441785 PMCID: PMC10413860 DOI: 10.1021/acs.jctc.3c00285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/15/2023]
Abstract
Many simulation methods concerning solvated molecules are based on the assumption that the solvated species and the solvent can be characterized by some representative structures of the solute and some embedding potential corresponding to this structure. While the averaging of the solvent configurations to obtain an embedding potential has been studied in great detail, this hinges on a single solute structure representation. This assumption is re-examined and generalized for conformationally flexible solutes and tested on 4 nonrigid systems. In this generalized approach, the solute is characterized by a set of representative structures and the corresponding embedding potentials. The representative structures are identified by means of subdividing the statistical ensemble, which in this work is generated by a constant-temperature molecular dynamics simulation. The embedding potential defined in the Frozen-Density Embedding Theory is used to characterize the average effect of the solvent in each subensemble. The numerical examples concern the vertical excitation energies of protonated retinal Schiff bases in protein environments. It is comprehensively shown that subensemble averaging leads to huge computational savings compared with explicit averaging of the excitation energies in the whole ensemble while introducing only minor errors in the case of the systems examined.
Collapse
Affiliation(s)
- Niccolò Ricardi
- Department of Physical Chemistry, University of Geneva, 1205 Geneva, Switzerland
| | | | - Suliman Adam
- Fritz Haber Center for Molecular Dynamics, Hebrew University of Jerusalem Israel, 91904 Jerusalem, Israel
| | - Jonathan R Church
- Fritz Haber Center for Molecular Dynamics, Hebrew University of Jerusalem Israel, 91904 Jerusalem, Israel
| | - Igor Schapiro
- Fritz Haber Center for Molecular Dynamics, Hebrew University of Jerusalem Israel, 91904 Jerusalem, Israel
| | | |
Collapse
|
4
|
Perrella F, Coppola F, Rega N, Petrone A. An Expedited Route to Optical and Electronic Properties at Finite Temperature via Unsupervised Learning. Molecules 2023; 28:3411. [PMID: 37110644 PMCID: PMC10144358 DOI: 10.3390/molecules28083411] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 04/06/2023] [Accepted: 04/07/2023] [Indexed: 04/29/2023] Open
Abstract
Electronic properties and absorption spectra are the grounds to investigate molecular electronic states and their interactions with the environment. Modeling and computations are required for the molecular understanding and design strategies of photo-active materials and sensors. However, the interpretation of such properties demands expensive computations and dealing with the interplay of electronic excited states with the conformational freedom of the chromophores in complex matrices (i.e., solvents, biomolecules, crystals) at finite temperature. Computational protocols combining time dependent density functional theory and ab initio molecular dynamics (MD) have become very powerful in this field, although they require still a large number of computations for a detailed reproduction of electronic properties, such as band shapes. Besides the ongoing research in more traditional computational chemistry fields, data analysis and machine learning methods have been increasingly employed as complementary approaches for efficient data exploration, prediction and model development, starting from the data resulting from MD simulations and electronic structure calculations. In this work, dataset reduction capabilities by unsupervised clustering techniques applied to MD trajectories are proposed and tested for the ab initio modeling of electronic absorption spectra of two challenging case studies: a non-covalent charge-transfer dimer and a ruthenium complex in solution at room temperature. The K-medoids clustering technique is applied and is proven to be able to reduce by ∼100 times the total cost of excited state calculations on an MD sampling with no loss in the accuracy and it also provides an easier understanding of the representative structures (medoids) to be analyzed on the molecular scale.
Collapse
Affiliation(s)
- Fulvio Perrella
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy; (F.P.); (F.C.); (N.R.)
| | - Federico Coppola
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy; (F.P.); (F.C.); (N.R.)
| | - Nadia Rega
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy; (F.P.); (F.C.); (N.R.)
- Department of Chemical Sciences, University of Napoli Federico II, Complesso Universitario di M.S. Angelo, via Cintia 21, I-80126 Napoli, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Complesso Universitario di M.S. Angelo ed. 6, via Cintia 21, I-80126 Napoli, Italy
| | - Alessio Petrone
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy; (F.P.); (F.C.); (N.R.)
- Department of Chemical Sciences, University of Napoli Federico II, Complesso Universitario di M.S. Angelo, via Cintia 21, I-80126 Napoli, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Complesso Universitario di M.S. Angelo ed. 6, via Cintia 21, I-80126 Napoli, Italy
| |
Collapse
|
5
|
Abudayah A, Daoud S, Al-Sha'er M, Taha M. Pharmacophore Modeling of Targets Infested with Activity Cliffs via Molecular Dynamics Simulation Coupled with QSAR and Comparison with other Pharmacophore Generation Methods: KDR as Case Study. Mol Inform 2022; 41:e2200049. [PMID: 35973966 DOI: 10.1002/minf.202200049] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 08/15/2022] [Indexed: 11/07/2022]
Abstract
Activity cliffs (ACs) are defined as pairs of structurally similar compounds with large difference in their potencies against certain biotarget. We recently proposed that potent AC members induce significant entropically-driven conformational modifications of the target that unveil additional binding interactions, while their weakly-potent counterparts are enthalpically-driven binders with little influence on the protein target. We herein propose to extract pharmacophores for ACs-infested target(s) from molecular dynamics (MD) frames of purely "enthalpic" potent binder(s) complexed within the particular target. Genetic function algorithm/machine learning (GFA/ML) can then be employed to search for the best possible combination of MD pharmacophore(s) capable of explaining bioactivity variations within a list of inhibitors. We compared the performance of this approach with established ligand-based and structure-based methods. Kinase inserts domain receptor (KDR) was used as a case study. KDR plays a crucial role in angiogenic signaling and its inhibitors have been approved in cancer treatment. Interestingly, GFA/ML selected, MD-based, pharmacophores were of comparable performances to ligand-based and structure-based pharmacophores. The resulting pharmacophores and QSAR models were used to capture hits from the national cancer institute list of compounds. The most active hit showed anti-KDR IC50 of 2.76 µM.
Collapse
Affiliation(s)
| | | | | | - Mutasem Taha
- Faculty of pharmacy,University of jordan, JORDAN
| |
Collapse
|
6
|
Baltrukevich H, Podlewska S. From Data to Knowledge: Systematic Review of Tools for Automatic Analysis of Molecular Dynamics Output. Front Pharmacol 2022; 13:844293. [PMID: 35359865 PMCID: PMC8960308 DOI: 10.3389/fphar.2022.844293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 01/26/2022] [Indexed: 12/02/2022] Open
Abstract
An increasing number of crystal structures available on one side, and the boost of computational power available for computer-aided drug design tasks on the other, have caused that the structure-based drug design tools are intensively used in the drug development pipelines. Docking and molecular dynamics simulations, key representatives of the structure-based approaches, provide detailed information about the potential interaction of a ligand with a target receptor. However, at the same time, they require a three-dimensional structure of a protein and a relatively high amount of computational resources. Nowadays, as both docking and molecular dynamics are much more extensively used, the amount of data output from these procedures is also growing. Therefore, there are also more and more approaches that facilitate the analysis and interpretation of the results of structure-based tools. In this review, we will comprehensively summarize approaches for handling molecular dynamics simulations output. It will cover both statistical and machine-learning-based tools, as well as various forms of depiction of molecular dynamics output.
Collapse
Affiliation(s)
- Hanna Baltrukevich
- Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
- Faculty of Pharmacy, Chair of Technology and Biotechnology of Medical Remedies, Jagiellonian University Medical College in Krakow, Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
| |
Collapse
|
7
|
Exploring energy landscapes at the DFTB quantum level using the threshold algorithm: the case of the anionic metal cluster Au$$_{20}^{-}$$. Theor Chem Acc 2021. [DOI: 10.1007/s00214-021-02748-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
8
|
Kozak F, Kurzbach D. How to assess the structural dynamics of transcription factors by integrating sparse NMR and EPR constraints with molecular dynamics simulations. Comput Struct Biotechnol J 2021; 19:2097-2105. [PMID: 33995905 PMCID: PMC8085671 DOI: 10.1016/j.csbj.2021.04.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 04/07/2021] [Accepted: 04/07/2021] [Indexed: 12/12/2022] Open
Abstract
We review recent advances in modeling structural ensembles of transcription factors from nuclear magnetic resonance (NMR) and electron paramagnetic resonance (EPR) spectroscopic data, integrated with molecular dynamics (MD) simulations. We focus on approaches that confirm computed conformational ensembles by sparse constraints obtained from magnetic resonance. This combination enables the deduction of functional and structural protein models even if nuclear Overhauser effects (NOEs) are too scarce for conventional structure determination. We highlight recent insights into the folding-upon-DNA binding transitions of intrinsically disordered transcription factors that could be assessed using such integrative approaches.
Collapse
Affiliation(s)
- Fanny Kozak
- University Vienna, Faculty of Chemistry, Institute of Biological Chemistry, Waehringer Str. 38, 1090 Vienna, Austria
| | - Dennis Kurzbach
- University Vienna, Faculty of Chemistry, Institute of Biological Chemistry, Waehringer Str. 38, 1090 Vienna, Austria
| |
Collapse
|
9
|
Bomediano Camillo LDM, Ferreira GC, Duran AFA, da Silva FRS, Garcia W, Scott AL, Sasaki SD. Structural modelling and thermostability of a serine protease inhibitor belonging to the Kunitz-BPTI family from the Rhipicephalus microplus tick. Biochimie 2020; 181:226-233. [PMID: 33359560 DOI: 10.1016/j.biochi.2020.12.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 11/09/2020] [Accepted: 12/18/2020] [Indexed: 10/22/2022]
Abstract
rBmTI-A is a recombinant serine protease inhibitor that belongs to the Kunitz-BPTI family and that was cloned from Rhipicephalus microplus tick. rBmTI-A has inhibitory activities on bovine trypsin, human plasma kallikrein, human neutrophil elastase and plasmin with dissociation constants in nM range. It is characterized by two inhibitory domains and each domain presents six cysteines that form three disulfide bonds, which contribute to the high stability of its structure. Previous studies suggest that serine protease inhibitor rBmTI-A has a protective potential against pulmonary emphysema in mice and anti-inflammatory potential. Besides that, rBmTI-A presented a potent inhibitory activity against in vitro vessel formation. In this study, the tertiary structure of rBmTI-A was modeled. The structure stabilization was evaluated by molecular dynamics analysis. Circular dichroism spectroscopy data corroborated the secondary structure found by the homology modelling. Also, in circular dichroism data it was shown a thermostability of rBmTI-A until approximately 70 °C, corroborated by inhibitory assays toward trypsin.
Collapse
Affiliation(s)
| | - Graziele Cristina Ferreira
- Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, São Bernardo do Campo, São Paulo, Brazil
| | | | | | - Wanius Garcia
- Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, Santo André, São Paulo, Brazil
| | - Ana Lígia Scott
- Centro de Matemática, Computação e Cognição. Universidade Federal do ABC, Santo André, São Paulo, Brazil
| | - Sergio Daishi Sasaki
- Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, São Bernardo do Campo, São Paulo, Brazil.
| |
Collapse
|
10
|
Bernetti M, Bertazzo M, Masetti M. Data-Driven Molecular Dynamics: A Multifaceted Challenge. Pharmaceuticals (Basel) 2020; 13:E253. [PMID: 32961909 PMCID: PMC7557855 DOI: 10.3390/ph13090253] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 12/18/2022] Open
Abstract
The big data concept is currently revolutionizing several fields of science including drug discovery and development. While opening up new perspectives for better drug design and related strategies, big data analysis strongly challenges our current ability to manage and exploit an extraordinarily large and possibly diverse amount of information. The recent renewal of machine learning (ML)-based algorithms is key in providing the proper framework for addressing this issue. In this respect, the impact on the exploitation of molecular dynamics (MD) simulations, which have recently reached mainstream status in computational drug discovery, can be remarkable. Here, we review the recent progress in the use of ML methods coupled to biomolecular simulations with potentially relevant implications for drug design. Specifically, we show how different ML-based strategies can be applied to the outcome of MD simulations for gaining knowledge and enhancing sampling. Finally, we discuss how intrinsic limitations of MD in accurately modeling biomolecular systems can be alleviated by including information coming from experimental data.
Collapse
Affiliation(s)
- Mattia Bernetti
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, I-34136 Trieste, Italy;
| | - Martina Bertazzo
- Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, I-16163 Genova, Italy;
| | - Matteo Masetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| |
Collapse
|
11
|
Schöberl M, Zabaras N, Koutsourelakis PS. Predictive collective variable discovery with deep Bayesian models. J Chem Phys 2019; 150:024109. [DOI: 10.1063/1.5058063] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Markus Schöberl
- Center for Informatics and Computational Science, University of Notre Dame, 311 Cushing Hall, Notre Dame, Indiana 46556, USA
- Continuum Mechanics Group, Technical University of Munich, Boltzmannstraße 15, 85748 Garching, Germany
| | - Nicholas Zabaras
- Center for Informatics and Computational Science, University of Notre Dame, 311 Cushing Hall, Notre Dame, Indiana 46556, USA
| | | |
Collapse
|
12
|
Chen W, Ferguson AL. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration. J Comput Chem 2018; 39:2079-2102. [PMID: 30368832 DOI: 10.1002/jcc.25520] [Citation(s) in RCA: 121] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 06/14/2018] [Indexed: 01/08/2023]
Abstract
Macromolecular and biomolecular folding landscapes typically contain high free energy barriers that impede efficient sampling of configurational space by standard molecular dynamics simulation. Biased sampling can artificially drive the simulation along prespecified collective variables (CVs), but success depends critically on the availability of good CVs associated with the important collective dynamical motions. Nonlinear machine learning techniques can identify such CVs but typically do not furnish an explicit relationship with the atomic coordinates necessary to perform biased sampling. In this work, we employ auto-associative artificial neural networks ("autoencoders") to learn nonlinear CVs that are explicit and differentiable functions of the atomic coordinates. Our approach offers substantial speedups in exploration of configurational space, and is distinguished from existing approaches by its capacity to simultaneously discover and directly accelerate along data-driven CVs. We demonstrate the approach in simulations of alanine dipeptide and Trp-cage, and have developed an open-source and freely available implementation within OpenMM. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 W Green Street, Urbana, Illinois, 61801.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois, 61801
| |
Collapse
|
13
|
Chen W, Tan AR, Ferguson AL. Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design. J Chem Phys 2018; 149:072312. [PMID: 30134681 DOI: 10.1063/1.5023804] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Auto-associative neural networks ("autoencoders") present a powerful nonlinear dimensionality reduction technique to mine data-driven collective variables from molecular simulation trajectories. This technique furnishes explicit and differentiable expressions for the nonlinear collective variables, making it ideally suited for integration with enhanced sampling techniques for accelerated exploration of configurational space. In this work, we describe a number of sophistications of the neural network architectures to improve and generalize the process of interleaved collective variable discovery and enhanced sampling. We employ circular network nodes to accommodate periodicities in the collective variables, hierarchical network architectures to rank-order the collective variables, and generalized encoder-decoder architectures to support bespoke error functions for network training to incorporate prior knowledge. We demonstrate our approach in blind collective variable discovery and enhanced sampling of the configurational free energy landscapes of alanine dipeptide and Trp-cage using an open-source plugin developed for the OpenMM molecular simulation package.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| | - Aik Rui Tan
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, Illinois 61801, USA
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| |
Collapse
|
14
|
Peng JH, Wang W, Yu YQ, Gu HL, Huang X. Clustering algorithms to analyze molecular dynamics simulation trajectories for complex chemical and biological systems. CHINESE J CHEM PHYS 2018. [DOI: 10.1063/1674-0068/31/cjcp1806147] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Jun-hui Peng
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Wei Wang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Ye-qing Yu
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Han-lin Gu
- Department of Mathematics, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xuhui Huang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Center of Systems Biology and Human Health, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
15
|
Neelamraju S, Oligschleger C, Schön JC. The threshold algorithm: Description of the methodology and new developments. J Chem Phys 2017; 147:152713. [DOI: 10.1063/1.4985912] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Sridhar Neelamraju
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bengaluru 560065, India
| | - Christina Oligschleger
- University of Applied Sciences Bonn-Rhein-Sieg, Von-Liebig-Str. 20, D-53359 Rheinbach, Germany
| | - J. Christian Schön
- Max Planck Institute for Solid State Research, Heisenbergstr. 1, D-70569 Stuttgart, Germany
| |
Collapse
|
16
|
Combining molecular dynamics simulation and ligand-receptor contacts analysis as a new approach for pharmacophore modeling: beta-secretase 1 and check point kinase 1 as case studies. J Comput Aided Mol Des 2016; 30:1149-1163. [PMID: 27722817 DOI: 10.1007/s10822-016-9984-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Accepted: 10/03/2016] [Indexed: 01/19/2023]
|
17
|
Abramyan TM, Snyder JA, Thyparambil AA, Stuart SJ, Latour RA. Cluster analysis of molecular simulation trajectories for systems where both conformation and orientation of the sampled states are important. J Comput Chem 2016; 37:1973-82. [PMID: 27292100 DOI: 10.1002/jcc.24416] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Accepted: 05/17/2016] [Indexed: 01/28/2023]
Abstract
Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed-state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Tigran M Abramyan
- Department of Bioengineering, 501 Rhodes Engineering Research Center, Clemson University, Clemson, South Carolina, 29634
| | - James A Snyder
- Department of Bioengineering, 501 Rhodes Engineering Research Center, Clemson University, Clemson, South Carolina, 29634
| | - Aby A Thyparambil
- Department of Bioengineering, 501 Rhodes Engineering Research Center, Clemson University, Clemson, South Carolina, 29634
| | - Steven J Stuart
- Department of Chemistry 369 Hunter Laboratory, Clemson University, Clemson, South Carolina, 29634
| | - Robert A Latour
- Department of Bioengineering, 501 Rhodes Engineering Research Center, Clemson University, Clemson, South Carolina, 29634
| |
Collapse
|
18
|
Roth CA, Dreyfus T, Robert CH, Cazals F. Hybridizing rapidly exploring random trees and basin hopping yields an improved exploration of energy landscapes. J Comput Chem 2015; 37:739-52. [DOI: 10.1002/jcc.24256] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Revised: 10/05/2015] [Accepted: 10/15/2015] [Indexed: 12/25/2022]
Affiliation(s)
- Christine-Andrea Roth
- Laboratoire De Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité; 13 Rue Pierre Et Marie Curie Paris 75005 France
| | - Tom Dreyfus
- Laboratoire De Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité; 13 Rue Pierre Et Marie Curie Paris 75005 France
| | - Charles H. Robert
- Laboratoire De Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité; 13 Rue Pierre Et Marie Curie Paris 75005 France
| | - Frédéric Cazals
- Laboratoire De Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité; 13 Rue Pierre Et Marie Curie Paris 75005 France
| |
Collapse
|
19
|
De Paris R, Quevedo CV, Ruiz DDA, Norberto de Souza O. An Effective Approach for Clustering InhA Molecular Dynamics Trajectory Using Substrate-Binding Cavity Features. PLoS One 2015. [PMID: 26218832 PMCID: PMC4517875 DOI: 10.1371/journal.pone.0133172] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Protein receptor conformations, obtained from molecular dynamics (MD) simulations, have become a promising treatment of its explicit flexibility in molecular docking experiments applied to drug discovery and development. However, incorporating the entire ensemble of MD conformations in docking experiments to screen large candidate compound libraries is currently an unfeasible task. Clustering algorithms have been widely used as a means to reduce such ensembles to a manageable size. Most studies investigate different algorithms using pairwise Root-Mean Square Deviation (RMSD) values for all, or part of the MD conformations. Nevertheless, the RMSD only may not be the most appropriate gauge to cluster conformations when the target receptor has a plastic active site, since they are influenced by changes that occur on other parts of the structure. Hence, we have applied two partitioning methods (k-means and k-medoids) and four agglomerative hierarchical methods (Complete linkage, Ward's, Unweighted Pair Group Method and Weighted Pair Group Method) to analyze and compare the quality of partitions between a data set composed of properties from an enzyme receptor substrate-binding cavity and two data sets created using different RMSD approaches. Ensembles of representative MD conformations were generated by selecting a medoid of each group from all partitions analyzed. We investigated the performance of our new method for evaluating binding conformation of drug candidates to the InhA enzyme, which were performed by cross-docking experiments between a 20 ns MD trajectory and 20 different ligands. Statistical analyses showed that the novel ensemble, which is represented by only 0.48% of the MD conformations, was able to reproduce 75% of all dynamic behaviors within the binding cavity for the docking experiments performed. Moreover, this new approach not only outperforms the other two RMSD-clustering solutions, but it also shows to be a promising strategy to distill biologically relevant information from MD trajectories, especially for docking purposes.
Collapse
Affiliation(s)
- Renata De Paris
- Grupo de Pesquisa em Inteligência de Negócio—GPIN, Faculdade de Informática, PUCRS, Av. Ipiranga, 6681-Prédio 32, sala 628, Porto Alegre, RS, Brasil
| | - Christian V. Quevedo
- Grupo de Pesquisa em Inteligência de Negócio—GPIN, Faculdade de Informática, PUCRS, Av. Ipiranga, 6681-Prédio 32, sala 628, Porto Alegre, RS, Brasil
| | - Duncan D. A. Ruiz
- Grupo de Pesquisa em Inteligência de Negócio—GPIN, Faculdade de Informática, PUCRS, Av. Ipiranga, 6681-Prédio 32, sala 628, Porto Alegre, RS, Brasil
- * E-mail: (DDAR); (ONS)
| | - Osmar Norberto de Souza
- Laboratório de Bioinformática, Modelagem e Simulação de Biossistemas—LABIO, Faculdade de Informática, PUCRS, Av. Ipiranga, 6681- Building 32, Room 602, Porto Alegre, RS, Brasil
- * E-mail: (DDAR); (ONS)
| |
Collapse
|
20
|
Cazals F, Dreyfus T, Mazauric D, Roth CA, Robert CH. Conformational ensembles and sampled energy landscapes: Analysis and comparison. J Comput Chem 2015; 36:1213-31. [DOI: 10.1002/jcc.23913] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Revised: 02/25/2015] [Accepted: 03/02/2015] [Indexed: 12/11/2022]
Affiliation(s)
- Frédéric Cazals
- Inria 2004 route des Lucioles, BP 93; F-06902 Sophia-Antipolis; FRANCE
| | - Tom Dreyfus
- Inria 2004 route des Lucioles, BP 93; F-06902 Sophia-Antipolis; FRANCE
| | - Dorian Mazauric
- Inria 2004 route des Lucioles, BP 93; F-06902 Sophia-Antipolis; FRANCE
| | | | - Charles H. Robert
- CNRS Laboratory of Theoretical Biochemistry (LBT) Institut de Biologie Physico-Chimique 13; rue Pierre et Marie Curie 75005 Paris
| |
Collapse
|
21
|
Role of indirect readout mechanism in TATA box binding protein-DNA interaction. J Comput Aided Mol Des 2015; 29:283-95. [PMID: 25575717 DOI: 10.1007/s10822-014-9828-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 12/18/2014] [Indexed: 12/11/2022]
Abstract
Gene expression generally initiates from recognition of TATA-box binding protein (TBP) to the minor groove of DNA of TATA box sequence where the DNA structure is significantly different from B-DNA. We have carried out molecular dynamics simulation studies of TBP-DNA system to understand how the DNA structure alters for efficient binding. We observed rigid nature of the protein while the DNA of TATA box sequence has an inherent flexibility in terms of bending and minor groove widening. The bending analysis of the free DNA and the TBP bound DNA systems indicate presence of some similar structures. Principal coordinate ordination analysis also indicates some structural features of the protein bound and free DNA are similar. Thus we suggest that the DNA of TATA box sequence regularly oscillates between several alternate structures and the one suitable for TBP binding is induced further by the protein for proper complex formation.
Collapse
|
22
|
Smeeton LC, Oakley MT, Johnston RL. Visualizing energy landscapes with metric disconnectivity graphs. J Comput Chem 2014; 35:1481-90. [PMID: 24866379 PMCID: PMC4285870 DOI: 10.1002/jcc.23643] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Revised: 03/12/2014] [Accepted: 04/14/2014] [Indexed: 11/24/2022]
Abstract
The visualization of multidimensional energy landscapes is important, providing insight into the kinetics and thermodynamics of a system, as well the range of structures a system can adopt. It is, however, highly nontrivial, with the number of dimensions required for a faithful reproduction of the landscape far higher than can be represented in two or three dimensions. Metric disconnectivity graphs provide a possible solution, incorporating the landscape connectivity information present in disconnectivity graphs with structural information in the form of a metric. In this study, we present a new software package, PyConnect, which is capable of producing both disconnectivity graphs and metric disconnectivity graphs in two or three dimensions. We present as a test case the analysis of the 69-bead BLN coarse-grained model protein and show that, by choosing appropriate order parameters, metric disconnectivity graphs can resolve correlations between structural features on the energy landscape with the landscapes energetic and kinetic properties.
Collapse
Affiliation(s)
- Lewis C Smeeton
- School of Chemistry, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom
| | | | | |
Collapse
|
23
|
Pitera JW. Expected distributions of root-mean-square positional deviations in proteins. J Phys Chem B 2014; 118:6526-30. [PMID: 24655018 DOI: 10.1021/jp412776d] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The atom positional root-mean-square deviation (RMSD) is a standard tool for comparing the similarity of two molecular structures. It is used to characterize the quality of biomolecular simulations, to cluster conformations, and as a reaction coordinate for conformational changes. This work presents an approximate analytic form for the expected distribution of RMSD values for a protein or polymer fluctuating about a stable native structure. The mean and maximum of the expected distribution are independent of chain length for long chains and linearly proportional to the average atom positional root-mean-square fluctuations (RMSF). To approximate the RMSD distribution for random-coil or unfolded ensembles, numerical distributions of RMSD were generated for ensembles of self-avoiding and non-self-avoiding random walks. In both cases, for all reference structures tested for chains more than three monomers long, the distributions have a maximum distant from the origin with a power-law dependence on chain length. The purely entropic nature of this result implies that care must be taken when interpreting stable high-RMSD regions of the free-energy landscape as "intermediates" or well-defined stable states.
Collapse
Affiliation(s)
- Jed W Pitera
- IBM Research - Almaden, 650 Harry Road, San Jose, California 95120, United States
| |
Collapse
|
24
|
Sullivan DC, Lim C. Configurational Entropy of Proteins: Covariance Matrix versus Cumulative Distribution Calculations. J CHIN CHEM SOC-TAIP 2013. [DOI: 10.1002/jccs.200400177] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
25
|
Rohrdanz MA, Zheng W, Clementi C. Discovering Mountain Passes via Torchlight: Methods for the Definition of Reaction Coordinates and Pathways in Complex Macromolecular Reactions. Annu Rev Phys Chem 2013; 64:295-316. [DOI: 10.1146/annurev-physchem-040412-110006] [Citation(s) in RCA: 150] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | - Wenwei Zheng
- Department of Chemistry, Rice University, Houston, Texas 77005;
| | | |
Collapse
|
26
|
Carra C, Cucinotta FA. Accurate prediction of the binding free energy and analysis of the mechanism of the interaction of replication protein A (RPA) with ssDNA. J Mol Model 2011; 18:2761-83. [PMID: 22116609 DOI: 10.1007/s00894-011-1288-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 10/19/2011] [Indexed: 10/15/2022]
Abstract
The eukaryotic replication protein A (RPA) has several pivotal functions in the cell metabolism, such as chromosomal replication, prevention of hairpin formation, DNA repair and recombination, and signaling after DNA damage. Moreover, RPA seems to have a crucial role in organizing the sequential assembly of DNA processing proteins along single stranded DNA (ssDNA). The strong RPA affinity for ssDNA, K(A) between 10(-9)-10(-10) M, is characterized by a low cooperativity with minor variation for changes on the nucleotide sequence. Recently, new data on RPA interactions was reported, including the binding free energy of the complex RPA70AB with dC(8) and dC(5), which has been estimated to be -10 ± 0.4 kcal mol(-1) and -7 ± 1 kcal mol(-1), respectively. In view of these results we performed a study based on molecular dynamics aimed to reproduce the absolute binding free energy of RPA70AB with the dC(5) and dC(8) oligonucleotides. We used several tools to analyze the binding free energy, rigidity, and time evolution of the complex. The results obtained by MM-PBSA method, with the use of ligand free geometry as a reference for the receptor in the separate trajectory approach, are in excellent agreement with the experimental data, with ±4 kcal mol(-1) error. This result shows that the MM-PB(GB)SA methods can provide accurate quantitative estimates of the binding free energy for interacting complexes when appropriate geometries are used for the receptor, ligand and complex. The decomposition of the MM-GBSA energy for each residue in the receptor allowed us to correlate the change of the affinity of the mutated protein with the ΔG(gas+sol) contribution of the residue considered in the mutation. The agreement with experiment is optimal and a strong change in the binding free energy can be considered as the dominant factor in the loss for the binding affinity resulting from mutation.
Collapse
Affiliation(s)
- Claudio Carra
- Universities Space Research Association, Houston, TX 77058, USA.
| | | |
Collapse
|
27
|
Jansen M, Doll K, Schön JC. Addressing chemical diversity by employing the energy landscape concept. Acta Crystallogr A 2010; 66:518-34. [DOI: 10.1107/s0108767310026371] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2009] [Accepted: 07/04/2010] [Indexed: 11/11/2022] Open
Affiliation(s)
- Martin Jansen
- Max Planck Institute for Solid State Research, Heisenbergstrasse 1, D-70569 Stuttgart, Germany
| | | | | |
Collapse
|
28
|
Carra C, Cucinotta FA. Binding selectivity of RecA to a single stranded DNA, a computational approach. J Mol Model 2010; 17:133-50. [PMID: 20386943 DOI: 10.1007/s00894-010-0694-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2009] [Accepted: 02/23/2010] [Indexed: 11/29/2022]
Abstract
Homologous recombination (HR) is the major DNA double strand break repair pathway which maintains the genomic integrity. It is fundamental for the survivability and functionality of all organisms. One of the initial steps in HR is the formation of the nucleoprotein filament composed by a single stranded DNA chain surrounded by the recombinases protein. The filament orchestrates the search for an undamaged homologue, as a template for the repair process. Our theoretical study was aimed at elucidating the selectivity of the interaction between a monomer of the recombinases enzyme in the Escherichia coli, EcRecA, the bacterial homologue of human Rad51, with a series of oligonucleotides of nine bases length. The complex, equilibrated for 20 ns with Langevian dynamics, was inserted in a periodic box with a 8 Å buffer of water molecules explicitly described by the TIP3P model. The absolute binding free energies are calculated in an implicit solvent using the Poisson-Boltzmann (PB) and the generalized Born (GB) solvent accessible surface area, using the MM-PB(GB)SA model. The solute entropic contribution is also calculated by normal mode analysis. The results underline how a significant contribution of the binding free energy is due to the interaction with the Arg196, a critical amino acid for the activity of the enzyme. The study revealed how the binding affinity of EcRecA is significantly higher toward dT₉ rather than dA₉, as expected from the experimental results.
Collapse
Affiliation(s)
- Claudio Carra
- Universities Space Research Association, 2101 NASA Parkway, Houston, TX 77058, USA.
| | | |
Collapse
|
29
|
Carra C, Cucinotta FA. Binding Sites of theE. ColiDNA Recombinase Protein to the ssDNA: A Computational Study. J Biomol Struct Dyn 2010; 27:407-28. [DOI: 10.1080/07391102.2010.10507327] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
30
|
Schön JC, Jansen M. Determination, prediction, and understanding of structures, using the energy landscapes of chemical systems – Part II. ACTA ACUST UNITED AC 2009. [DOI: 10.1524/zkri.216.7.361.20362] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Abstract
In the past decade, new theoretical approaches have been developed to determine, predict and understand the struc-ture of chemical compounds. The central element of these methods has been the investigation of the energy landscape of chemical systems. Applications range from extended crystalline and amorphous compounds over clusters and molecular crystals to proteins. In this review, we are going to give an introduction to energy landscapes and methods for their investigation, together with a number of examples. These include structure prediction of extended and mo-lecular crystals, structure prediction and folding of proteins, structure analysis of zeolites, and structure determination of crystals from powder diffraction data.
Collapse
|
31
|
Krishnan R, Walton EB, Van Vliet KJ. Characterizing rare-event property distributions via replicate molecular dynamics simulations of proteins. J Mol Model 2009; 15:1383-9. [PMID: 19418077 DOI: 10.1007/s00894-009-0504-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Accepted: 03/09/2009] [Indexed: 11/26/2022]
Abstract
As computational resources increase, molecular dynamics simulations of biomolecules are becoming an increasingly informative complement to experimental studies. In particular, it has now become feasible to use multiple initial molecular configurations to generate an ensemble of replicate production-run simulations that allows for more complete characterization of rare events such as ligand-receptor unbinding. However, there are currently no explicit guidelines for selecting an ensemble of initial configurations for replicate simulations. Here, we use clustering analysis and steered molecular dynamics simulations to demonstrate that the configurational changes accessible in molecular dynamics simulations of biomolecules do not necessarily correlate with observed rare-event properties. This informs selection of a representative set of initial configurations. We also employ statistical analysis to identify the minimum number of replicate simulations required to sufficiently sample a given biomolecular property distribution. Together, these results suggest a general procedure for generating an ensemble of replicate simulations that will maximize accurate characterization of rare-event property distributions in biomolecules.
Collapse
Affiliation(s)
- Ranjani Krishnan
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | |
Collapse
|
32
|
Krishnan R, Oommen B, Walton EB, Maloney JM, Van Vliet KJ. Modeling and simulation of chemomechanics at the cell-matrix interface. Cell Adh Migr 2008; 2:83-94. [PMID: 19262102 DOI: 10.4161/cam.2.2.6154] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Chemomechanical characteristics of the extracellular materials with which cells interact can have a profound impact on cell adhesion and migration. To understand and modulate such complex multiscale processes, a detailed understanding of the feedback between a cell and the adjacent microenvironment is crucial. Here, we use computational modeling and simulation to examine the cell-matrix interaction at both the molecular and continuum lengthscales. Using steered molecular dynamics, we consider how extracellular matrix (ECM) stiffness and extracellular pH influence the interaction between cell surface adhesion receptors and extracellular matrix ligands, and we predict potential consequences for focal adhesion formation and dissolution. Using continuum level finite element simulations and analytical methods to model cell-induced ECM deformation as a function of ECM stiffness and thickness, we consider the implications toward design of synthetic substrata for cell biology experiments that intend to decouple chemical and mechanical cues.
Collapse
Affiliation(s)
- Ranjani Krishnan
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, USA
| | | | | | | | | |
Collapse
|
33
|
Shao J, Tanner SW, Thompson N, Cheatham TE. Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms. J Chem Theory Comput 2007; 3:2312-34. [DOI: 10.1021/ct700119m] [Citation(s) in RCA: 614] [Impact Index Per Article: 34.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jianyin Shao
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| | - Stephen W. Tanner
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| | - Nephi Thompson
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| | - Thomas E. Cheatham
- Departments of Medicinal Chemistry, Pharmaceutics and Pharmaceutical Chemistry, and Bioengineering, College of Pharmacy, University of Utah, 2000 East 30 South, Skaggs Hall 201, Salt Lake City, Utah 84112
| |
Collapse
|
34
|
Plaku E, Stamati H, Clementi C, Kavraki LE. Fast and reliable analysis of molecular motion using proximity relations and dimensionality reduction. Proteins 2007; 67:897-907. [PMID: 17380507 DOI: 10.1002/prot.21337] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The analysis of molecular motion starting from extensive sampling of molecular configurations remains an important and challenging task in computational biology. Existing methods require a significant amount of time to extract the most relevant motion information from such data sets. In this work, we provide a practical tool for molecular motion analysis. The proposed method builds upon the recent ScIMAP (Scalable Isomap) method, which, by using proximity relations and dimensionality reduction, has been shown to reliably extract from simulation data a few parameters that capture the main, linear and/or nonlinear, modes of motion of a molecular system. The results we present in the context of protein folding reveal that the proposed method characterizes the folding process essentially as well as ScIMAP. At the same time, by projecting the simulation data and computing proximity relations in a low-dimensional Euclidean space, it renders such analysis computationally practical. In many instances, the proposed method reduces the computational cost from several CPU months to just a few CPU hours, making it possible to analyze extensive simulation data in a matter of a few hours using only a single processor. These results establish the proposed method as a reliable and practical tool for analyzing motions of considerably large molecular systems and proteins with complex folding mechanisms.
Collapse
Affiliation(s)
- Erion Plaku
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | | | | | | |
Collapse
|
35
|
Li Y. Bayesian model based clustering analysis: application to a molecular dynamics trajectory of the HIV-1 integrase catalytic core. J Chem Inf Model 2006; 46:1742-50. [PMID: 16859306 DOI: 10.1021/ci050463u] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
This work describes the application of a Bayesian method for clustering protein conformations sampled during a molecular dynamics simulation of the HIV-1 integrase catalytic core. A clustering analysis is carried out under the assumption of normal distribution without fixing the number of clusters in advance. Some performance measures, such as posterior probability and class cross entropy, are used to determine the most probable set of clusters. The Bayesian clustering method results in meaningful groups identifying transitions between conformational ensembles. The dihedral angles involved in such transitions are also examined in detail. The conformations in high dimensional space are projected into 3D space employing a multidimensional scaling technique to provide a visual inspection.
Collapse
Affiliation(s)
- Yan Li
- Department of Applied Chemistry, Ocean University of China, Qingdao 266003, P. R. China.
| |
Collapse
|
36
|
Sullivan DC, Lim C. Quantifying Polypeptide Conformational Space: Sensitivity to Conformation and Ensemble Definition. J Phys Chem B 2006; 110:16707-17. [PMID: 16913810 DOI: 10.1021/jp0569133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Quantifying the density of conformations over phase space (the conformational distribution) is needed to model important macromolecular processes such as protein folding. In this work, we quantify the conformational distribution for a simple polypeptide (N-mer polyalanine) using the cumulative distribution function (CDF), which gives the probability that two randomly selected conformations are separated by less than a "conformational" distance and whose inverse gives conformation counts as a function of conformational radius. An important finding is that the conformation counts obtained by the CDF inverse depend critically on the assignment of a conformation's distance span and the ensemble (e.g., unfolded state model): varying ensemble and conformation definition (1 --> 2 A) varies the CDF-based conformation counts for Ala(50) from 10(11) to 10(69). In particular, relatively short molecular dynamics (MD) relaxation of Ala(50)'s random-walk ensemble reduces the number of conformers from 10(55) to 10(14) (using a 1 A root-mean-square-deviation radius conformation definition) pointing to potential disconnections in comparing the results from simplified models of unfolded proteins with those from all-atom MD simulations. Explicit waters are found to roughen the landscape considerably. Under some common conformation definitions, the results herein provide (i) an upper limit to the number of accessible conformations that compose unfolded states of proteins, (ii) the optimal clustering radius/conformation radius for counting conformations for a given energy and solvent model, (iii) a means of comparing various studies, and (iv) an assessment of the applicability of random search in protein folding.
Collapse
Affiliation(s)
- David C Sullivan
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan.
| | | |
Collapse
|
37
|
Khavrutskii IV, Byrd RH, Brooks CL. A line integral reaction path approximation for large systems via nonlinear constrained optimization: Application to alanine dipeptide and the β hairpin of protein G. J Chem Phys 2006; 124:194903. [PMID: 16729840 DOI: 10.1063/1.2194544] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
A variation of the line integral method of Elber with self-avoiding walk has been implemented using a state of the art nonlinear constrained optimization procedure. The new implementation appears to be robust in finding approximate reaction paths for small and large systems. Exact transition states and intermediates for the resulting paths can easily be pinpointed with subsequent application of the conjugate peak refinement method [S. Fischer and M. Karplus, Chem. Phys. Lett. 194, 252 (1992)] and unconstrained minimization, respectively. Unlike previous implementations utilizing a penalty function approach, the present implementation generates an exact solution of the underlying problem. Most importantly, this formulation does not require an initial guess for the path, which makes it particularly useful for studying complex molecular rearrangements. The method has been applied to conformational rearrangements of the alanine dipeptide in the gas phase and in water, and folding of the beta hairpin of protein G in water. In the latter case a procedure was developed to systematically sample the potential energy surface underlying folding and reconstruct folding pathways within the nearest-neighbor hopping approximation.
Collapse
Affiliation(s)
- Ilja V Khavrutskii
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
38
|
Zhou Z, Madrid M, Evanseck JD, Madura JD. Effect of a Bound Non-Nucleoside RT Inhibitor on the Dynamics of Wild-Type and Mutant HIV-1 Reverse Transcriptase. J Am Chem Soc 2005; 127:17253-60. [PMID: 16332074 DOI: 10.1021/ja053973d] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
HIV-1 reverse transcriptase (RT) is an important target for drugs used in the treatment of AIDS. Drugs known as non-nucleoside RT inhibitors (NNRTI) appear to alter the structural and dynamical properties of RT which in turn inhibit RT's ability to transcribe. Molecular dynamics (MD), principal component analysis (PCA), and binding free energy simulations are employed to explore the dynamics of RT and its interaction with the bound NNRTI nevirapine, for both wild-type and mutant (V106A, Y181C, Y188C) RT. These three mutations commonly arise in the presence of nevirapine and result in resistance to the drug. We show that a bound NNRTI hinders the motion of almost all RT amino acids. The mutations, located in the non-nucleoside RT inhibitor binding pocket, partially restore RT flexibility. The binding affinities calculated by molecular mechanics/Poisson-Boltzmann surface accessibility (MM-PBSA) show that nevirapine interacts stronger with wild-type RT than with mutant RT. The mutations cause a loss of van der Waals interactions between the drug and the binding pocket. The results from this study suggest that a good inhibitor should efficiently enter and maximally occupy the binding pocket, thereby interacting effectively with the amino acids around the binding pocket.
Collapse
Affiliation(s)
- Zhigang Zhou
- Department of Chemistry and Biochemistry and Center for Computational Sciences, Duquesne University, Pittsburgh, Pennsylvania 15282, USA
| | | | | | | |
Collapse
|
39
|
Li Y, Zhou Z, Post CB. Dissociation of an antiviral compound from the internal pocket of human rhinovirus 14 capsid. Proc Natl Acad Sci U S A 2005; 102:7529-34. [PMID: 15899980 PMCID: PMC1140408 DOI: 10.1073/pnas.0408749102] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
WIN antiviral compounds bind human rhinovirus, as well as enterovirus and parechovirus, in an internal cavity located within the viral protein capsid. Access to the buried pocket necessitates deviation from the average viral protein structure identified by crystallography. We investigated the dissociation of WIN 52084 from the pocket in human rhinovirus 14 by using an adiabatic, biased molecular dynamics simulation method. Multiple dissociation trajectories are used to characterize the pathway. WIN 52084 exits between the polypeptide chain near the ends of betaC and betaH in a series of steps. Small, transient packing defects in the protein are sufficient for dissociation. A number of torsion-angle transitions of the antiviral compound are involved, which suggests that flexibility in antiviral compounds is important for binding. It is interesting to note that dissociation is associated with an increase in the conformational fluctuations of residues never in direct contact with WIN 52084 over the course of dissociation. These residues are N-terminal residues in the viral proteins VP3 and VP4 and are located in the interior of the capsid near the icosahedral 5-fold axis. The observed changes in dynamics may be relevant to structural changes associated with virion uncoating and its inhibition by antiviral compounds.
Collapse
Affiliation(s)
- Yumin Li
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN 47907-2091, USA
| | | | | |
Collapse
|
40
|
Pan PW, Dickson RJ, Gordon HL, Rothstein SM, Tanaka S. Functionally relevant protein motions: Extracting basin-specific collective coordinates from molecular dynamics trajectories. J Chem Phys 2005; 122:34904. [PMID: 15740224 DOI: 10.1063/1.1830434] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Functionally relevant motion of proteins has been associated with a number of atoms moving in a concerted fashion along so-called "collective coordinates." We present an approach to extract collective coordinates from conformations obtained from molecular dynamics simulations. The power of this technique for differentiating local structural fluctuations between classes of conformers obtained by clustering is illustrated by analyzing nanosecond-long trajectories for the response regulator protein Spo0F of Bacillus subtilis, generated both in vacuo and using an implicit-solvent representation. Conformational clustering is performed using automated histogram filtering of the inter-C(alpha) distances. Orthogonal (varimax) rotation of the vectors obtained by principal component analysis of these interresidue distances for the members of individual clusters is key to the interpretation of collective coordinates dominating each conformational class. The rotated loadings plots isolate significant variation in interresidue distances, and these are associated with entire mobile secondary structure elements. From this we infer concerted motions of these structural elements. For the Spo0F simulations employing an implicit-solvent representation, collective coordinates obtained in this fashion are consistent with the location of the protein's known active sites and experimentally determined mobile regions.
Collapse
Affiliation(s)
- Patricia Wang Pan
- Department of Chemistry, Brock University, St. Catharines, Ontario L2S 3A1, Canada
| | | | | | | | | |
Collapse
|
41
|
Sullivan DC, Kuntz ID. Distributions in protein conformation space: implications for structure prediction and entropy. Biophys J 2004; 87:113-20. [PMID: 15240450 PMCID: PMC1304334 DOI: 10.1529/biophysj.104.041723] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2004] [Accepted: 03/23/2004] [Indexed: 11/18/2022] Open
Abstract
By considering how polymer structures are distributed in conformation space, we show that it is possible to quantify the difficulty of structural prediction and to provide a measure of progress for prediction calculations. The critical issue is the probability that a conformation is found within a specified distance of another conformer. We address this question by constructing a cumulative distribution function (CDF) for the average probability from observations about its limiting behavior at small displacements and numerical simulations of polyalanine chains. We can use the CDF to estimate the likelihood that a structure prediction is better than random chance. For example, the chance of randomly predicting the native backbone structure of a 150-amino-acid protein to low resolution, say within 6 A, is 10(-14). A high-resolution structural prediction, say to 2 A, is immensely more difficult (10(-57)). With additional assumptions, the CDF yields the conformational entropy of protein folding from native-state coordinate variance. Or, using values of the conformational entropy change on folding, we can estimate the native state's conformational span. For example, for a 150-mer protein, equilibrium alpha-carbon displacements in the native ensemble would be 0.3-0.5 A based on T Delta S of 1.42 kcal/(mol residue).
Collapse
Affiliation(s)
- David C Sullivan
- Department of Pharmaceutical Chemistry, University of California, San Francisco, 94143-2240, USA
| | | |
Collapse
|
42
|
Loccisano AE, Acevedo O, DeChancie J, Schulze BG, Evanseck JD. Enhanced sampling by multiple molecular dynamics trajectories: carbonmonoxy myoglobin 10 μs A0 → A1–3 transition from ten 400 picosecond simulations. J Mol Graph Model 2004; 22:369-76. [PMID: 15099833 DOI: 10.1016/j.jmgm.2003.12.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The utility of multiple trajectories to extend the time scale of molecular dynamics simulations is reported for the spectroscopic A-states of carbonmonoxy myoglobin (MbCO). Experimentally, the A0-->A(1-3) transition has been observed to be 10 micros at 300 K, which is beyond the time scale of standard molecular dynamics simulations. To simulate this transition, 10 short (400 ps) and two longer time (1.2 ns) molecular dynamics trajectories, starting from five different crystallographic and solution phase structures with random initial velocities centered in a 37 A radius sphere of water, have been used to sample the native-fold of MbCO. Analysis of the ensemble of structures gathered over the cumulative 5.6 ns reveals two biomolecular motions involving the side chains of His64 and Arg45 to explain the spectroscopic states of MbCO. The 10 micros A0-->A(1-3) transition involves the motion of His64, where distance between His64 and CO is found to vary up to 8.8 +/- 1.0 A during the transition of His64 from the ligand (A(1-3)) to bulk solvent (A0). The His64 motion occurs within a single trajectory only once, however the multiple trajectories populate the spectroscopic A-states fully. Consequently, multiple independent molecular dynamics simulations have been found to extend biomolecular motion from 5 ns of total simulation to experimental phenomena on the microsecond time scale.
Collapse
Affiliation(s)
- Anne E Loccisano
- The National Energy and Technology Laboratory, Pittsburgh, PA 15236-0940, USA
| | | | | | | | | |
Collapse
|
43
|
Sullivan DC, Aynechi T, Voelz VA, Kuntz ID. Information content of molecular structures. Biophys J 2003; 85:174-90. [PMID: 12829474 PMCID: PMC1303075 DOI: 10.1016/s0006-3495(03)74464-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2002] [Accepted: 03/13/2003] [Indexed: 11/21/2022] Open
Abstract
For a completely enumerated set of conformers of a macromolecule or for exhaustive lattice walks of model polymers it is straightforward to use Shannon information theory to deduce the information content of the ensemble. It is also practicable to develop numerical measures of the information content of sets of exact distance constraints applied to specific conformational ensembles. We examine the effects of experimental uncertainties by considering "noisy" constraints. The introduction of noise requires additional assumptions about noise distribution and conformational clustering protocols that make the problem of measuring information content more complex. We make use of a standard concept in communication theory, the "noise sphere," to link uncertainty in measurements to information loss. Most of our numerical results are derived from two-dimensional lattice ensembles. Expressing results in terms of information per degree of freedom removes almost all of the chain length dependence. We also explore off-lattice polyalanine chains that yield surprisingly similar results.
Collapse
Affiliation(s)
- David C Sullivan
- Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143-2240, USA
| | | | | | | |
Collapse
|
44
|
Chema D, Becker OM. A method for correlations analysis of coordinates: applications for molecular conformations. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2002; 42:937-46. [PMID: 12132895 DOI: 10.1021/ci0103471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe a new method to analyze multiple correlations between subsets of coordinates that represent a sample. The correlation is established only between specific regions of interest at the coordinates. First, the region(s) of interest are selected at each molecular coordinate. Next, a correlation matrix is constructed for the selected regions. The matrix is subject to further analysis, illuminating the multidimensional structural characteristics that exist in the conformational space. The method's abilities are demonstrated in several examples: it is used to analyze the conformational space of complex molecules, it is successfully applied to compare related conformational spaces, and it is used to analyze a diverse set of protein folding trajectories.
Collapse
Affiliation(s)
- Doron Chema
- School of Chemistry, Tel Aviv University, Ramat Aviv, Tel Aviv 69978, Israel.
| | | |
Collapse
|
45
|
Laboulais C, Ouali M, Le Bret M, Gabarro-Arpa J. Hamming distance geometry of a protein conformational space: application to the clustering of a 4-ns molecular dynamics trajectory of the HIV-1 integrase catalytic core. Proteins 2002; 47:169-79. [PMID: 11933064 DOI: 10.1002/prot.10081] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Protein structures can be encoded into binary sequences (Gabarro-Arpa et al., Comput Chem 2000;24:693-698) these are used to define a Hamming distance in conformational space: the distance between two different molecular conformations is the number of different bits in their sequences. Each bit in the sequence arises from a partition of conformational space in two halves. Thus, the information encoded in the binary sequences is also used to characterize the regions of conformational space visited by the system. We apply this distance and their associated geometric structures to the clustering and analysis of conformations sampled during a 4-ns molecular dynamics simulation of the HIV-1 integrase catalytic core. The cluster analysis of the simulation shows a division of the trajectory into two segments of 2.6 and 1.4 ns length, which are qualitatively different: the data points to the fact that equilibration is only reached at the end of the first segment. The Hamming distance is compared also to the r.m.s. deviation measure. The analysis of the cases studied so far shows that under the same conditions the two measures behave quite differently, and that the Hamming distance appears to be more robust than the r.m.s. deviation.
Collapse
Affiliation(s)
- Cyril Laboulais
- LBPA, CNRS UMR 8532, Ecole Normale Supérieure de Cachan, Cachan, France
| | | | | | | |
Collapse
|
46
|
Affiliation(s)
- David C. Sullivan
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, California 94143-0446
| | - Irwin D. Kuntz
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, California 94143-0446
| |
Collapse
|
47
|
Church BW, Shalloway D. Top-down free-energy minimization on protein potential energy landscapes. Proc Natl Acad Sci U S A 2001; 98:6098-103. [PMID: 11344256 PMCID: PMC33428 DOI: 10.1073/pnas.101030498] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The hierarchical properties of potential energy landscapes have been used to gain insight into thermodynamic and kinetic properties of protein ensembles. It also may be possible to use them to direct computational searches for thermodynamically stable macroscopic states, i.e., computational protein folding. To this end, we have developed a top-down search procedure in which conformation space is recursively dissected according to the intrinsic hierarchical structure of a landscape's effective-energy barriers. This procedure generates an inverted tree similar to the disconnectivity graphs generated by local minima-clustering methods, but it fundamentally differs in the manner in which the portion of the tree that is to be computationally explored is selected. A key ingredient is a branch-selection algorithm that takes advantage of statistically predictive properties of the landscape to guide searches down the tree branches that are most likely to lead to the physically relevant macroscopic states. Using the computational folding of a beta-hairpin-forming peptide as an example, we show that such predictive properties indeed exist and can be used for structure prediction by free-energy global minimization.
Collapse
Affiliation(s)
- B W Church
- Biophysics Program, Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | | |
Collapse
|
48
|
Hyvönen MT, Hiltunen Y, El-Deredy W, Ojala T, Vaara J, Kovanen PT, Ala-Korpela M. Application of self-organizing maps in conformational analysis of lipids. J Am Chem Soc 2001; 123:810-6. [PMID: 11456614 DOI: 10.1021/ja0025853] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The characteristics of lipid assemblies are important for the functions of biological membranes. This has led to an increasing utilization of molecular dynamics simulations for the elucidation of the structural features of biomembranes. We have applied the self-organizing map (SOM) to the analysis of the complex conformational data from a 1-ns molecular dynamics simulation of PLPC phospholipids in a membrane assembly. Mapping of 1.44 million molecular conformations to a two-dimensional array of neurons revealed, without human intervention, the main conformational features in hours. Both the whole molecule and the characteristics of the unsaturated fatty acid chains were analyzed. All major structural features were easily distinguished, such as the orientational variability of the headgroup, the mainly trans state dihedral angles of the sn-1 chain, and both straight and bent conformations of the unsaturated sn-2 chain. Furthermore, presentation of the trajectory of an individual lipid molecule on the map provides information on conformational dynamics. The present results suggest that the SOM method provides a powerful tool for routinely gaining rapid insight to the main molecular conformations as well as to the conformational dynamics of any simulated molecular assembly without the requirement of a priori knowledge.
Collapse
Affiliation(s)
- M T Hyvönen
- Contribution from the Wihuri Research Institute, Kalliolinnantie 4, FIN-00140 Helsinki, Finland.
| | | | | | | | | | | | | |
Collapse
|
49
|
Hamprecht FA, Peter C, Daura X, Thiel W, van Gunsteren WF. A strategy for analysis of (molecular) equilibrium simulations: Configuration space density estimation, clustering, and visualization. J Chem Phys 2001. [DOI: 10.1063/1.1330216] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
50
|
|