1
|
Forouzesh N, Ghafouri F, Tolokh IS, Onufriev AV. Optimal Dielectric Boundary for Binding Free Energy Estimates in the Implicit Solvent. J Chem Inf Model 2024; 64:9433-9448. [PMID: 39656550 DOI: 10.1021/acs.jcim.4c01190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Accuracy of binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a multidimensional optimization pipeline is used to find optimal atomic radii, specifically for binding calculations in the implicit solvent. To reduce overfitting, the optimization target includes separate, weighted contributions from both binding and hydration free energies. The resulting five-parameter radii set, OPT_BIND5D, is evaluated against experiment for binding free energies of 20 host-guest (H-G) systems, unrelated to the types of structures used in the training. The resulting accuracy for this H-G test set (root mean square error of 2.03 kcal/mol, mean signed error of -0.13 kcal/mol, mean absolute error of 1.68 kcal/mol, and Pearson's correlation of r = 0.79 with the experimental values) is on par with what can be expected from the fixed charge explicit solvent models. Best agreement with the experiment is achieved when the implicit salt concentration is set equal or close to the experimental conditions.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Fatemeh Ghafouri
- Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| |
Collapse
|
2
|
Roy A, Ali T, Venkatraman V. The Area Law of Molecular Entropy: Moving beyond Harmonic Approximation. ENTROPY (BASEL, SWITZERLAND) 2024; 26:688. [PMID: 39202158 PMCID: PMC11353761 DOI: 10.3390/e26080688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 08/03/2024] [Accepted: 08/13/2024] [Indexed: 09/03/2024]
Abstract
This article shows that the gas-phase entropy of molecules is proportional to the area of the molecules, with corrections for the different curvatures of the molecular surface. The ability to estimate gas-phase entropy by the area law also allows us to calculate molecular entropy faster and more accurately than currently popular methods of estimating molecular entropy with harmonic oscillator approximation. The speed and accuracy of our method will open up new possibilities for the explicit inclusion of entropy in various computational biology methods.
Collapse
Affiliation(s)
- Amitava Roy
- Department of Biomedical and Pharmaceutical Sciences, University of Montana, Missoula, MT 59812, USA;
| | - Tibra Ali
- Department of Mathematics and Natural Sciences, School of Data and Science, BRAC University, Dhaka 1212, Bangladesh;
| | - Vishwesh Venkatraman
- Department of Chemistry, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| |
Collapse
|
3
|
Tolokh IS, Folescu DE, Onufriev AV. Inclusion of Water Multipoles into the Implicit Solvation Framework Leads to Accuracy Gains. J Phys Chem B 2024; 128:5855-5873. [PMID: 38860842 PMCID: PMC11194828 DOI: 10.1021/acs.jpcb.4c00254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 05/28/2024] [Accepted: 05/29/2024] [Indexed: 06/12/2024]
Abstract
The current practical "workhorses" of the atomistic implicit solvation─the Poisson-Boltzmann (PB) and generalized Born (GB) models─face fundamental accuracy limitations. Here, we propose a computationally efficient implicit solvation framework, the Implicit Water Multipole GB (IWM-GB) model, that systematically incorporates the effects of multipole moments of water molecules in the first hydration shell of a solute, beyond the dipole water polarization already present at the PB/GB level. The framework explicitly accounts for coupling between polar and nonpolar contributions to the total solvation energy, which is missing from many implicit solvation models. An implementation of the framework, utilizing the GAFF force field and AM1-BCC atomic partial charges model, is parametrized and tested against the experimental hydration free energies of small molecules from the FreeSolv database. The resulting accuracy on the test set (RMSE ∼ 0.9 kcal/mol) is 12% better than that of the explicit solvation (TIP3P) treatment, which is orders of magnitude slower. We also find that the coupling between polar and nonpolar parts of the solvation free energy is essential to ensuring that several features of the IWM-GB model are physically meaningful, including the sign of the nonpolar contributions.
Collapse
Affiliation(s)
- Igor S. Tolokh
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E. Folescu
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V. Onufriev
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center
for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
4
|
Su Z, Tong Y, Wei GW. Hodge Decomposition of Single-Cell RNA Velocity. J Chem Inf Model 2024; 64:3558-3568. [PMID: 38572676 PMCID: PMC11035094 DOI: 10.1021/acs.jcim.4c00132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 04/05/2024]
Abstract
RNA velocity has the ability to capture the cell dynamic information in the biological processes; yet, a comprehensive analysis of the cell state transitions and their associated chemical and biological processes remains a gap. In this work, we provide the Hodge decomposition, coupled with discrete exterior calculus (DEC), to unveil cell dynamics by examining the decomposed curl-free, divergence-free, and harmonic components of the RNA velocity field in a low dimensional representation, such as a UMAP or a t-SNE representation. Decomposition results show that the decomposed components distinctly reveal key cell dynamic features such as cell cycle, bifurcation, and cell lineage differentiation, regardless of the choice of the low-dimensional representations. The consistency across different representations demonstrates that the Hodge decomposition is a reliable and robust way to extract these cell dynamic features, offering unique analysis and insightful visualization of single-cell RNA velocity fields.
Collapse
Affiliation(s)
- Zhe Su
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yiying Tong
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Electrical and Computer Engineering, Michigan State University, East
Lansing, Michigan 48824, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
5
|
Feng H, Cottrell S, Hozumi Y, Wei GW. Multiscale differential geometry learning of networks with applications to single-cell RNA sequencing data. Comput Biol Med 2024; 171:108211. [PMID: 38422960 PMCID: PMC10965033 DOI: 10.1016/j.compbiomed.2024.108211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/02/2024] [Accepted: 02/25/2024] [Indexed: 03/02/2024]
Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a transformative technology, offering unparalleled insights into the intricate landscape of cellular diversity and gene expression dynamics. scRNA-seq analysis represents a challenging and cutting-edge frontier within the field of biological research. Differential geometry serves as a powerful mathematical tool in various applications of scientific research. In this study, we introduce, for the first time, a multiscale differential geometry (MDG) strategy for addressing the challenges encountered in scRNA-seq data analysis. We assume that intrinsic properties of cells lie on a family of low-dimensional manifolds embedded in the high-dimensional space of scRNA-seq data. Multiscale cell-cell interactive manifolds are constructed to reveal complex relationships in the cell-cell network, where curvature-based features for cells can decipher the intricate structural and biological information. We showcase the utility of our novel approach by demonstrating its effectiveness in classifying cell types. This innovative application of differential geometry in scRNA-seq analysis opens new avenues for understanding the intricacies of biological networks and holds great potential for network analysis in other fields.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Sean Cottrell
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Yuta Hozumi
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
6
|
Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment. J Chem Theory Comput 2024; 20:396-410. [PMID: 38149593 PMCID: PMC10950260 DOI: 10.1021/acs.jctc.3c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.
Collapse
Affiliation(s)
- Lewis Bass
- Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Luke H Elder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E Folescu
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Anuj Karpatne
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
7
|
Gupta A, Mukherjee A. Capturing surface complementarity in proteins using unsupervised learning and robust curvature measure. Proteins 2022; 90:1669-1683. [DOI: 10.1002/prot.26345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/06/2022] [Accepted: 04/01/2022] [Indexed: 11/07/2022]
Affiliation(s)
- Abhijit Gupta
- Department of Chemistry Indian Institute of Science Education and Research Pune Maharashtra India
| | - Arnab Mukherjee
- Department of Chemistry Indian Institute of Science Education and Research Pune Maharashtra India
| |
Collapse
|
8
|
Chen J, Zhao R, Tong Y, Wei GW. EVOLUTIONARY DE RHAM-HODGE METHOD. DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS. SERIES B 2021; 26:3785-3821. [PMID: 34675756 DOI: 10.3934/dcdsb.2020257] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The de Rham-Hodge theory is a landmark of the 20th Century's mathematics and has had a great impact on mathematics, physics, computer science, and engineering. This work introduces an evolutionary de Rham-Hodge method to provide a unified paradigm for the multiscale geometric and topological analysis of evolving manifolds constructed from a filtration, which induces a family of evolutionary de Rham complexes. While the present method can be easily applied to close manifolds, the emphasis is given to more challenging compact manifolds with 2-manifold boundaries, which require appropriate analysis and treatment of boundary conditions on differential forms to maintain proper topological properties. Three sets of unique evolutionary Hodge Laplacians are proposed to generate three sets of topology-preserving singular spectra, for which the multiplicities of zero eigenvalues correspond to exactly the persistent Betti numbers of dimensions 0, 1 and 2. Additionally, three sets of non-zero eigenvalues further reveal both topological persistence and geometric progression during the manifold evolution. Extensive numerical experiments are carried out via the discrete exterior calculus to demonstrate the potential of the proposed paradigm for data representation and shape analysis of both point cloud data and density maps. To demonstrate the utility of the proposed method, the application is considered to the protein B-factor predictions of a few challenging cases for which existing biophysical models break down.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Rundong Zhao
- Department of Computer Science and Engineering, Michigan State University, MI 48824, USA
| | - Yiying Tong
- Department of Computer Science and Engineering, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
| |
Collapse
|
9
|
Abstract
Recently, machine learning (ML) has established itself in various worldwide benchmarking competitions in computational biology, including Critical Assessment of Structure Prediction (CASP) and Drug Design Data Resource (D3R) Grand Challenges. However, the intricate structural complexity and high ML dimensionality of biomolecular datasets obstruct the efficient application of ML algorithms in the field. In addition to data and algorithm, an efficient ML machinery for biomolecular predictions must include structural representation as an indispensable component. Mathematical representations that simplify the biomolecular structural complexity and reduce ML dimensionality have emerged as a prime winner in D3R Grand Challenges. This review is devoted to the recent advances in developing low-dimensional and scalable mathematical representations of biomolecules in our laboratory. We discuss three classes of mathematical approaches, including algebraic topology, differential geometry, and graph theory. We elucidate how the physical and biological challenges have guided the evolution and development of these mathematical apparatuses for massive and diverse biomolecular data. We focus the performance analysis on protein-ligand binding predictions in this review although these methods have had tremendous success in many other applications, such as protein classification, virtual screening, and the predictions of solubility, solvation free energies, toxicity, partition coefficients, protein folding stability changes upon mutation, etc.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA. and Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA and Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
10
|
Nguyen DD, Wei GW. DG-GL: Differential geometry-based geometric learning of molecular datasets. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2019; 35:e3179. [PMID: 30693661 PMCID: PMC6598676 DOI: 10.1002/cnm.3179] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 11/21/2018] [Accepted: 12/06/2018] [Indexed: 05/11/2023]
Abstract
MOTIVATION Despite its great success in various physical modeling, differential geometry (DG) has rarely been devised as a versatile tool for analyzing large, diverse, and complex molecular and biomolecular datasets because of the limited understanding of its potential power in dimensionality reduction and its ability to encode essential chemical and biological information in differentiable manifolds. RESULTS We put forward a differential geometry-based geometric learning (DG-GL) hypothesis that the intrinsic physics of three-dimensional (3D) molecular structures lies on a family of low-dimensional manifolds embedded in a high-dimensional data space. We encode crucial chemical, physical, and biological information into 2D element interactive manifolds, extracted from a high-dimensional structural data space via a multiscale discrete-to-continuum mapping using differentiable density estimators. Differential geometry apparatuses are utilized to construct element interactive curvatures in analytical forms for certain analytically differentiable density estimators. These low-dimensional differential geometry representations are paired with a robust machine learning algorithm to showcase their descriptive and predictive powers for large, diverse, and complex molecular and biomolecular datasets. Extensive numerical experiments are carried out to demonstrate that the proposed DG-GL strategy outperforms other advanced methods in the predictions of drug discovery-related protein-ligand binding affinity, drug toxicity, and molecular solvation free energy. AVAILABILITY AND IMPLEMENTATION http://weilab.math.msu.edu/DG-GL/ Contact: wei@math.msu.edu.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, East Lansing, 48824, Michigan
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, 48824, Michigan
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, Michigan
| |
Collapse
|
11
|
Cang Z, Mu L, Wei GW. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput Biol 2018; 14:e1005929. [PMID: 29309403 PMCID: PMC5774846 DOI: 10.1371/journal.pcbi.1005929] [Citation(s) in RCA: 149] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 01/19/2018] [Accepted: 12/15/2017] [Indexed: 12/05/2022] Open
Abstract
This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America
| | - Lin Mu
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, United States of America
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
12
|
Onufriev AV, Izadi S. Water models for biomolecular simulations. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2017. [DOI: 10.1002/wcms.1347] [Citation(s) in RCA: 94] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Alexey V. Onufriev
- Department of Physics; Virginia Tech; Blacksburg VA USA
- Department of Computer Science; Virginia Tech; Blacksburg VA USA
- Center for Soft Matter and Biological Physics; Virginia Tech; Blacksburg VA USA
| | - Saeed Izadi
- Early Stage Pharmaceutical Development; Genentech Inc.; South San Francisco, CA USA
| |
Collapse
|
13
|
Mikucki M, Zhou Y. Fast Simulation of Lipid Vesicle Deformation Using Spherical Harmonic Approximation. COMMUNICATIONS IN COMPUTATIONAL PHYSICS 2017; 21:40-64. [PMID: 28804520 PMCID: PMC5552105 DOI: 10.4208/cicp.oa-2015-0029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Lipid vesicles appear ubiquitously in biological systems. Understanding how the mechanical and intermolecular interactions deform vesicle membranes is a fundamental question in biophysics. In this article we develop a fast algorithm to compute the surface configurations of lipid vesicles by introducing surface harmonic functions to approximate the membrane surface. This parameterization allows an analytical computation of the membrane curvature energy and its gradient for the efficient minimization of the curvature energy using a nonlinear conjugate gradient method. Our approach drastically reduces the degrees of freedom for approximating the membrane surfaces compared to the previously developed finite element and finite difference methods. Vesicle deformations with a reduced volume larger than 0.65 can be well approximated by using as small as 49 surface harmonic functions. The method thus has a great potential to reduce the computational expense of tracking multiple vesicles which deform for their interaction with external fields.
Collapse
Affiliation(s)
- Michael Mikucki
- Department of Applied Mathematics & Statistics, Colorado
School of Mines, Golden, Colorado, 80401, USA
| | - Yongcheng Zhou
- Department of Mathematics, Colorado State University, Fort Collins,
Colorado, 80523, USA
| |
Collapse
|
14
|
Nguyen DD, Wei GW. The impact of surface area, volume, curvature, and Lennard-Jones potential to solvation modeling. J Comput Chem 2016; 38:24-36. [PMID: 27718270 DOI: 10.1002/jcc.24512] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/17/2016] [Accepted: 08/30/2016] [Indexed: 12/24/2022]
Abstract
This article explores the impact of surface area, volume, curvature, and Lennard-Jones (LJ) potential on solvation free energy predictions. Rigidity surfaces are utilized to generate robust analytical expressions for maximum, minimum, mean, and Gaussian curvatures of solvent-solute interfaces, and define a generalized Poisson-Boltzmann (GPB) equation with a smooth dielectric profile. Extensive correlation analysis is performed to examine the linear dependence of surface area, surface enclosed volume, maximum curvature, minimum curvature, mean curvature, and Gaussian curvature for solvation modeling. It is found that surface area and surfaces enclosed volumes are highly correlated to each other's, and poorly correlated to various curvatures for six test sets of molecules. Different curvatures are weakly correlated to each other for six test sets of molecules, but are strongly correlated to each other within each test set of molecules. Based on correlation analysis, we construct twenty six nontrivial nonpolar solvation models. Our numerical results reveal that the LJ potential plays a vital role in nonpolar solvation modeling, especially for molecules involving strong van der Waals interactions. It is found that curvatures are at least as important as surface area or surface enclosed volume in nonpolar solvation modeling. In conjugation with the GPB model, various curvature-based nonpolar solvation models are shown to offer some of the best solvation free energy predictions for a wide range of test sets. For example, root mean square errors from a model constituting surface area, volume, mean curvature, and LJ potential are less than 0.42 kcal/mol for all test sets. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Duc D Nguyen
- Department of Mathematics, Michigan State University, Michigan, 48824
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, Michigan, 48824.,Department of Electrical and Computer Engineering, Michigan State University, Michigan, 48824.,Department of Biochemistry and Molecular Biology, Michigan State University, Michigan, 48824
| |
Collapse
|
15
|
Multiscale method for modeling binding phenomena involving large objects: application to kinesin motor domains motion along microtubules. Sci Rep 2016; 6:23249. [PMID: 26988596 PMCID: PMC4796874 DOI: 10.1038/srep23249] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 03/03/2016] [Indexed: 11/30/2022] Open
Abstract
Many biological phenomena involve the binding of proteins to a large object. Because the electrostatic forces that guide binding act over large distances, truncating the size of the system to facilitate computational modeling frequently yields inaccurate results. Our multiscale approach implements a computational focusing method that permits computation of large systems without truncating the electrostatic potential and achieves the high resolution required for modeling macromolecular interactions, all while keeping the computational time reasonable. We tested our approach on the motility of various kinesin motor domains. We found that electrostatics help guide kinesins as they walk: N-kinesins towards the plus-end, and C-kinesins towards the minus-end of microtubules. Our methodology enables computation in similar, large systems including protein binding to DNA, viruses, and membranes.
Collapse
|
16
|
Abstract
Persistent homology provides a new approach for the topological simplification of big data via measuring the life time of intrinsic topological features in a filtration process and has found its success in scientific and engineering applications. However, such a success is essentially limited to qualitative data classification and analysis. Indeed, persistent homology has rarely been employed for quantitative modeling and prediction. Additionally, the present persistent homology is a passive tool, rather than a proactive technique, for classification and analysis. In this work, we outline a general protocol to construct object-oriented persistent homology methods. By means of differential geometry theory of surfaces, we construct an objective functional, namely, a surface free energy defined on the data of interest. The minimization of the objective functional leads to a Laplace-Beltrami operator which generates a multiscale representation of the initial data and offers an objective oriented filtration process. The resulting differential geometry based object-oriented persistent homology is able to preserve desirable geometric features in the evolutionary filtration and enhances the corresponding topological persistence. The cubical complex based homology algorithm is employed in the present work to be compatible with the Cartesian representation of the Laplace-Beltrami flow. The proposed Laplace-Beltrami flow based persistent homology method is extensively validated. The consistence between Laplace-Beltrami flow based filtration and Euclidean distance based filtration is confirmed on the Vietoris-Rips complex for a large amount of numerical tests. The convergence and reliability of the present Laplace-Beltrami flow based cubical complex filtration approach are analyzed over various spatial and temporal mesh sizes. The Laplace-Beltrami flow based persistent homology approach is utilized to study the intrinsic topology of proteins and fullerene molecules. Based on a quantitative model which correlates the topological persistence of fullerene central cavity with the total curvature energy of the fullerene structure, the proposed method is used for the prediction of fullerene isomer stability. The efficiency and robustness of the present method are verified by more than 500 fullerene molecules. It is shown that the proposed persistent homology based quantitative model offers good predictions of total curvature energies for ten types of fullerene isomers. The present work offers the first example to design object-oriented persistent homology to enhance or preserve desirable features in the original data during the filtration process and then automatically detect or extract the corresponding topological traits from the data.
Collapse
Affiliation(s)
- Bao Wang
- Department of Mathematics Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Mathematical Biosciences Institute, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
17
|
Xia K, Wei GW. Persistent topology for cryo-EM data analysis. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2015; 31:n/a-n/a. [PMID: 25851063 DOI: 10.1002/cnm.2719] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Revised: 03/13/2015] [Accepted: 03/31/2015] [Indexed: 06/04/2023]
Abstract
In this work, we introduce persistent homology for the analysis of cryo-electron microscopy (cryo-EM) density maps. We identify the topological fingerprint or topological signature of noise, which is widespread in cryo-EM data. For low signal-to-noise ratio (SNR) volumetric data, intrinsic topological features of biomolecular structures are indistinguishable from noise. To remove noise, we employ geometric flows that are found to preserve the intrinsic topological fingerprints of cryo-EM structures and diminish the topological signature of noise. In particular, persistent homology enables us to visualize the gradual separation of the topological fingerprints of cryo-EM structures from those of noise during the denoising process, which gives rise to a practical procedure for prescribing a noise threshold to extract cryo-EM structure information from noise contaminated data after certain iterations of the geometric flow equation. To further demonstrate the utility of persistent homology for cryo-EM data analysis, we consider a microtubule intermediate structure Electron Microscopy Data (EMD 1129). Three helix models, an alpha-tubulin monomer model, an alpha-tubulin and beta-tubulin model, and an alpha-tubulin and beta-tubulin dimer model, are constructed to fit the cryo-EM data. The least square fitting leads to similarly high correlation coefficients, which indicates that structure determination via optimization is an ill-posed inverse problem. However, these models have dramatically different topological fingerprints. Especially, linkages or connectivities that discriminate one model from another, play little role in the traditional density fitting or optimization but are very sensitive and crucial to topological fingerprints. The intrinsic topological features of the microtubule data are identified after topological denoising. By a comparison of the topological fingerprints of the original data and those of three models, we found that the third model is topologically favored. The present work offers persistent homology based new strategies for topological denoising and for resolving ill-posed inverse problems.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
18
|
Xia K, Wei GW. Multidimensional persistence in biomolecular data. J Comput Chem 2015; 36:1502-20. [PMID: 26032339 PMCID: PMC4485576 DOI: 10.1002/jcc.23953] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Revised: 04/02/2015] [Accepted: 04/19/2015] [Indexed: 12/24/2022]
Abstract
Persistent homology has emerged as a popular technique for the topological simplification of big data, including biomolecular data. Multidimensional persistence bears considerable promise to bridge the gap between geometry and topology. However, its practical and robust construction has been a challenge. We introduce two families of multidimensional persistence, namely pseudomultidimensional persistence and multiscale multidimensional persistence. The former is generated via the repeated applications of persistent homology filtration to high-dimensional data, such as results from molecular dynamics or partial differential equations. The latter is constructed via isotropic and anisotropic scales that create new simiplicial complexes and associated topological spaces. The utility, robustness, and efficiency of the proposed topological methods are demonstrated via protein folding, protein flexibility analysis, the topological denoising of cryoelectron microscopy data, and the scale dependence of nanoparticles. Topological transition between partial folded and unfolded proteins has been observed in multidimensional persistence. The separation between noise topological signatures and molecular topological fingerprints is achieved by the Laplace-Beltrami flow. The multiscale multidimensional persistent homology reveals relative local features in Betti-0 invariants and the relatively global characteristics of Betti-1 and Betti-2 invariants.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
19
|
Xia K, Wei GW. Persistent homology analysis of protein structure, flexibility, and folding. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2014; 30:814-44. [PMID: 24902720 PMCID: PMC4131872 DOI: 10.1002/cnm.2655] [Citation(s) in RCA: 115] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2014] [Revised: 05/19/2014] [Accepted: 05/21/2014] [Indexed: 05/04/2023]
Abstract
Proteins are the most important biomolecules for living organisms. The understanding of protein structure, function, dynamics, and transport is one of the most challenging tasks in biological science. In the present work, persistent homology is, for the first time, introduced for extracting molecular topological fingerprints (MTFs) based on the persistence of molecular topological invariants. MTFs are utilized for protein characterization, identification, and classification. The method of slicing is proposed to track the geometric origin of protein topological invariants. Both all-atom and coarse-grained representations of MTFs are constructed. A new cutoff-like filtration is proposed to shed light on the optimal cutoff distance in elastic network models. On the basis of the correlation between protein compactness, rigidity, and connectivity, we propose an accumulated bar length generated from persistent topological invariants for the quantitative modeling of protein flexibility. To this end, a correlation matrix-based filtration is developed. This approach gives rise to an accurate prediction of the optimal characteristic distance used in protein B-factor analysis. Finally, MTFs are employed to characterize protein topological evolution during protein folding and quantitatively predict the protein folding stability. An excellent consistence between our persistent homology prediction and molecular dynamics simulation is found. This work reveals the topology-function relationship of proteins.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
- Center for Mathematical Molecular Biosciences, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Center for Mathematical Molecular Biosciences, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|