1
|
Hozumi Y, Tanemura KA, Wei GW. Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection. J Chem Inf Model 2024; 64:2829-2838. [PMID: 37402705 PMCID: PMC11009150 DOI: 10.1021/acs.jcim.3c00674] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell-cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing the downstream analysis. We present Correlated Clustering and Projection (CCP), a new data-domain dimensionality reduction method, for the first time. CCP projects each cluster of similar genes into a supergene defined as the accumulated pairwise nonlinear gene-gene correlations among all cells. Using 14 benchmark data sets, we demonstrate that CCP has significant advantages over classical principal component analysis (PCA) for clustering and/or classification problems with intrinsically high dimensionality. In addition, we introduce the Residue-Similarity index (RSI) as a novel metric for clustering and classification and the R-S plot as a new visualization tool. We show that the RSI correlates with accuracy without requiring the knowledge of the true labels. The R-S plot provides a unique alternative to the uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) for data with a large number of cell types.
Collapse
Affiliation(s)
- Yuta Hozumi
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Kiyoto Aramis Tanemura
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
2
|
Feng H, Cottrell S, Hozumi Y, Wei GW. Multiscale differential geometry learning of networks with applications to single-cell RNA sequencing data. Comput Biol Med 2024; 171:108211. [PMID: 38422960 PMCID: PMC10965033 DOI: 10.1016/j.compbiomed.2024.108211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/02/2024] [Accepted: 02/25/2024] [Indexed: 03/02/2024]
Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a transformative technology, offering unparalleled insights into the intricate landscape of cellular diversity and gene expression dynamics. scRNA-seq analysis represents a challenging and cutting-edge frontier within the field of biological research. Differential geometry serves as a powerful mathematical tool in various applications of scientific research. In this study, we introduce, for the first time, a multiscale differential geometry (MDG) strategy for addressing the challenges encountered in scRNA-seq data analysis. We assume that intrinsic properties of cells lie on a family of low-dimensional manifolds embedded in the high-dimensional space of scRNA-seq data. Multiscale cell-cell interactive manifolds are constructed to reveal complex relationships in the cell-cell network, where curvature-based features for cells can decipher the intricate structural and biological information. We showcase the utility of our novel approach by demonstrating its effectiveness in classifying cell types. This innovative application of differential geometry in scRNA-seq analysis opens new avenues for understanding the intricacies of biological networks and holds great potential for network analysis in other fields.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Sean Cottrell
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Yuta Hozumi
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
3
|
Rana MM, Nguyen DD. Geometric graph learning with extended atom-types features for protein-ligand binding affinity prediction. Comput Biol Med 2023; 164:107250. [PMID: 37515872 DOI: 10.1016/j.compbiomed.2023.107250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/12/2023] [Accepted: 07/07/2023] [Indexed: 07/31/2023]
Abstract
Understanding and accurately predicting protein-ligand binding affinity are essential in the drug design and discovery process. At present, machine learning-based methodologies are gaining popularity as a means of predicting binding affinity due to their efficiency and accuracy, as well as the increasing availability of structural and binding affinity data for protein-ligand complexes. In biomolecular studies, graph theory has been widely applied since graphs can be used to model molecules or molecular complexes in a natural manner. In the present work, we upgrade the graph-based learners for the study of protein-ligand interactions by integrating extensive atom types such as SYBYL and extended connectivity interactive features (ECIF) into multiscale weighted colored graphs (MWCG). By pairing with the gradient boosting decision tree (GBDT) machine learning algorithm, our approach results in two different methods, namely sybylGGL-Score and ecifGGL-Score. Both of our models are extensively validated in their scoring power using three commonly used benchmark datasets in the drug design area, namely CASF-2007, CASF-2013, and CASF-2016. The performance of our best model sybylGGL-Score is compared with other state-of-the-art models in the binding affinity prediction for each benchmark. While both of our models achieve state-of-the-art results, the SYBYL atom-type model sybylGGL-Score outperforms other methods by a wide margin in all benchmarks. Finally, the best-performing SYBYL atom-type model is evaluated on two test sets that are independent of CASF benchmarks.
Collapse
Affiliation(s)
- Md Masud Rana
- Department of Mathematics, University of Kentucky, Lexington, 40506, KY, USA.
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, 40506, KY, USA.
| |
Collapse
|
4
|
Zha J, Xia F. Developing Hybrid All-Atom and Ultra-Coarse-Grained Models to Investigate Taxol-Binding and Dynein Interactions on Microtubules. J Chem Theory Comput 2023; 19:5621-5632. [PMID: 37489636 DOI: 10.1021/acs.jctc.3c00275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
Simulating the conformations and functions of biological macromolecules by using all-atom (AA) models is a challenging task due to expensive computational costs. One possible strategy to solve this problem is to develop hybrid all-atom and ultra-coarse-grained (AA/UCG) models of the biological macromolecules. In the AA/UCG scheme, the interest regions are described by AA models, while the other regions are described in the UCG representation. In this study, we develop the hybrid AA/UCG models and apply them to investigate the conformational changes of microtubule-bound tubulins. The simulation results of the hybrid models elucidated the mechanism of why the taxol molecules selectively bound microtubules but not tubulin dimers. In addition, we also explore the interactions of the microtubules and dyneins. Our study shows that the hybrid AA/UCG model has great application potential in studying the function of complex biological systems.
Collapse
Affiliation(s)
- Jinyin Zha
- School of Chemistry and Molecular Engineering, NYU-ECNU Center for Computational Chemistry at NYU Shanghai, East China Normal University, Shanghai 200062, China
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Fei Xia
- School of Chemistry and Molecular Engineering, NYU-ECNU Center for Computational Chemistry at NYU Shanghai, East China Normal University, Shanghai 200062, China
| |
Collapse
|
5
|
Yao JF, Yang Y, Wang XC, Zhang XP. Systematic review of digital twin technology and applications. Vis Comput Ind Biomed Art 2023; 6:10. [PMID: 37249731 DOI: 10.1186/s42492-023-00137-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 05/18/2023] [Indexed: 05/31/2023] Open
Abstract
As one of the most important applications of digitalization, intelligence, and service, the digital twin (DT) breaks through the constraints of time, space, cost, and security on physical entities, expands and optimizes the relevant functions of physical entities, and enhances their application value. This phenomenon has been widely studied in academia and industry. In this study, the concept and definition of DT, as utilized by scholars and researchers in various fields of industry, are summarized. The internal association between DT and related technologies is explained. The four stages of DT development history are identified. The fundamentals of the technology, evaluation indexes, and model frameworks are reviewed. Subsequently, a conceptual ternary model of DT based on time, space, and logic is proposed. The technology and application status of typical DT systems are described. Finally, the current technical challenges of DT technology are analyzed, and directions for future development are discussed.
Collapse
Affiliation(s)
- Jun-Feng Yao
- Center for Digital Media Computing, School of Film, Xiamen University, Xiamen 361005, China.
- School of Informatics, Xiamen University, Xiamen 361005, China.
- Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan, Ministry of Culture and Tourism, Xiamen 361005, China.
| | - Yong Yang
- Center for Digital Media Computing, School of Film, Xiamen University, Xiamen 361005, China
| | - Xue-Cheng Wang
- Center for Digital Media Computing, School of Film, Xiamen University, Xiamen 361005, China
| | - Xiao-Peng Zhang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, the Institute of Automation, Chinese Academy of Sciences, Beijing 101408, China
| |
Collapse
|
6
|
Rana MM, Nguyen DD. EISA-Score: Element Interactive Surface Area Score for Protein–Ligand Binding Affinity Prediction. J Chem Inf Model 2022; 62:4329-4341. [DOI: 10.1021/acs.jcim.2c00697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Md Masud Rana
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
7
|
Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:biom12091246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein–ligand binding, including allosteric effects, protein–protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
|
8
|
Gao K, Wang R, Chen J, Cheng L, Frishcosy J, Huzumi Y, Qiu Y, Schluckbier T, Wei X, Wei GW. Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2. Chem Rev 2022; 122:11287-11368. [PMID: 35594413 PMCID: PMC9159519 DOI: 10.1021/acs.chemrev.1c00965] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Despite tremendous efforts in the past two years, our understanding of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), virus-host interactions, immune response, virulence, transmission, and evolution is still very limited. This limitation calls for further in-depth investigation. Computational studies have become an indispensable component in combating coronavirus disease 2019 (COVID-19) due to their low cost, their efficiency, and the fact that they are free from safety and ethical constraints. Additionally, the mechanism that governs the global evolution and transmission of SARS-CoV-2 cannot be revealed from individual experiments and was discovered by integrating genotyping of massive viral sequences, biophysical modeling of protein-protein interactions, deep mutational data, deep learning, and advanced mathematics. There exists a tsunami of literature on the molecular modeling, simulations, and predictions of SARS-CoV-2 and related developments of drugs, vaccines, antibodies, and diagnostics. To provide readers with a quick update about this literature, we present a comprehensive and systematic methodology-centered review. Aspects such as molecular biophysics, bioinformatics, cheminformatics, machine learning, and mathematics are discussed. This review will be beneficial to researchers who are looking for ways to contribute to SARS-CoV-2 studies and those who are interested in the status of the field.
Collapse
Affiliation(s)
- Kaifu Gao
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Rui Wang
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Jiahui Chen
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Limei Cheng
- Clinical
Pharmacology and Pharmacometrics, Bristol
Myers Squibb, Princeton, New Jersey 08536, United States
| | - Jaclyn Frishcosy
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yuta Huzumi
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yuchi Qiu
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Tom Schluckbier
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Xiaoqi Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
9
|
Rial R, González-Durruthy M, Liu Z, Ruso JM. Conformational binding mechanism of lysozyme induced by interactions with penicillin antibiotic drugs. J Mol Liq 2022. [DOI: 10.1016/j.molliq.2022.119081] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Alfarraj A, Wei GW. Geometric algebra generation of molecular surfaces. J R Soc Interface 2022; 19:20220117. [PMID: 35414214 PMCID: PMC9006026 DOI: 10.1098/rsif.2022.0117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Geometric algebra is a powerful framework that unifies mathematics and physics. Since its revival in the 1960s, it has attracted great attention and has been exploited in fields like physics, computer science and engineering. This work introduces a geometric algebra method for the molecular surface generation that uses the Clifford-Fourier transform (CFT) which is a generalization of the classical Fourier transform. Notably, the classical Fourier transform and CFT differ in the derivative property in [Formula: see text] for k even. This distinction is due to the non-commutativity of geometric product of pseudoscalars with multivectors and has significant consequences in applications. We use the CFT in [Formula: see text] to benefit from the derivative property in solving partial differential equations (PDEs). The CFT is used to solve the mode decomposition process in PDE transform. Two different initial cases are proposed to make the initial shapes in the present method. The proposed method is applied first to small molecules and proteins. To validate the method, the molecular surfaces generated are compared to surfaces of other definitions. Applications are considered to protein electrostatic surface potentials and solvation free energy. This work opens the door for further applications of geometric algebra and CFT in biological sciences.
Collapse
Affiliation(s)
- Azzam Alfarraj
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA.,Department of Mathematics, King Fahd University of Petroleum and Minerals, Dhahran 31261, KSA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA.,Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA.,Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
11
|
Deng X, Wang S, Han Z, Gong W, Liu Y, Li C. Dynamics of binding interactions of TDP-43 and RNA: An equally weighted multiscale elastic network model study. Proteins 2021; 90:589-600. [PMID: 34599611 DOI: 10.1002/prot.26255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 09/15/2021] [Accepted: 09/21/2021] [Indexed: 01/03/2023]
Abstract
Transactive response DNA binding protein 43 (TDP-43), an alternative-splicing regulator, can specifically bind long UG-rich RNAs, associated with a range of neurodegenerative diseases. Upon binding RNA, TDP-43 undergoes a large conformational change with two RNA recognition motifs (RRMs) connected by a long linker rearranged, strengthening the binding affinity of TDP-43 with RNA. We extend the equally weighted multiscale elastic network model (ewmENM), including its Gaussian network model (ewmGNM) and Anisotropic network model (ewmANM), with the multiscale effect of interactions considered, to the characterization of the dynamics of binding interactions of TDP-43 and RNA. The results reveal upon RNA binding a loss of flexibility occurs to TDP-43's loop3 segments rich in positively charged residues and C-terminal of high flexibility, suggesting their anchoring RNA, induced fit and conformational adjustment roles in recognizing RNA. Additionally, based on movement coupling analyses, it is found that RNA binding strengthens the interactions among intra-RRM β-sheets and between RRMs partially through the linker's mediating role, which stabilizes RNA binding interface, facilitating RNA binding efficiency. In addition, utilizing our proposed thermodynamic cycle method combined with ewmGNM, we identify the key residues for RNA binding whose perturbations induce a large change in binding free energy. We identify not only the residues important for specific binding, but also the ones critical for the conformational rearrangement between RRMs. Furthermore, molecular dynamics simulations are also performed to validate and further interpret the ENM-based results. The study demonstrates a useful avenue to utilize ewmENM to investigate the protein-RNA interaction dynamics characteristics.
Collapse
Affiliation(s)
- Xueqing Deng
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Shihao Wang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Zhongjie Han
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Yang Liu
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| |
Collapse
|
12
|
Szocinski T, Nguyen DD, Wei GW. AweGNN: Auto-parametrized weighted element-specific graph neural networks for molecules. Comput Biol Med 2021; 134:104460. [PMID: 34020133 DOI: 10.1016/j.compbiomed.2021.104460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 04/23/2021] [Accepted: 04/26/2021] [Indexed: 11/29/2022]
Abstract
While automated feature extraction has had tremendous success in many deep learning algorithms for image analysis and natural language processing, it does not work well for data involving complex internal structures, such as molecules. Data representations via advanced mathematics, including algebraic topology, differential geometry, and graph theory, have demonstrated superiority in a variety of biomolecular applications, however, their performance is often dependent on manual parametrization. This work introduces the auto-parametrized weighted element-specific graph neural network, dubbed AweGNN, to overcome the obstacle of this tedious parametrization process while also being a suitable technique for automated feature extraction on these internally complex biomolecular data sets. The AweGNN is a neural network model based on geometric-graph features of element-pair interactions, with its graph parameters being updated throughout the training, which results in what we call a network-enabled automatic representation (NEAR). To enhance the predictions with small data sets, we construct multi-task (MT) AweGNN models in addition to single-task (ST) AweGNN models. The proposed methods are applied to various benchmark data sets, including four data sets for quantitative toxicity analysis and another data set for solvation prediction. Extensive numerical tests show that AweGNN models can achieve state-of-the-art performance in molecular property predictions.
Collapse
Affiliation(s)
- Timothy Szocinski
- Department of Mathematics, Michigan State University, MI, 48824, USA
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, KY, 40506, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI, 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, MI, 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, MI, 48824, USA.
| |
Collapse
|
13
|
Abstract
In the global health emergency caused by coronavirus disease 2019 (COVID-19), efficient and specific therapies are urgently needed. Compared with traditional small-molecular drugs, antibody therapies are relatively easy to develop; they are as specific as vaccines in targeting severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); and they have thus attracted much attention in the past few months. This article reviews seven existing antibodies for neutralizing SARS-CoV-2 with 3D structures deposited in the Protein Data Bank (PDB). Five 3D antibody structures associated with the SARS-CoV spike (S) protein are also evaluated for their potential in neutralizing SARS-CoV-2. The interactions of these antibodies with the S protein receptor-binding domain (RBD) are compared with those between angiotensin-converting enzyme 2 and RBD complexes. Due to the orders of magnitude in the discrepancies of experimental binding affinities, we introduce topological data analysis, a variety of network models, and deep learning to analyze the binding strength and therapeutic potential of the 14 antibody-antigen complexes. The current COVID-19 antibody clinical trials, which are not limited to the S protein target, are also reviewed.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA;
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
14
|
Wang R, Chen J, Gao K, Hozumi Y, Yin C, Wei GW. Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants. Commun Biol 2021; 4:228. [PMID: 33589648 PMCID: PMC7884689 DOI: 10.1038/s42003-021-01754-6] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 11/13/2020] [Indexed: 02/07/2023] Open
Abstract
SARS-CoV-2 has been mutating since it was first sequenced in early January 2020. Here, we analyze 45,494 complete SARS-CoV-2 geneome sequences in the world to understand their mutations. Among them, 12,754 sequences are from the United States. Our analysis suggests the presence of four substrains and eleven top mutations in the United States. These eleven top mutations belong to 3 disconnected groups. The first and second groups consisting of 5 and 8 concurrent mutations are prevailing, while the other group with three concurrent mutations gradually fades out. Moreover, we reveal that female immune systems are more active than those of males in responding to SARS-CoV-2 infections. One of the top mutations, 27964C > T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we uncover that two of four SASR-CoV-2 substrains in the United States become potentially more infectious.
Collapse
Affiliation(s)
- Rui Wang
- grid.17088.360000 0001 2150 1785Department of Mathematics, Michigan State University, East Lansing, MI 48824 USA
| | - Jiahui Chen
- grid.17088.360000 0001 2150 1785Department of Mathematics, Michigan State University, East Lansing, MI 48824 USA
| | - Kaifu Gao
- grid.17088.360000 0001 2150 1785Department of Mathematics, Michigan State University, East Lansing, MI 48824 USA
| | - Yuta Hozumi
- grid.17088.360000 0001 2150 1785Department of Mathematics, Michigan State University, East Lansing, MI 48824 USA
| | - Changchuan Yin
- grid.185648.60000 0001 2175 0319Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL 60607 USA
| | - Guo-Wei Wei
- grid.17088.360000 0001 2150 1785Department of Mathematics, Michigan State University, East Lansing, MI 48824 USA ,grid.17088.360000 0001 2150 1785Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824 USA ,grid.17088.360000 0001 2150 1785Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824 USA
| |
Collapse
|
15
|
Zha J, Zhang Y, Xia K, Gräter F, Xia F. Coarse-Grained Simulation of Mechanical Properties of Single Microtubules With Micrometer Length. Front Mol Biosci 2021; 7:632122. [PMID: 33659274 PMCID: PMC7917235 DOI: 10.3389/fmolb.2020.632122] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Accepted: 12/30/2020] [Indexed: 01/03/2023] Open
Abstract
Microtubules are one of the most important components in the cytoskeleton and play a vital role in maintaining the shape and function of cells. Because single microtubules are some micrometers long, it is difficult to simulate such a large system using an all-atom model. In this work, we use the newly developed convolutional and K-means coarse-graining (CK-CG) method to establish an ultra-coarse-grained (UCG) model of a single microtubule, on the basis of the low electron microscopy density data of microtubules. We discuss the rationale of the micro-coarse-grained microtubule models of different resolutions and explore microtubule models up to 12-micron length. We use the devised microtubule model to quantify mechanical properties of microtubules of different lengths. Our model allows mesoscopic simulations of micrometer-level biomaterials and can be further used to study important biological processes related to microtubule function.
Collapse
Affiliation(s)
- Jinyin Zha
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Yuwei Zhang
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Frauke Gräter
- Interdisciplinary Centre for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany.,Heidelberg Institute for Theoretical Studies (HITS), Schloβ-Wolfsbrunnenweg 35, Heidelberg, Germany.,Max Planck School Matter to Life, Jahnstraβe 29, Heidelberg, Germany
| | - Fei Xia
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China.,Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
| |
Collapse
|
16
|
Wang R, Chen J, Hozumi Y, Yin C, Wei GW. Decoding Asymptomatic COVID-19 Infection and Transmission. J Phys Chem Lett 2020; 11:10007-10015. [PMID: 33179934 PMCID: PMC8150094 DOI: 10.1021/acs.jpclett.0c02765] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
One of the major challenges in controlling the coronavirus disease 2019 (COVID-19) outbreak is its asymptomatic transmission. The pathogenicity and virulence of asymptomatic COVID-19 remain mysterious. On the basis of the genotyping of 75775 SARS-CoV-2 genome isolates, we reveal that asymptomatic infection is linked to SARS-CoV-2 11083G>T mutation (i.e., L37F at nonstructure protein 6 (NSP6)). By analyzing the distribution of 11083G>T in various countries, we unveil that 11083G>T may correlate with the hypotoxicity of SARS-CoV-2. Moreover, we show a global decaying tendency of the 11083G>T mutation ratio indicating that 11083G>T hinders the SARS-CoV-2 transmission capacity. Artificial intelligence, sequence alignment, and network analysis are applied to show that NSP6 mutation L37F may have compromised the virus's ability to undermine the innate cellular defense against viral infection via autophagy regulation. This assessment is in good agreement with our genotyping of the SARS-CoV-2 evolution and transmission across various countries and regions over the past few months.
Collapse
Affiliation(s)
| | | | | | - Changchuan Yin
- Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607, United States
| | | |
Collapse
|
17
|
Wang S, Gong W, Deng X, Liu Y, Li C. Exploring the dynamics of RNA molecules with multiscale Gaussian network model. Chem Phys 2020. [DOI: 10.1016/j.chemphys.2020.110820] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
18
|
Wang R, Nguyen DD, Wei GW. Persistent spectral graph. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2020; 36:e3376. [PMID: 32515170 PMCID: PMC7719081 DOI: 10.1002/cnm.3376] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 05/15/2020] [Accepted: 05/31/2020] [Indexed: 05/25/2023]
Abstract
Persistent homology is constrained to purely topological persistence, while multiscale graphs account only for geometric information. This work introduces persistent spectral theory to create a unified low-dimensional multiscale paradigm for revealing topological persistence and extracting geometric shapes from high-dimensional datasets. For a point-cloud dataset, a filtration procedure is used to generate a sequence of chain complexes and associated families of simplicial complexes and chains, from which we construct persistent combinatorial Laplacian matrices. We show that a full set of topological persistence can be completely recovered from the harmonic persistent spectra, that is, the spectra that have zero eigenvalues, of the persistent combinatorial Laplacian matrices. However, non-harmonic spectra of the Laplacian matrices induced by the filtration offer another powerful tool for data analysis, modeling, and prediction. In this work, fullerene stability is predicted by using both harmonic spectra and non-harmonic persistent spectra, while the latter spectra are successfully devised to analyze the structure of fullerenes and model protein flexibility, which cannot be straightforwardly extracted from the current persistent homology. The proposed method is found to provide excellent predictions of the protein B-factors for which current popular biophysical models break down.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Duc Duy Nguyen
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
19
|
Pun CS, Yong BYS, Xia K. Weighted-persistent-homology-based machine learning for RNA flexibility analysis. PLoS One 2020; 15:e0237747. [PMID: 32822369 PMCID: PMC7446851 DOI: 10.1371/journal.pone.0237747] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/01/2020] [Indexed: 12/22/2022] Open
Abstract
With the great significance of biomolecular flexibility in biomolecular dynamics and functional analysis, various experimental and theoretical models are developed. Experimentally, Debye-Waller factor, also known as B-factor, measures atomic mean-square displacement and is usually considered as an important measurement for flexibility. Theoretically, elastic network models, Gaussian network model, flexibility-rigidity model, and other computational models have been proposed for flexibility analysis by shedding light on the biomolecular inner topological structures. Recently, a topology-based machine learning model has been proposed. By using the features from persistent homology, this model achieves a remarkable high Pearson correlation coefficient (PCC) in protein B-factor prediction. Motivated by its success, we propose weighted-persistent-homology (WPH)-based machine learning (WPHML) models for RNA flexibility analysis. Our WPH is a newly-proposed model, which incorporate physical, chemical and biological information into topological measurements using a weight function. In particular, we use local persistent homology (LPH) to focus on the topological information of local regions. Our WPHML model is validated on a well-established RNA dataset, and numerical experiments show that our model can achieve a PCC of up to 0.5822. The comparison with the previous sequence-information-based learning models shows that a consistent improvement in performance by at least 10% is achieved in our current model.
Collapse
Affiliation(s)
- Chi Seng Pun
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
- * E-mail: (CSP); (KX)
| | - Brandon Yung Sin Yong
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
- * E-mail: (CSP); (KX)
| |
Collapse
|
20
|
Wang R, Chen J, Gao K, Hozumi Y, Yin C, Wei GW. Characterizing SARS-CoV-2 mutations in the United States. RESEARCH SQUARE 2020:rs.3.rs-49671. [PMID: 32818213 PMCID: PMC7430589 DOI: 10.21203/rs.3.rs-49671/v1] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network theory, we reveal that the US SARS-CoV-2 has four substrains and five top US SARS-CoV-2 mutations were first detected in China (2 cases), Singapore (2 cases), and the United Kingdom (1 case). The next three top US SARS-CoV-2 mutations were first detected in the US. These eight top mutations belong to two disconnected groups. The first group consisting of 5 concurrent mutations is prevailing, while the other group with three concurrent mutations gradually fades out. We identify that one of the top mutations, 27964C>T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we further uncover that three of four US SASR-CoV-2 substrains become more infectious. Our study calls for effective viral control and containing strategies in the US.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Jiahui Chen
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Yuta Hozumi
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Changchuan Yin
- Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
21
|
Zhao R, Wang M, Chen J, Tong Y, Wei GW. The de Rham-Hodge Analysis and Modeling of Biomolecules. Bull Math Biol 2020; 82:108. [PMID: 32770408 PMCID: PMC8137271 DOI: 10.1007/s11538-020-00783-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 07/20/2020] [Indexed: 12/18/2022]
Abstract
Biological macromolecules have intricate structures that underpin their biological functions. Understanding their structure-function relationships remains a challenge due to their structural complexity and functional variability. Although de Rham-Hodge theory, a landmark of twentieth-century mathematics, has had a tremendous impact on mathematics and physics, it has not been devised for macromolecular modeling and analysis. In this work, we introduce de Rham-Hodge theory as a unified paradigm for analyzing the geometry, topology, flexibility, and Hodge mode analysis of biological macromolecules. Geometric characteristics and topological invariants are obtained either from the Helmholtz-Hodge decomposition of the scalar, vector, and/or tensor fields of a macromolecule or from the spectral analysis of various Laplace-de Rham operators defined on the molecular manifolds. We propose Laplace-de Rham spectral-based models for predicting macromolecular flexibility. We further construct a Laplace-de Rham-Helfrich operator for revealing cryo-EM natural frequencies. Extensive experiments are carried out to demonstrate that the proposed de Rham-Hodge paradigm is one of the most versatile tools for the multiscale modeling and analysis of biological macromolecules and subcellular organelles. Accurate, reliable, and topological structure-preserving algorithms for implementing discrete exterior calculus (DEC) have been developed to facilitate the aforementioned modeling and analysis of biological macromolecules. The proposed de Rham-Hodge paradigm has potential applications to subcellular organelles and the structure construction from medium- or low-resolution cryo-EM maps, and functional predictions from massive biomolecular datasets.
Collapse
Affiliation(s)
- Rundong Zhao
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Menglun Wang
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Jiahui Chen
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Yiying Tong
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
22
|
Abstract
Recently, machine learning (ML) has established itself in various worldwide benchmarking competitions in computational biology, including Critical Assessment of Structure Prediction (CASP) and Drug Design Data Resource (D3R) Grand Challenges. However, the intricate structural complexity and high ML dimensionality of biomolecular datasets obstruct the efficient application of ML algorithms in the field. In addition to data and algorithm, an efficient ML machinery for biomolecular predictions must include structural representation as an indispensable component. Mathematical representations that simplify the biomolecular structural complexity and reduce ML dimensionality have emerged as a prime winner in D3R Grand Challenges. This review is devoted to the recent advances in developing low-dimensional and scalable mathematical representations of biomolecules in our laboratory. We discuss three classes of mathematical approaches, including algebraic topology, differential geometry, and graph theory. We elucidate how the physical and biological challenges have guided the evolution and development of these mathematical apparatuses for massive and diverse biomolecular data. We focus the performance analysis on protein-ligand binding predictions in this review although these methods have had tremendous success in many other applications, such as protein classification, virtual screening, and the predictions of solubility, solvation free energies, toxicity, partition coefficients, protein folding stability changes upon mutation, etc.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA. and Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA and Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
23
|
Nguyen DD, Gao K, Wang M, Wei GW. MathDL: mathematical deep learning for D3R Grand Challenge 4. J Comput Aided Mol Des 2020; 34:131-147. [PMID: 31734815 PMCID: PMC7376411 DOI: 10.1007/s10822-019-00237-5] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 10/14/2019] [Indexed: 12/17/2022]
Abstract
We present the performances of our mathematical deep learning (MathDL) models for D3R Grand Challenge 4 (GC4). This challenge involves pose prediction, affinity ranking, and free energy estimation for beta secretase 1 (BACE) as well as affinity ranking and free energy estimation for Cathepsin S (CatS). We have developed advanced mathematics, namely differential geometry, algebraic graph, and/or algebraic topology, to accurately and efficiently encode high dimensional physical/chemical interactions into scalable low-dimensional rotational and translational invariant representations. These representations are integrated with deep learning models, such as generative adversarial networks (GAN) and convolutional neural networks (CNN) for pose prediction and energy evaluation, respectively. Overall, our MathDL models achieved the top place in pose prediction for BACE ligands in Stage 1a. Moreover, our submissions obtained the highest Spearman correlation coefficient on the affinity ranking of 460 CatS compounds, and the smallest centered root mean square error on the free energy set of 39 CatS molecules. It is worthy to mention that our method on docking pose predictions has significantly improved from our previous ones.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Menglun Wang
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
24
|
Steinberg L, Russo J, Frey J. A new topological descriptor for water network structure. J Cheminform 2019; 11:48. [PMID: 31292766 PMCID: PMC6617667 DOI: 10.1186/s13321-019-0369-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 07/02/2019] [Indexed: 11/10/2022] Open
Abstract
Bulk water molecular dynamics simulations based on a series of atomistic water potentials (TIP3P, TIP4P/Ew, SPC/E and OPC) are compared using new techniques from the field of topological data analysis. The topological invariants (the different degrees of homology) derived from each simulation frame are used to create a series of persistence diagrams from the atomic positions. These are averaged over the simulation time using the persistence image formalism, before being normalised by their total magnitude (the L1 norm) to ensure a size independent descriptor (L1NPI). We demonstrate that the L1NPI formalism is suitable for the analysis of systems where the number of molecules varies by at least a factor of 10. Using standard machine learning techniques, a basic linear SVM, it is shown that differences in water models are able to be isolated to different degrees of homology. In particular, whereas first degree homology is able to distinguish between all atomistic potentials studied, OPC is the only potential that differs in its second degree homology. The L1 normalised persistence images are then used in the comparison of a series of Stillinger-Weber potential simulations to the atomistic potentials and the effects of changing the strength of three-body interactions on the structures is easily evident in L1NPI space, with a reduction in variance of structures as interaction strength increases being the most obvious result. Furthermore, there is a clear tracking in L1NPI space of the λ parameter. The L1NPI formalism presents a useful new technique for the analysis of water and other materials. It is approximately size-independent, and has been shown to contain information as to real structures in the system. We finally present a perspective on the use of L1NPIs and other persistent homology techniques as a descriptor for water solubility.
Collapse
Affiliation(s)
- Lee Steinberg
- School of Chemistry, University of Southampton, Southampton, SO17 1BJ UK
| | - John Russo
- School of Mathematics, University of Bristol, Bristol, UK
| | - Jeremy Frey
- School of Chemistry, University of Southampton, Southampton, SO17 1BJ UK
| |
Collapse
|
25
|
Zhang Y, Xia K, Cao Z, Gräter F, Xia F. A new method for the construction of coarse-grained models of large biomolecules from low-resolution cryo-electron microscopy data. Phys Chem Chem Phys 2019; 21:9720-9727. [PMID: 31025999 DOI: 10.1039/c9cp01370a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The rapid development of cryo-electron microscopy (cryo-EM) has led to the generation of significant low-resolution electron density data of biomolecules. However, the atomistic details of huge biomolecules usually cannot be obtained because it is very difficult to construct all-atom models for MD simulations. Thus, it is still a challenge to make use of the rich low-resolution cryo-EM data for computer simulation and functional study. In this study, we proposed a new method called Convolutional and K-means Coarse-Graining (CK-CG) for the efficient coarse-graining of large biological systems. Using the CK-CG method, we could directly map the cryo-EM data into coarse-grained (CG) beads. Furthermore, the CG beads were parameterized with an empirical harmonic potential to construct a new CG model. We subjected the CK-CG models of the fibrillar protein assemblies F-actin and collagen to external forces in pulling dynamic simulations to assess their mechanical response. The agreement between the estimated tensile stiffness between CG models and experiments demonstrates the validity of the CK-CG method. Thus, our method provides a practical strategy for the direct construction of a structural model from low-resolution data for biological function studies.
Collapse
Affiliation(s)
- Yuwei Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.
| | | | | | | | | |
Collapse
|
26
|
Nguyen DD, Wei GW. DG-GL: Differential geometry-based geometric learning of molecular datasets. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2019; 35:e3179. [PMID: 30693661 PMCID: PMC6598676 DOI: 10.1002/cnm.3179] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 11/21/2018] [Accepted: 12/06/2018] [Indexed: 05/11/2023]
Abstract
MOTIVATION Despite its great success in various physical modeling, differential geometry (DG) has rarely been devised as a versatile tool for analyzing large, diverse, and complex molecular and biomolecular datasets because of the limited understanding of its potential power in dimensionality reduction and its ability to encode essential chemical and biological information in differentiable manifolds. RESULTS We put forward a differential geometry-based geometric learning (DG-GL) hypothesis that the intrinsic physics of three-dimensional (3D) molecular structures lies on a family of low-dimensional manifolds embedded in a high-dimensional data space. We encode crucial chemical, physical, and biological information into 2D element interactive manifolds, extracted from a high-dimensional structural data space via a multiscale discrete-to-continuum mapping using differentiable density estimators. Differential geometry apparatuses are utilized to construct element interactive curvatures in analytical forms for certain analytically differentiable density estimators. These low-dimensional differential geometry representations are paired with a robust machine learning algorithm to showcase their descriptive and predictive powers for large, diverse, and complex molecular and biomolecular datasets. Extensive numerical experiments are carried out to demonstrate that the proposed DG-GL strategy outperforms other advanced methods in the predictions of drug discovery-related protein-ligand binding affinity, drug toxicity, and molecular solvation free energy. AVAILABILITY AND IMPLEMENTATION http://weilab.math.msu.edu/DG-GL/ Contact: wei@math.msu.edu.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, East Lansing, 48824, Michigan
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, 48824, Michigan
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, Michigan
| |
Collapse
|
27
|
Bramer D, Wei GW. Multiscale weighted colored graphs for protein flexibility and rigidity analysis. J Chem Phys 2018; 148:054103. [PMID: 29421884 DOI: 10.1063/1.5016562] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Protein structural fluctuation, measured by Debye-Waller factors or B-factors, is known to correlate to protein flexibility and function. A variety of methods has been developed for protein Debye-Waller factor prediction and related applications to domain separation, docking pose ranking, entropy calculation, hinge detection, stability analysis, etc. Nevertheless, none of the current methodologies are able to deliver an accuracy of 0.7 in terms of the Pearson correlation coefficients averaged over a large set of proteins. In this work, we introduce a paradigm-shifting geometric graph model, multiscale weighted colored graph (MWCG), to provide a new generation of computational algorithms to significantly change the current status of protein structural fluctuation analysis. Our MWCG model divides a protein graph into multiple subgraphs based on interaction types between graph nodes and represents the protein rigidity by generalized centralities of subgraphs. MWCGs not only predict the B-factors of protein residues but also accurately analyze the flexibility of all atoms in a protein. The MWCG model is validated over a number of protein test sets and compared with many standard methods. An extensive numerical study indicates that the proposed MWCG offers an accuracy of over 0.8 and thus provides perhaps the first reliable method for estimating protein flexibility and B-factors. It also simultaneously predicts all-atom flexibility in a molecule.
Collapse
Affiliation(s)
- David Bramer
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
28
|
Xia K. Sequence-based multiscale modeling for high-throughput chromosome conformation capture (Hi-C) data analysis. PLoS One 2018; 13:e0191899. [PMID: 29408904 PMCID: PMC5800693 DOI: 10.1371/journal.pone.0191899] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Accepted: 01/12/2018] [Indexed: 11/18/2022] Open
Abstract
In this paper, we introduce sequence-based multiscale modeling for biomolecular data analysis. We employ spectral clustering method in our modeling and reveal the difference between sequence-based global scale clustering and local scale clustering. Essentially, two types of distances, i.e., Euclidean (or spatial) distance and genomic (or sequential) distance, can be used in data clustering. Clusters from sequence-based global scale models optimize spatial distances, meaning spatially adjacent loci are more likely to be assigned into the same cluster. Sequence-based local scale models, on the other hand, result in clusters that optimize genomic distances. That is to say, in these models, sequentially adjoining loci tend to be cluster together. We propose two sequence-based multiscale models (SeqMMs) for the study of chromosome hierarchical structures, including genomic compartments and topological associated domains (TADs). We find that genomic compartments are determined only by global scale information in the Hi-C data. The removal of all the local interactions within a band region as large as 10 Mb in genomic distance has almost no significant influence on the final compartment results. Further, in TAD analysis, we find that when the sequential scale is small, a tiny variation of diagonal band region in a contact map will result in a great change in the predicted TAD boundaries. When the scale value is larger than a threshold value, the TAD boundaries become very consistent. This threshold value is highly related to TAD sizes. By the comparison of our results with those previously obtained using a spectral clustering model, we find that our method is more robust and reliable. Finally, we demonstrate that almost all TAD boundaries from both clustering methods are local minimum of a TAD summation function.
Collapse
Affiliation(s)
- Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore 637371, Singapore
| |
Collapse
|
29
|
Xia K. Multiscale virtual particle based elastic network model (MVP-ENM) for normal mode analysis of large-sized biomolecules. Phys Chem Chem Phys 2018; 20:658-669. [PMID: 29227479 DOI: 10.1039/c7cp07177a] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
In this paper, a multiscale virtual particle based elastic network model (MVP-ENM) is proposed for the normal mode analysis of large-sized biomolecules. The multiscale virtual particle (MVP) model is proposed for the discretization of biomolecular density data. With this model, large-sized biomolecular structures can be coarse-grained into virtual particles such that a balance between model accuracy and computational cost can be achieved. An elastic network is constructed by assuming "connections" between virtual particles. The connection is described by a special harmonic potential function, which considers the influence from both the mass distributions and distance relations of the virtual particles. Two independent models, i.e., the multiscale virtual particle based Gaussian network model (MVP-GNM) and the multiscale virtual particle based anisotropic network model (MVP-ANM), are proposed. It has been found that in the Debye-Waller factor (B-factor) prediction, the results from our MVP-GNM with a high resolution are as good as the ones from GNM. Even with low resolutions, our MVP-GNM can still capture the global behavior of the B-factor very well with mismatches predominantly from the regions with large B-factor values. Further, it has been demonstrated that the low-frequency eigenmodes from our MVP-ANM are highly consistent with the ones from ANM even with very low resolutions and a coarse grid. Finally, the great advantage of MVP-ANM model for large-sized biomolecules has been demonstrated by using two poliovirus virus structures. The paper ends with a conclusion.
Collapse
Affiliation(s)
- Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371.
| |
Collapse
|
30
|
Cang Z, Wei GW. Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2018; 34. [PMID: 28677268 DOI: 10.1002/cnm.2914] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2017] [Revised: 06/27/2017] [Accepted: 06/29/2017] [Indexed: 05/17/2023]
Abstract
Protein-ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein-ligand binding affinities is vital to rational drug design and the understanding of protein-ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of abstraction and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and abstract topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology-based machine-learning paradigm outperforms other existing methods in protein-ligand binding affinity predictions. ESPH reveals protein-ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein-ligand hydrophobic interactions are extended to 40Å away from the binding site, which has a significant ramification to drug and protein design.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
31
|
Zhao R, Wang M, Tong Y, Wei GW. Divide-and-conquer strategy for large-scale Eulerian solvent excluded surface. COMMUNICATIONS IN INFORMATION AND SYSTEMS 2018; 18:299-329. [PMID: 31327932 DOI: 10.4310/cis.2018.v18.n4.a5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
MOTIVATION Surface generation and visualization are some of the most important tasks in biomolecular modeling and computation. Eulerian solvent excluded surface (ESES) software provides analytical solvent excluded surface (SES) in the Cartesian grid, which is necessary for simulating many biomolecular electrostatic and ion channel models. However, large biomolecules and/or fine grid resolutions give rise to excessively large memory requirements in ESES construction. We introduce an out-of-core and parallel algorithm to improve the ESES software. RESULTS The present approach drastically improves the spatial and temporal efficiency of ESES. The memory footprint and time complexity are analyzed and empirically verified through extensive tests with a large collection of biomolecule examples. Our results show that our algorithm can successfully reduce memory footprint through a straightforward divide-and-conquer strategy to perform the calculation of arbitrarily large proteins on a typical commodity personal computer. On multi-core computers or clusters, our algorithm can reduce the execution time by parallelizing most of the calculation as disjoint subproblems. Various comparisons with the state-of-the-art Cartesian grid based SES calculation were done to validate the present method and show the improved efficiency. This approach makes ESES a robust software for the construction of analytical solvent excluded surfaces. AVAILABILITY AND IMPLEMENTATION http://weilab.math.msu.edu/ESES.
Collapse
Affiliation(s)
- Rundong Zhao
- Department of Computer Science and Engineering, Michigan State University, MI 48824, USA
| | - Menglun Wang
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Yiying Tong
- Department of Computer Science and Engineering, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, and Department of Electrical and Computer Engineering, and Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
32
|
Cang Z, Mu L, Wei GW. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput Biol 2018; 14:e1005929. [PMID: 29309403 PMCID: PMC5774846 DOI: 10.1371/journal.pcbi.1005929] [Citation(s) in RCA: 141] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 01/19/2018] [Accepted: 12/15/2017] [Indexed: 12/05/2022] Open
Abstract
This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America
| | - Lin Mu
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, United States of America
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
33
|
Multiscale Persistent Functions for Biomolecular Structure Characterization. Bull Math Biol 2017; 80:1-31. [PMID: 29098540 DOI: 10.1007/s11538-017-0362-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 10/19/2017] [Indexed: 10/18/2022]
Abstract
In this paper, we introduce multiscale persistent functions for biomolecular structure characterization. The essential idea is to combine our multiscale rigidity functions (MRFs) with persistent homology analysis, so as to construct a series of multiscale persistent functions, particularly multiscale persistent entropies, for structure characterization. To clarify the fundamental idea of our method, the multiscale persistent entropy (MPE) model is discussed in great detail. Mathematically, unlike the previous persistent entropy (Chintakunta et al. in Pattern Recognit 48(2):391-401, 2015; Merelli et al. in Entropy 17(10):6872-6892, 2015; Rucco et al. in: Proceedings of ECCS 2014, Springer, pp 117-128, 2016), a special resolution parameter is incorporated into our model. Various scales can be achieved by tuning its value. Physically, our MPE can be used in conformational entropy evaluation. More specifically, it is found that our method incorporates in it a natural classification scheme. This is achieved through a density filtration of an MRF built from angular distributions. To further validate our model, a systematical comparison with the traditional entropy evaluation model is done. It is found that our model is able to preserve the intrinsic topological features of biomolecular data much better than traditional approaches, particularly for resolutions in the intermediate range. Moreover, by comparing with traditional entropies from various grid sizes, bond angle-based methods and a persistent homology-based support vector machine method (Cang et al. in Mol Based Math Biol 3:140-162, 2015), we find that our MPE method gives the best results in terms of average true positive rate in a classic protein structure classification test. More interestingly, all-alpha and all-beta protein classes can be clearly separated from each other with zero error only in our model. Finally, a special protein structure index (PSI) is proposed, for the first time, to describe the "regularity" of protein structures. Basically, a protein structure is deemed as regular if it has a consistent and orderly configuration. Our PSI model is tested on a database of 110 proteins; we find that structures with larger portions of loops and intrinsically disorder regions are always associated with larger PSI, meaning an irregular configuration, while proteins with larger portions of secondary structures, i.e., alpha-helix or beta-sheet, have smaller PSI. Essentially, PSI can be used to describe the "regularity" information in any systems.
Collapse
|
34
|
Nguyen DD, Xiao T, Wang M, Wei GW. Rigidity Strengthening: A Mechanism for Protein–Ligand Binding. J Chem Inf Model 2017; 57:1715-1721. [DOI: 10.1021/acs.jcim.7b00226] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Duc D. Nguyen
- Department of Mathematics, ‡Department of Biochemistry and Molecular Biology, and §Department of Electrical
and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| | - Tian Xiao
- Department of Mathematics, ‡Department of Biochemistry and Molecular Biology, and §Department of Electrical
and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| | - Menglun Wang
- Department of Mathematics, ‡Department of Biochemistry and Molecular Biology, and §Department of Electrical
and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, ‡Department of Biochemistry and Molecular Biology, and §Department of Electrical
and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
35
|
Abstract
Flexibility-rigidity index (FRI) has been developed as a robust, accurate, and efficient method for macromolecular thermal fluctuation analysis and B-factor prediction. The performance of FRI depends on its formulations of rigidity index and flexibility index. In this work, we introduce alternative rigidity and flexibility formulations. The structure of the classic Gaussian surface is utilized to construct a new type of rigidity index, which leads to a new class of rigidity densities with the classic Gaussian surface as a special case. Additionally, we introduce a new type of flexibility index based on the domain indicator property of normalized rigidity density. These generalized FRI (gFRI) methods have been extensively validated by the B-factor predictions of 364 proteins. Significantly outperforming the classic Gaussian network model, gFRI is a new generation of methodologies for accurate, robust, and efficient analysis of protein flexibility and fluctuation. Finally, gFRI based molecular surface generation and flexibility visualization are demonstrated.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
36
|
Liu B, Wang B, Zhao R, Tong Y, Wei GW. ESES: Software for Eulerian solvent excluded surface. J Comput Chem 2017; 38:446-466. [PMID: 28052350 DOI: 10.1002/jcc.24682] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 11/02/2016] [Accepted: 11/09/2016] [Indexed: 12/17/2022]
Abstract
Solvent excluded surface (SES) is one of the most popular surface definitions in biophysics and molecular biology. In addition to its usage in biomolecular visualization, it has been widely used in implicit solvent models, in which SES is usually immersed in a Cartesian mesh. Therefore, it is important to construct SESs in the Eulerian representation for biophysical modeling and computation. This work describes a software package called Eulerian solvent excluded surface (ESES) for the generation of accurate SESs in Cartesian grids. ESES offers the description of the solvent and solute domains by specifying all the intersection points between the SES and the Cartesian grid lines. Additionally, the interface normal at each intersection point is evaluated. Furthermore, for a given biomolecule, the ESES software not only provides the whole surface area, but also partitions the surface area according to atomic types. Homology theory is utilized to detect topological features, such as loops and cavities, on the complex formed by the SES. The sizes of loops and cavities are measured based on persistent homology with an evolutionary partial differential equation-based filtration. ESES is extensively validated by surface visualization, electrostatic solvation free energy computation, surface area and volume calculations, and loop and cavity detection and their size estimation. We used the Amber PBSA test set in our electrostatic solvation energy, area, and volume validations. Our results are either calibrated by analytical values or compared with those from the MSMS software. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Beibei Liu
- Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan, 48824
| | - Bao Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan, 48824
| | - Rundong Zhao
- Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan, 48824
| | - Yiying Tong
- Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan, 48824
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan, 48824.,Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, 48824.,Department of Biochemistry and, Molecular Biology, Michigan State University, East Lansing, Michigan, 48824
| |
Collapse
|
37
|
Nguyen DD, Wei GW. The impact of surface area, volume, curvature, and Lennard-Jones potential to solvation modeling. J Comput Chem 2016; 38:24-36. [PMID: 27718270 DOI: 10.1002/jcc.24512] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/17/2016] [Accepted: 08/30/2016] [Indexed: 12/24/2022]
Abstract
This article explores the impact of surface area, volume, curvature, and Lennard-Jones (LJ) potential on solvation free energy predictions. Rigidity surfaces are utilized to generate robust analytical expressions for maximum, minimum, mean, and Gaussian curvatures of solvent-solute interfaces, and define a generalized Poisson-Boltzmann (GPB) equation with a smooth dielectric profile. Extensive correlation analysis is performed to examine the linear dependence of surface area, surface enclosed volume, maximum curvature, minimum curvature, mean curvature, and Gaussian curvature for solvation modeling. It is found that surface area and surfaces enclosed volumes are highly correlated to each other's, and poorly correlated to various curvatures for six test sets of molecules. Different curvatures are weakly correlated to each other for six test sets of molecules, but are strongly correlated to each other within each test set of molecules. Based on correlation analysis, we construct twenty six nontrivial nonpolar solvation models. Our numerical results reveal that the LJ potential plays a vital role in nonpolar solvation modeling, especially for molecules involving strong van der Waals interactions. It is found that curvatures are at least as important as surface area or surface enclosed volume in nonpolar solvation modeling. In conjugation with the GPB model, various curvature-based nonpolar solvation models are shown to offer some of the best solvation free energy predictions for a wide range of test sets. For example, root mean square errors from a model constituting surface area, volume, mean curvature, and LJ potential are less than 0.42 kcal/mol for all test sets. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Duc D Nguyen
- Department of Mathematics, Michigan State University, Michigan, 48824
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, Michigan, 48824.,Department of Electrical and Computer Engineering, Michigan State University, Michigan, 48824.,Department of Biochemistry and Molecular Biology, Michigan State University, Michigan, 48824
| |
Collapse
|
38
|
Xia K, Opron K, Wei GW. Multiscale Gaussian network model (mGNM) and multiscale anisotropic network model (mANM). J Chem Phys 2016; 143:204106. [PMID: 26627949 DOI: 10.1063/1.4936132] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Gaussian network model (GNM) and anisotropic network model (ANM) are some of the most popular methods for the study of protein flexibility and related functions. In this work, we propose generalized GNM (gGNM) and ANM methods and show that the GNM Kirchhoff matrix can be built from the ideal low-pass filter, which is a special case of a wide class of correlation functions underpinning the linear scaling flexibility-rigidity index (FRI) method. Based on the mathematical structure of correlation functions, we propose a unified framework to construct generalized Kirchhoff matrices whose matrix inverse leads to gGNMs, whereas, the direct inverse of its diagonal elements gives rise to FRI method. With this connection, we further introduce two multiscale elastic network models, namely, multiscale GNM (mGNM) and multiscale ANM (mANM), which are able to incorporate different scales into the generalized Kirchhoff matrices or generalized Hessian matrices. We validate our new multiscale methods with extensive numerical experiments. We illustrate that gGNMs outperform the original GNM method in the B-factor prediction of a set of 364 proteins. We demonstrate that for a given correlation function, FRI and gGNM methods provide essentially identical B-factor predictions when the scale value in the correlation function is sufficiently large. More importantly, we reveal intrinsic multiscale behavior in protein structures. The proposed mGNM and mANM are able to capture this multiscale behavior and thus give rise to a significant improvement of more than 11% in B-factor predictions over the original GNM and ANM methods. We further demonstrate the benefits of our mGNM through the B-factor predictions of many proteins that fail the original GNM method. We show that the proposed mGNM can also be used to analyze protein domain separations. Finally, we showcase the ability of our mANM for the analysis of protein collective motions.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Kristopher Opron
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Mathematical Biosciences Institute, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
39
|
Opron K, Xia K, Burton Z, Wei GW. Flexibility-rigidity index for protein-nucleic acid flexibility and fluctuation analysis. J Comput Chem 2016; 37:1283-95. [PMID: 26927815 PMCID: PMC5844491 DOI: 10.1002/jcc.24320] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 12/02/2015] [Accepted: 01/17/2016] [Indexed: 12/29/2022]
Abstract
Protein-nucleic acid complexes are important for many cellular processes including the most essential functions such as transcription and translation. For many protein-nucleic acid complexes, flexibility of both macromolecules has been shown to be critical for specificity and/or function. The flexibility-rigidity index (FRI) has been proposed as an accurate and efficient approach for protein flexibility analysis. In this article, we introduce FRI for the flexibility analysis of protein-nucleic acid complexes. We demonstrate that a multiscale strategy, which incorporates multiple kernels to capture various length scales in biomolecular collective motions, is able to significantly improve the state of art in the flexibility analysis of protein-nucleic acid complexes. We take the advantage of the high accuracy and O(N) computational complexity of our multiscale FRI method to investigate the flexibility of ribosomal subunits, which are difficult to analyze by alternative approaches. An anisotropic FRI approach, which involves localized Hessian matrices, is utilized to study the translocation dynamics in an RNA polymerase.
Collapse
Affiliation(s)
- Kristopher Opron
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| | - Kelin Xia
- Department of Mathematics Michigan State University, MI 48824, USA
| | - Zach Burton
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Mathematical Biosciences Institute The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
40
|
Multiscale method for modeling binding phenomena involving large objects: application to kinesin motor domains motion along microtubules. Sci Rep 2016; 6:23249. [PMID: 26988596 PMCID: PMC4796874 DOI: 10.1038/srep23249] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 03/03/2016] [Indexed: 11/30/2022] Open
Abstract
Many biological phenomena involve the binding of proteins to a large object. Because the electrostatic forces that guide binding act over large distances, truncating the size of the system to facilitate computational modeling frequently yields inaccurate results. Our multiscale approach implements a computational focusing method that permits computation of large systems without truncating the electrostatic potential and achieves the high resolution required for modeling macromolecular interactions, all while keeping the computational time reasonable. We tested our approach on the motility of various kinesin motor domains. We found that electrostatics help guide kinesins as they walk: N-kinesins towards the plus-end, and C-kinesins towards the minus-end of microtubules. Our methodology enables computation in similar, large systems including protein binding to DNA, viruses, and membranes.
Collapse
|
41
|
Opron K, Xia K, Wei GW. Communication: Capturing protein multiscale thermal fluctuations. J Chem Phys 2016; 142:211101. [PMID: 26049417 DOI: 10.1063/1.4922045] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Existing elastic network models are typically parametrized at a given cutoff distance and often fail to properly predict the thermal fluctuation of many macromolecules that involve multiple characteristic length scales. We introduce a multiscale flexibility-rigidity index (mFRI) method to resolve this problem. The proposed mFRI utilizes two or three correlation kernels parametrized at different length scales to capture protein interactions at corresponding scales. It is about 20% more accurate than the Gaussian network model (GNM) in the B-factor prediction of a set of 364 proteins. Additionally, the present method is able to deliver accurate predictions for some large macromolecules on which GNM fails to produce accurate predictions. Finally, for a protein of N residues, mFRI is of linear scaling (O(N)) in computational complexity, in contrast to the order of O(N(3)) for GNM.
Collapse
Affiliation(s)
- Kristopher Opron
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
42
|
Xia K, Zhao Z, Wei GW. Multiresolution persistent homology for excessively large biomolecular datasets. J Chem Phys 2015; 143:134103. [PMID: 26450288 PMCID: PMC4592433 DOI: 10.1063/1.4931733] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2015] [Accepted: 09/08/2015] [Indexed: 12/21/2022] Open
Abstract
Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Zhixiong Zhao
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
43
|
Wang B, Xia K, Wei GW. Matched Interface and Boundary Method for Elasticity Interface Problems. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS 2015; 285:203-225. [PMID: 25914439 PMCID: PMC4404752 DOI: 10.1016/j.cam.2015.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Elasticity theory is an important component of continuum mechanics and has had widely spread applications in science and engineering. Material interfaces are ubiquity in nature and man-made devices, and often give rise to discontinuous coefficients in the governing elasticity equations. In this work, the matched interface and boundary (MIB) method is developed to address elasticity interface problems. Linear elasticity theory for both isotropic homogeneous and inhomogeneous media is employed. In our approach, Lamé's parameters can have jumps across the interface and are allowed to be position dependent in modeling isotropic inhomogeneous material. Both strong discontinuity, i.e., discontinuous solution, and weak discontinuity, namely, discontinuous derivatives of the solution, are considered in the present study. In the proposed method, fictitious values are utilized so that the standard central finite different schemes can be employed regardless of the interface. Interface jump conditions are enforced on the interface, which in turn, accurately determines fictitious values. We design new MIB schemes to account for complex interface geometries. In particular, the cross derivatives in the elasticity equations are difficult to handle for complex interface geometries. We propose secondary fictitious values and construct geometry based interpolation schemes to overcome this difficulty. Numerous analytical examples are used to validate the accuracy, convergence and robustness of the present MIB method for elasticity interface problems with both small and large curvatures, strong and weak discontinuities, and constant and variable coefficients. Numerical tests indicate second order accuracy in both L∞ and L2 norms.
Collapse
Affiliation(s)
- Bao Wang
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
44
|
Xia K, Wei GW. Persistent topology for cryo-EM data analysis. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2015; 31:n/a-n/a. [PMID: 25851063 DOI: 10.1002/cnm.2719] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Revised: 03/13/2015] [Accepted: 03/31/2015] [Indexed: 06/04/2023]
Abstract
In this work, we introduce persistent homology for the analysis of cryo-electron microscopy (cryo-EM) density maps. We identify the topological fingerprint or topological signature of noise, which is widespread in cryo-EM data. For low signal-to-noise ratio (SNR) volumetric data, intrinsic topological features of biomolecular structures are indistinguishable from noise. To remove noise, we employ geometric flows that are found to preserve the intrinsic topological fingerprints of cryo-EM structures and diminish the topological signature of noise. In particular, persistent homology enables us to visualize the gradual separation of the topological fingerprints of cryo-EM structures from those of noise during the denoising process, which gives rise to a practical procedure for prescribing a noise threshold to extract cryo-EM structure information from noise contaminated data after certain iterations of the geometric flow equation. To further demonstrate the utility of persistent homology for cryo-EM data analysis, we consider a microtubule intermediate structure Electron Microscopy Data (EMD 1129). Three helix models, an alpha-tubulin monomer model, an alpha-tubulin and beta-tubulin model, and an alpha-tubulin and beta-tubulin dimer model, are constructed to fit the cryo-EM data. The least square fitting leads to similarly high correlation coefficients, which indicates that structure determination via optimization is an ill-posed inverse problem. However, these models have dramatically different topological fingerprints. Especially, linkages or connectivities that discriminate one model from another, play little role in the traditional density fitting or optimization but are very sensitive and crucial to topological fingerprints. The intrinsic topological features of the microtubule data are identified after topological denoising. By a comparison of the topological fingerprints of the original data and those of three models, we found that the third model is topologically favored. The present work offers persistent homology based new strategies for topological denoising and for resolving ill-posed inverse problems.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
45
|
Wang B, Xia K, Wei GW. Second order Method for Solving 3D Elasticity Equations with Complex Interfaces. JOURNAL OF COMPUTATIONAL PHYSICS 2015; 294:405-438. [PMID: 25914422 PMCID: PMC4404754 DOI: 10.1016/j.jcp.2015.03.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Elastic materials are ubiquitous in nature and indispensable components in man-made devices and equipments. When a device or equipment involves composite or multiple elastic materials, elasticity interface problems come into play. The solution of three dimensional (3D) elasticity interface problems is significantly more difficult than that of elliptic counterparts due to the coupled vector components and cross derivatives in the governing elasticity equation. This work introduces the matched interface and boundary (MIB) method for solving 3D elasticity interface problems. The proposed MIB elasticity interface scheme utilizes fictitious values on irregular grid points near the material interface to replace function values in the discretization so that the elasticity equation can be discretized using the standard finite difference schemes as if there were no material interface. The interface jump conditions are rigorously enforced on the intersecting points between the interface and the mesh lines. Such an enforcement determines the fictitious values. A number of new techniques has been developed to construct efficient MIB elasticity interface schemes for dealing with cross derivative in coupled governing equations. The proposed method is extensively validated over both weak and strong discontinuity of the solution, both piecewise constant and position-dependent material parameters, both smooth and nonsmooth interface geometries, and both small and large contrasts in the Poisson's ratio and shear modulus across the interface. Numerical experiments indicate that the present MIB method is of second order convergence in both L∞ and L2 error norms for handling arbitrarily complex interfaces, including biomolecular surfaces. To our best knowledge, this is the first elasticity interface method that is able to deliver the second convergence for the molecular surfaces of proteins..
Collapse
Affiliation(s)
- Bao Wang
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA
- Center for Mathematical Molecular Biosciences, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
46
|
Xia K, Wei GW. Multidimensional persistence in biomolecular data. J Comput Chem 2015; 36:1502-20. [PMID: 26032339 PMCID: PMC4485576 DOI: 10.1002/jcc.23953] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Revised: 04/02/2015] [Accepted: 04/19/2015] [Indexed: 12/24/2022]
Abstract
Persistent homology has emerged as a popular technique for the topological simplification of big data, including biomolecular data. Multidimensional persistence bears considerable promise to bridge the gap between geometry and topology. However, its practical and robust construction has been a challenge. We introduce two families of multidimensional persistence, namely pseudomultidimensional persistence and multiscale multidimensional persistence. The former is generated via the repeated applications of persistent homology filtration to high-dimensional data, such as results from molecular dynamics or partial differential equations. The latter is constructed via isotropic and anisotropic scales that create new simiplicial complexes and associated topological spaces. The utility, robustness, and efficiency of the proposed topological methods are demonstrated via protein folding, protein flexibility analysis, the topological denoising of cryoelectron microscopy data, and the scale dependence of nanoparticles. Topological transition between partial folded and unfolded proteins has been observed in multidimensional persistence. The separation between noise topological signatures and molecular topological fingerprints is achieved by the Laplace-Beltrami flow. The multiscale multidimensional persistent homology reveals relative local features in Betti-0 invariants and the relatively global characteristics of Betti-1 and Betti-2 invariants.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
47
|
Abstract
Persistent homology has been advocated as a new strategy for the topological simplification of complex data. However, it is computationally intractable for large data sets. In this work, we introduce multiresolution persistent homology for tackling large datasets. Our basic idea is to match the resolution with the scale of interest so as to create a topological microscopy for the underlying data. We adjust the resolution via a rigidity density-based filtration. The proposed multiresolution topological analysis is validated by the study of a complex RNA molecule.
Collapse
Affiliation(s)
- Kelin Xia
- 1 Department of Mathematics, Michigan State University , East Lansing, Michigan
| | - Zhixiong Zhao
- 1 Department of Mathematics, Michigan State University , East Lansing, Michigan
| | - Guo-Wei Wei
- 1 Department of Mathematics, Michigan State University , East Lansing, Michigan.,2 Department of Electrical and Computer Engineering, Michigan State University , East Lansing, Michigan.,3 Department of Biochemistry and Molecular Biology, Michigan State University , East Lansing, Michigan
| |
Collapse
|
48
|
|
49
|
Opron K, Xia K, Wei GW. Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. J Chem Phys 2015; 140:234105. [PMID: 24952521 DOI: 10.1063/1.4882258] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N(2)). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.
Collapse
Affiliation(s)
- Kristopher Opron
- Department of Biochemistry and Molecular Biology, Michigan State University, Michigan 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Biochemistry and Molecular Biology, Michigan State University, Michigan 48824, USA
| |
Collapse
|
50
|
Heterogeneous elastic network model improves description of slow motions of proteins in solution. Chem Phys Lett 2015. [DOI: 10.1016/j.cplett.2014.11.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|