1
|
Kulkarni P, Bhattacharya S, Achuthan S, Behal A, Jolly MK, Kotnala S, Mohanty A, Rangarajan G, Salgia R, Uversky V. Intrinsically Disordered Proteins: Critical Components of the Wetware. Chem Rev 2022; 122:6614-6633. [PMID: 35170314 PMCID: PMC9250291 DOI: 10.1021/acs.chemrev.1c00848] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Despite the wealth of knowledge gained about intrinsically disordered proteins (IDPs) since their discovery, there are several aspects that remain unexplored and, hence, poorly understood. A living cell is a complex adaptive system that can be described as a wetware─a metaphor used to describe the cell as a computer comprising both hardware and software and attuned to logic gates─capable of "making" decisions. In this focused Review, we discuss how IDPs, as critical components of the wetware, influence cell-fate decisions by wiring protein interaction networks to keep them minimally frustrated. Because IDPs lie between order and chaos, we explore the possibility that they can be modeled as attractors. Further, we discuss how the conformational dynamics of IDPs manifests itself as conformational noise, which can potentially amplify transcriptional noise to stochastically switch cellular phenotypes. Finally, we explore the potential role of IDPs in prebiotic evolution, in forming proteinaceous membrane-less organelles, in the origin of multicellularity, and in protein conformation-based transgenerational inheritance of acquired characteristics. Together, these ideas provide a new conceptual framework to discern how IDPs may perform critical biological functions despite their lack of structure.
Collapse
Affiliation(s)
- Prakash Kulkarni
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA, USA
- Address for correspondence: Prakash Kulkarni, Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA 91010, , Vladimir N. Uversky, Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612,
| | - Supriyo Bhattacharya
- Integrative Genomics Core, City of Hope National Medical Center, Duarte, CA, USA
| | - Srisairam Achuthan
- Division of Research Informatics, Center for Informatics, City of Hope National Medical Center, Duarte, CA 91010, USA
| | - Amita Behal
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA, USA
| | - Mohit Kumar Jolly
- Center for BioSystems Science and Engineering, Indian Institute of Science, Bangalore 560012, India
| | - Sourabh Kotnala
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA, USA
| | - Atish Mohanty
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA, USA
| | - Govindan Rangarajan
- Department of Mathematics, Indian Institute of Science, Bangalore 560012, India
- Center for Neuroscience, Indian Institute of Science, Bangalore 560012, India
| | - Ravi Salgia
- Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA, USA
| | - Vladimir Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, Moscow region 141700, Russia
- Address for correspondence: Prakash Kulkarni, Department of Medical Oncology and Therapeutics Research, City of Hope National Medical Center, Duarte, CA 91010, , Vladimir N. Uversky, Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612,
| |
Collapse
|
2
|
Pun CS, Yong BYS, Xia K. Weighted-persistent-homology-based machine learning for RNA flexibility analysis. PLoS One 2020; 15:e0237747. [PMID: 32822369 PMCID: PMC7446851 DOI: 10.1371/journal.pone.0237747] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/01/2020] [Indexed: 12/22/2022] Open
Abstract
With the great significance of biomolecular flexibility in biomolecular dynamics and functional analysis, various experimental and theoretical models are developed. Experimentally, Debye-Waller factor, also known as B-factor, measures atomic mean-square displacement and is usually considered as an important measurement for flexibility. Theoretically, elastic network models, Gaussian network model, flexibility-rigidity model, and other computational models have been proposed for flexibility analysis by shedding light on the biomolecular inner topological structures. Recently, a topology-based machine learning model has been proposed. By using the features from persistent homology, this model achieves a remarkable high Pearson correlation coefficient (PCC) in protein B-factor prediction. Motivated by its success, we propose weighted-persistent-homology (WPH)-based machine learning (WPHML) models for RNA flexibility analysis. Our WPH is a newly-proposed model, which incorporate physical, chemical and biological information into topological measurements using a weight function. In particular, we use local persistent homology (LPH) to focus on the topological information of local regions. Our WPHML model is validated on a well-established RNA dataset, and numerical experiments show that our model can achieve a PCC of up to 0.5822. The comparison with the previous sequence-information-based learning models shows that a consistent improvement in performance by at least 10% is achieved in our current model.
Collapse
Affiliation(s)
- Chi Seng Pun
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
- * E-mail: (CSP); (KX)
| | - Brandon Yung Sin Yong
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
- * E-mail: (CSP); (KX)
| |
Collapse
|
3
|
Cang Z, Munch E, Wei GW. Evolutionary homology on coupled dynamical systems with applications to protein flexibility analysis. ACTA ACUST UNITED AC 2020; 4:481-507. [PMID: 34179350 DOI: 10.1007/s41468-020-00057-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
While the spatial topological persistence is naturally constructed from a radius-based filtration, it has hardly been derived from a temporal filtration. Most topological models are designed for the global topology of a given object as a whole. There is no method reported in the literature for the topology of an individual component in an object to the best of our knowledge. For many problems in science and engineering, the topology of an individual component is important for describing its properties. We propose evolutionary homology (EH) constructed via a time evolution-based filtration and topological persistence. Our approach couples a set of dynamical systems or chaotic oscillators by the interactions of a physical system, such as a macromolecule. The interactions are approximated by weighted graph Laplacians. Simplices, simplicial complexes, algebraic groups and topological persistence are defined on the coupled trajectories of the chaotic oscillators. The resulting EH gives rise to time-dependent topological invariants or evolutionary barcodes for an individual component of the physical system, revealing its topology-function relationship. In conjunction with Wasserstein metrics, the proposed EH is applied to protein flexibility analysis, an important problem in computational biophysics. Numerical results for the B-factor prediction of a benchmark set of 364 proteins indicate that the proposed EH outperforms all the other state-of-the-art methods in the field.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University
| | - Elizabeth Munch
- Department of Computational Mathematics, Science and Engineering, Michigan State University
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University
| |
Collapse
|
4
|
Abstract
Recently, machine learning (ML) has established itself in various worldwide benchmarking competitions in computational biology, including Critical Assessment of Structure Prediction (CASP) and Drug Design Data Resource (D3R) Grand Challenges. However, the intricate structural complexity and high ML dimensionality of biomolecular datasets obstruct the efficient application of ML algorithms in the field. In addition to data and algorithm, an efficient ML machinery for biomolecular predictions must include structural representation as an indispensable component. Mathematical representations that simplify the biomolecular structural complexity and reduce ML dimensionality have emerged as a prime winner in D3R Grand Challenges. This review is devoted to the recent advances in developing low-dimensional and scalable mathematical representations of biomolecules in our laboratory. We discuss three classes of mathematical approaches, including algebraic topology, differential geometry, and graph theory. We elucidate how the physical and biological challenges have guided the evolution and development of these mathematical apparatuses for massive and diverse biomolecular data. We focus the performance analysis on protein-ligand binding predictions in this review although these methods have had tremendous success in many other applications, such as protein classification, virtual screening, and the predictions of solubility, solvation free energies, toxicity, partition coefficients, protein folding stability changes upon mutation, etc.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA. and Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA and Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
5
|
Huber MC, Schreiber A, Schiller SM. Minimalist Protocell Design: A Molecular System Based Solely on Proteins that Form Dynamic Vesicular Membranes Embedding Enzymatic Functions. Chembiochem 2019; 20:2618-2632. [PMID: 31183952 DOI: 10.1002/cbic.201900283] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Indexed: 12/24/2022]
Abstract
Life in its molecular context is characterized by the challenge of orchestrating structure, energy and information processes through compartmentalization and chemical transformations amenable to mimicry of protocell models. Here we present an alternative protocell model incorporating dynamic membranes based on amphiphilic elastin-like proteins (ELPs) rather than phospholipids. For the first time we demonstrate the feasibility of combining vesicular membrane formation and biocatalytic activity with molecular entities of a single class: proteins. The presented self-assembled protein-membrane-based compartments (PMBCs) accommodate either an anabolic reaction, based on free DNA ligase as an example of information transformation processes, or a catabolic process. We present a catabolic process based on a single molecular entity combining an amphiphilic protein with tobacco etch virus (TEV) protease as part of the enclosure of a reaction space and facilitating selective catalytic transformations. Combining compartmentalization and biocatalytic activity by utilizing an amphiphilic molecular building block with and without enzyme functionalization enables new strategies in bottom-up synthetic biology, regenerative medicine, pharmaceutical science and biotechnology.
Collapse
Affiliation(s)
- Matthias C Huber
- Zentrum für Biosystemanalyse (ZBSA), Albert-Ludwigs-Universität Freiburg, Habsburgerstrasse 49, 79104, Freiburg, Germany
- Faculty of Biology, University of Freiburg, Schänzlestrasse 1, 79085, Freiburg, Germany
| | - Andreas Schreiber
- Zentrum für Biosystemanalyse (ZBSA), Albert-Ludwigs-Universität Freiburg, Habsburgerstrasse 49, 79104, Freiburg, Germany
- Faculty of Biology, University of Freiburg, Schänzlestrasse 1, 79085, Freiburg, Germany
| | - Stefan M Schiller
- Zentrum für Biosystemanalyse (ZBSA), Albert-Ludwigs-Universität Freiburg, Habsburgerstrasse 49, 79104, Freiburg, Germany
- Faculty of Biology, University of Freiburg, Schänzlestrasse 1, 79085, Freiburg, Germany
- BIOSS Centre for Biological Signalling Studies, University of Freiburg, Schänzlestrasse 18, 79104, Freiburg, Germany
- Cluster of Excellence livMatS @ FIT, Freiburg Center for Interactive Materials and Bioinspired Technologies, University of Freiburg, Georges-Köhler-Allee 105, 79110, Freiburg, Germany
- IMTEK Department of Microsystems Engineering, University of Freiburg, Georges-Köhler-Allee 103, 79110, Freiburg, Germany
| |
Collapse
|
6
|
Abstract
Flexibility-rigidity index (FRI) has been developed as a robust, accurate, and efficient method for macromolecular thermal fluctuation analysis and B-factor prediction. The performance of FRI depends on its formulations of rigidity index and flexibility index. In this work, we introduce alternative rigidity and flexibility formulations. The structure of the classic Gaussian surface is utilized to construct a new type of rigidity index, which leads to a new class of rigidity densities with the classic Gaussian surface as a special case. Additionally, we introduce a new type of flexibility index based on the domain indicator property of normalized rigidity density. These generalized FRI (gFRI) methods have been extensively validated by the B-factor predictions of 364 proteins. Significantly outperforming the classic Gaussian network model, gFRI is a new generation of methodologies for accurate, robust, and efficient analysis of protein flexibility and fluctuation. Finally, gFRI based molecular surface generation and flexibility visualization are demonstrated.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
7
|
Opron K, Xia K, Burton Z, Wei GW. Flexibility-rigidity index for protein-nucleic acid flexibility and fluctuation analysis. J Comput Chem 2016; 37:1283-95. [PMID: 26927815 PMCID: PMC5844491 DOI: 10.1002/jcc.24320] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 12/02/2015] [Accepted: 01/17/2016] [Indexed: 12/29/2022]
Abstract
Protein-nucleic acid complexes are important for many cellular processes including the most essential functions such as transcription and translation. For many protein-nucleic acid complexes, flexibility of both macromolecules has been shown to be critical for specificity and/or function. The flexibility-rigidity index (FRI) has been proposed as an accurate and efficient approach for protein flexibility analysis. In this article, we introduce FRI for the flexibility analysis of protein-nucleic acid complexes. We demonstrate that a multiscale strategy, which incorporates multiple kernels to capture various length scales in biomolecular collective motions, is able to significantly improve the state of art in the flexibility analysis of protein-nucleic acid complexes. We take the advantage of the high accuracy and O(N) computational complexity of our multiscale FRI method to investigate the flexibility of ribosomal subunits, which are difficult to analyze by alternative approaches. An anisotropic FRI approach, which involves localized Hessian matrices, is utilized to study the translocation dynamics in an RNA polymerase.
Collapse
Affiliation(s)
- Kristopher Opron
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| | - Kelin Xia
- Department of Mathematics Michigan State University, MI 48824, USA
| | - Zach Burton
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Mathematical Biosciences Institute The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
8
|
Opron K, Xia K, Wei GW. Communication: Capturing protein multiscale thermal fluctuations. J Chem Phys 2016; 142:211101. [PMID: 26049417 DOI: 10.1063/1.4922045] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Existing elastic network models are typically parametrized at a given cutoff distance and often fail to properly predict the thermal fluctuation of many macromolecules that involve multiple characteristic length scales. We introduce a multiscale flexibility-rigidity index (mFRI) method to resolve this problem. The proposed mFRI utilizes two or three correlation kernels parametrized at different length scales to capture protein interactions at corresponding scales. It is about 20% more accurate than the Gaussian network model (GNM) in the B-factor prediction of a set of 364 proteins. Additionally, the present method is able to deliver accurate predictions for some large macromolecules on which GNM fails to produce accurate predictions. Finally, for a protein of N residues, mFRI is of linear scaling (O(N)) in computational complexity, in contrast to the order of O(N(3)) for GNM.
Collapse
Affiliation(s)
- Kristopher Opron
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | - Kelin Xia
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, USA
| | - Guo-Wei Wei
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
9
|
DeForte S, Reddy KD, Uversky VN. Quarterly intrinsic disorder digest (January-February-March, 2014). INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1153395. [PMID: 28232896 DOI: 10.1080/21690707.2016.1153395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
This is the 5th issue of the Digested Disorder series that represents a reader's digest of the scientific literature on intrinsically disordered proteins. We continue to use only 2 criteria for inclusion of a paper to this digest: The publication date (a paper should be published within the covered time frame) and the topic (a paper should be dedicated to any aspect of protein intrinsic disorder). The current digest issue covers papers published during the first quarter of 2014; i.e., during the period of January, February, and March of 2014. Similar to previous issues, the papers are grouped hierarchically by topics they cover, and for each of the included papers a short description is given on its major findings.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Krishna D Reddy
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Biology Department, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| |
Collapse
|
10
|
Xia K, Wei GW. Multidimensional persistence in biomolecular data. J Comput Chem 2015; 36:1502-20. [PMID: 26032339 PMCID: PMC4485576 DOI: 10.1002/jcc.23953] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Revised: 04/02/2015] [Accepted: 04/19/2015] [Indexed: 12/24/2022]
Abstract
Persistent homology has emerged as a popular technique for the topological simplification of big data, including biomolecular data. Multidimensional persistence bears considerable promise to bridge the gap between geometry and topology. However, its practical and robust construction has been a challenge. We introduce two families of multidimensional persistence, namely pseudomultidimensional persistence and multiscale multidimensional persistence. The former is generated via the repeated applications of persistent homology filtration to high-dimensional data, such as results from molecular dynamics or partial differential equations. The latter is constructed via isotropic and anisotropic scales that create new simiplicial complexes and associated topological spaces. The utility, robustness, and efficiency of the proposed topological methods are demonstrated via protein folding, protein flexibility analysis, the topological denoising of cryoelectron microscopy data, and the scale dependence of nanoparticles. Topological transition between partial folded and unfolded proteins has been observed in multidimensional persistence. The separation between noise topological signatures and molecular topological fingerprints is achieved by the Laplace-Beltrami flow. The multiscale multidimensional persistent homology reveals relative local features in Betti-0 invariants and the relatively global characteristics of Betti-1 and Betti-2 invariants.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
11
|
Heterogeneous elastic network model improves description of slow motions of proteins in solution. Chem Phys Lett 2015. [DOI: 10.1016/j.cplett.2014.11.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
12
|
Xia K, Feng X, Tong Y, Wei GW. Persistent homology for the quantitative prediction of fullerene stability. J Comput Chem 2014; 36:408-22. [PMID: 25523342 DOI: 10.1002/jcc.23816] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Revised: 10/25/2014] [Accepted: 11/23/2014] [Indexed: 11/08/2022]
Abstract
Persistent homology is a relatively new tool often used for qualitative analysis of intrinsic topological features in images and data originated from scientific and engineering applications. In this article, we report novel quantitative predictions of the energy and stability of fullerene molecules, the very first attempt in using persistent homology in this context. The ground-state structures of a series of small fullerene molecules are first investigated with the standard Vietoris-Rips complex. We decipher all the barcodes, including both short-lived local bars and long-lived global bars arising from topological invariants, and associate them with fullerene structural details. Using accumulated bar lengths, we build quantitative models to correlate local and global Betti-2 bars, respectively with the heat of formation and total curvature energies of fullerenes. It is found that the heat of formation energy is related to the local hexagonal cavities of small fullerenes, while the total curvature energies of fullerene isomers are associated with their sphericities, which are measured by the lengths of their long-lived Betti-2 bars. Excellent correlation coefficients (>0.94) between persistent homology predictions and those of quantum or curvature analysis have been observed. A correlation matrix based filtration is introduced to further verify our findings.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, Michigan, 48824
| | | | | | | |
Collapse
|
13
|
Zhan M, Li S, Li F. Wavelet transformed Gaussian network model. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2014. [DOI: 10.1142/s0219633614500539] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Accurate prediction of the Debye–Waller temperature factor of proteins is of significant importance in the study of protein dynamics and function. This work explores the utility of wavelets for improving the performance of Gaussian network model (GNM). We propose two wavelet transformed Gaussian network models (wtGNM), namely a scale-one wtGNM and a scale-two wtGNM. Based on a set of 113 protein structures, it shows that the mean correlation with experimental results for the scale-one wtGNM is 0.714 and that for the scale-two wtGNM is 0.738. In contrast, the mean correlation for the original GNM is 0.594. Therefore, the wtGNM is a potential algorithm for improving the GNM prediction of protein B-factors.
Collapse
Affiliation(s)
- Meng Zhan
- Wuhan Center for Magnetic Resonance, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071, P. R. China
| | - Suhong Li
- Wuhan Center for Magnetic Resonance, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071, P. R. China
- University of the Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Fan Li
- Wuhan Center for Magnetic Resonance, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071, P. R. China
- University of the Chinese Academy of Sciences, Beijing 100049, P. R. China
| |
Collapse
|
14
|
Xia K, Wei GW. Persistent homology analysis of protein structure, flexibility, and folding. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2014; 30:814-44. [PMID: 24902720 PMCID: PMC4131872 DOI: 10.1002/cnm.2655] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2014] [Revised: 05/19/2014] [Accepted: 05/21/2014] [Indexed: 05/04/2023]
Abstract
Proteins are the most important biomolecules for living organisms. The understanding of protein structure, function, dynamics, and transport is one of the most challenging tasks in biological science. In the present work, persistent homology is, for the first time, introduced for extracting molecular topological fingerprints (MTFs) based on the persistence of molecular topological invariants. MTFs are utilized for protein characterization, identification, and classification. The method of slicing is proposed to track the geometric origin of protein topological invariants. Both all-atom and coarse-grained representations of MTFs are constructed. A new cutoff-like filtration is proposed to shed light on the optimal cutoff distance in elastic network models. On the basis of the correlation between protein compactness, rigidity, and connectivity, we propose an accumulated bar length generated from persistent topological invariants for the quantitative modeling of protein flexibility. To this end, a correlation matrix-based filtration is developed. This approach gives rise to an accurate prediction of the optimal characteristic distance used in protein B-factor analysis. Finally, MTFs are employed to characterize protein topological evolution during protein folding and quantitatively predict the protein folding stability. An excellent consistence between our persistent homology prediction and molecular dynamics simulation is found. This work reveals the topology-function relationship of proteins.
Collapse
Affiliation(s)
- Kelin Xia
- Department of Mathematics, Michigan State University, MI 48824, USA
- Center for Mathematical Molecular Biosciences, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Center for Mathematical Molecular Biosciences, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|