1
|
Amado D, Chaves OA, Cruz PF, Loureiro RJS, Almeida ZL, Jesus CSH, Serpa C, Brito RMM. Folding Kinetics and Volume Variation of the β-Hairpin Peptide Chignolin upon Ultrafast pH-Jumps. J Phys Chem B 2024; 128:4898-4910. [PMID: 38733339 DOI: 10.1021/acs.jpcb.3c08271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2024]
Abstract
In-depth characterization of fundamental folding steps of small model peptides is crucial for a better understanding of the folding mechanisms of more complex biomacromolecules. We have previously reported on the folding/unfolding kinetics of a model α-helix. Here, we study folding transitions in chignolin (GYDPETGTWG), a short β-hairpin peptide previously used as a model to study conformational changes in β-sheet proteins. Although previously suggested, until now, the role of the Tyr2-Trp9 interaction in the folding mechanism of chignolin was not clear. In the present work, pH-dependent conformational changes of chignolin were characterized by circular dichroism (CD), nuclear magnetic resonance (NMR), ultrafast pH-jump coupled with time-resolved photoacoustic calorimetry (TR-PAC), and molecular dynamics (MD) simulations. Taken together, our results present a comprehensive view of chignolin's folding kinetics upon local pH changes and the role of the Tyr2-Trp9 interaction in the folding process. CD data show that chignolin's β-hairpin formation displays a pH-dependent skew bell-shaped curve, with a maximum close to pH 6, and a large decrease in β-sheet content at alkaline pH. The β-hairpin structure is mainly stabilized by aromatic interactions between Tyr2 and Trp9 and CH-π interactions between Tyr2 and Pro4. Unfolding of chignolin at high pH demonstrates that protonation of Tyr2 is essential for the stability of the β-hairpin. Refolding studies were triggered by laser-induced pH-jumps and detected by TR-PAC. The refolding of chignolin from high pH, mainly due to the protonation of Tyr2, is characterized by a volume expansion (10.4 mL mol-1), independent of peptide concentration, in the microsecond time range (lifetime of 1.15 μs). At high pH, the presence of the deprotonated hydroxyl (tyrosinate) hinders the formation of the aromatic interaction between Tyr2 and Trp9 resulting in a more disorganized and dynamic tridimensional structure of the peptide. This was also confirmed by comparing MD simulations of chignolin under conditions mimicking neutral and high pH.
Collapse
Affiliation(s)
- Daniela Amado
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Otávio A Chaves
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Pedro F Cruz
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Rui J S Loureiro
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Zaida L Almeida
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Catarina S H Jesus
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Carlos Serpa
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Rui M M Brito
- CQC-IMS, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| |
Collapse
|
2
|
Bou Dagher L, Madern D, Malbos P, Brochier-Armanet C. Persistent homology reveals strong phylogenetic signal in 3D protein structures. PNAS NEXUS 2024; 3:pgae158. [PMID: 38689707 PMCID: PMC11058471 DOI: 10.1093/pnasnexus/pgae158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 04/01/2024] [Indexed: 05/02/2024]
Abstract
Changes that occur in proteins over time provide a phylogenetic signal that can be used to decipher their evolutionary history and the relationships between organisms. Sequence comparison is the most common way to access this phylogenetic signal, while those based on 3D structure comparisons are still in their infancy. In this study, we propose an effective approach based on Persistent Homology Theory (PH) to extract the phylogenetic information contained in protein structures. PH provides efficient and robust algorithms for extracting and comparing geometric features from noisy datasets at different spatial resolutions. PH has a growing number of applications in the life sciences, including the study of proteins (e.g. classification, folding). However, it has never been used to study the phylogenetic signal they may contain. Here, using 518 protein families, representing 22,940 protein sequences and structures, from 10 major taxonomic groups, we show that distances calculated with PH from protein structures correlate strongly with phylogenetic distances calculated from protein sequences, at both small and large evolutionary scales. We test several methods for calculating PH distances and propose some refinements to improve their relevance for addressing evolutionary questions. This work opens up new perspectives in evolutionary biology by proposing an efficient way to access the phylogenetic signal contained in protein structures, as well as future developments of topological analysis in the life sciences.
Collapse
Affiliation(s)
- Léa Bou Dagher
- Université Claude Bernard Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et BiologieÉvolutive, UMR5558, F-69622 Villeurbanne, France
- Université Claude Bernard Lyon 1, CNRS, Institut Camille Jordan, UMR5208, F-69622 Villeurbanne, France
- Université Libanaise, Laboratoire de Mathématiques, École Doctorale en Science et Technologie, PO BOX 5 Hadath, Liban
| | - Dominique Madern
- University Grenoble Alpes, CEA, CNRS, IBS, 38000 Grenoble, France
| | - Philippe Malbos
- Université Claude Bernard Lyon 1, CNRS, Institut Camille Jordan, UMR5208, F-69622 Villeurbanne, France
| | - Céline Brochier-Armanet
- Université Claude Bernard Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et BiologieÉvolutive, UMR5558, F-69622 Villeurbanne, France
| |
Collapse
|
3
|
Ehiro T. Descriptor generation from Morgan fingerprint using persistent homology. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:31-51. [PMID: 38234251 DOI: 10.1080/1062936x.2023.2301327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 12/28/2023] [Indexed: 01/19/2024]
Abstract
In cheminformatics, molecular fingerprints (FPs) are used in various tasks such as regression and classification. However, predictive models often underutilize Morgan FP for regression and related tasks in machine learning. This study introduced descriptors derived from reshaped Morgan FPs using persistent homology for the predictive accuracy improvement. In the solvation free energy (FreeSolv) and water solubility (ESOL) datasets, persistent homology was found to enhance predictive accuracy compared to the use of only Morgan FPs. Notably, using the first-order persistence diagram (PD1) for descriptor generation resulted in more significant improvements than using the zeroth-order persistence diagram (PD0). Combining 4096 bits Morgan FPs with PD1-generated descriptors increased the average coefficient of determination in the Gaussian process regression from 0.597 to 0.667 for FreeSolv and from 0.629 to 0.654 for ESOL. Adjusting the grid size parameter during PD-based descriptor generation is crucial, as finer grids, especially with PD0, generate more descriptors but reduce predictive accuracy. Coarsening the grid or applying principal component analysis (PCA) mitigates overfitting and enhances accuracy. When descriptors were generated from Morgan FPs with randomly shuffled bit positions, coarsening the grid and/or applying PCA achieved similar accuracy improvements as when the persistent homology of the original Morgan FPs was used.
Collapse
Affiliation(s)
- T Ehiro
- Research Division of Polymer Functional Materials, Osaka Research Institute of Industrial Science and Technology, Izumi, Osaka, Japan
| |
Collapse
|
4
|
Tarín-Pelló A, Suay-García B, Forés-Martos J, Falcó A, Pérez-Gracia MT. Computer-aided drug repurposing to tackle antibiotic resistance based on topological data analysis. Comput Biol Med 2023; 166:107496. [PMID: 37793206 DOI: 10.1016/j.compbiomed.2023.107496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 08/29/2023] [Accepted: 09/15/2023] [Indexed: 10/06/2023]
Abstract
The progressive emergence of antimicrobial resistance has become a global health problem in need of rapid solution. Research into new antimicrobial drugs is imperative. Drug repositioning, together with computational mathematical prediction models, could be a fast and efficient method of searching for new antibiotics. The aim of this study was to identify compounds with potential antimicrobial capacity against Escherichia coli from US Food and Drug Administration-approved drugs, and the similarity between known drug targets and E. coli proteins using a topological structure-activity data analysis model. This model has been shown to identify molecules with known antibiotic capacity, such as carbapenems and cephalosporins, as well as new molecules that could act as antimicrobials. Topological similarities were also found between E. coli proteins and proteins from different bacterial species such as Mycobacterium tuberculosis, Pseudomonas aeruginosa and Salmonella Typhimurium, which could imply that the selected molecules have a broader spectrum than expected. These molecules include antitumor drugs, antihistamines, lipid-lowering agents, hypoglycemic agents, antidepressants, nucleotides, and nucleosides, among others. The results presented in this study prove the ability of computational mathematical prediction models to predict molecules with potential antimicrobial capacity and/or possible new pharmacological targets of interest in the design of new antibiotics and in the better understanding of antimicrobial resistance.
Collapse
Affiliation(s)
- Antonio Tarín-Pelló
- Área de Microbiología, Departamento de Farmacia, Instituto de Ciencias Biomédicas, Facultad de Ciencias de la Salud Universidad Cardenal Herrera-CEU, CEU Universities, C/ Santiago Ramón y Cajal, 46115, Alfara del Patriarca, Valencia, Spain
| | - Beatriz Suay-García
- ESI International Chair@CEU-UCH, Departamento de Matemáticas, Física y Ciencias Tecnológicas, Universidad Cardenal Herrera-CEU, CEU Universities, C/ San Bartolomé 55, 46115, Alfara del Patriarca, Valencia, Spain
| | - Jaume Forés-Martos
- ESI International Chair@CEU-UCH, Departamento de Matemáticas, Física y Ciencias Tecnológicas, Universidad Cardenal Herrera-CEU, CEU Universities, C/ San Bartolomé 55, 46115, Alfara del Patriarca, Valencia, Spain
| | - Antonio Falcó
- ESI International Chair@CEU-UCH, Departamento de Matemáticas, Física y Ciencias Tecnológicas, Universidad Cardenal Herrera-CEU, CEU Universities, C/ San Bartolomé 55, 46115, Alfara del Patriarca, Valencia, Spain
| | - María-Teresa Pérez-Gracia
- Área de Microbiología, Departamento de Farmacia, Instituto de Ciencias Biomédicas, Facultad de Ciencias de la Salud Universidad Cardenal Herrera-CEU, CEU Universities, C/ Santiago Ramón y Cajal, 46115, Alfara del Patriarca, Valencia, Spain.
| |
Collapse
|
5
|
Bobrowski O, Skraba P. A universal null-distribution for topological data analysis. Sci Rep 2023; 13:12274. [PMID: 37507400 PMCID: PMC10382541 DOI: 10.1038/s41598-023-37842-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 06/28/2023] [Indexed: 07/30/2023] Open
Abstract
One of the most elusive challenges within the area of topological data analysis is understanding the distribution of persistence diagrams arising from data. Despite much effort and its many successful applications, this is largely an open problem. We present a surprising discovery: normalized properly, persistence diagrams arising from random point-clouds obey a universal probability law. Our statements are based on extensive experimentation on both simulated and real data, covering point-clouds with vastly different geometry, topology, and probability distributions. Our results also include an explicit well-known distribution as a candidate for the universal law. We demonstrate the power of these new discoveries by proposing a new hypothesis testing framework for computing significance values for individual topological features within persistence diagrams, providing a new quantitative way to assess the significance of structure in data.
Collapse
Affiliation(s)
- Omer Bobrowski
- Viterbi Faculty of Electrical and Computer Engineering, Technion - Israel Institute of Technology, Haifa, Israel.
- School of Mathematical Sciences, Queen Mary University of London, London, UK.
| | - Primoz Skraba
- School of Mathematical Sciences, Queen Mary University of London, London, UK.
| |
Collapse
|
6
|
Skaf Y, Laubenbacher R. Topological data analysis in biomedicine: A review. J Biomed Inform 2022; 130:104082. [PMID: 35508272 DOI: 10.1016/j.jbi.2022.104082] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/20/2022] [Accepted: 04/23/2022] [Indexed: 01/22/2023]
Abstract
Significant technological advances made in recent years have shepherded a dramatic increase in utilization of digital technologies for biomedicine- everything from the widespread use of electronic health records to improved medical imaging capabilities and the rising ubiquity of genomic sequencing contribute to a "digitization" of biomedical research and clinical care. With this shift toward computerized tools comes a dramatic increase in the amount of available data, and current tools for data analysis capable of extracting meaningful knowledge from this wealth of information have yet to catch up. This article seeks to provide an overview of emerging mathematical methods with the potential to improve the abilities of clinicians and researchers to analyze biomedical data, but may be hindered from doing so by a lack of conceptual accessibility and awareness in the life sciences research community. In particular, we focus on topological data analysis (TDA), a set of methods grounded in the mathematical field of algebraic topology that seeks to describe and harness features related to the "shape" of data. We aim to make such techniques more approachable to non-mathematicians by providing a conceptual discussion of their theoretical foundations followed by a survey of their published applications to scientific research. Finally, we discuss the limitations of these methods and suggest potential avenues for future work integrating mathematical tools into clinical care and biomedical informatics.
Collapse
Affiliation(s)
- Yara Skaf
- University of Florida, Department of Mathematics, Gainesville, FL, USA; University of Florida, Department of Medicine, Division of Pulmonary, Critical Care, & Sleep Medicine, Gainesville, FL, USA.
| | - Reinhard Laubenbacher
- University of Florida, Department of Mathematics, Gainesville, FL, USA; University of Florida, Department of Medicine, Division of Pulmonary, Critical Care, & Sleep Medicine, Gainesville, FL, USA.
| |
Collapse
|
7
|
Le MQ, Taylor D. Persistent homology of convection cycles in network flows. Phys Rev E 2022; 105:044311. [PMID: 35590622 DOI: 10.1103/physreve.105.044311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 03/29/2022] [Indexed: 06/15/2023]
Abstract
Convection is a well-studied topic in fluid dynamics, yet it is less understood in the context of network flows. Here, we incorporate techniques from topological data analysis (namely, persistent homology) to automate the detection and characterization of convective flows (also called cyclic or chiral flows) over networks, particularly those that arise for irreversible Markov chains. As two applications, we study convection cycles arising under the PageRank algorithm and we investigate chiral edge flows for a stochastic model of a bimonomer's configuration dynamics. Our experiments highlight how system parameters-e.g., the teleportation rate for PageRank and the transition rates of external and internal state changes for a monomer-can act as homology regularizers of convection, which we summarize with persistence barcodes and homological bifurcation diagrams. Our approach establishes a connection between the study of convection cycles and homology, the branch of mathematics that formally studies cycles, which has diverse potential applications throughout the sciences and engineering.
Collapse
Affiliation(s)
- Minh Quang Le
- Department of Mathematics, University at Buffalo, State University of New York, Buffalo, New York 14260, USA
| | - Dane Taylor
- Department of Mathematics, University at Buffalo, State University of New York, Buffalo, New York 14260, USA
| |
Collapse
|
8
|
Ichinomiya T. Topological data analysis gives two folding paths in HP35(nle-nle), double mutant of villin headpiece subdomain. Sci Rep 2022; 12:2719. [PMID: 35177744 PMCID: PMC8854739 DOI: 10.1038/s41598-022-06682-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 02/04/2022] [Indexed: 11/16/2022] Open
Abstract
The folding dynamics of proteins is a primary area of interest in protein science. We carried out topological data analysis (TDA) of the folding process of HP35(nle-nle), a double-mutant of the villin headpiece subdomain. Using persistent homology and non-negative matrix factorization, we reduced the dimension of protein structure and investigated the flow in the reduced space. We found this protein has two folding paths, distinguished by the pairings of inter-helix residues. Our analysis showed the excellent performance of TDA in capturing the formation of tertiary structure.
Collapse
Affiliation(s)
- Takashi Ichinomiya
- Department of Systems Biology, Gifu University School of Medicine, Yanagido 1-1, Gifu, 501-1194, Japan. .,The United Graduate School of Drug Discovery and Medical Information Sciences of Gifu University, Yanagido 1-1, Gifu, 501-1194, Japan.
| |
Collapse
|
9
|
Moroni D, Pascali MA. Learning Topology: Bridging Computational Topology and Machine Learning. PATTERN RECOGNITION AND IMAGE ANALYSIS 2021. [DOI: 10.1134/s1054661821030184] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|