1
|
Wozniak S, Janson G, Feig M. Accurate Predictions of Molecular Properties of Proteins via Graph Neural Networks and Transfer Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.10.627714. [PMID: 39713395 PMCID: PMC11661272 DOI: 10.1101/2024.12.10.627714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Machine learning has emerged as a promising approach for predicting molecular properties of proteins, as it addresses limitations of experimental and traditional computational methods. Here, we introduce GSnet, a graph neural network (GNN) trained to predict physicochemical and geometric properties including solvation free energies, diffusion constants, and hydrodynamic radii, based on three-dimensional protein structures. By leveraging transfer learning, pre-trained GSnet embeddings were adapted to predict solvent-accessible surface area (SASA) and residue-specific pKa values, achieving high accuracy and generalizability. Notably, GSnet outperformed existing protein embeddings for SASA prediction, and a locally charge-aware variant, aLCnet, approached the accuracy of simulation-based and empirical methods for pKa prediction. Our GNN framework demonstrated robustness across diverse datasets, including intrinsically disordered peptides, and scalability for high-throughput applications. These results highlight the potential of GNN-based embeddings and transfer learning to advance protein structure analysis, providing a foundation for integrating predictive models into proteome-wide studies and structural biology pipelines.
Collapse
Affiliation(s)
- Spencer Wozniak
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
2
|
Ramos FC, Martínez L. Molecular dynamics and solvation structures of the β-glucosidase from Humicola insolens (BGHI) in aqueous solutions containing glucose. Int J Biol Macromol 2024; 286:138210. [PMID: 39617236 DOI: 10.1016/j.ijbiomac.2024.138210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 11/27/2024] [Accepted: 11/28/2024] [Indexed: 12/15/2024]
Abstract
The β-glucosidase enzyme is a glycosyl hydrolase that breaks down the β-1,4 linkage of cellobiose. It is inhibited by glucose at high concentrations due to competitive inhibition. However, at lower glucose concentrations, the glucose-tolerant β-glucosidase from Humicola insolens (BGHI) undergoes stimulation. Proteins, in aqueous sugar solutions, tend to be preferentially hydrated, which generally promotes their stabilization. Thus, solvation phenomena may contribute to both glucose tolerance and stimulation processes. We have performed atomistic classical Molecular Dynamics (MD) simulations of BGHI at different glucose concentrations to mimic the conditions found in the catalytic experiments. A detailed examination of the solvent environment through the calculation of minimum distance distribution functions (MDDFs) and Kirkwood-Buff (KB) integrals was performed. The enzyme is preferentially hydrated in the presence of glucose at all concentrations. Nevertheless, the hydration does not prevent the glucose from directly interacting with the BGHI surface or from entering the active site. Based on the obtained results, we hypothesize that preferential hydration is beneficial for enzyme activity. At the same time, product inhibition has little effect at lower concentrations of glucose, and at higher glucose concentrations, competition for the active site becomes predominant and the enzyme is primarily inhibited.
Collapse
Affiliation(s)
- Felipe Cardoso Ramos
- Institute of Chemistry and Center for Computing in Engineering and Science - CCES, Universidade Estadual de Campinas (UNICAMP), Brazil
| | - Leandro Martínez
- Institute of Chemistry and Center for Computing in Engineering and Science - CCES, Universidade Estadual de Campinas (UNICAMP), Brazil.
| |
Collapse
|
3
|
Tammara V, Doke AA, Jha SK, Das A. Deciphering the Monomeric and Dimeric Conformational Landscapes of the Full-Length TDP-43 and the Impact of the C-Terminal Domain. ACS Chem Neurosci 2024. [PMID: 39548975 DOI: 10.1021/acschemneuro.4c00557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2024] Open
Abstract
The aberrant aggregation of TAR DNA-binding protein 43 kDa (TDP-43) in cells leads to the pathogenesis of multiple fatal neurodegenerative diseases. Decoding the proposed initial transition between its functional dimeric and aggregation-prone monomeric states can potentially design a viable therapeutic strategy, which is presently limited by the lack of structural detail of the full-length TDP-43. To achieve a complete understanding of such a delicate phase space, we employed a multiscale simulation approach that unearths numerous crucial features, broadly summarized in two categories: (1) state-independent features that involve inherent chain collapsibility, rugged polymorphic landscape dictated by the terminal domains, high β-sheet propensity, structural integrity preserved by backbone-based intrachain hydrogen bonds and electrostatic forces, the prominence of the C-terminal domain in the intrachain cross-domain interfaces, and equal participation of hydrophobic and hydrophilic (charged and polar) residues in cross-domain interfaces; and (2) dimerization-modulated characteristics that encompass slower collapsing dynamics, restricted polymorphic landscape, the dominance of side chains in interchain hydrogen bonds, the appearance of the N-terminal domain in the dimer interface, and the prominence of hydrophilic (specifically polar) residues in interchain homo- and cross-domain interfaces. In our work, the ill-known C-terminal domain appears as the most crucial structure-dictating domain, which preferably populates a compact conformation with a high β-sheet propensity in its isolated state stabilized by intrabackbone hydrogen bonds, and these signatures are comparatively faded in its integrated form. Validation of our simulated observables by a complementary spectroscopic approach on multiple counts ensures the robustness of the computationally predicted features of the TDP-43 aggregation landscape.
Collapse
Affiliation(s)
- Vaishnavi Tammara
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune, Maharashtra 411008, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Abhilasha A Doke
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune, Maharashtra 411008, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Santosh Kumar Jha
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune, Maharashtra 411008, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Atanu Das
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune, Maharashtra 411008, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
4
|
Gandhi VD, Hua L, Lawrenz M, Latif M, Rolland AD, Campuzano IDG, Larriba-Andaluz C. Elucidating Protein Structures in the Gas Phase: Traversing Configuration Space with Biasing Methods. J Chem Theory Comput 2024; 20:9720-9733. [PMID: 39439194 DOI: 10.1021/acs.jctc.4c00288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Achieving accurate characterization of protein structures in the gas phase continues to be a formidable challenge. To tackle this issue, the present study employs Molecular Dynamics (MD) simulations in tandem with enhanced sampling techniques (methods designed to efficiently explore protein conformations). The objective is to identify suitable structures of proteins by contrasting their calculated Collision Cross-Section (CCS) with those observed experimentally. Significant discrepancies were observed between the initial MD-simulated and experimentally measured CCS values through Ion Mobility-Mass Spectrometry (IMS-MS). To bridge this gap, we employed two distinct enhanced sampling methods, Harmonic Biasing Potential and Adaptive Biasing Force, which help the proteins overcome energy barriers to adopt more compact configurations. These techniques leverage the radius of gyration as a reaction coordinate (guiding parameter), guiding the system toward compressed states that potentially match experimental configurations more closely. The guiding forces are only employed to overcome existing barriers and are removed to allow the protein to naturally arrive at a potential gas phase configuration. The results demonstrated close alignment (within ∼4%) between simulated and experimental CCS values despite using different strengths and/or methods, validating their efficacy. This work lays the groundwork for future studies aimed at optimizing biasing methods and expanding the collective variables used for more accurate gas-phase structural predictions.
Collapse
Affiliation(s)
- Viraj D Gandhi
- Department of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
- Department of Mechanical and Energy Engineering, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
| | - Leyan Hua
- Department of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
- Department of Mechanical and Energy Engineering, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
| | - Morgan Lawrenz
- Molecular Analytics, AMGEN Research, Thousand Oaks, California 91320, United States
| | - Mohsen Latif
- Department of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
- Department of Mechanical and Energy Engineering, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
| | - Amber D Rolland
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, United States
| | - Iain D G Campuzano
- Molecular Analytics, AMGEN Research, Thousand Oaks, California 91320, United States
| | - Carlos Larriba-Andaluz
- Department of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
- Department of Mechanical and Energy Engineering, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
| |
Collapse
|
5
|
Shen M, Kortzak D, Ambrozak S, Bhatnagar S, Buchanan I, Liu R, Shen J. KaMLs for Predicting Protein p K a Values and Ionization States: Are Trees All You Need? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.09.622800. [PMID: 39605739 PMCID: PMC11601431 DOI: 10.1101/2024.11.09.622800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Despite its relevance for understanding biology and computer-aided drug discovery, accurate prediction of protein ionization states remains a formidable challenge. Physics-based approaches struggle to capture the small, competing contributions in the complex protein environment, while machine learning (ML) is hampered by scarcity of experimental data. Here we developed the pK a ML (KaML) models based on decision trees and graph attention networks (GATs), exploiting physicochemical features and a new experiment pK a database (PKAD-3) enriched with highly shifted pK a's. KaML-CBtree significantly outperforms the current state of the art in predicting pK a values and ionization states across all six titratable amino acids, notably achieving accurate predictions for deprotonated cysteines and lysines - a blind spot in previous models. The superior performance of KaMLs is achieved in part through several innovations, including separate treatment of acid and base, utilization of pK a shifts as training targets, data augmentation using AlphaFold structures, and model pre-training on a theoretical pK a database. A meta-feature analysis reveals why the lightweight tree model outperforms the more complex deep learning GAT. We release an end-to-end pK a predictor based on KaML-CBtree and the new database PKD-3, enabling applications and laying groundwork for further advances in protein electrostatics research.
Collapse
Affiliation(s)
- Mingzhe Shen
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| | - Daniel Kortzak
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| | - Simon Ambrozak
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742
| | - Shubham Bhatnagar
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742
| | | | - Ruibin Liu
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| | - Jana Shen
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| |
Collapse
|
6
|
Ferreira SGF, Sriramoju MK, Hsu STD, Faísca PFN, Machuqueiro M. Is There a Functional Role for the Knotted Topology in Protein UCH-L1? J Chem Inf Model 2024; 64:6827-6837. [PMID: 39045738 PMCID: PMC11388461 DOI: 10.1021/acs.jcim.4c00880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2024]
Abstract
Knotted proteins are present in nature, but there is still an open issue regarding the existence of a universal role for these remarkable structures. To address this question, we used classical molecular dynamics (MD) simulations combined with in vitro experiments to investigate the role of the Gordian knot in the catalytic activity of UCH-L1. To create an unknotted form of UCH-L1, we modified its amino acid sequence by truncating several residues from its N-terminus. Remarkably, we find that deleting the first two N-terminal residues leads to a partial loss of enzyme activity with conservation of secondary structural content and knotted topological state. This happens because the integrity of the N-terminus is critical to ensure the correct alignment of the catalytic triad. However, the removal of five residues from the N-terminus, which significantly disrupts the native structure and the topological state, leads to a complete loss of enzymatic activity. Overall, our findings indicate that UCH-L1's catalytic activity depends critically on the integrity of the N-terminus and the secondary structure content, with the latter being strongly coupled with the knotted topological state.
Collapse
Affiliation(s)
- Sara G F Ferreira
- BioISI - Instituto de Biossistemas e Ciências Integrativas, Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Manoj K Sriramoju
- Institute of Biological Chemistry, Academia Sinica, Taipei 11529, Taiwan
| | - Shang-Te Danny Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei 11529, Taiwan
- International Institute for Sustainability with Knotted Chiral Meta Matter (WPI-SKCM2), Hiroshima University, 1-3-1 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8526, Japan
- Institute of Biochemical Sciences, National Taiwan University, Taipei 11529, Taiwan
| | - Patrícia F N Faísca
- BioISI - Instituto de Biossistemas e Ciências Integrativas, Departamento de Física, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Miguel Machuqueiro
- BioISI - Instituto de Biossistemas e Ciências Integrativas, Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| |
Collapse
|
7
|
Barreto CAV, Vitorino JNM, Reis PBPS, Machuqueiro M, Moreira IS. p Ka Calculations of GPCRs: Understanding Protonation States in Receptor Activation. J Chem Inf Model 2024; 64:6850-6856. [PMID: 39150719 PMCID: PMC11388449 DOI: 10.1021/acs.jcim.4c01125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
The increase in the available G protein-coupled receptor (GPCR) structures has been pivotal in helping to understand their activation process. However, the role of protonation-conformation coupling in GPCR activation still needs to be clarified. We studied the protonation behavior of the highly conserved Asp2.50 residue in five different class A GPCRs (active and inactive conformations) using a linear response approximation (LRA) pKa calculation protocol. We observed consistent differences (1.3 pK units) for the macroscopic pKa values between the inactive and active states of the A2AR and B2AR receptors, indicating the protonation of Asp2.50 during GPCR activation. This process seems to be specific and not conserved, as no differences were observed in the pKa values of the remaining receptors (CB1R, NT1R, and GHSR).
Collapse
Affiliation(s)
- Carlos A V Barreto
- PhD Programme in Experimental Biology and Biomedicine, Institute for Interdisciplinary Research (IIIUC), University of Coimbra, Casa Costa Alemão, 3030-789 Coimbra, Portugal
- CNC─Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, 3004-504 Coimbra, Portugal
| | - João N M Vitorino
- BioSI─Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Pedro B P S Reis
- BioSI─Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Miguel Machuqueiro
- BioSI─Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Irina S Moreira
- CNC─Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, 3004-504 Coimbra, Portugal
- Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
| |
Collapse
|
8
|
Calinsky R, Levy Y. Histidine in Proteins: pH-Dependent Interplay between π-π, Cation-π, and CH-π Interactions. J Chem Theory Comput 2024; 20:6930-6945. [PMID: 39037905 PMCID: PMC11325542 DOI: 10.1021/acs.jctc.4c00606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Histidine (His) stands out as the most versatile natural amino acid due to its side chain's facile propensity to protonate at physiological pH, leading to a transition from aromatic to cationic characteristics and thereby enabling diverse biomolecular interactions. In this study, our objective was to quantify the energetics and geometries of pairwise interactions involving His at varying pH levels. Through quantum chemical calculations, we discovered that His exhibits robust participation in both π-π and cation-π interactions, underscoring its ability to adopt a π or cationic nature, akin to other common residues. Of particular note, we found that the affinity of protonated His for aromatic residues (via cation-π interactions) is greater than the affinity of neutral His for either cationic residues (also via cation-π interactions) or aromatic residues (via π-π interactions). Furthermore, His frequently engages in CH-π interactions, and notably, depending on its protonation state, we found that some instances of hydrogen bonding by His exhibit greater stability than is typical for interamino acid hydrogen bonds. The strength of the pH-dependent pairwise energies of His with aromatic residues is supported by the abundance of pairwise interactions with His of low and high predicted pKa values. Overall, our findings illustrate the contribution of His interactions to protein stability and its potential involvement in conformational changes despite its relatively low abundance in proteins.
Collapse
Affiliation(s)
- Rivka Calinsky
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yaakov Levy
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
9
|
Reis PBPS, Clevert DA, Machuqueiro M. PypKa server: online pKa predictions and biomolecular structure preparation with precomputed data from PDB and AlphaFold DB. Nucleic Acids Res 2024; 52:W294-W298. [PMID: 38619040 PMCID: PMC11223823 DOI: 10.1093/nar/gkae255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/14/2024] [Accepted: 03/28/2024] [Indexed: 04/16/2024] Open
Abstract
When preparing biomolecular structures for molecular dynamics simulations, pKa calculations are required to provide at least a representative protonation state at a given pH value. Neglecting this step and adopting the reference protonation states of the amino acid residues in water, often leads to wrong electrostatics and nonphysical simulations. Fortunately, several methods have been developed to prepare structures considering the protonation preference of residues in their specific environments (pKa values), and some are even available for online usage. In this work, we present the PypKa server, which allows users to run physics-based, as well as ML-accelerated methods suitable for larger systems, to obtain pKa values, isoelectric points, titration curves, and structures with representative pH-dependent protonation states compatible with commonly used force fields (AMBER, CHARMM, GROMOS). The user may upload a custom structure or submit an identifier code from PBD or UniProtKB. The results for over 200k structures taken from the Protein Data Bank and the AlphaFold DB have been precomputed, and their data can be retrieved without extra calculations. All this information can also be obtained from an application programming interface (API) facilitating its usage and integration into existing pipelines as well as other web services. The web server is available at pypka.org.
Collapse
Affiliation(s)
- Pedro B P S Reis
- BioISI – Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
- Machine Learning Research, Bayer AG, Müllerstraße 178, 13353 Berlin, Germany
| | - Djork-Arné Clevert
- Machine Learning Research, Bayer AG, Müllerstraße 178, 13353 Berlin, Germany
- Machine Learning Research, Pfizer, Berlin, Germany
| | - Miguel Machuqueiro
- BioISI – Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| |
Collapse
|
10
|
Weymuth T, Unsleber JP, Türtscher PL, Steiner M, Sobez JG, Müller CH, Mörchen M, Klasovita V, Grimmel SA, Eckhoff M, Csizi KS, Bosia F, Bensberg M, Reiher M. SCINE-Software for chemical interaction networks. J Chem Phys 2024; 160:222501. [PMID: 38857173 DOI: 10.1063/5.0206974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/09/2024] [Indexed: 06/12/2024] Open
Abstract
The software for chemical interaction networks (SCINE) project aims at pushing the frontier of quantum chemical calculations on molecular structures to a new level. While calculations on individual structures as well as on simple relations between them have become routine in chemistry, new developments have pushed the frontier in the field to high-throughput calculations. Chemical relations may be created by a search for specific molecular properties in a molecular design attempt, or they can be defined by a set of elementary reaction steps that form a chemical reaction network. The software modules of SCINE have been designed to facilitate such studies. The features of the modules are (i) general applicability of the applied methodologies ranging from electronic structure (no restriction to specific elements of the periodic table) to microkinetic modeling (with little restrictions on molecularity), full modularity so that SCINE modules can also be applied as stand-alone programs or be exchanged for external software packages that fulfill a similar purpose (to increase options for computational campaigns and to provide alternatives in case of tasks that are hard or impossible to accomplish with certain programs), (ii) high stability and autonomous operations so that control and steering by an operator are as easy as possible, and (iii) easy embedding into complex heterogeneous environments for molecular structures taken individually or in the context of a reaction network. A graphical user interface unites all modules and ensures interoperability. All components of the software have been made available as open source and free of charge.
Collapse
Affiliation(s)
- Thomas Weymuth
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan P Unsleber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Paul L Türtscher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Miguel Steiner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan-Grimo Sobez
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Charlotte H Müller
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Maximilian Mörchen
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Veronika Klasovita
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Stephanie A Grimmel
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Marco Eckhoff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Katja-Sophia Csizi
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Francesco Bosia
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Moritz Bensberg
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
11
|
Liu S, Yang Q, Zhang L, Luo S. Accurate Protein p Ka Prediction with Physical Organic Chemistry Guided 3D Protein Representation. J Chem Inf Model 2024; 64:4410-4418. [PMID: 38780156 DOI: 10.1021/acs.jcim.4c00354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Protein pKa is a fundamental physicochemical parameter that dictates protein structure and function. However, accurately determining protein site-pKa values remains a substantial challenge, both experimentally and theoretically. In this study, we introduce a physical organic approach, leveraging a protein structural and physical-organic-parameter-based representation (P-SPOC), to develop a rapid and intuitive model for protein pKa prediction. Our P-SPOC model achieves state-of-the-art predictive accuracy, with a mean absolute error (MAE) of 0.33 pKa units. Furthermore, we have incorporated advanced protein structure prediction models, like AlphaFold2, to approximate structures for proteins lacking three-dimensional representations, which enhances the applicability of our model in the context of structure-undetermined protein research. To promote broader accessibility within the research community, an online prediction interface was also established at isyn.luoszgroup.com.
Collapse
Affiliation(s)
- Siyuan Liu
- Center of Basic Molecular Science, Department of Chemistry, Tsinghua University, Beijing 100084, China
| | - Qi Yang
- Center of Basic Molecular Science, Department of Chemistry, Tsinghua University, Beijing 100084, China
| | - Long Zhang
- Center of Basic Molecular Science, Department of Chemistry, Tsinghua University, Beijing 100084, China
| | - Sanzhong Luo
- Center of Basic Molecular Science, Department of Chemistry, Tsinghua University, Beijing 100084, China
| |
Collapse
|
12
|
Csizi KS, Reiher M. Automated preparation of nanoscopic structures: Graph-based sequence analysis, mismatch detection, and pH-consistent protonation with uncertainty estimates. J Comput Chem 2024; 45:761-776. [PMID: 38124290 DOI: 10.1002/jcc.27276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 11/14/2023] [Indexed: 12/23/2023]
Abstract
Structure and function in nanoscale atomistic assemblies are tightly coupled, and every atom with its specific position and even every electron will have a decisive effect on the electronic structure, and hence, on the molecular properties. Molecular simulations of nanoscopic atomistic structures therefore require accurately resolved three-dimensional input structures. If extracted from experiment, these structures often suffer from severe uncertainties, of which the lack of information on hydrogen atoms is a prominent example. Hence, experimental structures require careful review and curation, which is a time-consuming and error-prone process. Here, we present a fast and robust protocol for the automated structure analysis and pH-consistent protonation, in short, ASAP. For biomolecules as a target, the ASAP protocol integrates sequence analysis and error assessment of a given input structure. ASAP allows for pK a prediction from reference data through Gaussian process regression including uncertainty estimation and connects to system-focused atomistic modeling described in Brunken and Reiher (J. Chem. Theory Comput. 16, 2020, 1646). Although focused on biomolecules, ASAP can be extended to other nanoscopic objects, because most of its design elements rely on a general graph-based foundation guaranteeing transferability. The modular character of the underlying pipeline supports different degrees of automation, which allows for (i) efficient feedback loops for human-machine interaction with a low entrance barrier and for (ii) integration into autonomous procedures such as automated force field parametrizations. This facilitates fast switching of the pH-state through on-the-fly system-focused reparametrization during a molecular simulation at virtually no extra computational cost.
Collapse
Affiliation(s)
- Katja-Sophia Csizi
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Markus Reiher
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
13
|
Cai Z, Peng H, Sun S, He J, Luo F, Huang Y. DeepKa Web Server: High-Throughput Protein p Ka Prediction. J Chem Inf Model 2024; 64:2933-2940. [PMID: 38530291 DOI: 10.1021/acs.jcim.3c02013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
DeepKa is a deep-learning-based protein pKa predictor proposed in our previous work. In this study, a web server was developed that enables online protein pKa prediction driven by DeepKa. The web server provides a user-friendly interface where a single step of entering a valid PDB code or uploading a PDB format file is required to submit a job. Two case studies have been attached in order to explain how pKa's calculated by the web server could be utilized by users. Finally, combining the web server with post processing as described in case studies, this work suggests a quick workflow of investigating the relationship between protein structure and function that are pH dependent. The web server of DeepKa is freely available at http://www.computbiophys.com/DeepKa/main.
Collapse
Affiliation(s)
- Zhitao Cai
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Hao Peng
- National Pilot School of Software, Yunnan University, Kunming 650504, China
| | - Shuo Sun
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Jiahao He
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Fangfang Luo
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Yandong Huang
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| |
Collapse
|
14
|
Hill JA, Nyathi Y, Horrell S, von Stetten D, Axford D, Owen RL, Beddard GS, Pearson AR, Ginn HM, Yorke BA. An ultraviolet-driven rescue pathway for oxidative stress to eye lens protein human gamma-D crystallin. Commun Chem 2024; 7:81. [PMID: 38600176 PMCID: PMC11006947 DOI: 10.1038/s42004-024-01163-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 03/27/2024] [Indexed: 04/12/2024] Open
Abstract
Human gamma-D crystallin (HGD) is a major constituent of the eye lens. Aggregation of HGD contributes to cataract formation, the leading cause of blindness worldwide. It is unique in its longevity, maintaining its folded and soluble state for 50-60 years. One outstanding question is the structural basis of this longevity despite oxidative aging and environmental stressors including ultraviolet radiation (UV). Here we present crystallographic structures evidencing a UV-induced crystallin redox switch mechanism. The room-temperature serial synchrotron crystallographic (SSX) structure of freshly prepared crystallin mutant (R36S) shows no post-translational modifications. After aging for nine months in the absence of light, a thiol-adduct (dithiothreitol) modifying surface cysteines is observed by low-dose SSX. This is shown to be UV-labile in an acutely light-exposed structure. This suggests a mechanism by which a major source of crystallin damage, UV, may also act as a rescuing factor in a finely balanced redox system.
Collapse
Affiliation(s)
- Jake A Hill
- School of Chemistry and Biosciences, University of Bradford, Richmond Road, Bradford, BD7 1DP, United Kingdom
- School of Chemistry, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, United Kingdom
| | - Yvonne Nyathi
- Faculty of Biological Sciences, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, United Kingdom
| | - Sam Horrell
- Diamond Light Source Ltd, Harwell Science and Innovation Campus, Didcot, OX11 0DE, United Kingdom
| | - David von Stetten
- European Molecular Biology Laboratory, Notkestraße 85, 22607, Hamburg, Germany
| | - Danny Axford
- Diamond Light Source Ltd, Harwell Science and Innovation Campus, Didcot, OX11 0DE, United Kingdom
| | - Robin L Owen
- Diamond Light Source Ltd, Harwell Science and Innovation Campus, Didcot, OX11 0DE, United Kingdom
| | - Godfrey S Beddard
- School of Chemistry, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, United Kingdom
- School of Chemistry, University of Edinburgh, David Brewster Road, Edinburgh, EH9 3FJ, United Kingdom
| | - Arwen R Pearson
- HARBOR, Institute for Nanostructure and Solid State Physics, Hamburg, 22761, Germany
| | - Helen M Ginn
- HARBOR, Institute for Nanostructure and Solid State Physics, Hamburg, 22761, Germany.
- Center for Free-Electron Laser Science, CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607, Hamburg, Germany.
| | - Briony A Yorke
- School of Chemistry, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, United Kingdom.
| |
Collapse
|
15
|
Thiel A, Speranza MJ, Jadhav S, Stevens LL, Unruh DK, Ren P, Ponder JW, Shen J, Schnieders MJ. Constant-pH Simulations with the Polarizable Atomic Multipole AMOEBA Force Field. J Chem Theory Comput 2024; 20:2921-2933. [PMID: 38507252 PMCID: PMC11008096 DOI: 10.1021/acs.jctc.3c01180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 03/05/2024] [Accepted: 03/05/2024] [Indexed: 03/22/2024]
Abstract
Accurately predicting protein behavior across diverse pH environments remains a significant challenge in biomolecular simulations. Existing constant-pH molecular dynamics (CpHMD) algorithms are limited to fixed-charge force fields, hindering their application to biomolecular systems described by permanent atomic multipoles or induced dipoles. This work overcomes these limitations by introducing the first polarizable CpHMD algorithm in the context of the Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force field. Additionally, our implementation in the open-source Force Field X (FFX) software has the unique ability to handle titration state changes for crystalline systems including flexible support for all 230 space groups. The evaluation of constant-pH molecular dynamics (CpHMD) with the AMOEBA force field was performed on 11 crystalline peptide systems that span the titrating amino acids (Asp, Glu, His, Lys, and Cys). Titration states were correctly predicted for 15 out of the 16 amino acids present in the 11 systems, including for the coordination of Zn2+ by cysteines. The lone exception was for a HIS-ALA peptide where CpHMD predicted both neutral histidine tautomers to be equally populated, whereas the experimental model did not consider multiple conformers and diffraction data are unavailable for rerefinement. This work demonstrates the promise polarizable CpHMD simulations for pKa predictions, the study of biochemical mechanisms such as the catalytic triad of proteases, and for improved protein-ligand binding affinity accuracy in the context of pharmaceutical lead optimization.
Collapse
Affiliation(s)
- Andrew
C. Thiel
- Department
of Biomedical Engineering, University of
Iowa, Iowa City, Iowa 52242, United States
| | - Matthew J. Speranza
- Department
of Biomedical Engineering, University of
Iowa, Iowa City, Iowa 52242, United States
| | - Sanika Jadhav
- Department
of Pharmaceutical Sciences and Experimental Therapeutics, University of Iowa, Iowa City, Iowa 52242, United States
| | - Lewis L. Stevens
- Department
of Pharmaceutical Sciences and Experimental Therapeutics, University of Iowa, Iowa City, Iowa 52242, United States
| | - Daniel K. Unruh
- Office
of the Vice President for Research, University
of Iowa, Iowa City, Iowa 52242, United
States
| | - Pengyu Ren
- Department
of Biomedical Engineering, University of
Texas, Austin, Texas 78712, United States
| | - Jay W. Ponder
- Department
of Chemistry, Washington University in St.
Louis, St. Louis, Missouri 63130, United
States
| | - Jana Shen
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Michael J. Schnieders
- Department
of Biomedical Engineering, University of
Iowa, Iowa City, Iowa 52242, United States
- Department
of Biochemistry, University of Iowa, Iowa City, Iowa 52242, United States
| |
Collapse
|
16
|
Stathopulos PB, Ikura M. Aromatically stacking the odds in favour of increased ORAI1 activation. Cell Calcium 2024; 117:102841. [PMID: 38154331 DOI: 10.1016/j.ceca.2023.102841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 12/18/2023] [Accepted: 12/20/2023] [Indexed: 12/30/2023]
Affiliation(s)
- Peter B Stathopulos
- Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, London, ON, N6A 5C1, Canada.
| | - Mitsuhiko Ikura
- Department of Medical Biophysics, University of Toronto, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, M5G 2M9, Canada
| |
Collapse
|
17
|
Wilson C, Karttunen M, de Groot BL, Gapsys V. Accurately Predicting Protein p Ka Values Using Nonequilibrium Alchemy. J Chem Theory Comput 2023; 19:7833-7845. [PMID: 37820376 PMCID: PMC10653114 DOI: 10.1021/acs.jctc.3c00721] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Indexed: 10/13/2023]
Abstract
The stability, solubility, and function of a protein depend on both its net charge and the protonation states of its individual residues. pKa is a measure of the tendency for a given residue to (de)protonate at a specific pH. Although pKa values can be resolved experimentally, theory and computation provide a compelling alternative. To this end, we assess the applicability of a nonequilibrium (NEQ) alchemical free energy method to the problem of pKa prediction. On a data set of 144 residues that span 13 proteins, we report an average unsigned error of 0.77 ± 0.09, 0.69 ± 0.09, and 0.52 ± 0.04 pK for aspartate, glutamate, and lysine, respectively. This is comparable to current state-of-the-art predictors and the accuracy recently reached using free energy perturbation methods (e.g., FEP+). Moreover, we demonstrate that our open-source, pmx-based approach can accurately resolve the pKa values of coupled residues and observe a substantial performance disparity associated with the lysine partial charges in Amber14SB/Amber99SB*-ILDN, for which an underused fix already exists.
Collapse
Affiliation(s)
- Carter
J. Wilson
- Department
of Mathematics, The University of Western
Ontario, N6A 5B7 London, Canada
- Centre
for Advanced Materials and Biomaterials Research (CAMBR), The University of Western Ontario, N6A 5B7 London, Canada
| | - Mikko Karttunen
- Centre
for Advanced Materials and Biomaterials Research (CAMBR), The University of Western Ontario, N6A 5B7 London, Canada
- Department
of Physics & Astronomy, The University
of Western Ontario, N6A
5B7 London, Canada
- Department
of Chemistry, The University of Western
Ontario, N6A 5B7 London, Canada
| | - Bert L. de Groot
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, 37077 Göttingen, Germany
| | - Vytautas Gapsys
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, 37077 Göttingen, Germany
- Computational
Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, B-2340 Beerse, Belgium
| |
Collapse
|
18
|
Ancona N, Bastola A, Alexov E. PKAD-2: New entries and expansion of functionalities of the database of experimentally measured pKa's of proteins. JOURNAL OF COMPUTATIONAL BIOPHYSICS AND CHEMISTRY 2023; 22:515-524. [PMID: 37520074 PMCID: PMC10373500 DOI: 10.1142/s2737416523500230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Almost all biological reactions are pH dependent and understanding the origin of pH dependence requires knowledge of the pKa's of ionizable groups. Here we report a new edition of PKAD, the PKAD-2, which is a database of experimentally measured pKa's of proteins, both wild type and mutant proteins. The new additions include 117 wild type and 54 mutant pKa values, resulting in total 1742 experimentally measured pKa's. The new edition of PKAD-2 includes 8 new wild type and 12 new mutant proteins, resulting in total of 220 proteins. This new edition incorporates a visual 3D image of the highlighted residue of interest within the corresponding protein or protein complex. Hydrogen bonds were identified, counted, and implemented as a search feature. Other new search features include the number of neighboring residues <4A from the heaviest atom of the side chain of a given amino acid. Here, we present PKAD-2 with the intention to continuously incorporate novel features and current data with the goal to be used as benchmark for computational methods.
Collapse
Affiliation(s)
- Nicolas Ancona
- Department of Biological Sciences, College of Science, Clemson University, 105 Sikes Hall, Address, Clemson, SC 29634, United States of America
| | - Ananta Bastola
- School of Computing, College of Engineering, Computing and Applied Sciences, Clemson University, 105 Sikes Hall, SC 29634, United States of America
| | - Emil Alexov
- Department of Physics, College of Science, Clemson University, 105 Sikes Hall, Address, Clemson, SC 29634, United States of America
| |
Collapse
|
19
|
Aci-Sèche S, Bourg S, Bonnet P, Rebehmed J, de Brevern AG, Diharce J. A perspective on the sharing of docking data. Data Brief 2023; 49:109386. [PMID: 37492229 PMCID: PMC10365938 DOI: 10.1016/j.dib.2023.109386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 05/17/2023] [Accepted: 07/03/2023] [Indexed: 07/27/2023] Open
Abstract
Computational approaches are nowadays largely applied in drug discovery projects. Among these, molecular docking is the most used for hit identification against a drug target protein. However, many scientists in the field shed light on the lack of availability and reproducibility of the data obtained from such studies to the whole community. Consequently, sustaining and developing the efforts toward a large and fully transparent sharing of those data could be beneficial for all researchers in drug discovery. The purpose of this article is first to propose guidelines and recommendations on the appropriate way to conduct virtual screening experiments and second to depict the current state of sharing molecular docking data. In conclusion, we have explored and proposed several prospects to enhance data sharing from docking experiment that could be developed in the foreseeable future.
Collapse
Affiliation(s)
- Samia Aci-Sèche
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311, Université d'Orléans BP 6759, Orléans Cedex 2, 45067, France
| | - Stéphane Bourg
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311, Université d'Orléans BP 6759, Orléans Cedex 2, 45067, France
| | - Pascal Bonnet
- Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université d'Orléans 7311, Université d'Orléans BP 6759, Orléans Cedex 2, 45067, France
| | - Joseph Rebehmed
- Department of Computer Science and Mathematics, Lebanese, American University, Beirut, Lebanon
| | - Alexandre G. de Brevern
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, Biologie Intégrée du Globule Rouge, UMR_S 1134, DSIMB Bioinformatics team, 75014 Paris, France
| | - Julien Diharce
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, Biologie Intégrée du Globule Rouge, UMR_S 1134, DSIMB Bioinformatics team, 75014 Paris, France
| |
Collapse
|
20
|
Cai Z, Liu T, Lin Q, He J, Lei X, Luo F, Huang Y. Basis for Accurate Protein p Ka Prediction with Machine Learning. J Chem Inf Model 2023; 63:2936-2947. [PMID: 37146199 DOI: 10.1021/acs.jcim.3c00254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
pH regulates protein structures and the associated functions in many biological processes via protonation and deprotonation of ionizable side chains where the titration equilibria are determined by pKa's. To accelerate pH-dependent molecular mechanism research in the life sciences or industrial protein and drug designs, fast and accurate pKa prediction is crucial. Here we present a theoretical pKa data set PHMD549, which was successfully applied to four distinct machine learning methods, including DeepKa, which was proposed in our previous work. To reach a valid comparison, EXP67S was selected as the test set. Encouragingly, DeepKa was improved significantly and outperforms other state-of-the-art methods, except for the constant-pH molecular dynamics, which was utilized to create PHMD549. More importantly, DeepKa reproduced experimental pKa orders of acidic dyads in five enzyme catalytic sites. Apart from structural proteins, DeepKa was found applicable to intrinsically disordered peptides. Further, in combination with solvent exposures, it is revealed that DeepKa offers the most accurate prediction under the challenging circumstance that hydrogen bonding or salt bridge interaction is partly compensated by desolvation for a buried side chain. Finally, our benchmark data qualify PHMD549 and EXP67S as the basis for future developments of protein pKa prediction tools driven by artificial intelligence. In addition, DeepKa built on PHMD549 has been proven an efficient protein pKa predictor and thus can be applied immediately to, for example, pKa database construction, protein design, drug discovery, and so on.
Collapse
Affiliation(s)
- Zhitao Cai
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Tengzi Liu
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Qiaoling Lin
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Jiahao He
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Xiaowei Lei
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Fangfang Luo
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Yandong Huang
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| |
Collapse
|
21
|
Awoonor-Williams E, Golosov AA, Hornak V. Benchmarking In Silico Tools for Cysteine p Ka Prediction. J Chem Inf Model 2023; 63:2170-2180. [PMID: 36996330 DOI: 10.1021/acs.jcim.3c00004] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
Abstract
Accurate estimation of the pKa's of cysteine residues in proteins could inform targeted approaches in hit discovery. The pKa of a targetable cysteine residue in a disease-related protein is an important physiochemical parameter in covalent drug discovery, as it influences the fraction of nucleophilic thiolate amenable to chemical protein modification. Traditional structure-based in silico tools are limited in their predictive accuracy of cysteine pKa's relative to other titratable residues. Additionally, there are limited comprehensive benchmark assessments for cysteine pKa predictive tools. This raises the need for extensive assessment and evaluation of methods for cysteine pKa prediction. Here, we report the performance of several computational pKa methods, including single-structure and ensemble-based approaches, on a diverse test set of experimental cysteine pKa's retrieved from the PKAD database. The dataset consisted of 16 wildtype and 10 mutant proteins with experimentally measured cysteine pKa values. Our results highlight that these methods are varied in their overall predictive accuracies. Among the test set of wildtype proteins evaluated, the best method (MOE) yielded a mean absolute error of 2.3 pK units, highlighting the need for improvement of existing pKa methods for accurate cysteine pKa estimation. Given the limited accuracy of these methods, further development is needed before these approaches can be routinely employed to drive design decisions in early drug discovery efforts.
Collapse
Affiliation(s)
- Ernest Awoonor-Williams
- Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Andrei A Golosov
- Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Viktor Hornak
- Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
22
|
Paccetti-Alves I, Batista MSP, Pimpão C, Victor BL, Soveral G. Unraveling the Aquaporin-3 Inhibitory Effect of Rottlerin by Experimental and Computational Approaches. Int J Mol Sci 2023; 24:ijms24066004. [PMID: 36983077 PMCID: PMC10057066 DOI: 10.3390/ijms24066004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/17/2023] [Accepted: 03/21/2023] [Indexed: 03/30/2023] Open
Abstract
The natural polyphenolic compound Rottlerin (RoT) showed anticancer properties in a variety of human cancers through the inhibition of several target molecules implicated in tumorigenesis, revealing its potential as an anticancer agent. Aquaporins (AQPs) are found overexpressed in different types of cancers and have recently emerged as promising pharmacological targets. Increasing evidence suggests that the water/glycerol channel aquaporin-3 (AQP3) plays a key role in cancer and metastasis. Here, we report the ability of RoT to inhibit human AQP3 activity with an IC50 in the micromolar range (22.8 ± 5.82 µM for water and 6.7 ± 2.97 µM for glycerol permeability inhibition). Moreover, we have used molecular docking and molecular dynamics simulations to understand the structural determinants of RoT that explain its ability to inhibit AQP3. Our results show that RoT blocks AQP3-glycerol permeation by establishing strong and stable interactions at the extracellular region of AQP3 pores interacting with residues essential for glycerol permeation. Altogether, our multidisciplinary approach unveiled RoT as an anticancer drug against tumors where AQP3 is highly expressed providing new information to aquaporin research that may boost future drug design.
Collapse
Affiliation(s)
- Inês Paccetti-Alves
- Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisbon, Portugal
- Department of Pharmaceutical Sciences and Medicines, Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisbon, Portugal
| | - Marta S P Batista
- Biosystems and Integrative Sciences Institute, Faculty of Sciences, Universidade de Lisboa, 1649-003 Lisbon, Portugal
| | - Catarina Pimpão
- Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisbon, Portugal
- Department of Pharmaceutical Sciences and Medicines, Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisbon, Portugal
| | - Bruno L Victor
- Biosystems and Integrative Sciences Institute, Faculty of Sciences, Universidade de Lisboa, 1649-003 Lisbon, Portugal
| | - Graça Soveral
- Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisbon, Portugal
- Department of Pharmaceutical Sciences and Medicines, Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisbon, Portugal
| |
Collapse
|
23
|
Zacarias S, Batista MSP, Ramalho SS, Victor BL, Farinha CM. Rescue of Rare CFTR Trafficking Mutants Highlights a Structural Location-Dependent Pattern for Correction. Int J Mol Sci 2023; 24:ijms24043211. [PMID: 36834620 PMCID: PMC9961391 DOI: 10.3390/ijms24043211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/30/2023] [Accepted: 01/31/2023] [Indexed: 02/08/2023] Open
Abstract
Cystic Fibrosis (CF) is a genetic disease caused by mutations in the gene encoding the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) channel. Currently, more than 2100 variants have been identified in the gene, with a large number being very rare. The approval of modulators that act on mutant CFTR protein, correcting its molecular defect and thus alleviating the burden of the disease, revolutionized the field of CF. However, these drugs do not apply to all patients with CF, especially those with rare mutations-for which there is a lack of knowledge on the molecular mechanisms of the disease and the response to modulators. In this work, we evaluated the impact of several rare putative class II mutations on the expression, processing, and response of CFTR to modulators. Novel cell models consisting of bronchial epithelial cell lines expressing CFTR with 14 rare variants were created. The variants studied are localized at Transmembrane Domain 1 (TMD1) or very close to the signature motif of Nucleotide Binding Domain 1 (NBD1). Our data show that all mutations analyzed significantly decrease CFTR processing and while TMD1 mutations respond to modulators, those localized in NBD1 do not. Molecular modeling calculations confirm that the mutations in NBD1 induce greater destabilization of CFTR structure than those in TMD1. Furthermore, the structural proximity of TMD1 mutants to the reported binding site of CFTR modulators such as VX-809 and VX-661, make them more efficient in stabilizing the CFTR mutants analyzed. Overall, our data suggest a pattern for mutation location and impact in response to modulators that correlates with the global effect of the mutations on CFTR structure.
Collapse
|
24
|
Sequeira JN, Rodrigues FEP, Silva TGD, Reis PBPS, Machuqueiro M. Extending the Stochastic Titration CpHMD to CHARMM36m. J Phys Chem B 2022; 126:7870-7882. [PMID: 36190807 PMCID: PMC9776569 DOI: 10.1021/acs.jpcb.2c04529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The impact of pH on proteins is significant but often neglected in molecular dynamics simulations. Constant-pH Molecular Dynamics (CpHMD) is the state-of-the-art methodology to deal with these effects. However, it still lacks widespread adoption by the scientific community. The stochastic titration CpHMD is one of such methods that, until now, only supported the GROMOS force field family. Here, we extend this method's implementation to include the CHARMM36m force field available in the GROMACS software package. We test this new implementation with a diverse group of proteins, namely, lysozyme, Staphylococcal nuclease, and human and E. coli thioredoxins. All proteins were conformationally stable in the simulations, even at extreme pH values. The RMSE values (pKa prediction vs experimental) obtained were very encouraging, in particular for lysozyme and human thioredoxin. We have also identified a few residues that challenged the CpHMD simulations, highlighting scenarios where the method still needs improvement independently of the force field. The CHARMM36m all-atom implementation was more computationally efficient when compared with the GROMOS 54A7, taking advantage of a shorter nonbonded interaction cutoff and a less frequent neighboring list update. The new extension will allow the study of pH effects in many systems for which this force field is particularly suited, i.e., proteins, membrane proteins, lipid bilayers, and nucleic acids.
Collapse
|
25
|
Lin YC, Ren P, Webb LJ. AMOEBA Force Field Trajectories Improve Predictions of Accurate p Ka Values of the GFP Fluorophore: The Importance of Polarizability and Water Interactions. J Phys Chem B 2022; 126:7806-7817. [PMID: 36194474 PMCID: PMC10851343 DOI: 10.1021/acs.jpcb.2c03642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Precisely quantifying the magnitude, direction, and biological functions of electric fields in proteins has long been an outstanding challenge in the field. The most widely implemented experimental method to measure such electric fields at a particular residue in a protein has been through changes in pKa of titratable residues. While many computational strategies exist to predict these values, it has been difficult to do this accurately or connect predicted results to key structural or mechanistic features of the molecule. Here, we used experimentally determined pKa values of the fluorophore in superfolder green fluorescent protein (GFP) with amino acid mutations made at position Thr 203 to evaluate the pKa prediction ability of molecular dynamics (MD) simulations using a polarizable force field, AMOEBA. Structure ensembles from AMOEBA were used to calculate pKa values of the GFP fluorophore. The calculated pKa values were then compared to trajectories using a conventional fixed charge force field (Amber03 ff). We found that the position of water molecules included in the pKa calculation had opposite effects on the pKa values between the trajectories from AMOEBA and Amber03 force fields. In AMOEBA trajectories, the inclusion of water molecules within 35 Å of the fluorophore decreased the difference between the predicted and experimental values, resulting in calculated pKa values that were within an average of 0.8 pKa unit from the experimental results. On the other hand, in Amber03 trajectories, including water molecules that were more than 5 Å from the fluorophore increased the differences between the calculated and experimental pKa values. The inaccuracy of pKa predictions determined from Amber03 trajectories was caused by a significant stabilization of the deprotonated chromophore's free energy compared to the result in AMOEBA. We rationalize the cutoffs for explicit water molecules when calculating pKa to better predict the electrostatic environment surrounding the fluorophore buried in GFP. We discuss how the results from this work will assist the prospective prediction of pKa values or other electrostatic effects in a wide variety of folded proteins.
Collapse
Affiliation(s)
- Yu-Chun Lin
- Department of Chemistry, Texas Materials Institute, and Interdisciplinary Life Sciences Program, The University of Texas at Austin, 105 E 24th St. STOP A5300, Austin, TX 78712-1224
| | - Pengyu Ren
- Department of Chemistry, Texas Materials Institute, and Interdisciplinary Life Sciences Program, The University of Texas at Austin, 105 E 24th St. STOP A5300, Austin, TX 78712-1224
| | - Lauren J. Webb
- Department of Chemistry, Texas Materials Institute, and Interdisciplinary Life Sciences Program, The University of Texas at Austin, 105 E 24th St. STOP A5300, Austin, TX 78712-1224
| |
Collapse
|
26
|
Reis PBPS, Bertolini M, Montanari F, Rocchia W, Machuqueiro M, Clevert DA. A Fast and Interpretable Deep Learning Approach for Accurate Electrostatics-Driven p Ka Predictions in Proteins. J Chem Theory Comput 2022; 18:5068-5078. [PMID: 35837736 DOI: 10.1021/acs.jctc.2c00308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Existing computational methods for estimating pKa values in proteins rely on theoretical approximations and lengthy computations. In this work, we use a data set of 6 million theoretically determined pKa shifts to train deep learning models, which are shown to rival the physics-based predictors. These neural networks managed to infer the electrostatic contributions of different chemical groups and learned the importance of solvent exposure and close interactions, including hydrogen bonds. Although trained only using theoretical data, our pKAI+ model displayed the best accuracy in a test set of ∼750 experimental values. Inference times allow speedups of more than 1000× compared to physics-based methods. By combining speed, accuracy, and a reasonable understanding of the underlying physics, our models provide a game-changing solution for fast estimations of macroscopic pKa values from ensembles of microscopic values as well as for many downstream applications such as molecular docking and constant-pH molecular dynamics simulations.
Collapse
Affiliation(s)
| | - Marco Bertolini
- Machine Learning Research, Bayer A.G., Berlin 13353, Germany
| | | | - Walter Rocchia
- CONCEPT Lab, Istituto Italiano di Tecnologia (IIT), Via Melen 83, B Block, Genoa 16152, Italy
| | - Miguel Machuqueiro
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | | |
Collapse
|
27
|
Reis PBPS, Barletta GP, Gagliardi L, Fortuna S, Soler MA, Rocchia W. Antibody-Antigen Binding Interface Analysis in the Big Data Era. Front Mol Biosci 2022; 9:945808. [PMID: 35911958 PMCID: PMC9329859 DOI: 10.3389/fmolb.2022.945808] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 06/24/2022] [Indexed: 11/22/2022] Open
Abstract
Antibodies have become the Swiss Army tool for molecular biology and nanotechnology. Their outstanding ability to specifically recognise molecular antigens allows their use in many different applications from medicine to the industry. Moreover, the improvement of conventional structural biology techniques (e.g., X-ray, NMR) as well as the emergence of new ones (e.g., Cryo-EM), have permitted in the last years a notable increase of resolved antibody-antigen structures. This offers a unique opportunity to perform an exhaustive structural analysis of antibody-antigen interfaces by employing the large amount of data available nowadays. To leverage this factor, different geometric as well as chemical descriptors were evaluated to perform a comprehensive characterization.
Collapse
Affiliation(s)
- Pedro B. P. S. Reis
- CONCEPT Lab, Istituto Italiano di Teconologia, Genova, Italy
- Bioisi, University of Lisbon, Lisbon, Portugal
| | - German P. Barletta
- CONCEPT Lab, Istituto Italiano di Teconologia, Genova, Italy
- Universidad Nacional de Quilmes/CONICET, Quilmes, Argentina
- The Abdus Salam International Centre for Theoretical Physics (ICTP), Trieste, Italy
| | - Luca Gagliardi
- CONCEPT Lab, Istituto Italiano di Teconologia, Genova, Italy
| | - Sara Fortuna
- CONCEPT Lab, Istituto Italiano di Teconologia, Genova, Italy
| | - Miguel A. Soler
- CONCEPT Lab, Istituto Italiano di Teconologia, Genova, Italy
- Dipartimento di Scienze Matematiche, Informatiche e Fisiche, Universita’ di Udine, Udine, Italy
- *Correspondence: Miguel A. Soler, ; Walter Rocchia,
| | - Walter Rocchia
- CONCEPT Lab, Istituto Italiano di Teconologia, Genova, Italy
- *Correspondence: Miguel A. Soler, ; Walter Rocchia,
| |
Collapse
|
28
|
Chen AY, Lee J, Damjanovic A, Brooks BR. Protein p Ka Prediction by Tree-Based Machine Learning. J Chem Theory Comput 2022; 18:2673-2686. [PMID: 35289611 PMCID: PMC10510853 DOI: 10.1021/acs.jctc.1c01257] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Protonation states of ionizable protein residues modulate many essential biological processes. For correct modeling and understanding of these processes, it is crucial to accurately determine their pKa values. Here, we present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA and 15% better than the published result from the pKa prediction method DelPhiPKa. The overall root-mean-square error (RMSE) for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys, and Tyr), and 0.63 when considering Asp, Glu, His, and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.
Collapse
Affiliation(s)
- Ada Y. Chen
- Department of Physics & Astronomy, Johns Hopkins
University, Baltimore, Maryland, 21218
- Laboratory of Computational Biology, National Heart, Lung
and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20892
| | - Juyong Lee
- Department of Chemistry, Division of Chemistry and
Biochemistry, Kangwon National University, 1 Gangwondaehak-gil, Chuncheon, 24341,
Republic of Korea
| | - Ana Damjanovic
- Department of Biophysics, Johns Hopkins University,
Baltimore, Maryland, 21218
| | - Bernard R. Brooks
- Laboratory of Computational Biology, National Heart, Lung
and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20892
| |
Collapse
|
29
|
Khaniya U, Mao J, Wei RJ, Gunner MR. Characterizing Protein Protonation Microstates Using Monte Carlo Sampling. J Phys Chem B 2022; 126:2476-2485. [PMID: 35344367 PMCID: PMC8997239 DOI: 10.1021/acs.jpcb.2c00139] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Proteins are polyelectrolytes with acidic and basic amino acids Asp, Glu, Arg, Lys, and His, making up ≈25% of the residues. The protonation state of residues, cofactors, and ligands defines a "protonation microstate". In an ensemble of proteins some residues will be ionized and others neutral, leading to a mixture of protonation microstates rather than in a single one as is often assumed. The microstate distribution changes with pH. The protein environment also modifies residue proton affinity so microstate distributions change in different reaction intermediates or as ligands are bound. Particular protonation microstates may be required for function, while others exist simply because there are many states with similar energy. Here, the protonation microstates generated in Monte Carlo sampling in MCCE are characterized in HEW lysozyme as a function of pH and bacterial photosynthetic reaction centers (RCs) in different reaction intermediates. The lowest energy and highest probability microstates are compared. The ΔG, ΔH, and ΔS between the four protonation states of Glu35 and Asp52 in lysozyme are shown to be calculated with reasonable precision. At pH 7 the lysozyme charge ranges from 6 to 10, with 24 accepted protonation microstates, while RCs have ≈50,000. A weighted Pearson correlation analysis shows coupling between residue protonation states in RCs and how they change when the quinone in the QB site is reduced. Protonation microstates can be used to define input MD parameters and provide insight into the motion of protons coupled to reactions.
Collapse
Affiliation(s)
- Umesh Khaniya
- Department of Physics, City College of New York, New York, New York 10031, United States.,Department of Physics, The Graduate Center, City University of New York, New York, New York 10016, United States
| | - Junjun Mao
- Department of Physics, City College of New York, New York, New York 10031, United States
| | - Rongmei Judy Wei
- Department of Physics, City College of New York, New York, New York 10031, United States.,Department of Chemistry, The Graduate Center, City University of New York, New York, New York 10016, United States
| | - M R Gunner
- Department of Physics, City College of New York, New York, New York 10031, United States.,Department of Physics, The Graduate Center, City University of New York, New York, New York 10016, United States.,Department of Chemistry, The Graduate Center, City University of New York, New York, New York 10016, United States
| |
Collapse
|
30
|
Kozlowski LP. Proteome-pI 2.0: proteome isoelectric point database update. Nucleic Acids Res 2022; 50:D1535-D1540. [PMID: 34718696 PMCID: PMC8728302 DOI: 10.1093/nar/gkab944] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 09/28/2021] [Accepted: 10/04/2021] [Indexed: 11/18/2022] Open
Abstract
Proteome-pI 2.0 is an update of an online database containing predicted isoelectric points and pKa dissociation constants of proteins and peptides. The isoelectric point-the pH at which a particular molecule carries no net electrical charge-is an important parameter for many analytical biochemistry and proteomics techniques. Additionally, it can be obtained directly from the pKa values of individual charged residues of the protein. The Proteome-pI 2.0 database includes data for over 61 million protein sequences from 20 115 proteomes (three to four times more than the previous release). The isoelectric point for proteins is predicted by 21 methods, whereas pKa values are inferred by one method. To facilitate bottom-up proteomics analysis, individual proteomes were digested in silico with the five most commonly used proteases (trypsin, chymotrypsin, trypsin + LysC, LysN, ArgC), and the peptides' isoelectric point and molecular weights were calculated. The database enables the retrieval of virtual 2D-PAGE plots and customized fractions of a proteome based on the isoelectric point and molecular weight. In addition, isoelectric points for proteins in NCBI non-redundant (nr), UniProt, SwissProt, and Protein Data Bank are available in both CSV and FASTA formats. The database can be accessed at http://isoelectricpointdb2.org.
Collapse
Affiliation(s)
- Lukasz Pawel Kozlowski
- Institute of Informatics, Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Mazovian Voivodeship 02-097, Poland
| |
Collapse
|
31
|
Network Biology and Artificial Intelligence Drive the Understanding of the Multidrug Resistance Phenotype in Cancer. Drug Resist Updat 2022; 60:100811. [DOI: 10.1016/j.drup.2022.100811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/22/2022] [Accepted: 01/24/2022] [Indexed: 02/07/2023]
|
32
|
Henderson JA, Shen J. Exploring the pH- and Ligand-Dependent Flap Dynamics of Malarial Plasmepsin II. J Chem Inf Model 2021; 62:150-158. [PMID: 34964641 DOI: 10.1021/acs.jcim.1c01180] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Malaria remains a global health threat─over 400,000 deaths occurred in 2019. Plasmepsins are promising targets of antimalarial therapeutics; however, no inhibitors have reached the clinic. To fuel the progress, a detailed understanding of the pH- and ligand-dependent conformational dynamics of plasmepsins is needed. Here we present the continuous constant pH molecular dynamics study of the prototypical plasmepsin II and its complexed form with a substrate analogue. The simulations revealed that the catalytic dyads D34 and D214 are highly coupled in the apo protein and that the pepstatin binding enhances the difference in proton affinity, making D34 the general base and D214 the general acid. The simulations showed that the flap adopts an open state regardless of pH; however, upon pepstatin binding the flap can close or open depending on the protonation state of D214. These and other data are discussed and compared with the off-targets human cathepsin D and renin. This study lays the groundwork for a systematic investigation of pH- and ligand-modulated dynamics of the entire family of plasmepsins to help design more potent and selective inhibitors.
Collapse
Affiliation(s)
- Jack A Henderson
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Jana Shen
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| |
Collapse
|
33
|
Reis PBPS, Clevert DA, Machuqueiro M. pKPDB: a protein data bank extension database of pKa and pI theoretical values. Bioinformatics 2021; 38:297-298. [PMID: 34260689 DOI: 10.1093/bioinformatics/btab518] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 07/02/2021] [Accepted: 07/08/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY pKa values of ionizable residues and isoelectric points of proteins provide valuable local and global insights about their structure and function. These properties can be estimated with reasonably good accuracy using Poisson-Boltzmann and Monte Carlo calculations at a considerable computational cost (from some minutes to several hours). pKPDB is a database of over 12 M theoretical pKa values calculated over 120k protein structures deposited in the Protein Data Bank. By providing precomputed pKa and pI values, users can retrieve results instantaneously for their protein(s) of interest while also saving countless hours and resources that would be spent on repeated calculations. Furthermore, there is an ever-growing imbalance between experimental pKa and pI values and the number of resolved structures. This database will complement the experimental and computational data already available and can also provide crucial information regarding buried residues that are under-represented in experimental measurements. AVAILABILITY AND IMPLEMENTATION Gzipped csv files containing p Ka and isoelectric point values can be downloaded from https://pypka.org/pKPDB. To query a single PDB code please use the PypKa free server at https://pypka.org. The pKPDB source code can be found at https://github.com/mms-fcul/pKPDB. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pedro B P S Reis
- Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, 1749-016 Lisboa, Portugal.,Bayer AG, Research & Development, Pharmaceuticals. Machine Learning Research, 13353 Berlin, Germany
| | - Djork-Arné Clevert
- Bayer AG, Research & Development, Pharmaceuticals. Machine Learning Research, 13353 Berlin, Germany
| | - Miguel Machuqueiro
- Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, 1749-016 Lisboa, Portugal
| |
Collapse
|
34
|
Oliveira NFB, Rodrigues FEP, Vitorino JNM, Loureiro RJS, Faísca PFN, Machuqueiro M. Predicting stable binding modes from simulated dimers of the D76N mutant of β 2-microglobulin. Comput Struct Biotechnol J 2021; 19:5160-5169. [PMID: 34630936 PMCID: PMC8473664 DOI: 10.1016/j.csbj.2021.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/02/2021] [Accepted: 09/02/2021] [Indexed: 11/16/2022] Open
Abstract
β2m D76N mutant populates an aggregation-prone monomer (I2) with unstructured termini. MD and MM-PBSA indicate that I2 dimers are stabilized by hydrophobic interactions. The termini regions and BC- and DE-loops are prevalent in the most stable interfaces. The most stable dimer has a limited growth potential without structural rearrangement.
The D76N mutant of the β2m protein is a biologically motivated model system to study protein aggregation. There is strong experimental evidence, supported by molecular simulations, that D76N populates a highly dynamic conformation (which we originally named I2) that exposes aggregation-prone patches as a result of the detachment of the two terminal regions. Here, we use Molecular Dynamics simulations to study the stability of an ensemble of dimers of I2 generated via protein–protein docking. MM-PBSA calculations indicate that within the ensemble of investigated dimers the major contribution to interface stabilization at physiological pH comes from hydrophobic interactions between apolar residues. Our structural analysis also reveals that the interfacial region associated with the most stable binding modes are particularly rich in residues pertaining to both the N- and C-terminus, as well residues from the BC- and DE-loops. On the other hand, the less stable interfaces are stabilized by intermolecular interactions involving residues from the CD- and EF-loops. By focusing on the most stable binding modes, we used a simple geometric rule to propagate the corresponding dimer interfaces. We found that, in the absence of any kind of structural rearrangement occurring at an early stage of the oligomerization pathway, some interfaces drive a self-limited growth process, while others can be propagated indefinitely allowing the formation of long, polymerized chains. In particular, the interfacial region of the most stable binding mode reported here falls in the class of self-limited growth.
Collapse
Affiliation(s)
- Nuno F B Oliveira
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal.,Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, Lisboa 1749-016, Portugal
| | - Filipe E P Rodrigues
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal.,Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, Lisboa 1749-016, Portugal
| | - João N M Vitorino
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal.,Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, Lisboa 1749-016, Portugal
| | - Rui J S Loureiro
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal
| | - Patrícia F N Faísca
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal.,Department of Physics, Faculty of Sciences, University of Lisbon, Lisbon 1749-016, Portugal
| | - Miguel Machuqueiro
- BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal.,Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, Lisboa 1749-016, Portugal
| |
Collapse
|
35
|
Privat C, Madurga S, Mas F, Rubio-Martinez J. Unravelling Constant pH Molecular Dynamics in Oligopeptides with Explicit Solvation Model. Polymers (Basel) 2021; 13:polym13193311. [PMID: 34641127 PMCID: PMC8512540 DOI: 10.3390/polym13193311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 09/22/2021] [Accepted: 09/24/2021] [Indexed: 11/16/2022] Open
Abstract
An accurate description of the protonation state of amino acids is essential to correctly simulate the conformational space and the mechanisms of action of proteins or other biochemical systems. The pH and the electrochemical environments are decisive factors to define the effective pKa of amino acids and, therefore, the protonation state. However, they are poorly considered in Molecular Dynamics (MD) simulations. To deal with this problem, constant pH Molecular Dynamics (cpHMD) methods have been developed in recent decades, demonstrating a great ability to consider the effective pKa of amino acids within complex structures. Nonetheless, there are very few studies that assess the effect of these approaches in the conformational sampling. In a previous work of our research group, we detected strengths and weaknesses of the discrete cpHMD method implemented in AMBER when simulating capped tripeptides in implicit solvent. Now, we progressed this assessment by including explicit solvation in these peptides. To analyze more in depth the scope of the reported limitations, we also carried out simulations of oligopeptides with distinct positions of the titratable amino acids. Our study showed that the explicit solvation model does not improve the previously noted weaknesses and, furthermore, the separation of the titratable amino acids in oligopeptides can minimize them, thus providing guidelines to improve the conformational sampling in the cpHMD simulations.
Collapse
|
36
|
King E, Aitchison E, Li H, Luo R. Recent Developments in Free Energy Calculations for Drug Discovery. Front Mol Biosci 2021; 8:712085. [PMID: 34458321 PMCID: PMC8387144 DOI: 10.3389/fmolb.2021.712085] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/27/2021] [Indexed: 01/11/2023] Open
Abstract
The grand challenge in structure-based drug design is achieving accurate prediction of binding free energies. Molecular dynamics (MD) simulations enable modeling of conformational changes critical to the binding process, leading to calculation of thermodynamic quantities involved in estimation of binding affinities. With recent advancements in computing capability and predictive accuracy, MD based virtual screening has progressed from the domain of theoretical attempts to real application in drug development. Approaches including the Molecular Mechanics Poisson Boltzmann Surface Area (MM-PBSA), Linear Interaction Energy (LIE), and alchemical methods have been broadly applied to model molecular recognition for drug discovery and lead optimization. Here we review the varied methodology of these approaches, developments enhancing simulation efficiency and reliability, remaining challenges hindering predictive performance, and applications to problems in the fields of medicine and biochemistry.
Collapse
Affiliation(s)
- Edward King
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, United States
| | - Erick Aitchison
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, United States
| | - Han Li
- Department of Chemical and Biomolecular Engineering, University of California, Irvine, CA, United States
| | - Ray Luo
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, United States
- Department of Chemical and Biomolecular Engineering, University of California, Irvine, CA, United States
- Department of Materials Science and Engineering, University of California, Irvine, CA, United States
- Department of Biomedical Engineering, University of California, Irvine, CA, United States
| |
Collapse
|
37
|
pK a Calculations in Membrane Proteins from Molecular Dynamics Simulations. Methods Mol Biol 2021. [PMID: 34302677 DOI: 10.1007/978-1-0716-1468-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
The conformational changes of membrane proteins are crucial to their function and usually lead to fluctuations in the electrostatic environment of the protein surface. A very effective way to quantify these changes is by calculating the pK a values of the protein's titratable residues, which can be regarded as electrostatic probes. To achieve this, we need to take advantage of the fast and reliable pK a calculators developed for globular proteins and adapt them to include the explicit effects of membranes. Here, we provide a detailed linear response approximation protocol that uses our own software (PypKa) to calculate reliable pK a values from short MD simulations of membrane proteins.
Collapse
|
38
|
Petrus E, Bo C. Unlocking Phase Diagrams for Molybdenum and Tungsten Nanoclusters and Prediction of their Formation Constants. J Phys Chem A 2021; 125:5212-5219. [PMID: 34086467 DOI: 10.1021/acs.jpca.1c03292] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Understanding and controlling aqueous speciation of metal oxides are key for the discovery and development of novel materials, and challenge both experimental and computational approaches. Here we present a computational method, called POMSimulator, which is able to predict speciation phase diagrams (Conc. vs pH) for multispecies chemical equilibria in solution, and which we apply to molybdenum and tungsten isopolyoxoanions (IPAs). Starting from the MO4 monomers, and considering dimers, trimers, and larger species, the chemical reaction networks involved in the formation of [H32Mo36O128]8- and [W12O42]12- are sampled in an automatic manner. This information is used for setting up ∼105 speciation models, and from there, we generate the speciation phase diagrams, which show an insightful picture of the behavior of IPAs in aqueous solution. Furthermore, we predict the values of 107 formation constants for a diversity of molybdenum and tungsten molecular oxides. Among these species, we could include several pentagonal-shaped species and very reactive tungsten intermediates as well. Last but not least, the calibration employed for correcting the density functional theory (DFT) Gibbs energies is remarkably similar for both metals, which suggests that a general rule might exist for correcting computed free energies for other metals.
Collapse
Affiliation(s)
- Enric Petrus
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Av. Països Catalans, 16, 43007 Tarragona, Spain
| | - Carles Bo
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Av. Països Catalans, 16, 43007 Tarragona, Spain.,Departament de Química Física i Inorgánica, Universitat Rovira i Virgili, Marcel•lí Domingo s/n, 43007 Tarragona, Spain
| |
Collapse
|