1
|
Ge Y, Pande V, Seierstad MJ, Damm-Ganamet KL. Exploring the Application of SiteMap and Site Finder for Focused Cryptic Pocket Identification. J Phys Chem B 2024; 128:6233-6245. [PMID: 38904218 DOI: 10.1021/acs.jpcb.4c00664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
The characterization of cryptic pockets has been elusive, despite substantial efforts. Computational modeling approaches, such as molecular dynamics (MD) simulations, can provide atomic-level details of binding site motions and binding pathways. However, the time scale that MD can achieve at a reasonable cost often limits its application for cryptic pocket identification. Enhanced sampling techniques can improve the efficiency of MD simulations by focused sampling of important regions of the protein, but prior knowledge of the simulated system is required to define the appropriate coordinates. In the case of a novel, unknown cryptic pocket, such information is not available, limiting the application of enhanced sampling techniques for cryptic pocket identification. In this work, we explore the ability of SiteMap and Site Finder, widely used commercial packages for pocket identification, to detect focus points on the protein and further apply other advanced computational methods. The information gained from this analysis enables the use of computational modeling, including enhanced MD sampling techniques, to explore potential cryptic binding pockets suggested by SiteMap and Site Finder. Here, we examined SiteMap and Site Finder results on 136 known cryptic pockets from a combination of the PocketMiner dataset (a recently curated set of cryptic pockets), the Cryptosite Set (a classic set of cryptic pockets), and Natural killer group 2D (NKG2D, a protein target where a cryptic pocket is confirmed). Our findings demonstrate the application of existing, well-studied tools in efficiently mapping potential regions harboring cryptic pockets.
Collapse
Affiliation(s)
- Yunhui Ge
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, 3210 Merryfield Row, San Diego, California 92121, United States
| | - Vineet Pande
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Mark J Seierstad
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, 3210 Merryfield Row, San Diego, California 92121, United States
| | - Kelly L Damm-Ganamet
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, 3210 Merryfield Row, San Diego, California 92121, United States
| |
Collapse
|
2
|
Raddi RM, Marshall T, Voelz VA. Automatic Forward Model Parameterization with Bayesian Inference of Conformational Populations. ARXIV 2024:arXiv:2405.18532v1. [PMID: 38855540 PMCID: PMC11160882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
To quantify how well theoretical predictions of structural ensembles agree with experimental measurements, we depend on the accuracy of forward models. These models are computational frameworks that generate observable quantities from molecular configurations based on empirical relationships linking specific molecular properties to experimental measurements. Bayesian Inference of Conformational Populations (BICePs) is a reweighting algorithm that reconciles simulated ensembles with ensemble-averaged experimental observations, even when such observations are sparse and/or noisy. This is achieved by sampling the posterior distribution of conformational populations under experimental restraints as well as sampling the posterior distribution of uncertainties due to random and systematic error. In this study, we enhance the algorithm for the refinement of empirical forward model (FM) parameters. We introduce and evaluate two novel methods for optimizing FM parameters. The first method treats FM parameters as nuisance parameters, integrating over them in the full posterior distribution. The second method employs variational minimization of a quantity called the BICePs score that reports the free energy of "turning on" the experimental restraints. This technique, coupled with improved likelihood functions for handling experimental outliers, facilitates force field validation and optimization, as illustrated in recent studies (Raddi et al. 2023, 2024). Using this approach, we refine parameters that modulate the Karplus relation, crucial for accurate predictions of J -coupling constants based on dihedral angles ( ϕ ) between interacting nuclei. We validate this approach first with a toy model system, and then for human ubiquitin, predicting six sets of Karplus parameters forJ H N H α 3 ,J H α C ' 3 ,J H N C β 3 ,J H N C ' 3 ,J C ' C β 3 ,J C ' C ' 3 . This approach, which does not rely on any predetermined parameterization, enhances predictive accuracy and can be used for many applications.
Collapse
Affiliation(s)
- Robert M Raddi
- Department of Chemistry, Temple University, Philadelphia, PA 19122, USA
| | - Tim Marshall
- Department of Chemistry, Temple University, Philadelphia, PA 19122, USA
| | - Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, PA 19122, USA
| |
Collapse
|
3
|
Arvindekar S, Pathak AS, Majila K, Viswanath S. Optimizing representations for integrative structural modeling using Bayesian model selection. Bioinformatics 2024; 40:btae106. [PMID: 38391029 PMCID: PMC10924281 DOI: 10.1093/bioinformatics/btae106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 02/03/2024] [Accepted: 02/21/2024] [Indexed: 02/24/2024] Open
Abstract
MOTIVATION Integrative structural modeling combines data from experiments, physical principles, statistics of previous structures, and prior models to obtain structures of macromolecular assemblies that are challenging to characterize experimentally. The choice of model representation is a key decision in integrative modeling, as it dictates the accuracy of scoring, efficiency of sampling, and resolution of analysis. But currently, the choice is usually made ad hoc, manually. RESULTS Here, we report NestOR (Nested Sampling for Optimizing Representation), a fully automated, statistically rigorous method based on Bayesian model selection to identify the optimal coarse-grained representation for a given integrative modeling setup. Given an integrative modeling setup, it determines the optimal representations from given candidate representations based on their model evidence and sampling efficiency. The performance of NestOR was evaluated on a benchmark of four macromolecular assemblies. AVAILABILITY AND IMPLEMENTATION NestOR is implemented in the Integrative Modeling Platform (https://integrativemodeling.org) and is available at https://github.com/isblab/nestor. Data for the benchmark is at https://www.doi.org/10.5281/zenodo.10360718.
Collapse
Affiliation(s)
- Shreyas Arvindekar
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India
| | - Aditi S Pathak
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India
| | - Kartik Majila
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India
| | - Shruthi Viswanath
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India
| |
Collapse
|
4
|
Raddi RM, Voelz VA. Markov State Model of Solvent Features Reveals Water Dynamics in Protein-Peptide Binding. J Phys Chem B 2023; 127:10682-10690. [PMID: 38078851 DOI: 10.1021/acs.jpcb.3c04775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
In this work, we investigate the role of solvent in the binding reaction of the p53 transactivation domain (TAD) peptide to its receptor MDM2. Previously, our group generated 831 μs of explicit-solvent aggregate molecular simulation trajectory data for the MDM2-p53 peptide binding reaction using large-scale distributed computing and subsequently built a Markov State Model (MSM) of the binding reaction (Zhou et al. 2017). Here, we perform a tICA analysis and construct an MSM with similar hyperparameters while using only solvent-based structural features. We find a remarkably similar landscape but accelerated implied timescales for the slowest motions. The solvent shells contributing most to the first tICA eigenvector are those centered on Lys24 and Thr18 of the p53 TAD peptide in the range of 3-6 Å. Important solvent shells were visualized to reveal solvation and desolvation transitions along the peptide-protein binding trajectories. Our results provide a solvent-centric view of the hydrophobic effect in action for a realistic peptide-protein binding scenario.
Collapse
Affiliation(s)
- Robert M Raddi
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
5
|
Arvindekar S, Pathak AS, Majila K, Viswanath S. Optimizing representations for integrative structural modeling using Bayesian model selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.12.571227. [PMID: 38168172 PMCID: PMC10760022 DOI: 10.1101/2023.12.12.571227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Motivation Integrative structural modeling combines data from experiments, physical principles, statistics of previous structures, and prior models to obtain structures of macromolecular assemblies that are challenging to characterize experimentally. The choice of model representation is a key decision in integrative modeling, as it dictates the accuracy of scoring, efficiency of sampling, and resolution of analysis. But currently, the choice is usually made ad hoc, manually. Results Here, we report NestOR (Nested Sampling for Optimizing Representation), a fully automated, statistically rigorous method based on Bayesian model selection to identify the optimal coarse-grained representation for a given integrative modeling setup. Given an integrative modeling setup, it determines the optimal representations from given candidate representations based on their model evidence and sampling efficiency. The performance of NestOR was evaluated on a benchmark of four macromolecular assemblies. Availability NestOR is implemented in the Integrative Modeling Platform (https://integrativemodeling.org) and is available at https://github.com/isblab/nestor.
Collapse
Affiliation(s)
- Shreyas Arvindekar
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| | - Aditi S. Pathak
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| | - Kartik Majila
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| | - Shruthi Viswanath
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| |
Collapse
|
6
|
Habeck M. Bayesian methods in integrative structure modeling. Biol Chem 2023; 404:741-754. [PMID: 37505205 DOI: 10.1515/hsz-2023-0145] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 07/07/2023] [Indexed: 07/29/2023]
Abstract
There is a growing interest in characterizing the structure and dynamics of large biomolecular assemblies and their interactions within the cellular environment. A diverse array of experimental techniques allows us to study biomolecular systems on a variety of length and time scales. These techniques range from imaging with light, X-rays or electrons, to spectroscopic methods, cross-linking mass spectrometry and functional genomics approaches, and are complemented by AI-assisted protein structure prediction methods. A challenge is to integrate all of these data into a model of the system and its functional dynamics. This review focuses on Bayesian approaches to integrative structure modeling. We sketch the principles of Bayesian inference, highlight recent applications to integrative modeling and conclude with a discussion of current challenges and future perspectives.
Collapse
Affiliation(s)
- Michael Habeck
- Microscopic Image Analysis Group, Jena University Hospital, D-07743 Jena, Germany
- Max Planck Institute for Multidisciplinary Sciences, d-37077 Göttingen, Germany
| |
Collapse
|
7
|
Voelz VA, Pande VS, Bowman GR. Folding@home: Achievements from over 20 years of citizen science herald the exascale era. Biophys J 2023; 122:2852-2863. [PMID: 36945779 PMCID: PMC10398258 DOI: 10.1016/j.bpj.2023.03.028] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 01/26/2023] [Accepted: 03/16/2023] [Indexed: 03/23/2023] Open
Abstract
Simulations of biomolecules have enormous potential to inform our understanding of biology but require extremely demanding calculations. For over 20 years, the Folding@home distributed computing project has pioneered a massively parallel approach to biomolecular simulation, harnessing the resources of citizen scientists across the globe. Here, we summarize the scientific and technical advances this perspective has enabled. As the project's name implies, the early years of Folding@home focused on driving advances in our understanding of protein folding by developing statistical methods for capturing long-timescale processes and facilitating insight into complex dynamical processes. Success laid a foundation for broadening the scope of Folding@home to address other functionally relevant conformational changes, such as receptor signaling, enzyme dynamics, and ligand binding. Continued algorithmic advances, hardware developments such as graphics processing unit (GPU)-based computing, and the growing scale of Folding@home have enabled the project to focus on new areas where massively parallel sampling can be impactful. While previous work sought to expand toward larger proteins with slower conformational changes, new work focuses on large-scale comparative studies of different protein sequences and chemical compounds to better understand biology and inform the development of small-molecule drugs. Progress on these fronts enabled the community to pivot quickly in response to the COVID-19 pandemic, expanding to become the world's first exascale computer and deploying this massive resource to provide insight into the inner workings of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and aid the development of new antivirals. This success provides a glimpse of what is to come as exascale supercomputers come online and as Folding@home continues its work.
Collapse
Affiliation(s)
- Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania
| | | | - Gregory R Bowman
- Departments of Biochemistry & Biophysics and of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania.
| |
Collapse
|
8
|
Raddi RM, Ge Y, Voelz VA. BICePs v2.0: Software for Ensemble Reweighting Using Bayesian Inference of Conformational Populations. J Chem Inf Model 2023; 63:2370-2381. [PMID: 37027181 PMCID: PMC10278562 DOI: 10.1021/acs.jcim.2c01296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
Bayesian Inference of Conformational Populations (BICePs) version 2.0 (v2.0) is a free, open-source Python package that reweights theoretical predictions of conformational state populations using sparse and/or noisy experimental measurements. In this article, we describe the implementation and usage of the latest version of BICePs (v2.0), a powerful, user-friendly and extensible package which makes several improvements upon the previous version. The algorithm now supports many experimental NMR observables (NOE distances, chemical shifts, J-coupling constants, and hydrogen-deuterium exchange protection factors), and enables convenient data preparation and processing. BICePs v2.0 can perform automatic analysis of the sampled posterior, including visualization, and evaluation of statistical significance and sampling convergence. We provide specific coding examples for these topics, and present a detailed example illustrating how to use BICePs v2.0 to reweight a theoretical ensemble using experimental measurements.
Collapse
Affiliation(s)
- Robert M Raddi
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Yunhui Ge
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| | - Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
9
|
Hurley MFD, Northrup JD, Ge Y, Schafmeister CE, Voelz VA. Metal Cation-Binding Mechanisms of Q-Proline Peptoid Macrocycles in Solution. J Chem Inf Model 2021; 61:2818-2828. [PMID: 34125519 DOI: 10.1021/acs.jcim.1c00447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The rational design of foldable and functionalizable peptidomimetic scaffolds requires the concerted application of both computational and experimental methods. Recently, a new class of designed peptoid macrocycle incorporating spiroligomer proline mimics (Q-prolines) has been found to preorganize when bound by monovalent metal cations. To determine the solution-state structure of these cation-bound macrocycles, we employ a Bayesian inference method (BICePs) to reconcile enhanced-sampling molecular simulations with sparse ROESY correlations from experimental NMR studies to predict and design conformational and binding properties of macrocycles as functional scaffolds for peptidomimetics. Conformations predicted to be most populated in solution were then simulated in the presence of explicit cations to yield trajectories with observed binding events, revealing a highly preorganized all-trans amide conformation, whose formation is likely limited by the slow rate of cis/trans isomerization. Interestingly, this conformation differs from a racemic crystal structure solved in the absence of cation. Free energies of cation binding computed from distance-dependent potentials of mean force suggest Na+ has a higher affinity to the macrocycle than K+, with both cations binding much more strongly in acetonitrile than water. The simulated affinities are able to correctly rank the extent to which different macrocycle sequences exhibit preorganization in the presence of different metal cations and solvents, suggesting our approach is suitable for solution-state computational design.
Collapse
Affiliation(s)
- Matthew F D Hurley
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Justin D Northrup
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Yunhui Ge
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | | | - Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|