1
|
Chen L, Mondal A, Perez A, Miranda-Quintana RA. Protein Retrieval via Integrative Molecular Ensembles (PRIME) through Extended Similarity Indices. J Chem Theory Comput 2024; 20:6303-6315. [PMID: 38978294 DOI: 10.1021/acs.jctc.4c00362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Molecular dynamics (MD) simulations are ideally suited to describe conformational ensembles of biomolecules such as proteins and nucleic acids. Microsecond-long simulations are now routine, facilitated by the emergence of graphical processing units. Clustering, which groups objects based on structural similarity, is typically used to process ensembles, leading to different states, their populations, and the identification of representative structures. A popular pipeline combines hierarchical clustering for clustering and selecting the cluster centroid as representative of the cluster. Here, we propose to improve on this approach, by developing a module-Protein Retrieval via Integrative Molecular Ensembles (PRIME), that consists of tools to improve the prediction of the representative in the most populated cluster using extended continuous similarity. PRIME is integrated with our Molecular Dynamics Analysis with N-ary Clustering Ensembles (MDANCE) package and can be used as a postprocessing tool for arbitrary clustering algorithms, compatible with several MD suites. PRIME predictions produced structures that when aligned to the experimental structure were better superposed (lower RMSD). A further benefit of PRIME is its linear scaling─rather than the traditional O(N2) traditionally associated with comparisons of elements in a set.
Collapse
Affiliation(s)
- Lexin Chen
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Alberto Perez
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
2
|
Parui S, Brini E, Dill KA. Computing Free Energies of Fold-Switching Proteins Using MELD x MD. J Chem Theory Comput 2023; 19:6839-6847. [PMID: 37725050 DOI: 10.1021/acs.jctc.3c00679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
Some proteins are conformational switches, able to transition between relatively different conformations. To understand what drives them requires computing the free-energy difference ΔGAB between their stable states, A and B. Molecular dynamics (MD) simulations alone are often slow because they require a reaction coordinate and must sample many transitions in between. Here, we show that modeling employing limited data (MELD) x MD on known endstates A and B is accurate and efficient because it does not require passing over barriers or knowing reaction coordinates. We validate this method on two problems: (1) it gives correct relative populations of α and β conformers for small designed chameleon sequences of protein G; and (2) it correctly predicts the conformations of the C-terminal domain (CTD) of RfaH. Free-energy methods like MELD x MD can often resolve structures that confuse machine-learning (ML) methods.
Collapse
Affiliation(s)
- Sridip Parui
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
| | - Emiliano Brini
- School of Chemistry and Materials Science, 85 Lomb Memorial Drive, Rochester, New York 14623, United States
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, United States
| |
Collapse
|
3
|
Parui S, Robertson JC, Somani S, Tresadern G, Liu C, Dill KA. MELD-Bracket Ranks Binding Affinities of Diverse Sets of Ligands. J Chem Inf Model 2023; 63:2857-2865. [PMID: 37093848 DOI: 10.1021/acs.jcim.3c00243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Affinity ranking of structurally diverse small-molecule ligands is a challenging problem with important applications in structure-based drug discovery. Absolute binding free energy methods can model diverse ligands, but the high computational cost of the current methods limits application to data sets with few ligands. We recently developed MELD-Bracket, a Molecular Dynamics method for efficient affinity ranking of ligands [ JCTC 2022, 18 (1), 374-379]. It utilizes a Bayesian framework to guide sampling to relevant regions of phase space, and it couples this with a bracket-like competition on a pool of ligands. Here we find that 6-competitor MELD-Bracket can rank dozens of diverse ligands that have low structural similarity and different net charges. We benchmark it on four protein systems─PTB1B, Tyk2, BACE, and JAK3─having varied modes of interactions. We also validated 8-competitor and 12-competitor protocols. The MELD-Bracket protocols presented here may have the appropriate balance of accuracy and computational efficiency to be suitable for ranking diverse ligands from typical drug discovery campaigns.
Collapse
Affiliation(s)
- Sridip Parui
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
| | - James C Robertson
- Janssen Research and Development, Spring House, Pennsylvania 19477, United States
| | - Sandeep Somani
- Janssen Research and Development, Spring House, Pennsylvania 19477, United States
| | - Gary Tresadern
- Janssen Research and Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Cong Liu
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, United States
| |
Collapse
|
4
|
Nassar R, Brini E, Parui S, Liu C, Dignon GL, Dill KA. Accelerating Protein Folding Molecular Dynamics Using Inter-Residue Distances from Machine Learning Servers. J Chem Theory Comput 2022; 18:1929-1935. [PMID: 35133832 PMCID: PMC9281603 DOI: 10.1021/acs.jctc.1c00916] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Recently, predicting the native structures of proteins has become possible using computational molecular physics (CMP)─physics-based force fields sampled with proper statistics─but only for small proteins. Algorithms with better scaling are needed. We describe ML x MELD x MD, a molecular dynamics (MD) method that inputs residue contacts derived from machine learning (ML) servers into MELD, a Bayesian accelerator that preserves detailed-balance statistics. Contacts are derived from trRosetta-predicted distance histograms (distograms) and are integrated into MELD's atomistic MD as spatial restraints through parametrized potential functions. In the CASP14 blind prediction event, ML x MELD x MD predicted 13 native structures to better than 4.5 Å error, including for 10 proteins in the range of 115-250 amino acids long. Also, the scaling of simulation time vs protein length is much better than unguided MD: tsim ∼ e0.023N for ML x MELD x MD vs tsim ∼ e0.168N for MD alone. This shows how machine learning information can be leveraged to advance physics-based modeling of proteins.
Collapse
Affiliation(s)
- Roy Nassar
- Laufer
Center for Physical and Quantitative Biology, Stony Brook University, Stony
Brook, New York 11794, United States
- Department
of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Emiliano Brini
- Laufer
Center for Physical and Quantitative Biology, Stony Brook University, Stony
Brook, New York 11794, United States
| | - Sridip Parui
- Laufer
Center for Physical and Quantitative Biology, Stony Brook University, Stony
Brook, New York 11794, United States
| | - Cong Liu
- Laufer
Center for Physical and Quantitative Biology, Stony Brook University, Stony
Brook, New York 11794, United States
- Department
of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Gregory L. Dignon
- Laufer
Center for Physical and Quantitative Biology, Stony Brook University, Stony
Brook, New York 11794, United States
| | - Ken A. Dill
- Laufer
Center for Physical and Quantitative Biology, Stony Brook University, Stony
Brook, New York 11794, United States
- Department
of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, United States
- Department
of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| |
Collapse
|
5
|
Ovchinnikov S, Huang PS. Structure-based protein design with deep learning. Curr Opin Chem Biol 2021; 65:136-144. [PMID: 34547592 PMCID: PMC8671290 DOI: 10.1016/j.cbpa.2021.08.004] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 08/13/2021] [Indexed: 12/11/2022]
Abstract
Since the first revelation of proteins functioning as macromolecular machines through their three dimensional structures, researchers have been intrigued by the marvelous ways the biochemical processes are carried out by proteins. The aspiration to understand protein structures has fueled extensive efforts across different scientific disciplines. In recent years, it has been demonstrated that proteins with new functionality or shapes can be designed via structure-based modeling methods, and the design strategies have combined all available information - but largely piece-by-piece - from sequence derived statistics to the detailed atomic-level modeling of chemical interactions. Despite the significant progress, incorporating data-derived approaches through the use of deep learning methods can be a game changer. In this review, we summarize current progress, compare the arc of developing the deep learning approaches with the conventional methods, and describe the motivation and concepts behind current strategies that may lead to potential future opportunities.
Collapse
Affiliation(s)
- Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, 02138, USA.
| | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
6
|
Sharma B, Dill KA. MELD-accelerated molecular dynamics help determine amyloid fibril structures. Commun Biol 2021; 4:942. [PMID: 34354239 PMCID: PMC8342454 DOI: 10.1038/s42003-021-02461-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 07/15/2021] [Indexed: 02/07/2023] Open
Abstract
It is challenging to determine the structures of protein fibrils such as amyloids. In principle, Molecular Dynamics (MD) modeling can aid experiments, but normal MD has been impractical for these large multi-molecules. Here, we show that MELD accelerated MD (MELD x MD) can give amyloid structures from limited data. Five long-chain fibril structures are accurately predicted from NMR and Solid State NMR (SSNMR) data. Ten short-chain fibril structures are accurately predicted from more limited restraints information derived from the knowledge of strand directions. Although the present study only tests against structure predictions - which are the most detailed form of validation currently available - the main promise of this physical approach is ultimately in going beyond structures to also give mechanical properties, conformational ensembles, and relative stabilities.
Collapse
Affiliation(s)
- Bhanita Sharma
- grid.36425.360000 0001 2216 9681Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY USA
| | - Ken A. Dill
- grid.36425.360000 0001 2216 9681Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY USA ,grid.36425.360000 0001 2216 9681Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY USA ,grid.36425.360000 0001 2216 9681Departments of Chemistry and Physics, Stony Brook University, Stony Brook, NY USA
| |
Collapse
|
7
|
Abstract
Every protein has a story-how it folds, what it binds, its biological actions, and how it misbehaves in aging or disease. Stories are often inferred from a protein's shape (i.e., its structure). But increasingly, stories are told using computational molecular physics (CMP). CMP is rooted in the principled physics of driving forces and reveals granular detail of conformational populations in space and time. Recent advances are accessing longer time scales, larger actions, and blind testing, enabling more of biology's stories to be told in the language of atomistic physics.
Collapse
Affiliation(s)
- Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Carlos Simmerling
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA.,Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA
| | - Ken Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA. .,Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New NY 11794, USA
| |
Collapse
|
8
|
Binding Ensembles of p53-MDM2 Peptide Inhibitors by Combining Bayesian Inference and Atomistic Simulations. Molecules 2021; 26:molecules26010198. [PMID: 33401765 PMCID: PMC7795311 DOI: 10.3390/molecules26010198] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 12/26/2020] [Accepted: 12/28/2020] [Indexed: 01/21/2023] Open
Abstract
Designing peptide inhibitors of the p53-MDM2 interaction against cancer is of wide interest. Computational modeling and virtual screening are a well established step in the rational design of small molecules. But they face challenges for binding flexible peptide molecules that fold upon binding. We look at the ability of five different peptides, three of which are intrinsically disordered, to bind to MDM2 with a new Bayesian inference approach (MELD × MD). The method is able to capture the folding upon binding mechanism and differentiate binding preferences between the five peptides. Processing the ensembles with statistical mechanics tools depicts the most likely bound conformations and hints at differences in the binding mechanism. Finally, the study shows the importance of capturing two driving forces to binding in this system: the ability of peptides to adopt bound conformations (ΔGconformation) and the interaction between interface residues (ΔGinteraction).
Collapse
|
9
|
Liu C, Brini E, Perez A, Dill KA. Computing Ligands Bound to Proteins Using MELD-Accelerated MD. J Chem Theory Comput 2020; 16:6377-6382. [PMID: 32910647 DOI: 10.1021/acs.jctc.0c00543] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Predicting the poses of small-molecule ligands in protein binding sites is often done by virtual screening algorithms such as DOCK. In principle, molecular dynamics (MD) using atomistic force fields could give better free-energy-based pose selection, but MD is computationally expensive. Here, we ask if modeling employing limited data (MELD)-accelerated MD (MELD × MD) can pick out the best DOCK poses taken as input. We study 30 different ligand-protein pairs. MELD × MD finds native poses, based on best free energies, in 23 out of the 30 cases, 20 of which were previously known DOCK failures. We conclude that MELD × MD can add value for predicting accurate poses of small molecules bound to proteins.
Collapse
Affiliation(s)
- Cong Liu
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794-5252, United States.,Department of Chemistry, Stony Brook University, Stony Brook, New York 11790-3400, United States
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794-5252, United States
| | - Alberto Perez
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794-5252, United States.,Department of Chemistry, Stony Brook University, Stony Brook, New York 11790-3400, United States.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794-3800, United States
| |
Collapse
|
10
|
Khramushin A, Marcu O, Alam N, Shimony O, Padhorny D, Brini E, Dill KA, Vajda S, Kozakov D, Schueler-Furman O. Modeling beta-sheet peptide-protein interactions: Rosetta FlexPepDock in CAPRI rounds 38-45. Proteins 2020; 88:1037-1049. [PMID: 31891416 PMCID: PMC7539656 DOI: 10.1002/prot.25871] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 12/17/2019] [Accepted: 12/26/2019] [Indexed: 01/09/2023]
Abstract
Peptide-protein docking is challenging due to the considerable conformational freedom of the peptide. CAPRI rounds 38-45 included two peptide-protein interactions, both characterized by a peptide forming an additional beta strand of a beta sheet in the receptor. Using the Rosetta FlexPepDock peptide docking protocol we generated top-performing, high-accuracy models for targets 134 and 135, involving an interaction between a peptide derived from L-MAG with DLC8. In addition, we were able to generate the only medium-accuracy models for a particularly challenging target, T121. In contrast to the classical peptide-mediated interaction, in which receptor side chains contact both peptide backbone and side chains, beta-sheet complementation involves a major contribution to binding by hydrogen bonds between main chain atoms. To establish how binding affinity and specificity are established in this special class of peptide-protein interactions, we extracted PeptiDBeta, a benchmark of solved structures of different protein domains that are bound by peptides via beta-sheet complementation, and tested our protocol for global peptide-docking PIPER-FlexPepDock on this dataset. We find that the beta-strand part of the peptide is sufficient to generate approximate and even high resolution models of many interactions, but inclusion of adjacent motif residues often provides additional information necessary to achieve high resolution model quality.
Collapse
Affiliation(s)
- Alisa Khramushin
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Orly Marcu
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Nawsad Alam
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Orly Shimony
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony
Brook University, New York, New York
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
| | - Ken A. Dill
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
- Department of Physics and Astronomy, Stony Brook
University, New York, New York
- Department of Chemistry, Stony Brook University, New York,
New York
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University,
Boston, Massachusetts
- Department of Chemistry, Boston University, Boston,
Massachusetts
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony
Brook University, New York, New York
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
| | - Ora Schueler-Furman
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| |
Collapse
|
11
|
Robertson JC, Nassar R, Liu C, Brini E, Dill KA, Perez A. NMR-assisted protein structure prediction with MELDxMD. Proteins 2019; 87:1333-1340. [PMID: 31350773 DOI: 10.1002/prot.25788] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Revised: 06/27/2019] [Accepted: 07/19/2019] [Indexed: 12/19/2022]
Abstract
We describe the performance of MELD-accelerated molecular dynamics (MELDxMD) in determining protein structures in the NMR-data-assisted category in CASP13. Seeded from web server predictions, MELDxMD was found best in the NMR category, over 17 targets, outperforming the next-best groups by a factor of ~4 in z-score. MELDxMD gives ensembles, not single structures; succeeds on a 326-mer, near the current upper limit for NMR structures; and predicts structures that match experimental residual dipolar couplings even though the only NMR-derived data used in the simulations was NOE-based ambiguous atom-atom contacts and backbone dihedrals. MELD can use noisy and ambiguous experimental information to reduce the MD search space. We believe MELDxMD is a promising method for determining protein structures from NMR data.
Collapse
Affiliation(s)
- James C Robertson
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Roy Nassar
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York.,Department of Chemistry, Stony Brook University, Stony Brook, New York
| | - Cong Liu
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York.,Department of Chemistry, Stony Brook University, Stony Brook, New York
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York.,Department of Chemistry, Stony Brook University, Stony Brook, New York.,Department of Physics & Astronomy, Stony Brook University, Stony Brook, New York
| | - Alberto Perez
- Department of Chemistry, University of Florida, Gainesville, Florida
| |
Collapse
|