1
|
Lu W, Zhang J, Huang W, Zhang Z, Jia X, Wang Z, Shi L, Li C, Wolynes PG, Zheng S. DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model. Nat Commun 2024; 15:1071. [PMID: 38316797 PMCID: PMC10844226 DOI: 10.1038/s41467-024-45461-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/24/2024] [Indexed: 02/07/2024] Open
Abstract
While significant advances have been made in predicting static protein structures, the inherent dynamics of proteins, modulated by ligands, are crucial for understanding protein function and facilitating drug discovery. Traditional docking methods, frequently used in studying protein-ligand interactions, typically treat proteins as rigid. While molecular dynamics simulations can propose appropriate protein conformations, they're computationally demanding due to rare transitions between biologically relevant equilibrium states. In this study, we present DynamicBind, a deep learning method that employs equivariant geometric diffusion networks to construct a smooth energy landscape, promoting efficient transitions between different equilibrium states. DynamicBind accurately recovers ligand-specific conformations from unbound protein structures without the need for holo-structures or extensive sampling. Remarkably, it demonstrates state-of-the-art performance in docking and virtual screening benchmarks. Our experiments reveal that DynamicBind can accommodate a wide range of large protein conformational changes and identify cryptic pockets in unseen protein targets. As a result, DynamicBind shows potential in accelerating the development of small molecules for previously undruggable targets and expanding the horizons of computational drug discovery.
Collapse
Affiliation(s)
- Wei Lu
- Galixir Technologies, 200100, Shanghai, China.
| | | | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, 510006, Guangzhou, China
| | | | - Xiangyu Jia
- Galixir Technologies, 200100, Shanghai, China
| | - Zhenyu Wang
- Galixir Technologies, 200100, Shanghai, China
| | - Leilei Shi
- Galixir Technologies, 200100, Shanghai, China
| | - Chengtao Li
- Galixir Technologies, 200100, Shanghai, China
| | - Peter G Wolynes
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, TX, 77005, USA
| | - Shuangjia Zheng
- Global Institute of Future Technology, Shanghai Jiao Tong University, 200240, Shanghai, China.
| |
Collapse
|
2
|
Freiberger MI, Ruiz-Serra V, Pontes C, Romero-Durana M, Galaz-Davison P, Ramírez-Sarmiento CA, Schuster CD, Marti MA, Wolynes PG, Ferreiro DU, Parra RG, Valencia A. Local energetic frustration conservation in protein families and superfamilies. Nat Commun 2023; 14:8379. [PMID: 38104123 PMCID: PMC10725452 DOI: 10.1038/s41467-023-43801-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 11/20/2023] [Indexed: 12/19/2023] Open
Abstract
Energetic local frustration offers a biophysical perspective to interpret the effects of sequence variability on protein families. Here we present a methodology to analyze local frustration patterns within protein families and superfamilies that allows us to uncover constraints related to stability and function, and identify differential frustration patterns in families with a common ancestry. We analyze these signals in very well studied protein families such as PDZ, SH3, ɑ and β globins and RAS families. Recent advances in protein structure prediction make it possible to analyze a vast majority of the protein space. An automatic and unsupervised proteome-wide analysis on the SARS-CoV-2 virus demonstrates the potential of our approach to enhance our understanding of the natural phenotypic diversity of protein families beyond single protein instances. We apply our method to modify biophysical properties of natural proteins based on their family properties, as well as perform unsupervised analysis of large datasets to shed light on the physicochemical signatures of poorly characterized proteins such as the ones belonging to emergent pathogens.
Collapse
Affiliation(s)
- Maria I Freiberger
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, C1428EGA, Argentina
| | - Victoria Ruiz-Serra
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Camila Pontes
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Miguel Romero-Durana
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Pablo Galaz-Davison
- Institute for Biological and Medical Engineering, Schools of Engineering, Medicine, and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
- ANID - Millennium Science Initiative Program - Millennium Institute for Integrative Biology (iBio), Santiago, 8331150, Chile
| | - Cesar A Ramírez-Sarmiento
- Institute for Biological and Medical Engineering, Schools of Engineering, Medicine, and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
- ANID - Millennium Science Initiative Program - Millennium Institute for Integrative Biology (iBio), Santiago, 8331150, Chile
| | - Claudio D Schuster
- Laboratorio de Bioinformática, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EGA, Buenos Aires, Argentina
| | - Marcelo A Marti
- Laboratorio de Bioinformática, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EGA, Buenos Aires, Argentina
| | - Peter G Wolynes
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, TX, 77005, USA
| | - Diego U Ferreiro
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, C1428EGA, Argentina
| | - R Gonzalo Parra
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain.
| | - Alfonso Valencia
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| |
Collapse
|
3
|
Borges-Araújo L, Patmanidis I, Singh AP, Santos LHS, Sieradzan AK, Vanni S, Czaplewski C, Pantano S, Shinoda W, Monticelli L, Liwo A, Marrink SJ, Souza PCT. Pragmatic Coarse-Graining of Proteins: Models and Applications. J Chem Theory Comput 2023; 19:7112-7135. [PMID: 37788237 DOI: 10.1021/acs.jctc.3c00733] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The molecular details involved in the folding, dynamics, organization, and interaction of proteins with other molecules are often difficult to assess by experimental techniques. Consequently, computational models play an ever-increasing role in the field. However, biological processes involving large-scale protein assemblies or long time scale dynamics are still computationally expensive to study in atomistic detail. For these applications, employing coarse-grained (CG) modeling approaches has become a key strategy. In this Review, we provide an overview of what we call pragmatic CG protein models, which are strategies combining, at least in part, a physics-based implementation and a top-down experimental approach to their parametrization. In particular, we focus on CG models in which most protein residues are represented by at least two beads, allowing these models to retain some degree of chemical specificity. A description of the main modern pragmatic protein CG models is provided, including a review of the most recent applications and an outlook on future perspectives in the field.
Collapse
Affiliation(s)
- Luís Borges-Araújo
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| | - Ilias Patmanidis
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark
- Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Akhil P Singh
- Department of Biology, University of Fribourg, Chemin du Musée 10, Fribourg CH-1700, Switzerland
| | - Lucianna H S Santos
- Biomolecular Simulations Group, Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Adam K Sieradzan
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Stefano Vanni
- Department of Biology, University of Fribourg, Chemin du Musée 10, Fribourg CH-1700, Switzerland
- Institut de Pharmacologie Moléculaire et Cellulaire, Université Côte d'Azur, Inserm, CNRS, 06560 Valbonne, France
| | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Sergio Pantano
- Biomolecular Simulations Group, Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Wataru Shinoda
- Research Institute for Interdisciplinary Science, Okayama University, 3-1-1 Tsushima-naka, Kita, Okayama 700-8530, Japan
| | - Luca Monticelli
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Siewert J Marrink
- Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Paulo C T Souza
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| |
Collapse
|
4
|
Chen Y, Jin S, Zhang M, Hu Y, Wu KL, Chung A, Wang S, Tian Z, Wang Y, Wolynes PG, Xiao H. Unleashing the potential of noncanonical amino acid biosynthesis to create cells with precision tyrosine sulfation. Nat Commun 2022; 13:5434. [PMID: 36114189 PMCID: PMC9481576 DOI: 10.1038/s41467-022-33111-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 09/01/2022] [Indexed: 01/31/2023] Open
Abstract
Despite the great promise of genetic code expansion technology to modulate structures and functions of proteins, external addition of ncAAs is required in most cases and it often limits the utility of genetic code expansion technology, especially to noncanonical amino acids (ncAAs) with poor membrane internalization. Here, we report the creation of autonomous cells, both prokaryotic and eukaryotic, with the ability to biosynthesize and genetically encode sulfotyrosine (sTyr), an important protein post-translational modification with low membrane permeability. These engineered cells can produce site-specifically sulfated proteins at a higher yield than cells fed exogenously with the highest level of sTyr reported in the literature. We use these autonomous cells to prepare highly potent thrombin inhibitors with site-specific sulfation. By enhancing ncAA incorporation efficiency, this added ability of cells to biosynthesize ncAAs and genetically incorporate them into proteins greatly extends the utility of genetic code expansion methods.
Collapse
Affiliation(s)
- Yuda Chen
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Shikai Jin
- grid.21940.3e0000 0004 1936 8278Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Biosciences, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Mengxi Zhang
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Yu Hu
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Kuan-Lin Wu
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Anna Chung
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Shichao Wang
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Zeru Tian
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Yixian Wang
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Peter G. Wolynes
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Biosciences, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Physics, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Han Xiao
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Biosciences, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX 77005 USA
| |
Collapse
|
5
|
Jin S, Bueno C, Lu W, Wang Q, Chen M, Chen X, Wolynes PG, Gao Y. Computationally exploring the mechanism of bacteriophage T7 gp4 helicase translocating along ssDNA. Proc Natl Acad Sci U S A 2022; 119:e2202239119. [PMID: 35914145 PMCID: PMC9371691 DOI: 10.1073/pnas.2202239119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/05/2022] [Indexed: 12/12/2022] Open
Abstract
Bacteriophage T7 gp4 helicase has served as a model system for understanding mechanisms of hexameric replicative helicase translocation. The mechanistic basis of how nucleoside 5'-triphosphate hydrolysis and translocation of gp4 helicase are coupled is not fully resolved. Here, we used a thermodynamically benchmarked coarse-grained protein force field, Associative memory, Water mediated, Structure and Energy Model (AWSEM), with the single-stranded DNA (ssDNA) force field 3SPN.2C to investigate gp4 translocation. We found that the adenosine 5'-triphosphate (ATP) at the subunit interface stabilizes the subunit-subunit interaction and inhibits subunit translocation. Hydrolysis of ATP to adenosine 5'-diphosphate enables the translocation of one subunit, and new ATP binding at the new subunit interface finalizes the subunit translocation. The LoopD2 and the N-terminal primase domain provide transient protein-protein and protein-DNA interactions that facilitate the large-scale subunit movement. The simulations of gp4 helicase both validate our coarse-grained protein-ssDNA force field and elucidate the molecular basis of replicative helicase translocation.
Collapse
Affiliation(s)
- Shikai Jin
- Department of Biosciences, Rice University, Houston, TX 77005
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
| | - Carlos Bueno
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
| | - Wei Lu
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
- Department of Physics, Rice University, Houston, TX 77005
| | - Qian Wang
- Department of Physics, University of Science and Technology of China, Hefei 230026, China
| | - Mingchen Chen
- Department of Research and Development, neoX Biotech, Beijing 100206, China
| | - Xun Chen
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
- Department of Chemistry, Rice University, Houston, TX 77005
| | - Peter G Wolynes
- Department of Biosciences, Rice University, Houston, TX 77005
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
- Department of Physics, Rice University, Houston, TX 77005
- Department of Chemistry, Rice University, Houston, TX 77005
| | - Yang Gao
- Department of Biosciences, Rice University, Houston, TX 77005
| |
Collapse
|
6
|
Exploring the folding energy landscapes of heme proteins using a hybrid AWSEM-heme model. J Biol Phys 2022; 48:37-53. [PMID: 35000062 PMCID: PMC8866609 DOI: 10.1007/s10867-021-09596-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 11/03/2021] [Indexed: 10/29/2022] Open
Abstract
Heme is an active center in many proteins. Here we explore computationally the role of heme in protein folding and protein structure. We model heme proteins using a hybrid model employing the AWSEM Hamiltonian, a coarse-grained forcefield for the protein chain along with AMBER, an all-atom forcefield for the heme. We carefully designed transferable force fields that model the interactions between the protein and the heme. The types of protein-ligand interactions in the hybrid model include thioester covalent bonds, coordinated covalent bonds, hydrogen bonds, and electrostatics. We explore the influence of different types of hemes (heme b and heme c) on folding and structure prediction. Including both types of heme improves the quality of protein structure predictions. The free energy landscape shows that both types of heme can act as nucleation sites for protein folding and stabilize the protein folded state. In binding the heme, coordinated covalent bonds and thioester covalent bonds for heme c drive the heme toward the native pocket. The electrostatics also facilitates the search for the binding site.
Collapse
|
7
|
Giulini M, Rigoli M, Mattiotti G, Menichetti R, Tarenzi T, Fiorentini R, Potestio R. From System Modeling to System Analysis: The Impact of Resolution Level and Resolution Distribution in the Computer-Aided Investigation of Biomolecules. Front Mol Biosci 2021; 8:676976. [PMID: 34164432 PMCID: PMC8215203 DOI: 10.3389/fmolb.2021.676976] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 05/06/2021] [Indexed: 12/18/2022] Open
Abstract
The ever increasing computer power, together with the improved accuracy of atomistic force fields, enables researchers to investigate biological systems at the molecular level with remarkable detail. However, the relevant length and time scales of many processes of interest are still hardly within reach even for state-of-the-art hardware, thus leaving important questions often unanswered. The computer-aided investigation of many biological physics problems thus largely benefits from the usage of coarse-grained models, that is, simplified representations of a molecule at a level of resolution that is lower than atomistic. A plethora of coarse-grained models have been developed, which differ most notably in their granularity; this latter aspect determines one of the crucial open issues in the field, i.e. the identification of an optimal degree of coarsening, which enables the greatest simplification at the expenses of the smallest information loss. In this review, we present the problem of coarse-grained modeling in biophysics from the viewpoint of system representation and information content. In particular, we discuss two distinct yet complementary aspects of protein modeling: on the one hand, the relationship between the resolution of a model and its capacity of accurately reproducing the properties of interest; on the other hand, the possibility of employing a lower resolution description of a detailed model to extract simple, useful, and intelligible information from the latter.
Collapse
Affiliation(s)
- Marco Giulini
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Marta Rigoli
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Giovanni Mattiotti
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Roberto Menichetti
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Thomas Tarenzi
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Raffaele Fiorentini
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| | - Raffaello Potestio
- Physics Department, University of Trento, Trento, Italy.,INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| |
Collapse
|
8
|
Kamenik AS, Handle PH, Hofer F, Kahler U, Kraml J, Liedl KR. Polarizable and non-polarizable force fields: Protein folding, unfolding, and misfolding. J Chem Phys 2021; 153:185102. [PMID: 33187403 DOI: 10.1063/5.0022135] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Molecular dynamics simulations are an invaluable tool to characterize the dynamic motions of proteins in atomistic detail. However, the accuracy of models derived from simulations inevitably relies on the quality of the underlying force field. Here, we present an evaluation of current non-polarizable and polarizable force fields (AMBER ff14SB, CHARMM 36m, GROMOS 54A7, and Drude 2013) based on the long-standing biophysical challenge of protein folding. We quantify the thermodynamics and kinetics of the β-hairpin formation using Markov state models of the fast-folding mini-protein CLN025. Furthermore, we study the (partial) folding dynamics of two more complex systems, a villin headpiece variant and a WW domain. Surprisingly, the polarizable force field in our set, Drude 2013, consistently leads to destabilization of the native state, regardless of the secondary structure element present. All non-polarizable force fields, on the other hand, stably characterize the native state ensembles in most cases even when starting from a partially unfolded conformation. Focusing on CLN025, we find that the conformational space captured with AMBER ff14SB and CHARMM 36m is comparable, but the ensembles from CHARMM 36m simulations are clearly shifted toward disordered conformations. While the AMBER ff14SB ensemble overstabilizes the native fold, CHARMM 36m and GROMOS 54A7 ensembles both agree remarkably well with experimental state populations. In addition, GROMOS 54A7 also reproduces experimental folding times most accurately. Our results further indicate an over-stabilization of helical structures with AMBER ff14SB. Nevertheless, the presented investigations strongly imply that reliable (un)folding dynamics of small proteins can be captured in feasible computational time with current additive force fields.
Collapse
Affiliation(s)
- Anna S Kamenik
- Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80/82, A-6020 Innsbruck, Austria
| | - Philip H Handle
- Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80/82, A-6020 Innsbruck, Austria
| | - Florian Hofer
- Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80/82, A-6020 Innsbruck, Austria
| | - Ursula Kahler
- Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80/82, A-6020 Innsbruck, Austria
| | - Johannes Kraml
- Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80/82, A-6020 Innsbruck, Austria
| | - Klaus R Liedl
- Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innrain 80/82, A-6020 Innsbruck, Austria
| |
Collapse
|
9
|
Lu W, Bueno C, Schafer NP, Moller J, Jin S, Chen X, Chen M, Gu X, Davtyan A, de Pablo JJ, Wolynes PG. OpenAWSEM with Open3SPN2: A fast, flexible, and accessible framework for large-scale coarse-grained biomolecular simulations. PLoS Comput Biol 2021; 17:e1008308. [PMID: 33577557 PMCID: PMC7906472 DOI: 10.1371/journal.pcbi.1008308] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 02/25/2021] [Accepted: 01/09/2021] [Indexed: 01/28/2023] Open
Abstract
We present OpenAWSEM and Open3SPN2, new cross-compatible implementations of coarse-grained models for protein (AWSEM) and DNA (3SPN2) molecular dynamics simulations within the OpenMM framework. These new implementations retain the chemical accuracy and intrinsic efficiency of the original models while adding GPU acceleration and the ease of forcefield modification provided by OpenMM’s Custom Forces software framework. By utilizing GPUs, we achieve around a 30-fold speedup in protein and protein-DNA simulations over the existing LAMMPS-based implementations running on a single CPU core. We showcase the benefits of OpenMM’s Custom Forces framework by devising and implementing two new potentials that allow us to address important aspects of protein folding and structure prediction and by testing the ability of the combined OpenAWSEM and Open3SPN2 to model protein-DNA binding. The first potential is used to describe the changes in effective interactions that occur as a protein becomes partially buried in a membrane. We also introduced an interaction to describe proteins with multiple disulfide bonds. Using simple pairwise disulfide bonding terms results in unphysical clustering of cysteine residues, posing a problem when simulating the folding of proteins with many cysteines. We now can computationally reproduce Anfinsen’s early Nobel prize winning experiments by using OpenMM’s Custom Forces framework to introduce a multi-body disulfide bonding term that prevents unphysical clustering. Our protein-DNA simulations show that the binding landscape is funneled towards structures that are quite similar to those found using experiments. In summary, this paper provides a simulation tool for the molecular biophysics community that is both easy to use and sufficiently efficient to simulate large proteins and large protein-DNA systems that are central to many cellular processes. These codes should facilitate the interplay between molecular simulations and cellular studies, which have been hampered by the large mismatch between the time and length scales accessible to molecular simulations and those relevant to cell biology. The cell’s most important pieces of machinery are large complexes of proteins often along with nucleic acids. From the ribosome, to CRISPR-Cas9, to transcription factors and DNA-wrangling proteins like the SMC-Kleisins, these complexes allow organisms to replicate and enable cells to respond to environmental cues. Computer simulation is a key technology that can be used to connect physical theories with biological reality. Unfortunately, the time and length scales accessible to molecular simulation have not kept pace with our ambition to study the cell’s molecular factories. Many simulation codes also unfortunately remain effectively locked away from the user community who need to modify them as more of the underlying physics is learned. In this paper, we present OpenAWSEM and Open3SPN2, two new easy-to-use and easy to modify implementations of efficient and accurate coarse-grained protein and DNA simulation forcefields that can now be run hundreds of times faster than before, thereby making studies of large biomolecular machines more facile.
Collapse
Affiliation(s)
- Wei Lu
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Physics, Rice University, Houston, Texas, United States of America
| | - Carlos Bueno
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Chemistry, Rice University, Houston, Texas, United States of America
| | - Nicholas P. Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Chemistry, Rice University, Houston, Texas, United States of America
- Schafer Science, LLC, Houston, Texas United States of America
| | - Joshua Moller
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois, United States of America
- Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Shikai Jin
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Biosciences, Rice University, Houston, Texas, United States of America
| | - Xun Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Chemistry, Rice University, Houston, Texas, United States of America
| | - Mingchen Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
| | - Xinyu Gu
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Chemistry, Rice University, Houston, Texas, United States of America
| | - Aram Davtyan
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
| | - Juan J. de Pablo
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois, United States of America
- Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Peter G. Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Chemistry, Rice University, Houston, Texas, United States of America
- Department of Physics, Rice University, Houston, Texas, United States of America
- Department of Biosciences, Rice University, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
10
|
Chen M, Chen X, Jin S, Lu W, Lin X, Wolynes PG. Protein Structure Refinement Guided by Atomic Packing Frustration Analysis. J Phys Chem B 2020; 124:10889-10898. [PMID: 32931278 DOI: 10.1021/acs.jpcb.0c06719] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recent advances in machine learning, bioinformatics, and the understanding of the folding problem have enabled efficient predictions of protein structures with moderate accuracy, even for targets where there is little information from templates. All-atom molecular dynamics simulations provide a route to refine such predicted structures, but unguided atomistic simulations, even when lengthy in time, often fail to eliminate incorrect structural features that would prevent the structure from becoming more energetically favorable owing to the necessity of making large scale motions and to overcoming energy barriers for side chain repacking. In this study, we show that localizing packing frustration at atomic resolution by examining the statistics of the energetic changes that occur when the local environment of a site is changed allows one to identify the most likely locations of incorrect contacts. The global statistics of atomic resolution frustration in structures that have been predicted using various algorithms provide strong indicators of structural quality when tested over a database of 20 targets from previous CASP experiments. Residues that are more correctly located turn out to be more minimally frustrated than more poorly positioned sites. These observations provide a diagnosis of both global and local quality of predicted structures and thus can be used as guidance in all-atom refinement simulations of the 20 targets. Refinement simulations guided by atomic packing frustration turn out to be quite efficient and significantly improve the quality of the structures.
Collapse
Affiliation(s)
- Mingchen Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Xun Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Shikai Jin
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Department of Biosciences, Rice University, Houston, Texas 77005, United States
| | - Wei Lu
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Department of Physics and Astronomy, Rice University, Houston, Texas 77030, United States
| | - Xingcheng Lin
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Peter G Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Department of Chemistry, Rice University, Houston, Texas 77005, United States.,Department of Biosciences, Rice University, Houston, Texas 77005, United States.,Department of Physics and Astronomy, Rice University, Houston, Texas 77030, United States
| |
Collapse
|
11
|
Jin S, Miller MD, Chen M, Schafer NP, Lin X, Chen X, Phillips GN, Wolynes PG. Molecular-replacement phasing using predicted protein structures from AWSEM-Suite. IUCRJ 2020; 7:1168-1178. [PMID: 33209327 PMCID: PMC7642774 DOI: 10.1107/s2052252520013494] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 10/07/2020] [Indexed: 06/11/2023]
Abstract
The phase problem in X-ray crystallography arises from the fact that only the intensities, and not the phases, of the diffracting electromagnetic waves are measured directly. Molecular replacement can often estimate the relative phases of reflections starting with those derived from a template structure, which is usually a previously solved structure of a similar protein. The key factor in the success of molecular replacement is finding a good template structure. When no good solved template exists, predicted structures based partially on templates can sometimes be used to generate models for molecular replacement, thereby extending the lower bound of structural and sequence similarity required for successful structure determination. Here, the effectiveness is examined of structures predicted by a state-of-the-art prediction algorithm, the Associative memory, Water-mediated, Structure and Energy Model Suite (AWSEM-Suite), which has been shown to perform well in predicting protein structures in CASP13 when there is no significant sequence similarity to a solved protein or only very low sequence similarity to known templates. The performance of AWSEM-Suite structures in molecular replacement is discussed and the results show that AWSEM-Suite performs well in providing useful phase information, often performing better than I-TASSER-MR and the previous algorithm AWSEM-Template.
Collapse
Affiliation(s)
- Shikai Jin
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Biosciences, Rice University, Houston, Texas, USA
| | | | - Mingchen Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| | - Nicholas P. Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
| | - Xingcheng Lin
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Xun Chen
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
| | - George N. Phillips
- Department of Biosciences, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
| | - Peter G. Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
- Department of Biosciences, Rice University, Houston, Texas, USA
- Department of Chemistry, Rice University, Houston, Texas, USA
- Department of Physics, Rice University, Houston, Texas, USA
| |
Collapse
|