1
|
Borges-Araújo L, Patmanidis I, Singh AP, Santos LHS, Sieradzan AK, Vanni S, Czaplewski C, Pantano S, Shinoda W, Monticelli L, Liwo A, Marrink SJ, Souza PCT. Pragmatic Coarse-Graining of Proteins: Models and Applications. J Chem Theory Comput 2023; 19:7112-7135. [PMID: 37788237 DOI: 10.1021/acs.jctc.3c00733] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The molecular details involved in the folding, dynamics, organization, and interaction of proteins with other molecules are often difficult to assess by experimental techniques. Consequently, computational models play an ever-increasing role in the field. However, biological processes involving large-scale protein assemblies or long time scale dynamics are still computationally expensive to study in atomistic detail. For these applications, employing coarse-grained (CG) modeling approaches has become a key strategy. In this Review, we provide an overview of what we call pragmatic CG protein models, which are strategies combining, at least in part, a physics-based implementation and a top-down experimental approach to their parametrization. In particular, we focus on CG models in which most protein residues are represented by at least two beads, allowing these models to retain some degree of chemical specificity. A description of the main modern pragmatic protein CG models is provided, including a review of the most recent applications and an outlook on future perspectives in the field.
Collapse
Affiliation(s)
- Luís Borges-Araújo
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| | - Ilias Patmanidis
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark
- Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Akhil P Singh
- Department of Biology, University of Fribourg, Chemin du Musée 10, Fribourg CH-1700, Switzerland
| | - Lucianna H S Santos
- Biomolecular Simulations Group, Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Adam K Sieradzan
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Stefano Vanni
- Department of Biology, University of Fribourg, Chemin du Musée 10, Fribourg CH-1700, Switzerland
- Institut de Pharmacologie Moléculaire et Cellulaire, Université Côte d'Azur, Inserm, CNRS, 06560 Valbonne, France
| | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Sergio Pantano
- Biomolecular Simulations Group, Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Wataru Shinoda
- Research Institute for Interdisciplinary Science, Okayama University, 3-1-1 Tsushima-naka, Kita, Okayama 700-8530, Japan
| | - Luca Monticelli
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Siewert J Marrink
- Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Paulo C T Souza
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| |
Collapse
|
2
|
Pan T, Jin S, Miller MD, Kyrillidis A, Phillips GN. A deep learning solution for crystallographic structure determination. IUCRJ 2023; 10:487-496. [PMID: 37409806 DOI: 10.1107/s2052252523004293] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 05/17/2023] [Indexed: 07/07/2023]
Abstract
The general de novo solution of the crystallographic phase problem is difficult and only possible under certain conditions. This paper develops an initial pathway to a deep learning neural network approach for the phase problem in protein crystallography, based on a synthetic dataset of small fragments derived from a large well curated subset of solved structures in the Protein Data Bank (PDB). In particular, electron-density estimates of simple artificial systems are produced directly from corresponding Patterson maps using a convolutional neural network architecture as a proof of concept.
Collapse
Affiliation(s)
- Tom Pan
- Department of Computer Science, Rice University, Houston, Texas, USA
| | - Shikai Jin
- Department of Biosciences, Rice University, Houston, Texas, USA
| | | | | | | |
Collapse
|
3
|
Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022; 21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open
Abstract
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
Collapse
Affiliation(s)
- Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Aalt D.J. van Dijk
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| |
Collapse
|
4
|
Karasawa A, Andi B, Fuchs MR, Shi W, McSweeney S, Hendrickson WA, Liu Q. Multi-crystal native-SAD phasing at 5 keV with a helium environment. IUCRJ 2022; 9:768-777. [PMID: 36381147 PMCID: PMC9634608 DOI: 10.1107/s205225252200971x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/03/2022] [Indexed: 06/16/2023]
Abstract
De novo structure determination from single-wavelength anomalous diffraction using native sulfur or phospho-rus in biomolecules (native-SAD) is an appealing method to mitigate the labor-intensive production of heavy-atom derivatives and seleno-methio-nyl substitutions. The native-SAD method is particularly attractive for membrane proteins, which are difficult to produce and often recalcitrant to grow into decent-sized crystals. Native-SAD uses lower-energy X-rays to enhance anomalous signals from sulfur or phospho-rus. However, at lower energies, the scattering and absorption of air contribute to the background noise, reduce the signals and are thus adverse to native-SAD phasing. We have previously demonstrated native-SAD phasing at an energy of 5 keV in air at the NSLS-II FMX beamline. Here, the use of a helium path developed to reduce both the noise from background scattering and the air absorption of the diffracted X-ray beam are described. The helium path was used for collection of anomalous diffraction data at 5 keV for two proteins: thaumatin and the membrane protein TehA. Although anomalous signals from each individual crystal are very weak, robust anomalous signals are obtained from data assembled from micrometre-sized crystals. The thaumatin structure was determined from 15 microcrystals and the TehA structure from 18 microcrystals. These results demonstrate the usefulness of a helium environment in support of native-SAD phasing at 5 keV.
Collapse
Affiliation(s)
- Akira Karasawa
- Center on Membrane Protein Production and Analysis, New York Structural Biology Center, New York, NY 10027, USA
| | - Babak Andi
- Photon Sciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Martin R. Fuchs
- Photon Sciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Wuxian Shi
- Photon Sciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Sean McSweeney
- Photon Sciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Wayne A. Hendrickson
- Center on Membrane Protein Production and Analysis, New York Structural Biology Center, New York, NY 10027, USA
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
- Department of Physiology and Cellular Biophysics, Columbia University, New York, NY 10032, USA
| | - Qun Liu
- Photon Sciences, NSLS-II, Brookhaven National Laboratory, Upton, NY 11973, USA
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
| |
Collapse
|
5
|
Chen Y, Jin S, Zhang M, Hu Y, Wu KL, Chung A, Wang S, Tian Z, Wang Y, Wolynes PG, Xiao H. Unleashing the potential of noncanonical amino acid biosynthesis to create cells with precision tyrosine sulfation. Nat Commun 2022; 13:5434. [PMID: 36114189 PMCID: PMC9481576 DOI: 10.1038/s41467-022-33111-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 09/01/2022] [Indexed: 01/31/2023] Open
Abstract
Despite the great promise of genetic code expansion technology to modulate structures and functions of proteins, external addition of ncAAs is required in most cases and it often limits the utility of genetic code expansion technology, especially to noncanonical amino acids (ncAAs) with poor membrane internalization. Here, we report the creation of autonomous cells, both prokaryotic and eukaryotic, with the ability to biosynthesize and genetically encode sulfotyrosine (sTyr), an important protein post-translational modification with low membrane permeability. These engineered cells can produce site-specifically sulfated proteins at a higher yield than cells fed exogenously with the highest level of sTyr reported in the literature. We use these autonomous cells to prepare highly potent thrombin inhibitors with site-specific sulfation. By enhancing ncAA incorporation efficiency, this added ability of cells to biosynthesize ncAAs and genetically incorporate them into proteins greatly extends the utility of genetic code expansion methods.
Collapse
Affiliation(s)
- Yuda Chen
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Shikai Jin
- grid.21940.3e0000 0004 1936 8278Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Biosciences, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Mengxi Zhang
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Yu Hu
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Kuan-Lin Wu
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Anna Chung
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Shichao Wang
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Zeru Tian
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Yixian Wang
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Peter G. Wolynes
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Biosciences, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Physics, Rice University, 6100 Main Street, Houston, TX 77005 USA
| | - Han Xiao
- grid.21940.3e0000 0004 1936 8278Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Biosciences, Rice University, 6100 Main Street, Houston, TX 77005 USA ,grid.21940.3e0000 0004 1936 8278Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX 77005 USA
| |
Collapse
|
6
|
Traver MS, Bradford SE, Olmos JL, Wright ZJ, Miller MD, Xu W, Phillips GN, Bartel B. The Structure of the Arabidopsis PEX4-PEX22 Peroxin Complex-Insights Into Ubiquitination at the Peroxisomal Membrane. Front Cell Dev Biol 2022; 10:838923. [PMID: 35300425 PMCID: PMC8922245 DOI: 10.3389/fcell.2022.838923] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 01/28/2022] [Indexed: 01/11/2023] Open
Abstract
Peroxisomes are eukaryotic organelles that sequester critical oxidative reactions and process the resulting reactive oxygen species into less toxic byproducts. Peroxisome function and formation are coordinated by peroxins (PEX proteins) that guide peroxisome biogenesis and division and shuttle proteins into the lumen and membrane of the organelle. Despite the importance of peroxins in plant metabolism and development, no plant peroxin structures have been reported. Here we report the X-ray crystal structure of the PEX4-PEX22 peroxin complex from the reference plant Arabidopsis thaliana. PEX4 is a ubiquitin-conjugating enzyme (UBC) that ubiquitinates proteins associated with the peroxisomal membrane, and PEX22 is a peroxisomal membrane protein that anchors PEX4 to the peroxisome and facilitates PEX4 activity. We co-expressed Arabidopsis PEX4 as a translational fusion with the soluble PEX4-interacting domain of PEX22 in E. coli. The fusion was linked via a protease recognition site, allowing us to separate PEX4 and PEX22 following purification and solve the structure of the complex. We compared the structure of the PEX4-PEX22 complex to the previously published structures of yeast orthologs. Arabidopsis PEX4 displays the typical UBC structure expected from its sequence. Although Arabidopsis PEX22 lacks notable sequence identity to yeast PEX22, it maintains a similar Rossmann fold-like structure. Several salt bridges are positioned to contribute to the specificity of PEX22 for PEX4 versus other Arabidopsis UBCs, and the long unstructured PEX22 tether would allow PEX4-mediated ubiquitination of distant peroxisomal membrane targets without dissociation from PEX22. The Arabidopsis PEX4-PEX22 structure also revealed that the residue altered in pex4-1 (P123L), a mutant previously isolated via a forward-genetic screen for peroxisomal dysfunction, is near the active site cysteine of PEX4. We demonstrated in vitro UBC activity for the PEX4-PEX22 complex and found that the pex4-1 enzyme has reduced in vitro ubiquitin-conjugating activity and altered specificity compared to PEX4. Our findings illuminate the role of PEX4 and PEX22 in peroxisome structure and function and provide tools for future exploration of ubiquitination at the peroxisome surface.
Collapse
Affiliation(s)
- Melissa S. Traver
- Department of Biosciences, Rice University, Houston, TX, United States
| | - Sarah E. Bradford
- Department of Biosciences, Rice University, Houston, TX, United States
| | - Jose Luis Olmos
- Department of Biosciences, Rice University, Houston, TX, United States
| | - Zachary J. Wright
- Department of Biosciences, Rice University, Houston, TX, United States
| | | | - Weijun Xu
- Department of Biosciences, Rice University, Houston, TX, United States
| | - George N. Phillips
- Department of Biosciences, Rice University, Houston, TX, United States
- Department of Chemistry, Rice University, Houston, TX, United States
| | - Bonnie Bartel
- Department of Biosciences, Rice University, Houston, TX, United States
| |
Collapse
|
7
|
McCoy AJ, Sammito MD, Read RJ. Implications of AlphaFold2 for crystallographic phasing by molecular replacement. Acta Crystallogr D Struct Biol 2022; 78:1-13. [PMID: 34981757 PMCID: PMC8725160 DOI: 10.1107/s2059798321012122] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 11/13/2021] [Indexed: 12/11/2022] Open
Abstract
The AlphaFold2 results in the 14th edition of Critical Assessment of Structure Prediction (CASP14) showed that accurate (low root-mean-square deviation) in silico models of protein structure domains are on the horizon, whether or not the protein is related to known structures through high-coverage sequence similarity. As highly accurate models become available, generated by harnessing the power of correlated mutations and deep learning, one of the aspects of structural biology to be impacted will be methods of phasing in crystallography. Here, the data from CASP14 are used to explore the prospects for changes in phasing methods, and in particular to explore the prospects for molecular-replacement phasing using in silico models.
Collapse
Affiliation(s)
- Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Massimo D. Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
8
|
Masrati G, Landau M, Ben-Tal N, Lupas A, Kosloff M, Kosinski J. Integrative Structural Biology in the Era of Accurate Structure Prediction. J Mol Biol 2021; 433:167127. [PMID: 34224746 DOI: 10.1016/j.jmb.2021.167127] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 06/28/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022]
Abstract
Characterizing the three-dimensional structure of macromolecules is central to understanding their function. Traditionally, structures of proteins and their complexes have been determined using experimental techniques such as X-ray crystallography, NMR, or cryo-electron microscopy-applied individually or in an integrative manner. Meanwhile, however, computational methods for protein structure prediction have been improving their accuracy, gradually, then suddenly, with the breakthrough advance by AlphaFold2, whose models of monomeric proteins are often as accurate as experimental structures. This breakthrough foreshadows a new era of computational methods that can build accurate models for most monomeric proteins. Here, we envision how such accurate modeling methods can combine with experimental structural biology techniques, enhancing integrative structural biology. We highlight the challenges that arise when considering multiple structural conformations, protein complexes, and polymorphic assemblies. These challenges will motivate further developments, both in modeling programs and in methods to solve experimental structures, towards better and quicker investigation of structure-function relationships.
Collapse
Affiliation(s)
- Gal Masrati
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Meytal Landau
- Department of Biology, Technion-Israel Institute of Technology, Haifa 3200003, Israel; European Molecular Biology Laboratory (EMBL), Hamburg 22607, Germany
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Andrei Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany.
| | - Mickey Kosloff
- Department of Human Biology, Faculty of Natural Sciences, University of Haifa, 199 Aba Khoushy Ave., Mt. Carmel, 3498838 Haifa, Israel.
| | - Jan Kosinski
- European Molecular Biology Laboratory (EMBL), Hamburg 22607, Germany; Centre for Structural Systems Biology (CSSB), Hamburg 22607, Germany; Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
| |
Collapse
|