1
|
ElGamacy M. Accelerating therapeutic protein design. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 130:85-118. [PMID: 35534117 DOI: 10.1016/bs.apcsb.2022.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Protein structures provide for defined microenvironments that can support complex pharmacological functions, otherwise unachievable by small molecules. The advent of therapeutic proteins has thus greatly broadened the range of manageable disorders. Leveraging the knowledge and recent advances in de novo protein design methods has the prospect of revolutionizing how protein drugs are discovered and developed. This review lays out the main challenges facing therapeutic proteins discovery and development, and how present and future advancements of protein design can accelerate the protein drug pipelines.
Collapse
Affiliation(s)
- Mohammad ElGamacy
- University Hospital Tübingen, Division of Translational Oncology, Tübingen, Germany; Max Planck Institute for Biology, Tübingen, Germany.
| |
Collapse
|
2
|
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature 2021; 596:583-589. [PMID: 34265844 PMCID: PMC8371605 DOI: 10.1038/s41586-021-03819-2] [Citation(s) in RCA: 16326] [Impact Index Per Article: 5442.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023]
Abstract
Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1-4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'8-has been an important open research problem for more than 50 years9. Despite recent progress10-14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Woolfson DN. A Brief History of De Novo Protein Design: Minimal, Rational, and Computational. J Mol Biol 2021; 433:167160. [PMID: 34298061 DOI: 10.1016/j.jmb.2021.167160] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 07/07/2021] [Accepted: 07/12/2021] [Indexed: 12/26/2022]
Abstract
Protein design has come of age, but how will it mature? In the 1980s and the 1990s, the primary motivation for de novo protein design was to test our understanding of the informational aspect of the protein-folding problem; i.e., how does protein sequence determine protein structure and function? This necessitated minimal and rational design approaches whereby the placement of each residue in a design was reasoned using chemical principles and/or biochemical knowledge. At that time, though with some notable exceptions, the use of computers to aid design was not widespread. Over the past two decades, the tables have turned and computational protein design is firmly established. Here, I illustrate this progress through a timeline of de novo protein structures that have been solved to atomic resolution and deposited in the Protein Data Bank. From this, it is clear that the impact of rational and computational design has been considerable: More-complex and more-sophisticated designs are being targeted with many being resolved to atomic resolution. Furthermore, our ability to generate and manipulate synthetic proteins has advanced to a point where they are providing realistic alternatives to natural protein functions for applications both in vitro and in cells. Also, and increasingly, computational protein design is becoming accessible to non-specialists. This all begs the questions: Is there still a place for minimal and rational design approaches? And, what challenges lie ahead for the burgeoning field of de novo protein design as a whole?
Collapse
Affiliation(s)
- Derek N Woolfson
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, UK; School of Biochemistry, University of Bristol, Biomedical Sciences Building, University Walk, Bristol BS8 1TD, UK; Bristol BioDesign Institute, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, UK.
| |
Collapse
|
4
|
ElGamacy M, Hernandez Alvarez B. Expanding the versatility of natural and de novo designed coiled coils and helical bundles. Curr Opin Struct Biol 2021; 68:224-234. [PMID: 33964630 DOI: 10.1016/j.sbi.2021.03.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 03/23/2021] [Accepted: 03/23/2021] [Indexed: 10/21/2022]
Abstract
Natural helical bundles (HBs) constitute a ubiquitous class of protein folds built of two or more longitudinally arranged α-helices. They adopt topologies that include symmetric, highly regular assemblies all the way to asymmetric, loosely packed domains. The diverse functional spectrum of HBs ranges from structural scaffolds to complex and dynamic effectors as molecular motors, signaling and sensing molecules, enzymes, and molecular switches. Symmetric HBs, particularly coiled coils, offer simple model systems providing an ideal entry point for protein folding and design studies. Herein, we review recent progress unveiling new structural features and functional mechanisms in natural HBs and cover staggering advances in the de novo design of HBs, giving rise to exotic structures and the creation of novel functions.
Collapse
Affiliation(s)
- Mohammad ElGamacy
- Systems Biology of Development Group, Friedrich Miescher Laboratory of the Max Planck Society, Max-Planck-Ring 9, Tübingen, 72076, Germany; Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Otfried-Müller-Strasse 10, Tübingen, 72076, Germany; Department of Protein Evolution, Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, 72076, Germany
| | - Birte Hernandez Alvarez
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, 72076, Germany.
| |
Collapse
|
5
|
Gidley F, Parmeggiani F. Repeat proteins: designing new shapes and functions for solenoid folds. Curr Opin Struct Biol 2021; 68:208-214. [PMID: 33721772 DOI: 10.1016/j.sbi.2021.02.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 01/31/2021] [Accepted: 02/01/2021] [Indexed: 10/21/2022]
Abstract
The modular nature of repeat proteins has inspired the design of regular and completely novel sequences and structures. Research in the past years has provided a broad set of design approaches and new repeat proteins that have found applications in molecular recognition, taking advantage of the natural ability of some of these families to bind proteins, peptides and nucleic acids. Here, we provide an overview on the recent trends in design of repeat proteins, particularly solenoid folds, and their applications. By exploiting the intrinsic modularity of repeats, new architectures have been designed that combine different types of repeat, are easily scalable by changing the number of repeats and can be quickly generated by using existing modular building blocks.
Collapse
Affiliation(s)
- Frances Gidley
- School of Chemistry, School of Biochemistry, Bristol Biodesign Institute, University of Bristol, United Kingdom
| | - Fabio Parmeggiani
- School of Chemistry, School of Biochemistry, Bristol Biodesign Institute, University of Bristol, United Kingdom.
| |
Collapse
|
6
|
Hernandez Alvarez B, Skokowa J, Coles M, Mir P, Nasri M, Maksymenko K, Weidmann L, Rogers KW, Welte K, Lupas AN, Müller P, ElGamacy M. Design of novel granulopoietic proteins by topological rescaffolding. PLoS Biol 2020; 18:e3000919. [PMID: 33351791 PMCID: PMC7755208 DOI: 10.1371/journal.pbio.3000919] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 11/24/2020] [Indexed: 11/18/2022] Open
Abstract
Computational protein design is rapidly becoming more powerful, and improving the accuracy of computational methods would greatly streamline protein engineering by eliminating the need for empirical optimization in the laboratory. In this work, we set out to design novel granulopoietic agents using a rescaffolding strategy with the goal of achieving simpler and more stable proteins. All of the 4 experimentally tested designs were folded, monomeric, and stable, while the 2 determined structures agreed with the design models within less than 2.5 Å. Despite the lack of significant topological or sequence similarity to their natural granulopoietic counterpart, 2 designs bound to the granulocyte colony-stimulating factor (G-CSF) receptor and exhibited potent, but delayed, in vitro proliferative activity in a G-CSF-dependent cell line. Interestingly, the designs also induced proliferation and differentiation of primary human hematopoietic stem cells into mature granulocytes, highlighting the utility of our approach to develop highly active therapeutic leads purely based on computational design. De novo designed cytokines that activate the G-CSF receptor show that the receptor-binding information can be encoded onto stable, miniaturised protein scaffolds that possess potent granulopoietic activity; such novel proteins provide for ideal candidates for protein-based therapeutics.
Collapse
Affiliation(s)
| | - Julia Skokowa
- University Hospital Tübingen, Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Germany
- * E-mail: (JS); (ME)
| | - Murray Coles
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Perihan Mir
- University Hospital Tübingen, Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Germany
| | - Masoud Nasri
- University Hospital Tübingen, Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Germany
| | | | - Laura Weidmann
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | | | - Karl Welte
- University Hospital Tübingen, Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Germany
| | - Andrei N. Lupas
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Patrick Müller
- University Hospital Tübingen, Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Germany
- Friedrich Miescher Laboratory of the Max Planck Society Tübingen, Germany
| | - Mohammad ElGamacy
- Max Planck Institute for Developmental Biology, Tübingen, Germany
- University Hospital Tübingen, Division of Translational Oncology, Department of Hematology, Oncology, Clinical Immunology and Rheumatology, University Hospital Tübingen, Germany
- Friedrich Miescher Laboratory of the Max Planck Society Tübingen, Germany
- Heliopolis Biotechnology Ltd., London, United Kingdom
- * E-mail: (JS); (ME)
| |
Collapse
|
7
|
Cohan MC, Ruff KM, Pappu RV. Information theoretic measures for quantifying sequence-ensemble relationships of intrinsically disordered proteins. Protein Eng Des Sel 2020; 32:191-202. [PMID: 31375817 PMCID: PMC7462041 DOI: 10.1093/protein/gzz014] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 06/19/2019] [Indexed: 01/26/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) contribute to a multitude of functions. De novo design of IDPs should open the door to modulating functions and phenotypes controlled by these systems. Recent design efforts have focused on compositional biases and specific sequence patterns as the design features. Analysis of the impact of these designs on sequence-function relationships indicates that individual sequence/compositional parameters are insufficient for describing sequence-function relationships in IDPs. To remedy this problem, we have developed information theoretic measures for sequence–ensemble relationships (SERs) of IDPs. These measures rely on prior availability of statistically robust conformational ensembles derived from all atom simulations. We show that the measures we have developed are useful for comparing sequence-ensemble relationships even when sequence is poorly conserved. Based on our results, we propose that de novo designs of IDPs, guided by knowledge of their SERs, should provide improved insights into their sequence–ensemble–function relationships.
Collapse
Affiliation(s)
- Megan C Cohan
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS) Washington University in St. Louis, One Brookings Drive, Campus Box 1097, St. Louis MO, USA
| | - Kiersten M Ruff
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS) Washington University in St. Louis, One Brookings Drive, Campus Box 1097, St. Louis MO, USA
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS) Washington University in St. Louis, One Brookings Drive, Campus Box 1097, St. Louis MO, USA
| |
Collapse
|
8
|
Abstract
Proteins are molecular machines whose function depends on their ability to achieve complex folds with precisely defined structural and dynamic properties. The rational design of proteins from first-principles, or de novo, was once considered to be impossible, but today proteins with a variety of folds and functions have been realized. We review the evolution of the field from its earliest days, placing particular emphasis on how this endeavor has illuminated our understanding of the principles underlying the folding and function of natural proteins, and is informing the design of macromolecules with unprecedented structures and properties. An initial set of milestones in de novo protein design focused on the construction of sequences that folded in water and membranes to adopt folded conformations. The first proteins were designed from first-principles using very simple physical models. As computers became more powerful, the use of the rotamer approximation allowed one to discover amino acid sequences that stabilize the desired fold. As the crystallographic database of protein structures expanded in subsequent years, it became possible to construct proteins by assembling short backbone fragments that frequently recur in Nature. The second set of milestones in de novo design involves the discovery of complex functions. Proteins have been designed to bind a variety of metals, porphyrins, and other cofactors. The design of proteins that catalyze hydrolysis and oxygen-dependent reactions has progressed significantly. However, de novo design of catalysts for energetically demanding reactions, or even proteins that bind with high affinity and specificity to highly functionalized complex polar molecules remains an importnant challenge that is now being achieved. Finally, the protein design contributed significantly to our understanding of membrane protein folding and transport of ions across membranes. The area of membrane protein design, or more generally of biomimetic polymers that function in mixed or non-aqueous environments, is now becoming increasingly possible.
Collapse
|
9
|
Mittl PR, Ernst P, Plückthun A. Chaperone-assisted structure elucidation with DARPins. Curr Opin Struct Biol 2020; 60:93-100. [PMID: 31918361 DOI: 10.1016/j.sbi.2019.12.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/16/2019] [Accepted: 12/05/2019] [Indexed: 12/14/2022]
Abstract
Designed ankyrin repeat proteins (DARPins) are artificial binding proteins that have found many uses in therapy, diagnostics and biochemical research. They substantially extend the scope of antibody-derived binders. Their high affinity and specificity, rigidity, extended paratope, and facile bacterial production make them attractive for structural biology. Complexes with simple DARPins have been crystallized for a long time, but particularly the rigid helix fusion strategy has opened new opportunities. Rigid DARPin fusions expand crystallization space, enable recruitment of targets in a host lattice and reduce the size limit for cryo-EM. Besides applications in structural biology, rigid DARPin fusions also serve as molecular probes in cells to investigate spatial restraints in targets.
Collapse
Affiliation(s)
- Peer Re Mittl
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Patrick Ernst
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Andreas Plückthun
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland.
| |
Collapse
|
10
|
Ernst P, Honegger A, van der Valk F, Ewald C, Mittl PRE, Plückthun A. Rigid fusions of designed helical repeat binding proteins efficiently protect a binding surface from crystal contacts. Sci Rep 2019; 9:16162. [PMID: 31700118 PMCID: PMC6838082 DOI: 10.1038/s41598-019-52121-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 10/09/2019] [Indexed: 12/26/2022] Open
Abstract
Designed armadillo repeat proteins (dArmRPs) bind extended peptides in a modular way. The consensus version recognises alternating arginines and lysines, with one dipeptide per repeat. For generating new binding specificities, the rapid and robust analysis by crystallography is key. Yet, we have previously found that crystal contacts can strongly influence this analysis, by displacing the peptide and potentially distorting the overall geometry of the scaffold. Therefore, we now used protein design to minimise these effects and expand the previously described concept of shared helices to rigidly connect dArmRPs and designed ankyrin repeat proteins (DARPins), which serve as a crystallisation chaperone. To shield the peptide-binding surface from crystal contacts, we rigidly fused two DARPins to the N- and C-terminal repeat of the dArmRP and linked the two DARPins by a disulfide bond. In this ring-like structure, peptide binding, on the inside of the ring, is very regular and undistorted, highlighting the truly modular binding mode. Thus, protein design was utilised to construct a well crystallising scaffold that prevents interference from crystal contacts with peptide binding and maintains the equilibrium structure of the dArmRP. Rigid DARPin-dArmRPs fusions will also be useful when chimeric binding proteins with predefined geometries are required.
Collapse
Affiliation(s)
- Patrick Ernst
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Annemarie Honegger
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Floor van der Valk
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Christina Ewald
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland.,Cytometry Facility, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Peer R E Mittl
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Andreas Plückthun
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland.
| |
Collapse
|
11
|
ElGamacy M, Coles M, Lupas A. Asymmetric protein design from conserved supersecondary structures. J Struct Biol 2018; 204:380-387. [DOI: 10.1016/j.jsb.2018.10.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 10/19/2018] [Accepted: 10/25/2018] [Indexed: 10/28/2022]
|