1
|
Harteveld Z, Van Hall-Beauvais A, Morozova I, Southern J, Goverde C, Georgeon S, Rosset S, Defferrard M, Loukas A, Vandergheynst P, Bronstein MM, Correia BE. Exploring "dark-matter" protein folds using deep learning. Cell Syst 2024; 15:898-910.e5. [PMID: 39383860 DOI: 10.1016/j.cels.2024.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 06/13/2024] [Accepted: 09/16/2024] [Indexed: 10/11/2024]
Abstract
De novo protein design explores uncharted sequence and structure space to generate novel proteins not sampled by evolution. A main challenge in de novo design involves crafting "designable" structural templates to guide the sequence searches toward adopting target structures. We present a convolutional variational autoencoder that learns patterns of protein structure, dubbed Genesis. We coupled Genesis with trRosetta to design sequences for a set of protein folds and found that Genesis is capable of reconstructing native-like distance and angle distributions for five native folds and three novel, the so-called "dark-matter" folds as a demonstration of generalizability. We used a high-throughput assay to characterize the stability of the designs through protease resistance, obtaining encouraging success rates for folded proteins. Genesis enables exploration of the protein fold space within minutes, unrestricted by protein topologies. Our approach addresses the backbone designability problem, showing that small neural networks can efficiently learn structural patterns in proteins. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Zander Harteveld
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Alexandra Van Hall-Beauvais
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Irina Morozova
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | | | - Casper Goverde
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | | | - Stéphane Rosset
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | | | - Andreas Loukas
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Prescient Design, gRED, Roche, Basel, Switzerland
| | | | | | - Bruno E Correia
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
2
|
Albanese KI, Petrenas R, Pirro F, Naudin EA, Borucu U, Dawson WM, Scott DA, Leggett GJ, Weiner OD, Oliver TAA, Woolfson DN. Rationally seeded computational protein design of ɑ-helical barrels. Nat Chem Biol 2024; 20:991-999. [PMID: 38902458 PMCID: PMC11288890 DOI: 10.1038/s41589-024-01642-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 05/09/2024] [Indexed: 06/22/2024]
Abstract
Computational protein design is advancing rapidly. Here we describe efficient routes starting from validated parallel and antiparallel peptide assemblies to design two families of α-helical barrel proteins with central channels that bind small molecules. Computational designs are seeded by the sequences and structures of defined de novo oligomeric barrel-forming peptides, and adjacent helices are connected by loop building. For targets with antiparallel helices, short loops are sufficient. However, targets with parallel helices require longer connectors; namely, an outer layer of helix-turn-helix-turn-helix motifs that are packed onto the barrels. Throughout these computational pipelines, residues that define open states of the barrels are maintained. This minimizes sequence sampling, accelerating the design process. For each of six targets, just two to six synthetic genes are made for expression in Escherichia coli. On average, 70% of these genes express to give soluble monomeric proteins that are fully characterized, including high-resolution structures for most targets that match the design models with high accuracy.
Collapse
Affiliation(s)
- Katherine I Albanese
- School of Chemistry, University of Bristol, Bristol, UK
- Max Planck-Bristol Centre for Minimal Biology, University of Bristol, Bristol, UK
| | | | - Fabio Pirro
- School of Chemistry, University of Bristol, Bristol, UK
| | | | - Ufuk Borucu
- School of Biochemistry, University of Bristol, Medical Sciences Building, Bristol, UK
| | | | - D Arne Scott
- Rosa Biotech, Science Creates St Philips, Bristol, UK
| | | | - Orion D Weiner
- Cardiovascular Research Institute, Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
| | | | - Derek N Woolfson
- School of Chemistry, University of Bristol, Bristol, UK.
- Max Planck-Bristol Centre for Minimal Biology, University of Bristol, Bristol, UK.
- School of Biochemistry, University of Bristol, Medical Sciences Building, Bristol, UK.
- Bristol BioDesign Institute, University of Bristol, Bristol, UK.
| |
Collapse
|
3
|
Roel‐Touris J, Carcelén L, Marcos E. The structural landscape of the immunoglobulin fold by large-scale de novo design. Protein Sci 2024; 33:e4936. [PMID: 38501461 PMCID: PMC10949314 DOI: 10.1002/pro.4936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 02/02/2024] [Accepted: 02/06/2024] [Indexed: 03/20/2024]
Abstract
De novo designing immunoglobulin-like frameworks that allow for functional loop diversification shows great potential for crafting antibody-like scaffolds with fully customizable structures and functions. In this work, we combined de novo parametric design with deep-learning methods for protein structure prediction and design to explore the structural landscape of 7-stranded immunoglobulin domains. After screening folding of nearly 4 million designs, we have assembled a structurally diverse library of ~50,000 immunoglobulin domains with high-confidence AlphaFold2 predictions and structures diverging from naturally occurring ones. The designed dataset enabled us to identify structural requirements for the correct folding of immunoglobulin domains, shed light on β-sheet-β-sheet rotational preferences and how these are linked to functional properties. Our approach eliminates the need for preset loop conformations and opens the route to large-scale de novo design of immunoglobulin-like frameworks.
Collapse
Affiliation(s)
- Jorge Roel‐Touris
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| | - Lourdes Carcelén
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| | - Enrique Marcos
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| |
Collapse
|
4
|
Kortemme T. De novo protein design-From new structures to programmable functions. Cell 2024; 187:526-544. [PMID: 38306980 PMCID: PMC10990048 DOI: 10.1016/j.cell.2023.12.028] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/03/2023] [Accepted: 12/19/2023] [Indexed: 02/04/2024]
Abstract
Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.
Collapse
Affiliation(s)
- Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|
5
|
Roel-Touris J, Nadal M, Marcos E. Single-chain dimers from de novo immunoglobulins as robust scaffolds for multiple binding loops. Nat Commun 2023; 14:5939. [PMID: 37741853 PMCID: PMC10517939 DOI: 10.1038/s41467-023-41717-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/15/2023] [Indexed: 09/25/2023] Open
Abstract
Antibody derivatives have sought to recapitulate the antigen binding properties of antibodies, but with improved biophysical attributes convenient for therapeutic, diagnostic and research applications. However, their success has been limited by the naturally occurring structure of the immunoglobulin dimer displaying hypervariable binding loops, which is hard to modify by traditional engineering approaches. Here, we devise geometrical principles for de novo designing single-chain immunoglobulin dimers, as a tunable two-domain architecture that optimizes biophysical properties through more favorable dimer interfaces. Guided by these principles, we computationally designed protein scaffolds that were hyperstable, structurally accurate and robust for accommodating multiple functional loops, both individually and in combination, as confirmed through biochemical assays and X-ray crystallography. We showcase the modularity of this architecture by deep-learning-based diversification, opening up the possibility for tailoring the number, positioning, and relative orientation of ligand-binding loops targeting one or two distal epitopes. Our results provide a route to custom-design robust protein scaffolds for harboring multiple functional loops.
Collapse
Affiliation(s)
- Jorge Roel-Touris
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028, Barcelona, Spain
| | - Marta Nadal
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028, Barcelona, Spain
| | - Enrique Marcos
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028, Barcelona, Spain.
| |
Collapse
|
6
|
Cummins MC, Tripathy A, Sondek J, Kuhlman B. De novo design of stable proteins that efficaciously inhibit oncogenic G proteins. Protein Sci 2023; 32:e4713. [PMID: 37368504 PMCID: PMC10360382 DOI: 10.1002/pro.4713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 06/23/2023] [Accepted: 06/24/2023] [Indexed: 06/29/2023]
Abstract
Many protein therapeutics are competitive inhibitors that function by binding to endogenous proteins and preventing them from interacting with native partners. One effective strategy for engineering competitive inhibitors is to graft structural motifs from a native partner into a host protein. Here, we develop and experimentally test a computational protocol for embedding binding motifs in de novo designed proteins. The protocol uses an "inside-out" approach: Starting with a structural model of the binding motif docked against the target protein, the de novo protein is built by growing new structural elements off the termini of the binding motif. During backbone assembly, a score function favors backbones that introduce new tertiary contacts within the designed protein and do not introduce clashes with the target binding partner. Final sequences are designed and optimized using the molecular modeling program Rosetta. To test our protocol, we designed small helical proteins to inhibit the interaction between Gαq and its effector PLC-β isozymes. Several of the designed proteins remain folded above 90°C and bind to Gαq with equilibrium dissociation constants tighter than 80 nM. In cellular assays with oncogenic variants of Gαq , the designed proteins inhibit activation of PLC-β isozymes and Dbl-family RhoGEFs. Our results demonstrate that computational protein design, in combination with motif grafting, can be used to directly generate potent inhibitors without further optimization via high throughput screening or selection.
Collapse
Affiliation(s)
- Matthew C. Cummins
- Department of PharmacologyUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Ashutosh Tripathy
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - John Sondek
- Department of PharmacologyUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
- Lineberger Comprehensive Cancer CenterUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Brian Kuhlman
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
- Lineberger Comprehensive Cancer CenterUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| |
Collapse
|
7
|
Cummins MC, Tripathy A, Sondek J, Kuhlman B. De novo design of stable proteins that efficaciously inhibit oncogenic G proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.28.534629. [PMID: 37034763 PMCID: PMC10081213 DOI: 10.1101/2023.03.28.534629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/22/2023]
Abstract
Many protein therapeutics are competitive inhibitors that function by binding to endogenous proteins and preventing them from interacting with native partners. One effective strategy for engineering competitive inhibitors is to graft structural motifs from a native partner into a host protein. Here, we develop and experimentally test a computational protocol for embedding binding motifs in de novo designed proteins. The protocol uses an "inside-out" approach: Starting with a structural model of the binding motif docked against the target protein, the de novo protein is built by growing new structural elements off the termini of the binding motif. During backbone assembly, a score function favors backbones that introduce new tertiary contacts within the designed protein and do not introduce clashes with the target binding partner. Final sequences are designed and optimized using the molecular modeling program Rosetta. To test our protocol, we designed small helical proteins to inhibit the interaction between Gα q and its effector PLC-β isozymes. Several of the designed proteins remain folded above 90°C and bind to Gα q with equilibrium dissociation constants tighter than 80 nM. In cellular assays with oncogenic variants of Gα q , the designed proteins inhibit activation of PLC-β isozymes and Dbl-family RhoGEFs. Our results demonstrate that computational protein design, in combination with motif grafting, can be used to directly generate potent inhibitors without further optimization via high throughput screening or selection. statement for broader audience Engineered proteins that bind to specific target proteins are useful as research reagents, diagnostics, and therapeutics. We used computational protein design to engineer de novo proteins that bind and competitively inhibit the G protein, Gα q , which is an oncogene for uveal melanomas. This computational method is a general approach that should be useful for designing competitive inhibitors against other proteins of interest.
Collapse
Affiliation(s)
- Matthew C. Cummins
- Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Ashutosh Tripathy
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - John Sondek
- Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
8
|
Wang X, Xu K, Tan Y, Liu S, Zhou J. Possibilities of Using De Novo Design for Generating Diverse Functional Food Enzymes. Int J Mol Sci 2023; 24:3827. [PMID: 36835238 PMCID: PMC9964944 DOI: 10.3390/ijms24043827] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/03/2023] [Accepted: 02/03/2023] [Indexed: 02/17/2023] Open
Abstract
Food enzymes have an important role in the improvement of certain food characteristics, such as texture improvement, elimination of toxins and allergens, production of carbohydrates, enhancing flavor/appearance characteristics. Recently, along with the development of artificial meats, food enzymes have been employed to achieve more diverse functions, especially in converting non-edible biomass to delicious foods. Reported food enzyme modifications for specific applications have highlighted the significance of enzyme engineering. However, using direct evolution or rational design showed inherent limitations due to the mutation rates, which made it difficult to satisfy the stability or specific activity needs for certain applications. Generating functional enzymes using de novo design, which highly assembles naturally existing enzymes, provides potential solutions for screening desired enzymes. Here, we describe the functions and applications of food enzymes to introduce the need for food enzymes engineering. To illustrate the possibilities of using de novo design for generating diverse functional proteins, we reviewed protein modelling and de novo design methods and their implementations. The future directions for adding structural data for de novo design model training, acquiring diversified training data, and investigating the relationship between enzyme-substrate binding and activity were highlighted as challenges to overcome for the de novo design of food enzymes.
Collapse
Affiliation(s)
- Xinglong Wang
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Kangjie Xu
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Yameng Tan
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Song Liu
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Jingwen Zhou
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, Wuxi 214122, China
| |
Collapse
|