1
|
Ribeiro-Filho HV, Jara GE, Guerra JVS, Cheung M, Felbinger NR, Pereira JGC, Pierce BG, Lopes-de-Oliveira PS. Exploring the Potential of Structure-Based Deep Learning Approaches for T cell Receptor Design. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.19.590222. [PMID: 38712216 PMCID: PMC11071404 DOI: 10.1101/2024.04.19.590222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Deep learning methods, trained on the increasing set of available protein 3D structures and sequences, have substantially impacted the protein modeling and design field. These advancements have facilitated the creation of novel proteins, or the optimization of existing ones designed for specific functions, such as binding a target protein. Despite the demonstrated potential of such approaches in designing general protein binders, their application in designing immunotherapeutics remains relatively unexplored. A relevant application is the design of T cell receptors (TCRs). Given the crucial role of T cells in mediating immune responses, redirecting these cells to tumor or infected target cells through the engineering of TCRs has shown promising results in treating diseases, especially cancer. However, the computational design of TCR interactions presents challenges for current physics-based methods, particularly due to the unique natural characteristics of these interfaces, such as low affinity and cross-reactivity. For this reason, in this study, we explored the potential of two structure-based deep learning protein design methods, ProteinMPNN and ESM-IF, in designing fixed-backbone TCRs for binding target antigenic peptides presented by the MHC through different design scenarios. To evaluate TCR designs, we employed a comprehensive set of sequence- and structure-based metrics, highlighting the benefits of these methods in comparison to classical physics-based design methods and identifying deficiencies for improvement.
Collapse
Affiliation(s)
- Helder V. Ribeiro-Filho
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Gabriel E. Jara
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - João V. S. Guerra
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
- Graduate Program in Pharmaceutical Sciences, Faculty of Pharmaceutical Sciences, University of Campinas, Campinas, São Paulo, 13083-871, Brazil
| | - Melyssa Cheung
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Nathaniel R. Felbinger
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland 20742, USA
| | - José G. C. Pereira
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Brian G. Pierce
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland 20742, USA
| | - Paulo S. Lopes-de-Oliveira
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
- Graduate Program in Pharmaceutical Sciences, Faculty of Pharmaceutical Sciences, University of Campinas, Campinas, São Paulo, 13083-871, Brazil
| |
Collapse
|
2
|
Hong L, Kortemme T. An integrative approach to protein sequence design through multiobjective optimization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.01.582670. [PMID: 38496480 PMCID: PMC10942313 DOI: 10.1101/2024.03.01.582670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
With recent methodological advances in the field of computational protein design, in particular those based on deep learning, there is an increasing need for frameworks that allow for coherent, direct integration of different models and objective functions into the generative design process. Here we demonstrate how evolutionary multiobjective optimization techniques can be adapted to provide such an approach. With the established Non-dominated Sorting Genetic Algorithm II (NSGA-II) as the optimization framework, we use AlphaFold2 and ProteinMPNN confidence metrics to define the objective space, and a mutation operator composed of ESM-1v and ProteinMPNN to rank and then redesign the least favorable positions. Using the multistate design problem of the foldswitching protein RfaH as an in-depth case study, we show that the evolutionary multiobjective optimization approach leads to significant reduction in the bias and variance in RfaH native sequence recovery, compared to a direct application of ProteinMPNN. We suggest that this improvement is due to three factors: (i) the use of an informative mutation operator that accelerates the sequence space exploration, (ii) the parallel, iterative design process inherent to the genetic algorithm that improves upon the ProteinMPNN autoregressive sequence decoding scheme, and (iii) the explicit approximation of the Pareto front that leads to optimal design candidates representing diverse tradeoff conditions. We anticipate this approach to be readily adaptable to different models and broadly relevant for protein design tasks with complex specifications.
Collapse
Affiliation(s)
- Lu Hong
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
3
|
Kortemme T. De novo protein design-From new structures to programmable functions. Cell 2024; 187:526-544. [PMID: 38306980 PMCID: PMC10990048 DOI: 10.1016/j.cell.2023.12.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/03/2023] [Accepted: 12/19/2023] [Indexed: 02/04/2024]
Abstract
Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.
Collapse
Affiliation(s)
- Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|