1
|
Yu Z, Jiang M, Lan X. HeteroTCR: A heterogeneous graph neural network-based method for predicting peptide-TCR interaction. Commun Biol 2024; 7:684. [PMID: 38834836 DOI: 10.1038/s42003-024-06380-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 05/23/2024] [Indexed: 06/06/2024] Open
Abstract
Identifying interactions between T-cell receptors (TCRs) and immunogenic peptides holds profound implications across diverse research domains and clinical scenarios. Unsupervised clustering models (UCMs) cannot predict peptide-TCR binding directly, while supervised predictive models (SPMs) often face challenges in identifying antigens previously unencountered by the immune system or possessing limited TCR binding repertoires. Therefore, we propose HeteroTCR, an SPM based on Heterogeneous Graph Neural Network (GNN), to accurately predict peptide-TCR binding probabilities. HeteroTCR captures within-type (TCR-TCR or peptide-peptide) similarity information and between-type (peptide-TCR) interaction insights for predictions on unseen peptides and TCRs, surpassing limitations of existing SPMs. Our evaluation shows HeteroTCR outperforms state-of-the-art models on independent datasets. Ablation studies and visual interpretation underscore the Heterogeneous GNN module's critical role in enhancing HeteroTCR's performance by capturing pivotal binding process features. We further demonstrate the robustness and reliability of HeteroTCR through validation using single-cell datasets, aligning with the expectation that pMHC-TCR complexes with higher predicted binding probabilities correspond to increased binding fractions.
Collapse
Affiliation(s)
- Zilan Yu
- School of Medicine, Tsinghua University, 100084, Beijing, China
- Centre for Life Sciences, Tsinghua University, 100084, Beijing, China
| | - Mengnan Jiang
- School of Medicine, Tsinghua University, 100084, Beijing, China
| | - Xun Lan
- School of Medicine, Tsinghua University, 100084, Beijing, China.
- Centre for Life Sciences, Tsinghua University, 100084, Beijing, China.
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.
- MOE Key Laboratory of Bioinformatics, Tsinghua University, 100084, Beijing, China.
| |
Collapse
|
2
|
Bulashevska A, Nacsa Z, Lang F, Braun M, Machyna M, Diken M, Childs L, König R. Artificial intelligence and neoantigens: paving the path for precision cancer immunotherapy. Front Immunol 2024; 15:1394003. [PMID: 38868767 PMCID: PMC11167095 DOI: 10.3389/fimmu.2024.1394003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 05/13/2024] [Indexed: 06/14/2024] Open
Abstract
Cancer immunotherapy has witnessed rapid advancement in recent years, with a particular focus on neoantigens as promising targets for personalized treatments. The convergence of immunogenomics, bioinformatics, and artificial intelligence (AI) has propelled the development of innovative neoantigen discovery tools and pipelines. These tools have revolutionized our ability to identify tumor-specific antigens, providing the foundation for precision cancer immunotherapy. AI-driven algorithms can process extensive amounts of data, identify patterns, and make predictions that were once challenging to achieve. However, the integration of AI comes with its own set of challenges, leaving space for further research. With particular focus on the computational approaches, in this article we have explored the current landscape of neoantigen prediction, the fundamental concepts behind, the challenges and their potential solutions providing a comprehensive overview of this rapidly evolving field.
Collapse
Affiliation(s)
- Alla Bulashevska
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Zsófia Nacsa
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Franziska Lang
- TRON - Translational Oncology at the University Medical Center of the Johannes Gutenberg University gGmbH, Mainz, Germany
| | - Markus Braun
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Martin Machyna
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Mustafa Diken
- TRON - Translational Oncology at the University Medical Center of the Johannes Gutenberg University gGmbH, Mainz, Germany
| | - Liam Childs
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Renate König
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| |
Collapse
|
3
|
Leary AY, Scott D, Gupta NT, Waite JC, Skokos D, Atwal GS, Hawkins PG. Designing meaningful continuous representations of T cell receptor sequences with deep generative models. Nat Commun 2024; 15:4271. [PMID: 38769289 PMCID: PMC11106309 DOI: 10.1038/s41467-024-48198-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 04/24/2024] [Indexed: 05/22/2024] Open
Abstract
T Cell Receptor (TCR) antigen binding underlies a key mechanism of the adaptive immune response yet the vast diversity of TCRs and the complexity of protein interactions limits our ability to build useful low dimensional representations of TCRs. To address the current limitations in TCR analysis we develop a capacity-controlled disentangling variational autoencoder trained using a dataset of approximately 100 million TCR sequences, that we name TCR-VALID. We design TCR-VALID such that the model representations are low-dimensional, continuous, disentangled, and sufficiently informative to provide high-quality TCR sequence de novo generation. We thoroughly quantify these properties of the representations, providing a framework for future protein representation learning in low dimensions. The continuity of TCR-VALID representations allows fast and accurate TCR clustering and is benchmarked against other state-of-the-art TCR clustering tools and pre-trained language models.
Collapse
Affiliation(s)
- Allen Y Leary
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA.
| | - Darius Scott
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Namita T Gupta
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Janelle C Waite
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Dimitris Skokos
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Gurinder S Atwal
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Peter G Hawkins
- Regeneron Pharmaceuticals Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA.
| |
Collapse
|
4
|
Wang A, Lin X, Chau KN, Onuchic JN, Levine H, George JT. RACER-m leverages structural features for sparse T cell specificity prediction. SCIENCE ADVANCES 2024; 10:eadl0161. [PMID: 38748791 PMCID: PMC11095454 DOI: 10.1126/sciadv.adl0161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 04/10/2024] [Indexed: 05/19/2024]
Abstract
Reliable prediction of T cell specificity against antigenic signatures is a formidable task, complicated by the immense diversity of T cell receptor and antigen sequence space and the resulting limited availability of training sets for inferential models. Recent modeling efforts have demonstrated the advantage of incorporating structural information to overcome the need for extensive training sequence data, yet disentangling the heterogeneous TCR-antigen interface to accurately predict MHC-allele-restricted TCR-peptide interactions has remained challenging. Here, we present RACER-m, a coarse-grained structural model leveraging key biophysical information from the diversity of publicly available TCR-antigen crystal structures. Explicit inclusion of structural content substantially reduces the required number of training examples and maintains reliable predictions of TCR-recognition specificity and sensitivity across diverse biological contexts. Our model capably identifies biophysically meaningful point-mutant peptides that affect binding affinity, distinguishing its ability in predicting TCR specificity of point-mutants from alternative sequence-based methods. Its application is broadly applicable to studies involving both closely related and structurally diverse TCR-peptide pairs.
Collapse
Affiliation(s)
- Ailun Wang
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
| | - Xingcheng Lin
- Department of Physics, North Carolina State University, Raleigh, NC, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Kevin Ng Chau
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
| | - José N. Onuchic
- Departments of Physics and Astronomy, Chemistry, and Biosciences, Rice University, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
| | - Herbert Levine
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
- Department of Bioengineering, Northeastern University, Boston, MA, USA
| | - Jason T. George
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Biomedical Engineering, Texas A&M University, Houston, TX, USA
| |
Collapse
|
5
|
Lotter W, Hassett MJ, Schultz N, Kehl KL, Van Allen EM, Cerami E. Artificial Intelligence in Oncology: Current Landscape, Challenges, and Future Directions. Cancer Discov 2024; 14:711-726. [PMID: 38597966 PMCID: PMC11131133 DOI: 10.1158/2159-8290.cd-23-1199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/29/2024] [Accepted: 02/28/2024] [Indexed: 04/11/2024]
Abstract
Artificial intelligence (AI) in oncology is advancing beyond algorithm development to integration into clinical practice. This review describes the current state of the field, with a specific focus on clinical integration. AI applications are structured according to cancer type and clinical domain, focusing on the four most common cancers and tasks of detection, diagnosis, and treatment. These applications encompass various data modalities, including imaging, genomics, and medical records. We conclude with a summary of existing challenges, evolving solutions, and potential future directions for the field. SIGNIFICANCE AI is increasingly being applied to all aspects of oncology, where several applications are maturing beyond research and development to direct clinical integration. This review summarizes the current state of the field through the lens of clinical translation along the clinical care continuum. Emerging areas are also highlighted, along with common challenges, evolving solutions, and potential future directions for the field.
Collapse
Affiliation(s)
- William Lotter
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Michael J. Hassett
- Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nikolaus Schultz
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center; New York, NY, USA
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Kenneth L. Kehl
- Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Eliezer M. Van Allen
- Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ethan Cerami
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
6
|
Kidman J, Zemek RM, Sidhom JW, Correa D, Principe N, Sheikh F, Fear VS, Forbes CA, Chopra A, Boon L, Zaitouny A, de Jong E, Holt RA, Jones M, Millward MJ, Lassmann T, Forrest AR, Nowak AK, Watson M, Lake RA, Lesterhuis WJ, Chee J. Immune checkpoint therapy responders display early clonal expansion of tumor infiltrating lymphocytes. Oncoimmunology 2024; 13:2345859. [PMID: 38686178 PMCID: PMC11057660 DOI: 10.1080/2162402x.2024.2345859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 04/17/2024] [Indexed: 05/02/2024] Open
Abstract
Immune checkpoint therapy (ICT) causes durable tumour responses in a subgroup of patients, but it is not well known how T cell receptor beta (TCRβ) repertoire dynamics contribute to the therapeutic response. Using murine models that exclude variation in host genetics, environmental factors and tumour mutation burden, limiting variation between animals to naturally diverse TCRβ repertoires, we applied TCRseq, single cell RNAseq and flow cytometry to study TCRβ repertoire dynamics in ICT responders and non-responders. Increased oligoclonal expansion of TCRβ clonotypes was observed in responding tumours. Machine learning identified TCRβ CDR3 signatures unique to each tumour model, and signatures associated with ICT response at various timepoints before or during ICT. Clonally expanded CD8+ T cells in responding tumours post ICT displayed effector T cell gene signatures and phenotype. An early burst of clonal expansion during ICT is associated with response, and we report unique dynamics in TCRβ signatures associated with ICT response.
Collapse
MESH Headings
- Animals
- Immune Checkpoint Inhibitors/pharmacology
- Immune Checkpoint Inhibitors/therapeutic use
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/metabolism
- Mice
- Lymphocytes, Tumor-Infiltrating/immunology
- Lymphocytes, Tumor-Infiltrating/drug effects
- Lymphocytes, Tumor-Infiltrating/metabolism
- CD8-Positive T-Lymphocytes/immunology
- CD8-Positive T-Lymphocytes/drug effects
- CD8-Positive T-Lymphocytes/metabolism
- Humans
- Mice, Inbred C57BL
- Female
Collapse
Affiliation(s)
- Joel Kidman
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
| | | | | | - Debora Correa
- Complex Systems Group, Department of Mathematics and Statistics, University of Western Australia, Perth, Australia
| | - Nicola Principe
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
| | - Fezaan Sheikh
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
| | | | | | - Abha Chopra
- Medical Genomics Laboratories (IIID), Centre for Molecular Medicine and Innovative Therapeutics, Health Futures Institute, Murdoch University, Murdoch, Australia
| | | | - Ayham Zaitouny
- Complex Systems Group, Department of Mathematics and Statistics, University of Western Australia, Perth, Australia
- Department of Mathematical Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Emma de Jong
- Telethon Kids Institute, Perth, Australia
- Medical School, University of Western Australia, Perth, Australia
| | | | - Matt Jones
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Australia
| | | | | | - Alistair R.R. Forrest
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Australia
| | - Anna K. Nowak
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
- Medical School, University of Western Australia, Perth, Australia
| | - Mark Watson
- Medical Genomics Laboratories (IIID), Centre for Molecular Medicine and Innovative Therapeutics, Health Futures Institute, Murdoch University, Murdoch, Australia
| | - Richard A. Lake
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
| | - W. Joost Lesterhuis
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
- Telethon Kids Institute, Perth, Australia
| | - Jonathan Chee
- National Centre for Asbestos Related Diseases, Institute for Respiratory Health, University of Western Australia, Perth, Australia
| |
Collapse
|
7
|
Goldner Kabeli R, Zevin S, Abargel A, Zilberberg A, Efroni S. Self-supervised learning of T cell receptor sequences exposes core properties for T cell membership. SCIENCE ADVANCES 2024; 10:eadk4670. [PMID: 38669334 DOI: 10.1126/sciadv.adk4670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 03/26/2024] [Indexed: 04/28/2024]
Abstract
The T cell receptor (TCR) repertoire is an extraordinarily diverse collection of TCRs essential for maintaining the body's homeostasis and response to threats. In this study, we compiled an extensive dataset of more than 4200 bulk TCR repertoire samples, encompassing 221,176,713 sequences, alongside 6,159,652 single-cell TCR sequences from over 400 samples. From this dataset, we then selected a representative subset of 5 million bulk sequences and 4.2 million single-cell sequences to train two specialized Transformer-based language models for bulk (CVC) and single-cell (scCVC) TCR repertoires, respectively. We show that these models successfully capture TCR core qualities, such as sharing, gene composition, and single-cell properties. These qualities are emergent in the encoded TCR latent space and enable classification into TCR-based qualities such as public sequences. These models demonstrate the potential of Transformer-based language models in TCR downstream applications.
Collapse
Affiliation(s)
- Romi Goldner Kabeli
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Sarit Zevin
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Avital Abargel
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Alona Zilberberg
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Sol Efroni
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| |
Collapse
|
8
|
Eskandari A, Leow TC, Rahman MBA, Oslan SN. Advances in Therapeutic Cancer Vaccines, Their Obstacles, and Prospects Toward Tumor Immunotherapy. Mol Biotechnol 2024:10.1007/s12033-024-01144-3. [PMID: 38625508 DOI: 10.1007/s12033-024-01144-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 03/15/2024] [Indexed: 04/17/2024]
Abstract
Over the past few decades, cancer immunotherapy has experienced a significant revolution due to the advancements in immune checkpoint inhibitors (ICIs) and adoptive cell therapies (ACTs), along with their regulatory approvals. In recent times, there has been hope in the effectiveness of cancer vaccines for therapy as they have been able to stimulate de novo T-cell reactions against tumor antigens. These tumor antigens include both tumor-associated antigen (TAA) and tumor-specific antigen (TSA). Nevertheless, the constant quest to fully achieve these abilities persists. Therefore, this review offers a broad perspective on the existing status of cancer immunizations. Cancer vaccine design has been revolutionized due to the advancements made in antigen selection, the development of antigen delivery systems, and a deeper understanding of the strategic intricacies involved in effective antigen presentation. In addition, this review addresses the present condition of clinical tests and deliberates on their approaches, with a particular emphasis on the immunogenicity specific to tumors and the evaluation of effectiveness against tumors. Nevertheless, the ongoing clinical endeavors to create cancer vaccines have failed to produce remarkable clinical results as a result of substantial obstacles, such as the suppression of the tumor immune microenvironment, the identification of suitable candidates, the assessment of immune responses, and the acceleration of vaccine production. Hence, there are possibilities for the industry to overcome challenges and enhance patient results in the coming years. This can be achieved by recognizing the intricate nature of clinical issues and continuously working toward surpassing existing limitations.
Collapse
Affiliation(s)
- Azadeh Eskandari
- Enzyme and Microbial Technology Research Centre, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia.
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia.
| | - Thean Chor Leow
- Enzyme and Microbial Technology Research Centre, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
- Department of Cell and Molecular Biology, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
- Enzyme Technology and X-ray Crystallography Laboratory, VacBio 5, Institute of Bioscience, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
| | | | - Siti Nurbaya Oslan
- Enzyme and Microbial Technology Research Centre, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
- Enzyme Technology and X-ray Crystallography Laboratory, VacBio 5, Institute of Bioscience, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia
| |
Collapse
|
9
|
Zaslavsky ME, Craig E, Michuda JK, Sehgal N, Ram-Mohan N, Lee JY, Nguyen KD, Hoh RA, Pham TD, Röltgen K, Lam B, Parsons ES, Macwana SR, DeJager W, Drapeau EM, Roskin KM, Cunningham-Rundles C, Moody MA, Haynes BF, Goldman JD, Heath JR, Nadeau KC, Pinsky BA, Blish CA, Hensley SE, Jensen K, Meyer E, Balboni I, Utz PJ, Merrill JT, Guthridge JM, James JA, Yang S, Tibshirani R, Kundaje A, Boyd SD. Disease diagnostics using machine learning of immune receptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2022.04.26.489314. [PMID: 35547855 PMCID: PMC9094102 DOI: 10.1101/2022.04.26.489314] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Clinical diagnosis typically incorporates physical examination, patient history, and various laboratory tests and imaging studies, but makes limited use of the human system's own record of antigen exposures encoded by receptors on B cells and T cells. We analyzed immune receptor datasets from 593 individuals to develop MAchine Learning for Immunological Diagnosis (Mal-ID) , an interpretive framework to screen for multiple illnesses simultaneously or precisely test for one condition. This approach detects specific infections, autoimmune disorders, vaccine responses, and disease severity differences. Human-interpretable features of the model recapitulate known immune responses to SARS-CoV-2, Influenza, and HIV, highlight antigen-specific receptors, and reveal distinct characteristics of Systemic Lupus Erythematosus and Type-1 Diabetes autoreactivity. This analysis framework has broad potential for scientific and clinical interpretation of human immune responses.
Collapse
|
10
|
Ji H, Wang XX, Zhang Q, Zhang C, Zhang HM. Predicting TCR sequences for unseen antigen epitopes using structural and sequence features. Brief Bioinform 2024; 25:bbae210. [PMID: 38711371 PMCID: PMC11074592 DOI: 10.1093/bib/bbae210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 04/04/2024] [Accepted: 04/22/2024] [Indexed: 05/08/2024] Open
Abstract
T-cell receptor (TCR) recognition of antigens is fundamental to the adaptive immune response. With the expansion of experimental techniques, a substantial database of matched TCR-antigen pairs has emerged, presenting opportunities for computational prediction models. However, accurately forecasting the binding affinities of unseen antigen-TCR pairs remains a major challenge. Here, we present convolutional-self-attention TCR (CATCR), a novel framework tailored to enhance the prediction of epitope and TCR interactions. Our approach utilizes convolutional neural networks to extract peptide features from residue contact matrices, as generated by OpenFold, and a transformer to encode segment-based coded sequences. We introduce CATCR-D, a discriminator that can assess binding by analyzing the structural and sequence features of epitopes and CDR3-β regions. Additionally, the framework comprises CATCR-G, a generative module designed for CDR3-β sequences, which applies the pretrained encoder to deduce epitope characteristics and a transformer decoder for predicting matching CDR3-β sequences. CATCR-D achieved an AUROC of 0.89 on previously unseen epitope-TCR pairs and outperformed four benchmark models by a margin of 17.4%. CATCR-G has demonstrated high precision, recall and F1 scores, surpassing 95% in bidirectional encoder representations from transformers score assessments. Our results indicate that CATCR is an effective tool for predicting unseen epitope-TCR interactions. Incorporating structural insights enhances our understanding of the general rules governing TCR-epitope recognition significantly. The ability to predict TCRs for novel epitopes using structural and sequence information is promising, and broadening the repository of experimental TCR-epitope data could further improve the precision of epitope-TCR binding predictions.
Collapse
MESH Headings
- Receptors, Antigen, T-Cell/chemistry
- Receptors, Antigen, T-Cell/immunology
- Receptors, Antigen, T-Cell/metabolism
- Receptors, Antigen, T-Cell/genetics
- Humans
- Epitopes/chemistry
- Epitopes/immunology
- Computational Biology/methods
- Neural Networks, Computer
- Epitopes, T-Lymphocyte/immunology
- Epitopes, T-Lymphocyte/chemistry
- Antigens/chemistry
- Antigens/immunology
- Amino Acid Sequence
Collapse
Affiliation(s)
- Hongchen Ji
- Department of Oncology of Xijing Hospital, Air Force Medical University, Xi’an, Shaanxi, China
| | - Xiang-Xu Wang
- Department of Oncology of Xijing Hospital, Air Force Medical University, Xi’an, Shaanxi, China
| | - Qiong Zhang
- Department of Oncology of Xijing Hospital, Air Force Medical University, Xi’an, Shaanxi, China
| | - Chengkai Zhang
- Department of Oncology of Xijing Hospital, Air Force Medical University, Xi’an, Shaanxi, China
| | - Hong-Mei Zhang
- Department of Oncology of Xijing Hospital, Air Force Medical University, Xi’an, Shaanxi, China
| |
Collapse
|
11
|
Lee H, Shin K, Lee Y, Lee S, Lee S, Lee E, Kim SW, Shin HY, Kim JH, Chung J, Kwon S. Identification of B cell subsets based on antigen receptor sequences using deep learning. Front Immunol 2024; 15:1342285. [PMID: 38576618 PMCID: PMC10991714 DOI: 10.3389/fimmu.2024.1342285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 03/07/2024] [Indexed: 04/06/2024] Open
Abstract
B cell receptors (BCRs) denote antigen specificity, while corresponding cell subsets indicate B cell functionality. Since each B cell uniquely encodes this combination, physical isolation and subsequent processing of individual B cells become indispensable to identify both attributes. However, this approach accompanies high costs and inevitable information loss, hindering high-throughput investigation of B cell populations. Here, we present BCR-SORT, a deep learning model that predicts cell subsets from their corresponding BCR sequences by leveraging B cell activation and maturation signatures encoded within BCR sequences. Subsequently, BCR-SORT is demonstrated to improve reconstruction of BCR phylogenetic trees, and reproduce results consistent with those verified using physical isolation-based methods or prior knowledge. Notably, when applied to BCR sequences from COVID-19 vaccine recipients, it revealed inter-individual heterogeneity of evolutionary trajectories towards Omicron-binding memory B cells. Overall, BCR-SORT offers great potential to improve our understanding of B cell responses.
Collapse
Affiliation(s)
- Hyunho Lee
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Kyoungseob Shin
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Yongju Lee
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Soobin Lee
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Seungyoun Lee
- Department of Biochemistry and Molecular Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Science, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Eunjae Lee
- Department of Biochemistry and Molecular Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Science, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Seung Woo Kim
- Department of Neurology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Ha Young Shin
- Department of Neurology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Jong Hoon Kim
- Department of Dermatology and Cutaneous Biology Research Institute, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Junho Chung
- Department of Biochemistry and Molecular Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Science, Seoul National University College of Medicine, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Sunghoon Kwon
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea
- Bio-MAX Institute, Seoul National University, Seoul, Republic of Korea
- Inter-University Semiconductor Research Center, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
12
|
Pothuri VS, Hogg GD, Conant L, Borcherding N, James CA, Mudd J, Williams G, Seo YD, Hawkins WG, Pillarisetty VG, DeNardo DG, Fields RC. Intratumoral T-cell receptor repertoire composition predicts overall survival in patients with pancreatic ductal adenocarcinoma. Oncoimmunology 2024; 13:2320411. [PMID: 38504847 PMCID: PMC10950267 DOI: 10.1080/2162402x.2024.2320411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 02/14/2024] [Indexed: 03/21/2024] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is a lethal malignancy that is refractory to immune checkpoint inhibitor therapy. However, intratumoral T-cell infiltration correlates with improved overall survival (OS). Herein, we characterized the diversity and antigen specificity of the PDAC T-cell receptor (TCR) repertoire to identify novel immune-relevant biomarkers. Demographic, clinical, and TCR-beta sequencing data were collated from 353 patients across three cohorts that underwent surgical resection for PDAC. TCR diversity was calculated using Shannon Wiener index, Inverse Simpson index, and "True entropy." Patients were clustered by shared repertoire specificity. TCRs predictive of OS were identified and their associated transcriptional states were characterized by single-cell RNAseq. In multivariate Cox regression models controlling for relevant covariates, high intratumoral TCR diversity predicted OS across multiple cohorts. Conversely, in peripheral blood, high abundance of T-cells, but not high diversity, predicted OS. Clustering patients based on TCR specificity revealed a subset of TCRs that predicts OS. Interestingly, these TCR sequences were more likely to encode CD8+ effector memory and CD4+ T-regulatory (Tregs) T-cells, all with the capacity to recognize beta islet-derived autoantigens. As opposed to T-cell abundance, intratumoral TCR diversity was predictive of OS in multiple PDAC cohorts, and a subset of TCRs enriched in high-diversity patients independently correlated with OS. These findings emphasize the importance of evaluating peripheral and intratumoral TCR repertoires as distinct and relevant biomarkers in PDAC.
Collapse
Affiliation(s)
- Vikram S. Pothuri
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
| | - Graham D. Hogg
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Leah Conant
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
| | - Nicholas Borcherding
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - C. Alston James
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
| | - Jacqueline Mudd
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
| | - Greg Williams
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
| | - Yongwoo David Seo
- Department of Surgery, University of Washington School of Medicine, Seattle, WA, USA
- Department of Surgical Oncology, MD Anderson Cancer Center, Houston, TX, USA
| | - William G. Hawkins
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MOUSA
| | - Venu G. Pillarisetty
- Department of Surgery, University of Washington School of Medicine, Seattle, WA, USA
- Fred Hutchinson Cancer Center, Seattle, WAUSA
| | - David G. DeNardo
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MOUSA
| | - Ryan C. Fields
- Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MOUSA
| |
Collapse
|
13
|
Qian X, Yang G, Li F, Zhang X, Zhu X, Lai X, Xiao X, Wang T, Wang J. DeepLION2: deep multi-instance contrastive learning framework enhancing the prediction of cancer-associated T cell receptors by attention strategy on motifs. Front Immunol 2024; 15:1345586. [PMID: 38515756 PMCID: PMC10956474 DOI: 10.3389/fimmu.2024.1345586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/19/2024] [Indexed: 03/23/2024] Open
Abstract
Introduction T cell receptor (TCR) repertoires provide valuable insights into complex human diseases, including cancers. Recent advancements in immune sequencing technology have significantly improved our understanding of TCR repertoire. Some computational methods have been devised to identify cancer-associated TCRs and enable cancer detection using TCR sequencing data. However, the existing methods are often limited by their inadequate consideration of the correlations among TCRs within a repertoire, hindering the identification of crucial TCRs. Additionally, the sparsity of cancer-associated TCR distribution presents a challenge in accurate prediction. Methods To address these issues, we presented DeepLION2, an innovative deep multi-instance contrastive learning framework specifically designed to enhance cancer-associated TCR prediction. DeepLION2 leveraged content-based sparse self-attention, focusing on the top k related TCRs for each TCR, to effectively model inter-TCR correlations. Furthermore, it adopted a contrastive learning strategy for bootstrapping parameter updates of the attention matrix, preventing the model from fixating on non-cancer-associated TCRs. Results Extensive experimentation on diverse patient cohorts, encompassing over ten cancer types, demonstrated that DeepLION2 significantly outperformed current state-of-the-art methods in terms of accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the curve (AUC). Notably, DeepLION2 achieved impressive AUC values of 0.933, 0.880, and 0.763 on thyroid, lung, and gastrointestinal cancer cohorts, respectively. Furthermore, it effectively identified cancer-associated TCRs along with their key motifs, highlighting the amino acids that play a crucial role in TCR-peptide binding. Conclusion These compelling results underscore DeepLION2's potential for enhancing cancer detection and facilitating personalized cancer immunotherapy. DeepLION2 is publicly available on GitHub, at https://github.com/Bioinformatics7181/DeepLION2, for academic use only.
Collapse
Affiliation(s)
- Xinyang Qian
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Guang Yang
- Department of Clinical Oncology, The Second Affiliated Hospital of Air Force Medical University, Xi’an, China
| | - Fan Li
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xuanping Zhang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xiaoyan Zhu
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xin Lai
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xiao Xiao
- Genomics Institute, Geneplus-Shenzhen, Shenzhen, China
| | - Tao Wang
- Department of Thoracic Surgery, The Second Affiliated Hospital of Air Force Medical University, Xi’an, China
| | - Jiayin Wang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
14
|
Chen S, McMiller TL, Soni A, Succaria F, Sidhom JW, Cappelli LC, Casciola-Rosen LA, Morales IR, Sankaran P, Berger AE, Deutsch JS, Zhu QC, Anders RA, Hooper JE, Pardoll DM, Lipson EJ, Taube JM, Topalian SL. Comparing anti-tumor and anti-self immunity in a patient with melanoma receiving immune checkpoint blockade. J Transl Med 2024; 22:241. [PMID: 38443917 PMCID: PMC10916264 DOI: 10.1186/s12967-024-04973-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 02/09/2024] [Indexed: 03/07/2024] Open
Abstract
BACKGROUND Tumor regression following immune checkpoint blockade (ICB) is often associated with immune-related adverse events (irAEs), marked by inflammation in non-cancerous tissues. This study was undertaken to investigate the functional relationship between anti-tumor and anti-self immunity, to facilitate irAE management while promoting anti-tumor immunity. METHODS Multiple biopsies from tumor and inflamed tissues were collected from a patient with melanoma experiencing both tumor regression and irAEs on ICB, who underwent rapid autopsy. Immune cells infiltrating melanoma lesions and inflamed normal tissues were subjected to gene expression profiling with multiplex qRT-PCR for 122 candidate genes. Subsequently, immunohistochemistry was conducted to assess the expression of 14 candidate markers of immune cell subsets and checkpoints. TCR-beta sequencing was used to explore T cell clonal repertoires across specimens. RESULTS While genes involved in MHC I/II antigen presentation, IFN signaling, innate immunity and immunosuppression were abundantly expressed across specimens, irAE tissues over-expressed certain genes associated with immunosuppression (CSF1R, IL10RA, IL27/EBI3, FOXP3, KLRG1, SOCS1, TGFB1), including those in the COX-2/PGE2 pathway (IL1B, PTGER1/EP1 and PTGER4/EP4). Immunohistochemistry revealed similar proportions of immunosuppressive cell subsets and checkpoint molecules across samples. TCRseq did not indicate common TCR repertoires across tumor and inflammation sites, arguing against shared antigen recognition between anti-tumor and anti-self immunity in this patient. CONCLUSIONS This comprehensive study of a single patient with melanoma experiencing both tumor regression and irAEs on ICB explores the immune landscape across these tissues, revealing similarities between anti-tumor and anti-self immunity. Further, it highlights expression of the COX-2/PGE2 pathway, which is known to be immunosuppressive and potentially mediates ICB resistance. Ongoing clinical trials of COX-2/PGE2 pathway inhibitors targeting the major COX-2 inducer IL-1B, COX-2 itself, or the PGE2 receptors EP2 and EP4 present new opportunities to promote anti-tumor activity, but may also have the potential to enhance the severity of ICB-induced irAEs.
Collapse
Affiliation(s)
- Shuming Chen
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Tracee L McMiller
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Abha Soni
- Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Contra Costa Pathology Associates, Pleasant Hill, CA, USA
| | - Farah Succaria
- Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - John-William Sidhom
- Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Mount Sinai School of Medicine, New York, NY, USA
| | - Laura C Cappelli
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Livia A Casciola-Rosen
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Isaac R Morales
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Preethi Sankaran
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Crossbow Therapeutics, Cambridge, MA, USA
| | - Alan E Berger
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Julie Stein Deutsch
- Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Qingfeng C Zhu
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Robert A Anders
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Jody E Hooper
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Stanford University School of Medicine, Palo Alto, CA, USA
| | - Drew M Pardoll
- Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Evan J Lipson
- Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Janis M Taube
- Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
| | - Suzanne L Topalian
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA.
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA.
| |
Collapse
|
15
|
Jensen MF, Nielsen M. Enhancing TCR specificity predictions by combined pan- and peptide-specific training, loss-scaling, and sequence similarity integration. eLife 2024; 12:RP93934. [PMID: 38437160 PMCID: PMC10942633 DOI: 10.7554/elife.93934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2024] Open
Abstract
Predicting the interaction between Major Histocompatibility Complex (MHC) class I-presented peptides and T-cell receptors (TCR) holds significant implications for vaccine development, cancer treatment, and autoimmune disease therapies. However, limited paired-chain TCR data, skewed towards well-studied epitopes, hampers the development of pan-specific machine-learning (ML) models. Leveraging a larger peptide-TCR dataset, we explore various alterations to the ML architectures and training strategies to address data imbalance. This leads to an overall improved performance, particularly for peptides with scant TCR data. However, challenges persist for unseen peptides, especially those distant from training examples. We demonstrate that such ML models can be used to detect potential outliers, which when removed from training, leads to augmented performance. Integrating pan-specific and peptide-specific models alongside with similarity-based predictions, further improves the overall performance, especially when a low false positive rate is desirable. In the context of the IMMREP22 benchmark, this modeling framework attained state-of-the-art performance. Moreover, combining these strategies results in acceptable predictive accuracy for peptides characterized with as little as 15 positive TCRs. This observation places great promise on rapidly expanding the peptide covering of the current models for predicting TCR specificity. The NetTCR 2.2 model incorporating these advances is available on GitHub (https://github.com/mnielLab/NetTCR-2.2) and as a web server at https://services.healthtech.dtu.dk/services/NetTCR-2.2/.
Collapse
Affiliation(s)
- Mathias Fynbo Jensen
- Department of Health Technology, Section for Bioinformatics, Technical University of DenmarkLyngbyDenmark
| | - Morten Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of DenmarkLyngbyDenmark
| |
Collapse
|
16
|
Tayebi Z, Ali S, Murad T, Khan I, Patterson M. PseAAC2Vec protein encoding for TCR protein sequence classification. Comput Biol Med 2024; 170:107956. [PMID: 38217977 DOI: 10.1016/j.compbiomed.2024.107956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/07/2023] [Accepted: 01/01/2024] [Indexed: 01/15/2024]
Abstract
The classification and prediction of T-cell receptors (TCRs) protein sequences are of significant interest in understanding the immune system and developing personalized immunotherapies. In this study, we propose a novel approach using Pseudo Amino Acid Composition (PseAAC) protein encoding for accurate TCR protein sequence classification. The PseAAC2Vec encoding method captures the physicochemical properties of amino acids and their local sequence information, enabling the representation of protein sequences as fixed-length feature vectors. By incorporating physicochemical properties such as hydrophobicity, polarity, charge, molecular weight, and solvent accessibility, PseAAC2Vec provides a comprehensive and informative characterization of TCR protein sequences. To evaluate the effectiveness of the proposed PseAAC2Vec encoding approach, we assembled a large dataset of TCR protein sequences with annotated classes. We applied the PseAAC2Vec encoding scheme to each sequence and generated feature vectors based on a specified window size. Subsequently, we employed state-of-the-art machine learning algorithms, such as support vector machines (SVM) and random forests (RF), to classify the TCR protein sequences. Experimental results on the benchmark dataset demonstrated the superior performance of the PseAAC2Vec-based approach compared to existing methods. The PseAAC2Vec encoding effectively captures the discriminative patterns in TCR protein sequences, leading to improved classification accuracy and robustness. Furthermore, the encoding scheme showed promising results across different window sizes, indicating its adaptability to varying sequence contexts.
Collapse
Affiliation(s)
- Zahra Tayebi
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| | - Sarwan Ali
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| | - Taslim Murad
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| | - Imdadullah Khan
- Department of Computer Science, Lahore University of Management Sciences, Lahore, Punjab, Pakistan.
| | - Murray Patterson
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| |
Collapse
|
17
|
Xiong P, Liang A, Cai X, Xia T. APTAnet: an atom-level peptide-TCR interaction affinity prediction model. BIOPHYSICS REPORTS 2024; 10:1-14. [PMID: 38737473 PMCID: PMC11079603 DOI: 10.52601/bpr.2023.230037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 01/26/2024] [Indexed: 05/14/2024] Open
Abstract
The prediction of affinity between TCRs and peptides is crucial for the further development of TIL (Tumor-Infiltrating Lymphocytes) immunotherapy. Inspired by the broader research of drug-protein interaction (DPI), we propose an atom-level peptide-TCR interaction (PTI) affinity prediction model APTAnet using natural language processing methods. APTAnet model achieved an average ROC-AUC and PR-AUC of 0.893 and 0.877, respectively, in ten-fold cross-validation on 25,675 pairs of PTI data. Furthermore, experimental results on an independent test set from the McPAS database showed that APTAnet outperformed the current mainstream models. Finally, through the validation on 11 cases of real tumor patient data, we found that the APTAnet model can effectively identify tumor peptides and screen tumor-specific TCRs.
Collapse
Affiliation(s)
- Peng Xiong
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Anyi Liang
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xunhui Cai
- Institute of Pathology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Tian Xia
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
- Institute of Pathology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| |
Collapse
|
18
|
Akama-Garren EH, Yin X, Prestwood TR, Ma M, Utz PJ, Carroll MC. T cell help shapes B cell tolerance. Sci Immunol 2024; 9:eadj7029. [PMID: 38363829 DOI: 10.1126/sciimmunol.adj7029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 12/29/2023] [Indexed: 02/18/2024]
Abstract
T cell help is a crucial component of the normal humoral immune response, yet whether it promotes or restrains autoreactive B cell responses remains unclear. Here, we observe that autoreactive germinal centers require T cell help for their formation and persistence. Using retrogenic chimeras transduced with candidate TCRs, we demonstrate that a follicular T cell repertoire restricted to a single autoreactive TCR, but not a foreign antigen-specific TCR, is sufficient to initiate autoreactive germinal centers. Follicular T cell specificity influences the breadth of epitope spreading by regulating wild-type B cell entry into autoreactive germinal centers. These results demonstrate that TCR-dependent T cell help can promote loss of B cell tolerance and that epitope spreading is determined by TCR specificity.
Collapse
Affiliation(s)
- Elliot H Akama-Garren
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
- Harvard-MIT Health Sciences and Technology, Harvard Medical School, Boston, MA 02115, USA
| | - Xihui Yin
- Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Tyler R Prestwood
- Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Minghe Ma
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Paul J Utz
- Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Michael C Carroll
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
19
|
Bravi B. Development and use of machine learning algorithms in vaccine target selection. NPJ Vaccines 2024; 9:15. [PMID: 38242890 PMCID: PMC10798987 DOI: 10.1038/s41541-023-00795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open
Abstract
Computer-aided discovery of vaccine targets has become a cornerstone of rational vaccine design. In this article, I discuss how Machine Learning (ML) can inform and guide key computational steps in rational vaccine design concerned with the identification of B and T cell epitopes and correlates of protection. I provide examples of ML models, as well as types of data and predictions for which they are built. I argue that interpretable ML has the potential to improve the identification of immunogens also as a tool for scientific discovery, by helping elucidate the molecular processes underlying vaccine-induced immune responses. I outline the limitations and challenges in terms of data availability and method development that need to be addressed to bridge the gap between advances in ML predictions and their translational application to vaccine design.
Collapse
Affiliation(s)
- Barbara Bravi
- Department of Mathematics, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
20
|
Li X, You J, Hong L, Liu W, Guo P, Hao X. Neoantigen cancer vaccines: a new star on the horizon. Cancer Biol Med 2023; 21:j.issn.2095-3941.2023.0395. [PMID: 38164734 PMCID: PMC11033713 DOI: 10.20892/j.issn.2095-3941.2023.0395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 11/22/2023] [Indexed: 01/03/2024] Open
Abstract
Immunotherapy represents a promising strategy for cancer treatment that utilizes immune cells or drugs to activate the patient's own immune system and eliminate cancer cells. One of the most exciting advances within this field is the targeting of neoantigens, which are peptides derived from non-synonymous somatic mutations that are found exclusively within cancer cells and absent in normal cells. Although neoantigen-based therapeutic vaccines have not received approval for standard cancer treatment, early clinical trials have yielded encouraging outcomes as standalone monotherapy or when combined with checkpoint inhibitors. Progress made in high-throughput sequencing and bioinformatics have greatly facilitated the precise and efficient identification of neoantigens. Consequently, personalized neoantigen-based vaccines tailored to each patient have been developed that are capable of eliciting a robust and long-lasting immune response which effectively eliminates tumors and prevents recurrences. This review provides a concise overview consolidating the latest clinical advances in neoantigen-based therapeutic vaccines, and also discusses challenges and future perspectives for this innovative approach, particularly emphasizing the potential of neoantigen-based therapeutic vaccines to enhance clinical efficacy against advanced solid tumors.
Collapse
Affiliation(s)
- Xiaoling Li
- Cell Biotechnology Laboratory, Tianjin Cancer Hospital Airport Hospital, Tianjin 300308, China
- National Clinical Research Center for Cancer, Tianjin 300060, China
- Haihe Laboratory of Synthetic Biology, Tianjin 300090, China
| | - Jian You
- Department of Thoracic Oncology, Tianjin Cancer Hospital Airport Hospital, Tianjin 300308, China
- Department of Thoracic Oncology Surgery, Tianjin Medical University Cancer Institute & Hospital, Tianjin 300060, China
| | - Liping Hong
- Cell Biotechnology Laboratory, Tianjin Cancer Hospital Airport Hospital, Tianjin 300308, China
- National Clinical Research Center for Cancer, Tianjin 300060, China
- Haihe Laboratory of Synthetic Biology, Tianjin 300090, China
| | - Weijiang Liu
- Cell Biotechnology Laboratory, Tianjin Cancer Hospital Airport Hospital, Tianjin 300308, China
- National Clinical Research Center for Cancer, Tianjin 300060, China
- Haihe Laboratory of Synthetic Biology, Tianjin 300090, China
| | - Peng Guo
- Cell Biotechnology Laboratory, Tianjin Cancer Hospital Airport Hospital, Tianjin 300308, China
- National Clinical Research Center for Cancer, Tianjin 300060, China
- Haihe Laboratory of Synthetic Biology, Tianjin 300090, China
| | - Xishan Hao
- Cell Biotechnology Laboratory, Tianjin Cancer Hospital Airport Hospital, Tianjin 300308, China
- National Clinical Research Center for Cancer, Tianjin 300060, China
- Haihe Laboratory of Synthetic Biology, Tianjin 300090, China
- Tianjin Medical University Cancer Institute & Hospital, Tianjin 300060, China
| |
Collapse
|
21
|
Sidiropoulos DN, Ho WJ, Jaffee EM, Kagohara LT, Fertig EJ. Systems immunology spanning tumors, lymph nodes, and periphery. CELL REPORTS METHODS 2023; 3:100670. [PMID: 38086385 PMCID: PMC10753389 DOI: 10.1016/j.crmeth.2023.100670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 10/20/2023] [Accepted: 11/17/2023] [Indexed: 12/21/2023]
Abstract
The immune system defines a complex network of tissues and cell types that orchestrate responses across the body in a dynamic manner. The local and systemic interactions between immune and cancer cells contribute to disease progression. Lymphocytes are activated in lymph nodes, traffic through the periphery, and impact cancer progression through their interactions with tumor cells. As a result, therapeutic response and resistance are mediated across tissues, and a comprehensive understanding of lymphocyte dynamics requires a systems-level approach. In this review, we highlight experimental and computational methods that can leverage the study of leukocyte trafficking through an immunomics lens and reveal how adaptive immunity shapes cancer.
Collapse
Affiliation(s)
- Dimitrios N Sidiropoulos
- Johns Hopkins University School of Medicine, Baltimore, MD, USA; Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Won Jin Ho
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Elizabeth M Jaffee
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Luciane T Kagohara
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA.
| | - Elana J Fertig
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
22
|
Koyama K, Hashimoto K, Nagao C, Mizuguchi K. Attention network for predicting T-cell receptor-peptide binding can associate attention with interpretable protein structural properties. FRONTIERS IN BIOINFORMATICS 2023; 3:1274599. [PMID: 38170146 PMCID: PMC10759225 DOI: 10.3389/fbinf.2023.1274599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 11/27/2023] [Indexed: 01/05/2024] Open
Abstract
Understanding how a T-cell receptor (TCR) recognizes its specific ligand peptide is crucial for gaining an insight into biological functions and disease mechanisms. Despite its importance, experimentally determining TCR-peptide-major histocompatibility complex (TCR-pMHC) interactions is expensive and time-consuming. To address this challenge, computational methods have been proposed, but they are typically evaluated by internal retrospective validation only, and few researchers have incorporated and tested an attention layer from language models into structural information. Therefore, in this study, we developed a machine learning model based on a modified version of Transformer, a source-target attention neural network, to predict the TCR-pMHC interaction solely from the amino acid sequences of the TCR complementarity-determining region (CDR) 3 and the peptide. This model achieved competitive performance on a benchmark dataset of the TCR-pMHC interaction, as well as on a truly new external dataset. Additionally, by analyzing the results of binding predictions, we associated the neural network weights with protein structural properties. By classifying the residues into large- and small-attention groups, we identified statistically significant properties associated with the largely attended residues such as hydrogen bonds within CDR3. The dataset that we created and the ability of our model to provide an interpretable prediction of TCR-peptide binding should increase our knowledge about molecular recognition and pave the way for designing new therapeutics.
Collapse
Affiliation(s)
- Kyohei Koyama
- Laboratory for Computational Biology, Institute for Protein Research, Osaka University, Osaka, Japan
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | - Kosuke Hashimoto
- Laboratory for Computational Biology, Institute for Protein Research, Osaka University, Osaka, Japan
| | - Chioko Nagao
- Laboratory for Computational Biology, Institute for Protein Research, Osaka University, Osaka, Japan
| | - Kenji Mizuguchi
- Laboratory for Computational Biology, Institute for Protein Research, Osaka University, Osaka, Japan
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| |
Collapse
|
23
|
Fan T, Zhang M, Yang J, Zhu Z, Cao W, Dong C. Therapeutic cancer vaccines: advancements, challenges, and prospects. Signal Transduct Target Ther 2023; 8:450. [PMID: 38086815 PMCID: PMC10716479 DOI: 10.1038/s41392-023-01674-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 09/08/2023] [Accepted: 09/19/2023] [Indexed: 12/18/2023] Open
Abstract
With the development and regulatory approval of immune checkpoint inhibitors and adoptive cell therapies, cancer immunotherapy has undergone a profound transformation over the past decades. Recently, therapeutic cancer vaccines have shown promise by eliciting de novo T cell responses targeting tumor antigens, including tumor-associated antigens and tumor-specific antigens. The objective was to amplify and diversify the intrinsic repertoire of tumor-specific T cells. However, the complete realization of these capabilities remains an ongoing pursuit. Therefore, we provide an overview of the current landscape of cancer vaccines in this review. The range of antigen selection, antigen delivery systems development the strategic nuances underlying effective antigen presentation have pioneered cancer vaccine design. Furthermore, this review addresses the current status of clinical trials and discusses their strategies, focusing on tumor-specific immunogenicity and anti-tumor efficacy assessment. However, current clinical attempts toward developing cancer vaccines have not yielded breakthrough clinical outcomes due to significant challenges, including tumor immune microenvironment suppression, optimal candidate identification, immune response evaluation, and vaccine manufacturing acceleration. Therefore, the field is poised to overcome hurdles and improve patient outcomes in the future by acknowledging these clinical complexities and persistently striving to surmount inherent constraints.
Collapse
Affiliation(s)
- Ting Fan
- Department of Oncology, East Hospital Affiliated to Tongji University, Tongji University School of Medicine, Shanghai, China
| | - Mingna Zhang
- Postgraduate Training Base, Shanghai East Hospital, Jinzhou Medical University, Shanghai, 200120, China
| | - Jingxian Yang
- Department of Oncology, East Hospital Affiliated to Tongji University, Tongji University School of Medicine, Shanghai, China
| | - Zhounan Zhu
- Department of Oncology, East Hospital Affiliated to Tongji University, Tongji University School of Medicine, Shanghai, China
| | - Wanlu Cao
- Department of Oncology, East Hospital Affiliated to Tongji University, Tongji University School of Medicine, Shanghai, China.
| | - Chunyan Dong
- Department of Oncology, East Hospital Affiliated to Tongji University, Tongji University School of Medicine, Shanghai, China.
| |
Collapse
|
24
|
Shah RK, Cygan E, Kozlik T, Colina A, Zamora AE. Utilizing immunogenomic approaches to prioritize targetable neoantigens for personalized cancer immunotherapy. Front Immunol 2023; 14:1301100. [PMID: 38149253 PMCID: PMC10749952 DOI: 10.3389/fimmu.2023.1301100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 11/29/2023] [Indexed: 12/28/2023] Open
Abstract
Advancements in sequencing technologies and bioinformatics algorithms have expanded our ability to identify tumor-specific somatic mutation-derived antigens (neoantigens). While recent studies have shown neoantigens to be compelling targets for cancer immunotherapy due to their foreign nature and high immunogenicity, the need for increasingly accurate and cost-effective approaches to rapidly identify neoantigens remains a challenging task, but essential for successful cancer immunotherapy. Currently, gene expression analysis and algorithms for variant calling can be used to generate lists of mutational profiles across patients, but more care is needed to curate these lists and prioritize the candidate neoantigens most capable of inducing an immune response. A growing amount of evidence suggests that only a handful of somatic mutations predicted by mutational profiling approaches act as immunogenic neoantigens. Hence, unbiased screening of all candidate neoantigens predicted by Whole Genome Sequencing/Whole Exome Sequencing may be necessary to more comprehensively access the full spectrum of immunogenic neoepitopes. Once putative cancer neoantigens are identified, one of the largest bottlenecks in translating these neoantigens into actionable targets for cell-based therapies is identifying the cognate T cell receptors (TCRs) capable of recognizing these neoantigens. While many TCR-directed screening and validation assays have utilized bulk samples in the past, there has been a recent surge in the number of single-cell assays that provide a more granular understanding of the factors governing TCR-pMHC interactions. The goal of this review is to provide an overview of existing strategies to identify candidate neoantigens using genomics-based approaches and methods for assessing neoantigen immunogenicity. Additionally, applications, prospects, and limitations of some of the current single-cell technologies will be discussed. Finally, we will briefly summarize some of the recent models that have been used to predict TCR antigen specificity and analyze the TCR receptor repertoire.
Collapse
Affiliation(s)
- Ravi K. Shah
- Department of Medicine, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Erin Cygan
- Department of Microbiology and Immunology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Tanya Kozlik
- Department of Medicine, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Alfredo Colina
- Department of Microbiology and Immunology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Anthony E. Zamora
- Department of Medicine, Medical College of Wisconsin, Milwaukee, WI, United States
- Department of Microbiology and Immunology, Medical College of Wisconsin, Milwaukee, WI, United States
| |
Collapse
|
25
|
Zhao M, Xu SX, Yang Y, Yuan M. GGNpTCR: A Generative Graph Structure Neural Network for Predicting Immunogenic Peptides for T-cell Immune Response. J Chem Inf Model 2023; 63:7557-7567. [PMID: 37990917 DOI: 10.1021/acs.jcim.3c01293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Identifying the interactions between T-cell receptor (TCRs) and human antigens is a crucial step in developing new vaccines, diagnostics, and immunotherapy. Current methods primarily focus on learning binding patterns from known TCR binding repertoires by using sequence information alone without considering the binding specificity of new antigens or exogenous peptides that have not appeared in the training set. Furthermore, the spatial structure of antigens plays a critical role in immune studies and immunotherapy, which should be addressed properly in the identification of interacting TCR-antigen pairs. In this study, we introduced a novel deep learning framework based on generative graph structures, GGNpTCR, for predicting interactions between TCR and peptides from sequence information. Results of real data analysis indicate that our model achieved excellent prediction for new antigens unseen in the training data set, making significant improvements compared to existing methods. We also applied the model to a large COVID-19 data set with no antigens in the training data set, and the improvement was also significant. Furthermore, through incorporation of additional supervised mechanisms, GGNpTCR demonstrated the ability to precisely forecast the locations of peptide-TCR interactions within 3D configurations. This enhancement substantially improved the model's interpretability. In summary, based on the performance on multiple data sets, GGNpTCR has made significant progress in terms of performance, universality, and interpretability.
Collapse
Affiliation(s)
- Minghua Zhao
- Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China
| | - Steven X Xu
- Genmab US, Inc., Princeton, New Jersey 08540, United States
| | - Yaning Yang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China
| | - Min Yuan
- School of Public Health Administration, Anhui Medical University, Hefei 230032, China
| |
Collapse
|
26
|
Korpela D, Jokinen E, Dumitrescu A, Huuhtanen J, Mustjoki S, Lähdesmäki H. EPIC-TRACE: predicting TCR binding to unseen epitopes using attention and contextualized embeddings. Bioinformatics 2023; 39:btad743. [PMID: 38070156 PMCID: PMC10963061 DOI: 10.1093/bioinformatics/btad743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 11/20/2023] [Accepted: 12/07/2023] [Indexed: 12/21/2023] Open
Abstract
MOTIVATION T cells play an essential role in adaptive immune system to fight pathogens and cancer but may also give rise to autoimmune diseases. The recognition of a peptide-MHC (pMHC) complex by a T cell receptor (TCR) is required to elicit an immune response. Many machine learning models have been developed to predict the binding, but generalizing predictions to pMHCs outside the training data remains challenging. RESULTS We have developed a new machine learning model that utilizes information about the TCR from both α and β chains, epitope sequence, and MHC. Our method uses ProtBERT embeddings for the amino acid sequences of both chains and the epitope, as well as convolution and multi-head attention architectures. We show the importance of each input feature as well as the benefit of including epitopes with only a few TCRs to the training data. We evaluate our model on existing databases and show that it compares favorably against other state-of-the-art models. AVAILABILITY AND IMPLEMENTATION https://github.com/DaniTheOrange/EPIC-TRACE.
Collapse
Affiliation(s)
- Dani Korpela
- Department of Computer Science, Aalto University, 02150 Espoo, Finland
| | - Emmi Jokinen
- Department of Computer Science, Aalto University, 02150 Espoo, Finland
- Translational Immunology Research Program, Department of Clinical Chemistry and Hematology, University of Helsinki, 00290 Helsinki, Finland
- Hematology Research Unit Helsinki, Helsinki University Hospital Comprehensive Cancer Center, 00290 Helsinki, Finland
| | | | - Jani Huuhtanen
- Translational Immunology Research Program, Department of Clinical Chemistry and Hematology, University of Helsinki, 00290 Helsinki, Finland
- Hematology Research Unit Helsinki, Helsinki University Hospital Comprehensive Cancer Center, 00290 Helsinki, Finland
| | - Satu Mustjoki
- Translational Immunology Research Program, Department of Clinical Chemistry and Hematology, University of Helsinki, 00290 Helsinki, Finland
- Hematology Research Unit Helsinki, Helsinki University Hospital Comprehensive Cancer Center, 00290 Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, 02150 Espoo, Finland
| |
Collapse
|
27
|
Khan AR, Reinders MJT, Khatri I. Determining epitope specificity of T-cell receptors with transformers. Bioinformatics 2023; 39:btad632. [PMID: 37847663 PMCID: PMC10636277 DOI: 10.1093/bioinformatics/btad632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 09/09/2023] [Accepted: 10/16/2023] [Indexed: 10/19/2023] Open
Abstract
SUMMARY T-cell receptors (TCRs) on T cells recognize and bind to epitopes presented by the major histocompatibility complex in case of an infection or cancer. However, the high diversity of TCRs, as well as their unique and complex binding mechanisms underlying epitope recognition, make it difficult to predict the binding between TCRs and epitopes. Here, we present the utility of transformers, a deep learning strategy that incorporates an attention mechanism that learns the informative features, and show that these models pre-trained on a large set of protein sequences outperform current strategies. We compared three pre-trained auto-encoder transformer models (ProtBERT, ProtAlbert, and ProtElectra) and one pre-trained auto-regressive transformer model (ProtXLNet) to predict the binding specificity of TCRs to 25 epitopes from the VDJdb database (human and murine). Two additional modifications were performed to incorporate gene usage of the TCRs in the four transformer models. Of all 12 transformer implementations (four models with three different modifications), a modified version of the ProtXLNet model could predict TCR-epitope pairs with the highest accuracy (weighted F1 score 0.55 simultaneously considering all 25 epitopes). The modification included additional features representing the gene names for the TCRs. We also showed that the basic implementation of transformers outperformed the previously available methods, i.e. TCRGP, TCRdist, and DeepTCR, developed for the same biological problem, especially for the hard-to-classify labels. We show that the proficiency of transformers in attention learning can be made operational in a complex biological setting like TCR binding prediction. Further ingenuity in utilizing the full potential of transformers, either through attention head visualization or introducing additional features, can extend T-cell research avenues. AVAILABILITY AND IMPLEMENTATION Data and code are available on https://github.com/InduKhatri/tcrformer.
Collapse
Affiliation(s)
- Abdul Rehman Khan
- Department of Intelligent Systems, Delft University of Technology, Delft 2600 GA, The Netherlands
| | - Marcel J T Reinders
- Department of Intelligent Systems, Delft University of Technology, Delft 2600 GA, The Netherlands
- Leiden Computational Biology Center, Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2333 ZA, The Netherlands
| | - Indu Khatri
- Leiden Computational Biology Center, Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2333 ZA, The Netherlands
- Department of Immunology, Leiden University Medical Center, Leiden 2333 ZA, The Netherlands
| |
Collapse
|
28
|
Montemurro A, Povlsen HR, Jessen LE, Nielsen M. Benchmarking data-driven filtering for denoising of TCRpMHC single-cell data. Sci Rep 2023; 13:16147. [PMID: 37752190 PMCID: PMC10522655 DOI: 10.1038/s41598-023-43048-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 09/18/2023] [Indexed: 09/28/2023] Open
Abstract
Pairing of the T cell receptor (TCR) with its cognate peptide-MHC (pMHC) is a cornerstone in T cell-mediated immunity. Recently, single-cell sequencing coupled with DNA-barcoded MHC multimer staining has enabled high-throughput studies of T cell specificities. However, the immense variability of TCR-pMHC interactions combined with the relatively low signal-to-noise ratio in the data generated using current technologies are complicating these studies. Several approaches have been proposed for denoising single-cell TCR-pMHC specificity data. Here, we present a benchmark evaluating two such denoising methods, ICON and ITRAP. We applied and evaluated the methods on publicly available immune profiling data provided by 10x Genomics. We find that both methods identified approximately 75% of the raw data as noise. We analyzed both internal metrics developed for the purpose and performance on independent data using machine learning methods trained on the raw and denoised 10x data. We find an increased signal-to-noise ratio comparing the denoised to the raw data for both methods, and demonstrate an overall superior performance of the ITRAP method in terms of both data consistency and performance. In conclusion, this study demonstrates that Improving the data quality from high throughput studies of TCRpMHC-specificity by denoising is paramount in increasing our understanding of T cell-mediated immunity.
Collapse
Affiliation(s)
- Alessandro Montemurro
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800, Kgs. Lyngby, Denmark
| | - Helle Rus Povlsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800, Kgs. Lyngby, Denmark
| | - Leon Eyrich Jessen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800, Kgs. Lyngby, Denmark
| | - Morten Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800, Kgs. Lyngby, Denmark.
| |
Collapse
|
29
|
Mudd P, Borcherding N, Kim W, Quinn M, Han F, Zhou J, Sturtz A, Schmitz A, Lei T, Schattgen S, Klebert M, Suessen T, Middleton W, Goss C, Liu C, Crawford J, Thomas P, Teefey S, Presti R, O'Halloran J, Turner J, Ellebedy A. Antigen-specific CD4 + T cells exhibit distinct transcriptional phenotypes in the lymph node and blood following vaccination in humans. RESEARCH SQUARE 2023:rs.3.rs-3304466. [PMID: 37790414 PMCID: PMC10543502 DOI: 10.21203/rs.3.rs-3304466/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
SARS-CoV-2 infection and mRNA vaccination induce robust CD4+ T cell responses that are critical for the development of protective immunity. Here, we evaluated spike-specific CD4+ T cells in the blood and draining lymph node (dLN) of human subjects following BNT162b2 mRNA vaccination using single-cell transcriptomics. We analyze multiple spike-specific CD4+ T cell clonotypes, including novel clonotypes we define here using Trex, a new deep learning-based reverse epitope mapping method integrating single-cell T cell receptor (TCR) sequencing and transcriptomics to predict antigen-specificity. Human dLN spike-specific T follicular helper cells (TFH) exhibited distinct phenotypes, including germinal center (GC)-TFH and IL-10+ TFH, that varied over time during the GC response. Paired TCR clonotype analysis revealed tissue-specific segregation of circulating and dLN clonotypes, despite numerous spike-specific clonotypes in each compartment. Analysis of a separate SARS-CoV-2 infection cohort revealed circulating spike-specific CD4+ T cell profiles distinct from those found following BNT162b2 vaccination. Our findings provide an atlas of human antigen-specific CD4+ T cell transcriptional phenotypes in the dLN and blood following vaccination or infection.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | - Charles Goss
- Division of Biostatistics, Washington University in St.Louis
| | - Chang Liu
- Washington University School of Medicine
| | | | | | | | | | - Jane O'Halloran
- Department of Emergency Medicine, Washington University in St.Louis
| | | | | |
Collapse
|
30
|
Fast E, Dhar M, Chen B. TAPIR: a T-cell receptor language model for predicting rare and novel targets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.12.557285. [PMID: 37745475 PMCID: PMC10515850 DOI: 10.1101/2023.09.12.557285] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
T-cell receptors (TCRs) are involved in most human diseases, but linking their sequences with their targets remains an unsolved grand challenge in the field. In this study, we present TAPIR (T-cell receptor and Peptide Interaction Recognizer), a T-cell receptor (TCR) language model that predicts TCR-target interactions, with a focus on novel and rare targets. TAPIR employs deep convolutional neural network (CNN) encoders to process TCR and target sequences across flexible representations (e.g., beta-chain only, unknown MHC allele, etc.) and learns patterns of interactivity via several training tasks. This flexibility allows TAPIR to train on more than 50k either paired (alpha and beta chain) or unpaired TCRs (just alpha or beta chain) from public and proprietary databases against 1933 unique targets. TAPIR demonstrates state-of-the-art performance when predicting TCR interactivity against common benchmark targets and is the first method to demonstrate strong performance when predicting TCR interactivity against novel targets, where no examples are provided in training. TAPIR is also capable of predicting TCR interaction against MHC alleles in the absence of target information. Leveraging these capabilities, we apply TAPIR to cancer patient TCR repertoires and identify and validate a novel and potent anti-cancer T-cell receptor against a shared cancer neoantigen target (PIK3CA H1047L). We further show how TAPIR, when extended with a generative neural network, is capable of directly designing T-cell receptor sequences that interact with a target of interest.
Collapse
Affiliation(s)
- Ethan Fast
- Vcreate, Inc., Menlo Park, CA, 94025, USA
| | | | | |
Collapse
|
31
|
Lee CH, Huh J, Buckley PR, Jang M, Pinho MP, Fernandes RA, Antanaviciute A, Simmons A, Koohy H. A robust deep learning workflow to predict CD8 + T-cell epitopes. Genome Med 2023; 15:70. [PMID: 37705109 PMCID: PMC10498576 DOI: 10.1186/s13073-023-01225-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 08/30/2023] [Indexed: 09/15/2023] Open
Abstract
BACKGROUND T-cells play a crucial role in the adaptive immune system by triggering responses against cancer cells and pathogens, while maintaining tolerance against self-antigens, which has sparked interest in the development of various T-cell-focused immunotherapies. However, the identification of antigens recognised by T-cells is low-throughput and laborious. To overcome some of these limitations, computational methods for predicting CD8 + T-cell epitopes have emerged. Despite recent developments, most immunogenicity algorithms struggle to learn features of peptide immunogenicity from small datasets, suffer from HLA bias and are unable to reliably predict pathology-specific CD8 + T-cell epitopes. METHODS We developed TRAP (T-cell recognition potential of HLA-I presented peptides), a robust deep learning workflow for predicting CD8 + T-cell epitopes from MHC-I presented pathogenic and self-peptides. TRAP uses transfer learning, deep learning architecture and MHC binding information to make context-specific predictions of CD8 + T-cell epitopes. TRAP also detects low-confidence predictions for peptides that differ significantly from those in the training datasets to abstain from making incorrect predictions. To estimate the immunogenicity of pathogenic peptides with low-confidence predictions, we further developed a novel metric, RSAT (relative similarity to autoantigens and tumour-associated antigens), as a complementary to 'dissimilarity to self' from cancer studies. RESULTS TRAP was used to identify epitopes from glioblastoma patients as well as SARS-CoV-2 peptides, and it outperformed other algorithms in both cancer and pathogenic settings. TRAP was especially effective at extracting immunogenicity-associated properties from restricted data of emerging pathogens and translating them onto related species, as well as minimising the loss of likely epitopes in imbalanced datasets. We also demonstrated that the novel metric termed RSAT was able to estimate immunogenic of pathogenic peptides of various lengths and species. TRAP implementation is available at: https://github.com/ChloeHJ/TRAP . CONCLUSIONS This study presents a novel computational workflow for accurately predicting CD8 + T-cell epitopes to foster a better understanding of antigen-specific T-cell response and the development of effective clinical therapeutics.
Collapse
Affiliation(s)
- Chloe H Lee
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
| | - Jaesung Huh
- Visual Geometry Group, Department of Engineering Science, University of Oxford, Oxford, OX2 6NN, UK
| | - Paul R Buckley
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
| | - Myeongjun Jang
- Intelligent Systems Lab, Department of Computer Science, University of Oxford, Oxford, OX1 3QG, UK
| | - Mariana Pereira Pinho
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
| | - Ricardo A Fernandes
- Chinese Academy of Medical Sciences (CAMS) Oxford Institute (COI), University of Oxford, Oxford, OX3 7BN, UK
| | - Agne Antanaviciute
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
| | - Alison Simmons
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK
- Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Hashem Koohy
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK.
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS, UK.
- Alan Turning Fellow in Health and Medicine, The Alan Turing Institute, London, UK.
| |
Collapse
|
32
|
Zhao Y, He B, Xu F, Li C, Xu Z, Su X, He H, Huang Y, Rossjohn J, Song J, Yao J. DeepAIR: A deep learning framework for effective integration of sequence and 3D structure to enable adaptive immune receptor analysis. SCIENCE ADVANCES 2023; 9:eabo5128. [PMID: 37556545 PMCID: PMC10411891 DOI: 10.1126/sciadv.abo5128] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 07/06/2023] [Indexed: 08/11/2023]
Abstract
Structural docking between the adaptive immune receptors (AIRs), including T cell receptors (TCRs) and B cell receptors (BCRs), and their cognate antigens are one of the most fundamental processes in adaptive immunity. However, current methods for predicting AIR-antigen binding largely rely on sequence-derived features of AIRs, omitting the structure features that are essential for binding affinity. In this study, we present a deep learning framework, termed DeepAIR, for the accurate prediction of AIR-antigen binding by integrating both sequence and structure features of AIRs. DeepAIR achieves a Pearson's correlation of 0.813 in predicting the binding affinity of TCR, and a median area under the receiver-operating characteristic curve (AUC) of 0.904 and 0.942 in predicting the binding reactivity of TCR and BCR, respectively. Meanwhile, using TCR and BCR repertoire, DeepAIR correctly identifies every patient with nasopharyngeal carcinoma and inflammatory bowel disease in test data. Thus, DeepAIR improves the AIR-antigen binding prediction that facilitates the study of adaptive immunity.
Collapse
Affiliation(s)
- Yu Zhao
- AI Lab, Tencent, Shenzhen, China
| | - Bing He
- AI Lab, Tencent, Shenzhen, China
| | - Fan Xu
- AI Lab, Tencent, Shenzhen, China
| | - Chen Li
- Biomedicine Discovery Institute and Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | | | | | | | | | - Jamie Rossjohn
- Infection and Immunity Program and Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
- Institute of Infection and Immunity, Cardiff University School of Medicine, Heath Park, Cardiff, UK
| | - Jiangning Song
- AI Lab, Tencent, Shenzhen, China
- Biomedicine Discovery Institute and Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | | |
Collapse
|
33
|
Myronov A, Mazzocco G, Król P, Plewczynski D. BERTrand-peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing. Bioinformatics 2023; 39:btad468. [PMID: 37535685 PMCID: PMC10444968 DOI: 10.1093/bioinformatics/btad468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 06/28/2023] [Accepted: 08/01/2023] [Indexed: 08/05/2023] Open
Abstract
MOTIVATION The advent of T-cell receptor (TCR) sequencing experiments allowed for a significant increase in the amount of peptide:TCR binding data available and a number of machine-learning models appeared in recent years. High-quality prediction models for a fixed epitope sequence are feasible, provided enough known binding TCR sequences are available. However, their performance drops significantly for previously unseen peptides. RESULTS We prepare the dataset of known peptide:TCR binders and augment it with negative decoys created using healthy donors' T-cell repertoires. We employ deep learning methods commonly applied in Natural Language Processing to train part a peptide:TCR binding model with a degree of cross-peptide generalization (0.69 AUROC). We demonstrate that BERTrand outperforms the published methods when evaluated on peptide sequences not used during model training. AVAILABILITY AND IMPLEMENTATION The datasets and the code for model training are available at https://github.com/SFGLab/bertrand.
Collapse
Affiliation(s)
- Alexander Myronov
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
- Ardigen, Krakow, Poland
| | | | | | - Dariusz Plewczynski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| |
Collapse
|
34
|
Povlsen HR, Bentzen AK, Kadivar M, Jessen LE, Hadrup SR, Nielsen M. Improved T cell receptor antigen pairing through data-driven filtering of sequencing information from single cells. eLife 2023; 12:e81810. [PMID: 37133356 PMCID: PMC10156162 DOI: 10.7554/elife.81810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 03/13/2023] [Indexed: 05/04/2023] Open
Abstract
Novel single-cell-based technologies hold the promise of matching T cell receptor (TCR) sequences with their cognate peptide-MHC recognition motif in a high-throughput manner. Parallel capture of TCR transcripts and peptide-MHC is enabled through the use of reagents labeled with DNA barcodes. However, analysis and annotation of such single-cell sequencing (SCseq) data are challenged by dropout, random noise, and other technical artifacts that must be carefully handled in the downstream processing steps. We here propose a rational, data-driven method termed ITRAP (improved T cell Receptor Antigen Paring) to deal with these challenges, filtering away likely artifacts, and enable the generation of large sets of TCR-pMHC sequence data with a high degree of specificity and sensitivity, thus outputting the most likely pMHC target per T cell. We have validated this approach across 10 different virus-specific T cell responses in 16 healthy donors. Across these samples, we have identified up to 1494 high-confident TCR-pMHC pairs derived from 4135 single cells.
Collapse
Affiliation(s)
- Helle Rus Povlsen
- Department of Health Technology at Technical University of DenmarkKongens LyngbyDenmark
| | - Amalie Kai Bentzen
- Department of Health Technology at Technical University of DenmarkKongens LyngbyDenmark
| | - Mohammad Kadivar
- Department of Health Technology at Technical University of DenmarkKongens LyngbyDenmark
| | - Leon Eyrich Jessen
- Department of Health Technology at Technical University of DenmarkKongens LyngbyDenmark
| | - Sine Reker Hadrup
- Department of Health Technology at Technical University of DenmarkKongens LyngbyDenmark
| | - Morten Nielsen
- Department of Health Technology at Technical University of DenmarkKongens LyngbyDenmark
| |
Collapse
|
35
|
Hederman AP, Ackerman ME. Leveraging deep learning to improve vaccine design. Trends Immunol 2023; 44:333-344. [PMID: 37003949 PMCID: PMC10485910 DOI: 10.1016/j.it.2023.03.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/05/2023] [Accepted: 03/05/2023] [Indexed: 04/03/2023]
Abstract
Deep learning has led to incredible breakthroughs in areas of research, from self-driving vehicles to solutions, to formal mathematical proofs. In the biomedical sciences, however, the revolutionary results seen in other fields are only now beginning to be realized. Vaccine research and development efforts represent an application with high public health significance. Protein structure prediction, immune repertoire analysis, and phylogenetics are three principal areas in which deep learning is poised to provide key advances. Here, we opine on some of the current challenges with deep learning and how they are being addressed. Despite the nascent stage of deep learning applications in immunological studies, there is ample opportunity to utilize this new technology to address the most challenging and burdensome infectious diseases confronting global populations.
Collapse
Affiliation(s)
| | - Margaret E Ackerman
- Thayer School of Engineering, Dartmouth College, Hanover, NH, USA; Department of Microbiology and Immunology, Geisel School of Medicine, Hanover, NH, USA.
| |
Collapse
|
36
|
Tippalagama R, Chihab LY, Kearns K, Lewis S, Panda S, Willemsen L, Burel JG, Lindestam Arlehamn CS. Antigen-specificity measurements are the key to understanding T cell responses. Front Immunol 2023; 14:1127470. [PMID: 37122719 PMCID: PMC10140422 DOI: 10.3389/fimmu.2023.1127470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 03/30/2023] [Indexed: 05/02/2023] Open
Abstract
Antigen-specific T cells play a central role in the adaptive immune response and come in a wide range of phenotypes. T cell receptors (TCRs) mediate the antigen-specificities found in T cells. Importantly, high-throughput TCR sequencing provides a fingerprint which allows tracking of specific T cells and their clonal expansion in response to particular antigens. As a result, many studies have leveraged TCR sequencing in an attempt to elucidate the role of antigen-specific T cells in various contexts. Here, we discuss the published approaches to studying antigen-specific T cells and their specific TCR repertoire. Further, we discuss how these methods have been applied to study the TCR repertoire in various diseases in order to characterize the antigen-specific T cells involved in the immune control of disease.
Collapse
|
37
|
Currenti J, Simmons J, Oakes J, Gaudieri S, Warren CM, Gangula R, Alves E, Ram R, Leary S, Armitage JD, Smith RM, Chopra A, Halasa NB, Pilkinton MA, Kalams SA. Tracking of activated cTfh cells following sequential influenza vaccinations reveals transcriptional profile of clonotypes driving a vaccine-induced immune response. Front Immunol 2023; 14:1133781. [PMID: 37063867 PMCID: PMC10095155 DOI: 10.3389/fimmu.2023.1133781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 03/13/2023] [Indexed: 03/30/2023] Open
Abstract
Introduction A vaccine against influenza is available seasonally but is not 100% effective. A predictor of successful seroconversion in adults is an increase in activated circulating T follicular helper (cTfh) cells after vaccination. However, the impact of repeated annual vaccinations on long-term protection and seasonal vaccine efficacy remains unclear. Methods In this study, we examined the T cell receptor (TCR) repertoire and transcriptional profile of vaccine-induced expanded cTfh cells in individuals who received sequential seasonal influenza vaccines. We measured the magnitude of cTfh and plasmablast cell activation from day 0 (d0) to d7 post-vaccination as an indicator of a vaccine response. To assess TCR diversity and T cell expansion we sorted activated and resting cTfh cells at d0 and d7 post-vaccination and performed TCR sequencing. We also single cell sorted activated and resting cTfh cells for TCR analysis and transcriptome sequencing. Results and discussion The percent of activated cTfh cells significantly increased from d0 to d7 in each of the 2016-17 (p < 0.0001) and 2017-18 (p = 0.015) vaccine seasons with the magnitude of cTfh activation increase positively correlated with the frequency of circulating plasmablast cells in the 2016-17 (p = 0.0001) and 2017-18 (p = 0.003) seasons. At d7 post-vaccination, higher magnitudes of cTfh activation were associated with increased clonality of cTfh TCR repertoire. The TCRs from vaccine-expanded clonotypes were identified and tracked longitudinally with several TCRs found to be present in both years. The transcriptomic profile of these expanded cTfh cells at the single cell level demonstrated overrepresentation of transcripts of genes involved in the type-I interferon pathway, pathways involved in gene expression, and antigen presentation and recognition. These results identify the expansion and transcriptomic profile of vaccine-induced cTfh cells important for B cell help.
Collapse
Affiliation(s)
- Jennifer Currenti
- School of Human Sciences, University of Western Australia, Crawley, WA, Australia
| | - Joshua Simmons
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Jared Oakes
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Silvana Gaudieri
- School of Human Sciences, University of Western Australia, Crawley, WA, Australia
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
- Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, WA, Australia
| | - Christian M. Warren
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Rama Gangula
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Eric Alves
- School of Human Sciences, University of Western Australia, Crawley, WA, Australia
| | - Ramesh Ram
- Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, WA, Australia
| | - Shay Leary
- Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, WA, Australia
| | - Jesse D. Armitage
- Telethon Kids Institute, University of Western Australia, Nedlands, WA, Australia
| | - Rita M. Smith
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Abha Chopra
- Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, WA, Australia
| | - Natasha B. Halasa
- Division of Pediatric Infectious Diseases, Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Mark A. Pilkinton
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Spyros A. Kalams
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, United States
| |
Collapse
|
38
|
Frank ML, Lu K, Erdogan C, Han Y, Hu J, Wang T, Heymach JV, Zhang J, Reuben A. T-Cell Receptor Repertoire Sequencing in the Era of Cancer Immunotherapy. Clin Cancer Res 2023; 29:994-1008. [PMID: 36413126 PMCID: PMC10011887 DOI: 10.1158/1078-0432.ccr-22-2469] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/07/2022] [Accepted: 11/14/2022] [Indexed: 11/23/2022]
Abstract
T cells are integral components of the adaptive immune system, and their responses are mediated by unique T-cell receptors (TCR) that recognize specific antigens from a variety of biological contexts. As a result, analyzing the T-cell repertoire offers a better understanding of immune responses and of diseases like cancer. Next-generation sequencing technologies have greatly enabled the high-throughput analysis of the TCR repertoire. On the basis of our extensive experience in the field from the past decade, we provide an overview of TCR sequencing, from the initial library preparation steps to sequencing and analysis methods and finally to functional validation techniques. With regards to data analysis, we detail important TCR repertoire metrics and present several computational tools for predicting antigen specificity. Finally, we highlight important applications of TCR sequencing and repertoire analysis to understanding tumor biology and developing cancer immunotherapies.
Collapse
Affiliation(s)
- Meredith L Frank
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.,The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
| | - Kaylene Lu
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.,The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas.,Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Can Erdogan
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.,Rice University, Houston, Texas
| | - Yi Han
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Jian Hu
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas.,Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Tao Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas.,Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, Texas
| | - John V Heymach
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.,The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
| | - Jianjun Zhang
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.,The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas.,Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Alexandre Reuben
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.,The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
| |
Collapse
|
39
|
Gao Y, Gao Y, Fan Y, Zhu C, Wei Z, Zhou C, Chuai G, Chen Q, Zhang H, Liu Q. Pan-Peptide Meta Learning for T-cell receptor–antigen binding recognition. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00619-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
|
40
|
Li T, Li Y, Zhu X, He Y, Wu Y, Ying T, Xie Z. Artificial intelligence in cancer immunotherapy: Applications in neoantigen recognition, antibody design and immunotherapy response prediction. Semin Cancer Biol 2023; 91:50-69. [PMID: 36870459 DOI: 10.1016/j.semcancer.2023.02.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/13/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023]
Abstract
Cancer immunotherapy is a method of controlling and eliminating tumors by reactivating the body's cancer-immunity cycle and restoring its antitumor immune response. The increased availability of data, combined with advancements in high-performance computing and innovative artificial intelligence (AI) technology, has resulted in a rise in the use of AI in oncology research. State-of-the-art AI models for functional classification and prediction in immunotherapy research are increasingly used to support laboratory-based experiments. This review offers a glimpse of the current AI applications in immunotherapy, including neoantigen recognition, antibody design, and prediction of immunotherapy response. Advancing in this direction will result in more robust predictive models for developing better targets, drugs, and treatments, and these advancements will eventually make their way into the clinical setting, pushing AI forward in the field of precision oncology.
Collapse
Affiliation(s)
- Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yupeng Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Xiaoyi Zhu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yanling Wu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Tianlei Ying
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China.
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China; Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
41
|
Inferring the T cell repertoire dynamics of healthy individuals. Proc Natl Acad Sci U S A 2023; 120:e2207516120. [PMID: 36669107 PMCID: PMC9942919 DOI: 10.1073/pnas.2207516120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The adaptive immune system is a diverse ecosystem that responds to pathogens by selecting cells with specific receptors. While clonal expansion in response to particular immune challenges has been extensively studied, we do not know the neutral dynamics that drive the immune system in the absence of strong stimuli. Here, we learn the parameters that underlie the clonal dynamics of the T cell repertoire in healthy individuals of different ages, by applying Bayesian inference to longitudinal immune repertoire sequencing (RepSeq) data. Quantifying the experimental noise accurately for a given RepSeq technique allows us to disentangle real changes in clonal frequencies from noise. We find that the data are consistent with clone sizes following a geometric Brownian motion and show that its predicted steady state is in quantitative agreement with the observed power-law behavior of the clone-size distribution. The inferred turnover time scale of the repertoire increases with patient age and depends on the clone size in some individuals.
Collapse
|
42
|
Akerman O, Isakov H, Levi R, Psevkin V, Louzoun Y. Counting is almost all you need. Front Immunol 2023; 13:1031011. [PMID: 36741395 PMCID: PMC9896581 DOI: 10.3389/fimmu.2022.1031011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 12/27/2022] [Indexed: 01/21/2023] Open
Abstract
The immune memory repertoire encodes the history of present and past infections and immunological attributes of the individual. As such, multiple methods were proposed to use T-cell receptor (TCR) repertoires to detect disease history. We here show that the counting method outperforms two leading algorithms. We then show that the counting can be further improved using a novel attention model to weigh the different TCRs. The attention model is based on the projection of TCRs using a Variational AutoEncoder (VAE). Both counting and attention algorithms predict better than current leading algorithms whether the host had CMV and its HLA alleles. As an intermediate solution between the complex attention model and the very simple counting model, we propose a new Graph Convolutional Network approach that obtains the accuracy of the attention model and the simplicity of the counting model. The code for the models used in the paper is provided at: https://github.com/louzounlab/CountingIsAlmostAllYouNeed.
Collapse
Affiliation(s)
- Ofek Akerman
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
| | - Haim Isakov
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Reut Levi
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Vladimir Psevkin
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Yoram Louzoun
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| |
Collapse
|
43
|
Zhao Y, He B, Xu Z, Zhang Y, Zhao X, Huang ZA, Yang F, Wang L, Duan L, Song J, Yao J. Interpretable artificial intelligence model for accurate identification of medical conditions using immune repertoire. Brief Bioinform 2023; 24:6960620. [PMID: 36567255 DOI: 10.1093/bib/bbac555] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/04/2022] [Accepted: 11/15/2022] [Indexed: 12/27/2022] Open
Abstract
Underlying medical conditions, such as cancer, kidney disease and heart failure, are associated with a higher risk for severe COVID-19. Accurate classification of COVID-19 patients with underlying medical conditions is critical for personalized treatment decision and prognosis estimation. In this study, we propose an interpretable artificial intelligence model termed VDJMiner to mine the underlying medical conditions and predict the prognosis of COVID-19 patients according to their immune repertoires. In a cohort of more than 1400 COVID-19 patients, VDJMiner accurately identifies multiple underlying medical conditions, including cancers, chronic kidney disease, autoimmune disease, diabetes, congestive heart failure, coronary artery disease, asthma and chronic obstructive pulmonary disease, with an average area under the receiver operating characteristic curve (AUC) of 0.961. Meanwhile, in this same cohort, VDJMiner achieves an AUC of 0.922 in predicting severe COVID-19. Moreover, VDJMiner achieves an accuracy of 0.857 in predicting the response of COVID-19 patients to tocilizumab treatment on the leave-one-out test. Additionally, VDJMiner interpretively mines and scores V(D)J gene segments of the T-cell receptors that are associated with the disease. The identified associations between single-cell V(D)J gene segments and COVID-19 are highly consistent with previous studies. The source code of VDJMiner is publicly accessible at https://github.com/TencentAILabHealthcare/VDJMiner. The web server of VDJMiner is available at https://gene.ai.tencent.com/VDJMiner/.
Collapse
Affiliation(s)
- Yu Zhao
- AI Lab, Tencent, Shenzhen, China
| | - Bing He
- AI Lab, Tencent, Shenzhen, China
| | | | - Yidan Zhang
- AI Lab, Tencent, Shenzhen, China.,School of Computer Science, Sichuan University, Chengdu, China
| | | | - Zhi-An Huang
- AI Lab, Tencent, Shenzhen, China.,Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Fan Yang
- AI Lab, Tencent, Shenzhen, China
| | | | - Lei Duan
- School of Computer Science, Sichuan University, Chengdu, China
| | - Jiangning Song
- AI Lab, Tencent, Shenzhen, China.,Monash Biomedicine Discovery Institute and Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | | |
Collapse
|
44
|
Finding antigens for TB vaccines: the good, the bad and the useless. Nat Med 2023; 29:35-36. [PMID: 36604539 PMCID: PMC9877171 DOI: 10.1038/s41591-022-02123-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Prospective, longitudinal clinical studies incorporating high-throughput, single-cell analyses could identify which bacterial antigens to include in TB vaccines — and which to avoid.
Collapse
|
45
|
Kanduri C, Scheffer L, Pavlović M, Rand KD, Chernigovskaya M, Pirvandy O, Yaari G, Greiff V, Sandve GK. simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods. Gigascience 2022; 12:giad074. [PMID: 37848619 PMCID: PMC10580376 DOI: 10.1093/gigascience/giad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 07/20/2023] [Accepted: 08/29/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Machine learning (ML) has gained significant attention for classifying immune states in adaptive immune receptor repertoires (AIRRs) to support the advancement of immunodiagnostics and therapeutics. Simulated data are crucial for the rigorous benchmarking of AIRR-ML methods. Existing approaches to generating synthetic benchmarking datasets result in the generation of naive repertoires missing the key feature of many shared receptor sequences (selected for common antigens) found in antigen-experienced repertoires. RESULTS We demonstrate that a common approach to generating simulated AIRR benchmark datasets can introduce biases, which may be exploited for undesired shortcut learning by certain ML methods. To mitigate undesirable access to true signals in simulated AIRR datasets, we devised a simulation strategy (simAIRR) that constructs antigen-experienced-like repertoires with a realistic overlap of receptor sequences. simAIRR can be used for constructing AIRR-level benchmarks based on a range of assumptions (or experimental data sources) for what constitutes receptor-level immune signals. This includes the possibility of making or not making any prior assumptions regarding the similarity or commonality of immune state-associated sequences that will be used as true signals. We demonstrate the real-world realism of our proposed simulation approach by showing that basic ML strategies perform similarly on simAIRR-generated and real-world experimental AIRR datasets. CONCLUSIONS This study sheds light on the potential shortcut learning opportunities for ML methods that can arise with the state-of-the-art way of simulating AIRR datasets. simAIRR is available as a Python package: https://github.com/KanduriC/simAIRR.
Collapse
Affiliation(s)
- Chakravarthi Kanduri
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| | - Lonneke Scheffer
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Milena Pavlović
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| | - Knut Dagestad Rand
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology and Oslo University Hospital, University of Oslo, 0373 Oslo, Norway
| | - Oz Pirvandy
- Faculty of Engineering, Bar-Ilan University, 5290002, Israel
| | - Gur Yaari
- Faculty of Engineering, Bar-Ilan University, 5290002, Israel
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, 0373 Oslo, Norway
| | - Geir K Sandve
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| |
Collapse
|
46
|
Guo Z, Yamaguchi R. Machine learning methods for protein-protein binding affinity prediction in protein design. FRONTIERS IN BIOINFORMATICS 2022; 2:1065703. [PMID: 36591334 PMCID: PMC9800603 DOI: 10.3389/fbinf.2022.1065703] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/01/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions govern a wide range of biological activity. A proper estimation of the protein-protein binding affinity is vital to design proteins with high specificity and binding affinity toward a target protein, which has a variety of applications including antibody design in immunotherapy, enzyme engineering for reaction optimization, and construction of biosensors. However, experimental and theoretical modelling methods are time-consuming, hinder the exploration of the entire protein space, and deter the identification of optimal proteins that meet the requirements of practical applications. In recent years, the rapid development in machine learning methods for protein-protein binding affinity prediction has revealed the potential of a paradigm shift in protein design. Here, we review the prediction methods and associated datasets and discuss the requirements and construction methods of binding affinity prediction models for protein design.
Collapse
Affiliation(s)
- Zhongliang Guo
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan
| | - Rui Yamaguchi
- Division of Cancer Systems Biology, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan,Division of Cancer Informatics, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan,*Correspondence: Rui Yamaguchi,
| |
Collapse
|
47
|
Montemurro A, Jessen LE, Nielsen M. NetTCR-2.1: Lessons and guidance on how to develop models for TCR specificity predictions. Front Immunol 2022; 13:1055151. [PMID: 36561755 PMCID: PMC9763291 DOI: 10.3389/fimmu.2022.1055151] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/22/2022] [Indexed: 12/12/2022] Open
Abstract
T cell receptors (TCR) define the specificity of T cells and are responsible for their interaction with peptide antigen targets presented in complex with major histocompatibility complex (MHC) molecules. Understanding the rules underlying this interaction hence forms the foundation for our understanding of basic adaptive immunology. Over the last decade, efforts have been dedicated to developing assays for high throughput identification of peptide-specific TCRs. Based on such data, several computational methods have been proposed for predicting the TCR-pMHC interaction. The general conclusion from these studies is that the prediction of TCR interactions with MHC-peptide complexes remains highly challenging. Several reasons form the basis for this including scarcity and quality of data, and ill-defined modeling objectives imposed by the high redundancy of the available data. In this work, we propose a framework for dealing with this redundancy, allowing us to address essential questions related to the modeling of TCR specificity including the use of peptide- versus pan-specific models, how to best define negative data, and the performance impact of integrating of CDR1 and 2 loops. Further, we illustrate how and why it is strongly recommended to include simple similarity-based modeling approaches when validating an improved predictive power of machine learning models, and that such validation should include a performance evaluation as a function of "distance" to the training data, to quantify the potential for generalization of the proposed model. The conclusion of the work is that, given current data, TCR specificity is best modeled using peptide-specific approaches, integrating information from all 6 CDR loops, and with negative data constructed from a combination of true and mislabeled negatives. Comparing such machine learning models to similarity-based approaches demonstrated an increased performance gain of the former as the "distance" to the training data was increased; thus demonstrating an improved generalization ability of the machine learning-based approaches. We believe these results demonstrate that the outlined modeling framework and proposed evaluation strategy form a solid basis for investigating the modeling of TCR specificities and that adhering to such a framework will allow for faster progress within the field. The final devolved model, NetTCR-2.1, is available at https://services.healthtech.dtu.dk/service.php?NetTCR-2.1.
Collapse
Affiliation(s)
- Alessandro Montemurro
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800 Kgs., Lyngby, Denmark
| | - Leon Eyrich Jessen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800 Kgs., Lyngby, Denmark
| | - Morten Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, 2800 Kgs., Lyngby, Denmark,Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín, Buenos Aires, Argentina,*Correspondence: Morten Nielsen,
| |
Collapse
|
48
|
Sidhom JW, Oliveira G, Ross-MacDonald P, Wind-Rotolo M, Wu CJ, Pardoll DM, Baras AS. Deep learning reveals predictive sequence concepts within immune repertoires to immunotherapy. SCIENCE ADVANCES 2022; 8:eabq5089. [PMID: 36112691 PMCID: PMC9481116 DOI: 10.1126/sciadv.abq5089] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 07/29/2022] [Indexed: 06/09/2023]
Abstract
T cell receptor (TCR) sequencing has been used to characterize the immune response to cancer. However, most analyses have been restricted to quantitative measures such as clonality that do not leverage the complementarity-determining region 3 (CDR3) sequence. We use DeepTCR, a framework of deep learning algorithms, to reveal sequence concepts that are predictive of response to immunotherapy. We demonstrate that DeepTCR can predict response and use the model to infer the antigenic specificities of the predictive signature and their unique dynamics during therapy. The predictive signature of nonresponse is associated with high frequencies of TCRs predicted to recognize tumor-specific antigens, and these tumor-specific TCRs undergo a higher degree of dynamic changes on therapy in nonresponders versus responders. These results are consistent with a biological model where the hallmark of nonresponders is an accumulation of tumor-specific T cells that undergo turnover on therapy, possibly because of the dysfunctional state of these T cells in nonresponders.
Collapse
Affiliation(s)
- John-William Sidhom
- Bloomberg Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Giacomo Oliveira
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | | | | - Catherine J. Wu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Drew M. Pardoll
- Bloomberg Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Alexander S. Baras
- Bloomberg Kimmel Institute for Cancer Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
49
|
Ji F, Chen L, Chen Z, Luo B, Wang Y, Lan X. TCR repertoire and transcriptional signatures of circulating tumour-associated T cells facilitate effective non-invasive cancer detection. Clin Transl Med 2022; 12:e853. [PMID: 36134717 PMCID: PMC9494610 DOI: 10.1002/ctm2.853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/11/2022] [Accepted: 04/15/2022] [Indexed: 11/10/2022] Open
Affiliation(s)
- Fansen Ji
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.,School of Medicine, Tsinghua University, Beijing, China
| | - Lin Chen
- School of Medicine, Tsinghua University, Beijing, China.,General Surgery Department, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Zhizhuo Chen
- School of Life Science, Tsinghua University, Beijing, China
| | - Bin Luo
- General Surgery Department, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Yongwang Wang
- Department of Anesthesiology, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Xun Lan
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.,School of Medicine, Tsinghua University, Beijing, China
| |
Collapse
|
50
|
Cai M, Bang S, Zhang P, Lee H. ATM-TCR: TCR-Epitope Binding Affinity Prediction Using a Multi-Head Self-Attention Model. Front Immunol 2022; 13:893247. [PMID: 35874725 PMCID: PMC9299376 DOI: 10.3389/fimmu.2022.893247] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/27/2022] [Indexed: 11/29/2022] Open
Abstract
TCR-epitope pair binding is the key component for T cell regulation. The ability to predict whether a given pair binds is fundamental to understanding the underlying biology of the binding mechanism as well as developing T-cell mediated immunotherapy approaches. The advent of large-scale public databases containing TCR-epitope binding pairs enabled the recent development of computational prediction methods for TCR-epitope binding. However, the number of epitopes reported along with binding TCRs is far too small, resulting in poor out-of-sample performance for unseen epitopes. In order to address this issue, we present our model ATM-TCR which uses a multi-head self-attention mechanism to capture biological contextual information and improve generalization performance. Additionally, we present a novel application of the attention map from our model to improve out-of-sample performance by demonstrating on recent SARS-CoV-2 data.
Collapse
Affiliation(s)
- Michael Cai
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United States.,Biodesign Institute, Arizona State University, Tempe, AZ, United States
| | - Seojin Bang
- Biodesign Institute, Arizona State University, Tempe, AZ, United States
| | - Pengfei Zhang
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United States.,Biodesign Institute, Arizona State University, Tempe, AZ, United States
| | - Heewook Lee
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United States.,Biodesign Institute, Arizona State University, Tempe, AZ, United States
| |
Collapse
|