1
|
Cai Y, Luo M, Yang W, Xu C, Wang P, Xue G, Jin X, Cheng R, Que J, Zhou W, Pang B, Xu S, Li Y, Jiang Q, Xu Z. The Deep Learning Framework iCanTCR Enables Early Cancer Detection Using the T-cell Receptor Repertoire in Peripheral Blood. Cancer Res 2024; 84:1915-1928. [PMID: 38536129 DOI: 10.1158/0008-5472.can-23-0860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 07/20/2023] [Accepted: 03/19/2024] [Indexed: 06/05/2024]
Abstract
T cells recognize tumor antigens and initiate an anticancer immune response in the very early stages of tumor development, and the antigen specificity of T cells is determined by the T-cell receptor (TCR). Therefore, monitoring changes in the TCR repertoire in peripheral blood may offer a strategy to detect various cancers at a relatively early stage. Here, we developed the deep learning framework iCanTCR to identify patients with cancer based on the TCR repertoire. The iCanTCR framework uses TCRβ sequences from an individual as an input and outputs the predicted cancer probability. The model was trained on over 2,000 publicly available TCR repertoires from 11 types of cancer and healthy controls. Analysis of several additional publicly available datasets validated the ability of iCanTCR to distinguish patients with cancer from noncancer individuals and demonstrated the capability of iCanTCR for the accurate classification of multiple cancers. Importantly, iCanTCR precisely identified individuals with early-stage cancer with an AUC of 86%. Altogether, this work provides a liquid biopsy approach to capture immune signals from peripheral blood for noninvasive cancer diagnosis. SIGNIFICANCE Development of a deep learning-based method for multicancer detection using the TCR repertoire in the peripheral blood establishes the potential of evaluating circulating immune signals for noninvasive early cancer detection.
Collapse
Affiliation(s)
- Yideng Cai
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Meng Luo
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Wenyi Yang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chang Xu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Pingping Wang
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Guangfu Xue
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Xiyun Jin
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Rui Cheng
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jinhao Que
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Wenyang Zhou
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Boran Pang
- Center for Difficult and Complicated Abdominal Surgery, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, China
| | - Shouping Xu
- Department of Breast Cancer, Harbin Medical University Cancer Hospital, Harbin, China
| | - Yu Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Zhaochun Xu
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| |
Collapse
|
2
|
Zaslavsky ME, Craig E, Michuda JK, Sehgal N, Ram-Mohan N, Lee JY, Nguyen KD, Hoh RA, Pham TD, Röltgen K, Lam B, Parsons ES, Macwana SR, DeJager W, Drapeau EM, Roskin KM, Cunningham-Rundles C, Moody MA, Haynes BF, Goldman JD, Heath JR, Nadeau KC, Pinsky BA, Blish CA, Hensley SE, Jensen K, Meyer E, Balboni I, Utz PJ, Merrill JT, Guthridge JM, James JA, Yang S, Tibshirani R, Kundaje A, Boyd SD. Disease diagnostics using machine learning of immune receptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2022.04.26.489314. [PMID: 35547855 PMCID: PMC9094102 DOI: 10.1101/2022.04.26.489314] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Clinical diagnosis typically incorporates physical examination, patient history, and various laboratory tests and imaging studies, but makes limited use of the human system's own record of antigen exposures encoded by receptors on B cells and T cells. We analyzed immune receptor datasets from 593 individuals to develop MAchine Learning for Immunological Diagnosis (Mal-ID) , an interpretive framework to screen for multiple illnesses simultaneously or precisely test for one condition. This approach detects specific infections, autoimmune disorders, vaccine responses, and disease severity differences. Human-interpretable features of the model recapitulate known immune responses to SARS-CoV-2, Influenza, and HIV, highlight antigen-specific receptors, and reveal distinct characteristics of Systemic Lupus Erythematosus and Type-1 Diabetes autoreactivity. This analysis framework has broad potential for scientific and clinical interpretation of human immune responses.
Collapse
|
3
|
Qian X, Yang G, Li F, Zhang X, Zhu X, Lai X, Xiao X, Wang T, Wang J. DeepLION2: deep multi-instance contrastive learning framework enhancing the prediction of cancer-associated T cell receptors by attention strategy on motifs. Front Immunol 2024; 15:1345586. [PMID: 38515756 PMCID: PMC10956474 DOI: 10.3389/fimmu.2024.1345586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/19/2024] [Indexed: 03/23/2024] Open
Abstract
Introduction T cell receptor (TCR) repertoires provide valuable insights into complex human diseases, including cancers. Recent advancements in immune sequencing technology have significantly improved our understanding of TCR repertoire. Some computational methods have been devised to identify cancer-associated TCRs and enable cancer detection using TCR sequencing data. However, the existing methods are often limited by their inadequate consideration of the correlations among TCRs within a repertoire, hindering the identification of crucial TCRs. Additionally, the sparsity of cancer-associated TCR distribution presents a challenge in accurate prediction. Methods To address these issues, we presented DeepLION2, an innovative deep multi-instance contrastive learning framework specifically designed to enhance cancer-associated TCR prediction. DeepLION2 leveraged content-based sparse self-attention, focusing on the top k related TCRs for each TCR, to effectively model inter-TCR correlations. Furthermore, it adopted a contrastive learning strategy for bootstrapping parameter updates of the attention matrix, preventing the model from fixating on non-cancer-associated TCRs. Results Extensive experimentation on diverse patient cohorts, encompassing over ten cancer types, demonstrated that DeepLION2 significantly outperformed current state-of-the-art methods in terms of accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the curve (AUC). Notably, DeepLION2 achieved impressive AUC values of 0.933, 0.880, and 0.763 on thyroid, lung, and gastrointestinal cancer cohorts, respectively. Furthermore, it effectively identified cancer-associated TCRs along with their key motifs, highlighting the amino acids that play a crucial role in TCR-peptide binding. Conclusion These compelling results underscore DeepLION2's potential for enhancing cancer detection and facilitating personalized cancer immunotherapy. DeepLION2 is publicly available on GitHub, at https://github.com/Bioinformatics7181/DeepLION2, for academic use only.
Collapse
Affiliation(s)
- Xinyang Qian
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Guang Yang
- Department of Clinical Oncology, The Second Affiliated Hospital of Air Force Medical University, Xi’an, China
| | - Fan Li
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xuanping Zhang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xiaoyan Zhu
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xin Lai
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xiao Xiao
- Genomics Institute, Geneplus-Shenzhen, Shenzhen, China
| | - Tao Wang
- Department of Thoracic Surgery, The Second Affiliated Hospital of Air Force Medical University, Xi’an, China
| | - Jiayin Wang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
4
|
Tayebi Z, Ali S, Murad T, Khan I, Patterson M. PseAAC2Vec protein encoding for TCR protein sequence classification. Comput Biol Med 2024; 170:107956. [PMID: 38217977 DOI: 10.1016/j.compbiomed.2024.107956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/07/2023] [Accepted: 01/01/2024] [Indexed: 01/15/2024]
Abstract
The classification and prediction of T-cell receptors (TCRs) protein sequences are of significant interest in understanding the immune system and developing personalized immunotherapies. In this study, we propose a novel approach using Pseudo Amino Acid Composition (PseAAC) protein encoding for accurate TCR protein sequence classification. The PseAAC2Vec encoding method captures the physicochemical properties of amino acids and their local sequence information, enabling the representation of protein sequences as fixed-length feature vectors. By incorporating physicochemical properties such as hydrophobicity, polarity, charge, molecular weight, and solvent accessibility, PseAAC2Vec provides a comprehensive and informative characterization of TCR protein sequences. To evaluate the effectiveness of the proposed PseAAC2Vec encoding approach, we assembled a large dataset of TCR protein sequences with annotated classes. We applied the PseAAC2Vec encoding scheme to each sequence and generated feature vectors based on a specified window size. Subsequently, we employed state-of-the-art machine learning algorithms, such as support vector machines (SVM) and random forests (RF), to classify the TCR protein sequences. Experimental results on the benchmark dataset demonstrated the superior performance of the PseAAC2Vec-based approach compared to existing methods. The PseAAC2Vec encoding effectively captures the discriminative patterns in TCR protein sequences, leading to improved classification accuracy and robustness. Furthermore, the encoding scheme showed promising results across different window sizes, indicating its adaptability to varying sequence contexts.
Collapse
Affiliation(s)
- Zahra Tayebi
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| | - Sarwan Ali
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| | - Taslim Murad
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| | - Imdadullah Khan
- Department of Computer Science, Lahore University of Management Sciences, Lahore, Punjab, Pakistan.
| | - Murray Patterson
- Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA.
| |
Collapse
|
5
|
Wang M, Patsenker J, Li H, Kluger Y, Kleinstein S. Language model-based B cell receptor sequence embeddings can effectively encode receptor specificity. Nucleic Acids Res 2024; 52:548-557. [PMID: 38109302 PMCID: PMC10810273 DOI: 10.1093/nar/gkad1128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 10/18/2023] [Accepted: 11/11/2023] [Indexed: 12/20/2023] Open
Abstract
High throughput sequencing of B cell receptors (BCRs) is increasingly applied to study the immense diversity of antibodies. Learning biologically meaningful embeddings of BCR sequences is beneficial for predictive modeling. Several embedding methods have been developed for BCRs, but no direct performance benchmarking exists. Moreover, the impact of the input sequence length and paired-chain information on the prediction remains to be explored. We evaluated the performance of multiple embedding models to predict BCR sequence properties and receptor specificity. Despite the differences in model architectures, most embeddings effectively capture BCR sequence properties and specificity. BCR-specific embeddings slightly outperform general protein language models in predicting specificity. In addition, incorporating full-length heavy chains and paired light chain sequences improves the prediction performance of all embeddings. This study provides insights into the properties of BCR embeddings to improve downstream prediction applications for antibody analysis and discovery.
Collapse
Affiliation(s)
- Meng Wang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | | | - Henry Li
- Program in Applied Mathematics, Yale University, New Haven, CT, USA
| | - Yuval Kluger
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Program in Applied Mathematics, Yale University, New Haven, CT, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
| | - Steven H Kleinstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
- Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
| |
Collapse
|
6
|
Minotto T, Robert PA, Hobæk Haff I, Sandve GK. Assessing the feasibility of statistical inference using synthetic antibody-antigen datasets. Stat Appl Genet Mol Biol 2024; 23:sagmb-2023-0027. [PMID: 38563699 DOI: 10.1515/sagmb-2023-0027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 03/13/2024] [Indexed: 04/04/2024]
Abstract
Simulation frameworks are useful to stress-test predictive models when data is scarce, or to assert model sensitivity to specific data distributions. Such frameworks often need to recapitulate several layers of data complexity, including emergent properties that arise implicitly from the interaction between simulation components. Antibody-antigen binding is a complex mechanism by which an antibody sequence wraps itself around an antigen with high affinity. In this study, we use a synthetic simulation framework for antibody-antigen folding and binding on a 3D lattice that include full details on the spatial conformation of both molecules. We investigate how emergent properties arise in this framework, in particular the physical proximity of amino acids, their presence on the binding interface, or the binding status of a sequence, and relate that to the individual and pairwise contributions of amino acids in statistical models for binding prediction. We show that weights learnt from a simple logistic regression model align with some but not all features of amino acids involved in the binding, and that predictive sequence binding patterns can be enriched. In particular, main effects correlated with the capacity of a sequence to bind any antigen, while statistical interactions were related to sequence specificity.
Collapse
Affiliation(s)
- Thomas Minotto
- Department of Mathematics, 6305 University of Oslo , Oslo, Norway
| | - Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Departmemt of Biomedicine, University of Basel, Basel, Switzerland
| | | | - Geir K Sandve
- Department of Informatics, 6305 University of Oslo , Oslo, Norway
| |
Collapse
|
7
|
Chen C, Liu Y, Yao J, Wang K, Zhang M, Shi F, Tian Y, Gao L, Ying Y, Pan Q, Wang H, Wu J, Qi X, Wang Y, Xu D. Deep learning approaches for differentiating thyroid nodules with calcification: a two-center study. BMC Cancer 2023; 23:1139. [PMID: 37996814 PMCID: PMC10668439 DOI: 10.1186/s12885-023-11456-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 09/27/2023] [Indexed: 11/25/2023] Open
Abstract
BACKGROUND Calcification is a common phenomenon in both benign and malignant thyroid nodules. However, the clinical significance of calcification remains unclear. Therefore, we explored a more objective method for distinguishing between benign and malignant thyroid calcified nodules. METHODS This retrospective study, conducted at two centers, involved a total of 631 thyroid nodules, all of which were pathologically confirmed. Ultrasound image sets were employed for analysis. The primary evaluation index was the area under the receiver-operator characteristic curve (AUROC). We compared the diagnostic performance of deep learning (DL) methods with that of radiologists and determined whether DL could enhance the diagnostic capabilities of radiologists. RESULTS The Xception classification model exhibited the highest performance, achieving an AUROC of up to 0.970, followed by the DenseNet169 model, which attained an AUROC of up to 0.959. Notably, both DL models outperformed radiologists (P < 0.05). The success of the Xception model can be attributed to its incorporation of deep separable convolution, which effectively reduces the model's parameter count. This feature enables the model to capture features more effectively during the feature extraction process, resulting in superior performance, particularly when dealing with limited data. CONCLUSIONS This study conclusively demonstrated that DL outperformed radiologists in differentiating between benign and malignant calcified thyroid nodules. Additionally, the diagnostic capabilities of radiologists could be enhanced with the aid of DL.
Collapse
Affiliation(s)
- Chen Chen
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Yuanzhen Liu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Jincao Yao
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, Hangzhou, 310022, China
- Key Laboratory of Head & Neck Cancer Translational Research of Zhejiang Province, Hangzhou, 310022, China
| | - Kai Wang
- Department of Ultrasound, The Affiliated Dongyang Hospital of Wenzhou Medical University, Dongyang, 317502, China
| | - Maoliang Zhang
- Department of Ultrasound, The Affiliated Dongyang Hospital of Wenzhou Medical University, Dongyang, 317502, China
| | - Fang Shi
- Capacity Building and Continuing Education Center of National Health Commission, Beijing, 100098, China
| | - Yuan Tian
- Capacity Building and Continuing Education Center of National Health Commission, Beijing, 100098, China
| | - Lu Gao
- Capacity Building and Continuing Education Center of National Health Commission, Beijing, 100098, China
| | - Yajun Ying
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Qianmeng Pan
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Hui Wang
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Jinxin Wu
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Xiaoqing Qi
- Department of Ultrasound, Hangzhou Ninth People's Hospital, Hangzhou, 311225, China
| | - Yifan Wang
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China.
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China.
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China.
| | - Dong Xu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China.
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China.
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China.
| |
Collapse
|
8
|
Bravi B, Di Gioacchino A, Fernandez-de-Cossio-Diaz J, Walczak AM, Mora T, Cocco S, Monasson R. A transfer-learning approach to predict antigen immunogenicity and T-cell receptor specificity. eLife 2023; 12:e85126. [PMID: 37681658 PMCID: PMC10522340 DOI: 10.7554/elife.85126] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 09/07/2023] [Indexed: 09/09/2023] Open
Abstract
Antigen immunogenicity and the specificity of binding of T-cell receptors to antigens are key properties underlying effective immune responses. Here we propose diffRBM, an approach based on transfer learning and Restricted Boltzmann Machines, to build sequence-based predictive models of these properties. DiffRBM is designed to learn the distinctive patterns in amino-acid composition that, on the one hand, underlie the antigen's probability of triggering a response, and on the other hand the T-cell receptor's ability to bind to a given antigen. We show that the patterns learnt by diffRBM allow us to predict putative contact sites of the antigen-receptor complex. We also discriminate immunogenic and non-immunogenic antigens, antigen-specific and generic receptors, reaching performances that compare favorably to existing sequence-based predictors of antigen immunogenicity and T-cell receptor specificity.
Collapse
Affiliation(s)
- Barbara Bravi
- Department of Mathematics, Imperial College LondonLondonUnited Kingdom
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| | - Andrea Di Gioacchino
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| | - Jorge Fernandez-de-Cossio-Diaz
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| | - Aleksandra M Walczak
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| | - Thierry Mora
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| | - Simona Cocco
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| | - Rémi Monasson
- Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris-CitéParisFrance
| |
Collapse
|
9
|
Yang M, Huang ZA, Zhou W, Ji J, Zhang J, He S, Zhu Z. MIX-TPI: a flexible prediction framework for TCR-pMHC interactions based on multimodal representations. Bioinformatics 2023; 39:btad475. [PMID: 37527015 PMCID: PMC10423027 DOI: 10.1093/bioinformatics/btad475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/05/2023] [Accepted: 07/29/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION The interactions between T-cell receptors (TCR) and peptide-major histocompatibility complex (pMHC) are essential for the adaptive immune system. However, identifying these interactions can be challenging due to the limited availability of experimental data, sequence data heterogeneity, and high experimental validation costs. RESULTS To address this issue, we develop a novel computational framework, named MIX-TPI, to predict TCR-pMHC interactions using amino acid sequences and physicochemical properties. Based on convolutional neural networks, MIX-TPI incorporates sequence-based and physicochemical-based extractors to refine the representations of TCR-pMHC interactions. Each modality is projected into modality-invariant and modality-specific representations to capture the uniformity and diversities between different features. A self-attention fusion layer is then adopted to form the classification module. Experimental results demonstrate the effectiveness of MIX-TPI in comparison with other state-of-the-art methods. MIX-TPI also shows good generalization capability on mutual exclusive evaluation datasets and a paired TCR dataset. AVAILABILITY AND IMPLEMENTATION The source code of MIX-TPI and the test data are available at: https://github.com/Wolverinerine/MIX-TPI.
Collapse
Affiliation(s)
- Minghao Yang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Zhi-An Huang
- Research Office, City University of Hong Kong (Dongguan), Dongguan 523000, China
| | - Wei Zhou
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Junkai Ji
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Jun Zhang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Shan He
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
10
|
Ostmeyer J, Park JY, von Itzstein MS, Hsiehchen D, Fattah F, Gwin M, Catalan R, Khan S, Raj P, Wakeland EK, Xie Y, Gerber DE. T-cell tolerant fraction as a predictor of immune-related adverse events. J Immunother Cancer 2023; 11:e006437. [PMID: 37580069 PMCID: PMC10432621 DOI: 10.1136/jitc-2022-006437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/28/2023] [Indexed: 08/16/2023] Open
Abstract
BACKGROUND Immune checkpoint inhibitor (ICI) therapies may cause unpredictable and potentially severe autoimmune toxicities termed immune-related adverse events (irAEs). Because T cells mediate ICI effects, T cell profiling may provide insight into the risk of irAEs. Here we evaluate a novel metric-the T-cell tolerant fraction-as a predictor of future irAEs. METHODS We examined T-cell receptor beta (TRB) locus sequencing from baseline pretreatment samples from an institutional registry and previously published studies. For each patient, we used TRB sequences to calculate the T-cell tolerant fraction, which was then assessed as a predictor of future irAEs (classified as Common Terminology Criteria for Adverse Event grade 0-1 vs grade ≥2). We then compared the tolerant fraction to TRB clonality and diversity. Finally, the tolerant fraction was assessed on (1) T cells enriched against napsin A, a potential autoantigen of irAEs; (2) thymic versus peripheral blood T cells; and (3) TRBs specific for various infections and autoimmune diseases. RESULTS A total of 77 patients with cancer (22 from an institutional registry and 55 from published studies) receiving ICI therapy (43 CTLA4, 19 PD1/PDL1, 15 combination CTLA4+PD1/PDL1) were included in the study. The tolerant fraction was significantly lower in cases with clinically significant irAEs (p<0.001) and had an area under the receiver operating curve (AUC) of 0.79. The tolerant fraction was lower for each ICI treatment category, reaching statistical significance for CTLA4 (p<0.001) and demonstrating non-significant trends for PD1/PDL1 (p=0.21) and combination ICI (p=0.18). The tolerant fraction for T cells enriched against napsin A was lower than other samples. The tolerant fraction was also lower in thymic versus peripheral blood samples, and lower in some (multiple sclerosis) but not other (type 1 diabetes) autoimmune diseases. In our study cohort, TRB clonality had an AUC of 0.62, and TRB diversity had an AUC of 0.60 for predicting irAEs. CONCLUSIONS Among patients receiving ICI, the baseline T-cell tolerant fraction may serve as a predictor of clinically significant irAEs.
Collapse
Affiliation(s)
- Jared Ostmeyer
- Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Jason Y Park
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Mitchell S von Itzstein
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - David Hsiehchen
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Harold C Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Farjana Fattah
- Harold C Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Mary Gwin
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Rodrigo Catalan
- Harold C Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Shaheen Khan
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Prithvi Raj
- Department of Immunology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Edward K Wakeland
- Department of Immunology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Yang Xie
- Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Harold C Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - David E Gerber
- Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Harold C Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
11
|
Schmidt J, Chiffelle J, Perez MAS, Magnin M, Bobisse S, Arnaud M, Genolet R, Cesbron J, Barras D, Navarro Rodrigo B, Benedetti F, Michel A, Queiroz L, Baumgaertner P, Guillaume P, Hebeisen M, Michielin O, Nguyen-Ngoc T, Huber F, Irving M, Tissot-Renaud S, Stevenson BJ, Rusakiewicz S, Dangaj Laniti D, Bassani-Sternberg M, Rufer N, Gfeller D, Kandalaft LE, Speiser DE, Zoete V, Coukos G, Harari A. Neoantigen-specific CD8 T cells with high structural avidity preferentially reside in and eliminate tumors. Nat Commun 2023; 14:3188. [PMID: 37280206 DOI: 10.1038/s41467-023-38946-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 05/23/2023] [Indexed: 06/08/2023] Open
Abstract
The success of cancer immunotherapy depends in part on the strength of antigen recognition by T cells. Here, we characterize the T cell receptor (TCR) functional (antigen sensitivity) and structural (monomeric pMHC-TCR off-rates) avidities of 371 CD8 T cell clones specific for neoantigens, tumor-associated antigens (TAAs) or viral antigens isolated from tumors or blood of patients and healthy donors. T cells from tumors exhibit stronger functional and structural avidity than their blood counterparts. Relative to TAA, neoantigen-specific T cells are of higher structural avidity and, consistently, are preferentially detected in tumors. Effective tumor infiltration in mice models is associated with high structural avidity and CXCR3 expression. Based on TCR biophysicochemical properties, we derive and apply an in silico model predicting TCR structural avidity and validate the enrichment in high avidity T cells in patients' tumors. These observations indicate a direct relationship between neoantigen recognition, T cell functionality and tumor infiltration. These results delineate a rational approach to identify potent T cells for personalized cancer immunotherapy.
Collapse
Affiliation(s)
- Julien Schmidt
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Johanna Chiffelle
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Marta A S Perez
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Morgane Magnin
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Sara Bobisse
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Marion Arnaud
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Raphael Genolet
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Julien Cesbron
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - David Barras
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Blanca Navarro Rodrigo
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Fabrizio Benedetti
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Alexandra Michel
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Lise Queiroz
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Petra Baumgaertner
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Philippe Guillaume
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Michael Hebeisen
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
| | - Olivier Michielin
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Tu Nguyen-Ngoc
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
| | - Florian Huber
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Melita Irving
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
| | - Stéphanie Tissot-Renaud
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Brian J Stevenson
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Sylvie Rusakiewicz
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Denarda Dangaj Laniti
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Nathalie Rufer
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
| | - David Gfeller
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Lana E Kandalaft
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Center of Experimental Therapeutics, Department of Oncology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
| | - Daniel E Speiser
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
| | - Vincent Zoete
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - George Coukos
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Alexandre Harari
- Ludwig Institute for Cancer Research, Lausanne University Hospital (CHUV) and University of Lausanne (UNIL), Agora Cancer Research Center, Lausanne, Switzerland.
- Center for Cell Therapy, Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland.
| |
Collapse
|
12
|
Hey S, Whyte D, Hoang MC, Le N, Natvig J, Wingfield C, Onyeama C, Howrylak J, Toby IT. Analysis of CDR3 Sequences from T-Cell Receptor β in Acute Respiratory Distress Syndrome. Biomolecules 2023; 13:biom13050825. [PMID: 37238695 DOI: 10.3390/biom13050825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/04/2023] [Accepted: 05/06/2023] [Indexed: 05/28/2023] Open
Abstract
Acute Respiratory Distress Syndrome (ARDS) is an illness that typically develops in people who are significantly ill or have serious injuries. ARDS is characterized by fluid build-up that occurs in the alveoli. T-cells are implicated as playing a role in the modulation of the aberrant response leading to excessive tissue damage and, eventually, ARDS. Complementarity Determining Region 3 (CDR3) sequences derived from T-cells are key players in the adaptive immune response. This response is governed by an elaborate specificity for distinct molecules and the ability to recognize and vigorously respond to repeated exposures to the same molecules. Most of the diversity in T-cell receptors (TCRs) is contained in the CDR3 regions of the heterodimeric cell-surface receptors. For this study, we employed the novel technology of immune sequencing to assess lung edema fluid. Our goal was to explore the landscape of CDR3 clonal sequences found within these samples. We obtained more than 3615 CDR3 sequences across samples in the study. Our data demonstrate that: (1) CDR3 sequences from lung edema fluid exhibit distinct clonal populations, and (2) CDR3 sequences can be further characterized based on biochemical features. Analysis of these CDR3 sequences offers insight into the CDR3-driven T-cell repertoire of ARDS. These findings represent the first step towards applications of this technology with these types of biological samples in the context of ARDS.
Collapse
Affiliation(s)
- Sara Hey
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| | - Dayjah Whyte
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| | - Minh-Chau Hoang
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| | - Nick Le
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| | - Joseph Natvig
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| | - Claire Wingfield
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| | | | - Judie Howrylak
- Pulmonary, Allergy and Critical Care Division, Penn State Milton S. Hershey Medical Center, Hershey, PA 17033, USA
| | - Inimary T Toby
- Department of Biology, University of Dallas, Irving, TX 75062, USA
| |
Collapse
|
13
|
Sanromán ÁF, Joshi K, Au L, Chain B, Turajlic S. TCR sequencing: applications in immuno-oncology research. IMMUNO-ONCOLOGY TECHNOLOGY 2023; 17:100373. [PMID: 36908996 PMCID: PMC9996383 DOI: 10.1016/j.iotech.2023.100373] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
•T-cell receptor (TCR) interaction with major histocompatibility complex-antigen complexes leads to antitumour responses.•TCR sequencing analysis allows characterisation of T cells that recognise tumour neoantigens.•T-cell clonal revival and clonal replacement potentially underpin immunotherapy responses.
Collapse
Affiliation(s)
- Á F Sanromán
- Cancer Dynamics Laboratory, The Francis Crick Institute, London, UK
| | - K Joshi
- Department of Medical Oncology, The Royal Marsden NHS Foundation Trust, London, UK.,Renal and Skin Unit, The Royal Marsden NHS Foundation Trust, London, UK
| | - L Au
- Cancer Dynamics Laboratory, The Francis Crick Institute, London, UK.,Department of Medical Oncology, Peter MacCallum Cancer Centre, Melbourne, Australia.,Cancer Immunology Program, Peter MacCallum Cancer Centre, Melbourne, Australia.,Sir Peter MacCallum Department of Oncology, The University of Melbourne, Australia
| | - B Chain
- Division of Infection and Immunity, University College London, London, UK.,Department of Computer Science, University College London, London, UK
| | - S Turajlic
- Renal and Skin Unit, The Royal Marsden NHS Foundation Trust, London, UK.,Melanoma and Kidney Cancer Team, The Institute of Cancer Research, London, UK
| |
Collapse
|
14
|
Andrade DS, Terrematte P, Rennó-Costa C, Zilberberg A, Efroni S. GENTLE: a novel bioinformatics tool for generating features and building classifiers from T cell repertoire cancer data. BMC Bioinformatics 2023; 24:32. [PMID: 36717789 PMCID: PMC9885559 DOI: 10.1186/s12859-023-05155-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 01/23/2023] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND In the global effort to discover biomarkers for cancer prognosis, prediction tools have become essential resources. TCR (T cell receptor) repertoires contain important features that differentiate healthy controls from cancer patients or differentiate outcomes for patients being treated with different drugs. Considering, tools that can easily and quickly generate and identify important features out of TCR repertoire data and build accurate classifiers to predict future outcomes are essential. RESULTS This paper introduces GENTLE (GENerator of T cell receptor repertoire features for machine LEarning): an open-source, user-friendly web-application tool that allows TCR repertoire researchers to discover important features; to create classifier models and evaluate them with metrics; and to quickly generate visualizations for data interpretations. We performed a case study with repertoires of TRegs (regulatory T cells) and TConvs (conventional T cells) from healthy controls versus patients with breast cancer. We showed that diversity features were able to distinguish between the groups. Moreover, the classifiers built with these features could correctly classify samples ('Healthy' or 'Breast Cancer')from the TRegs repertoire when trained with the TConvs repertoire, and from the TConvs repertoire when trained with the TRegs repertoire. CONCLUSION The paper walks through installing and using GENTLE and presents a case study and results to demonstrate the application's utility. GENTLE is geared towards any researcher working with TCR repertoire data and aims to discover predictive features from these data and build accurate classifiers. GENTLE is available on https://github.com/dhiego22/gentle and https://share.streamlit.io/dhiego22/gentle/main/gentle.py .
Collapse
Affiliation(s)
- Dhiego Souto Andrade
- grid.411233.60000 0000 9687 399XBioinformatics Multidisciplinary Environment (BioME), Metropole Digital Institute (IMD), Federal University of Rio Grande Do Norte (UFRN), Natal, 59078-970 Brazil
| | - Patrick Terrematte
- grid.411233.60000 0000 9687 399XBioinformatics Multidisciplinary Environment (BioME), Metropole Digital Institute (IMD), Federal University of Rio Grande Do Norte (UFRN), Natal, 59078-970 Brazil
| | - César Rennó-Costa
- grid.411233.60000 0000 9687 399XBioinformatics Multidisciplinary Environment (BioME), Metropole Digital Institute (IMD), Federal University of Rio Grande Do Norte (UFRN), Natal, 59078-970 Brazil
| | - Alona Zilberberg
- grid.22098.310000 0004 1937 0503The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Sol Efroni
- grid.22098.310000 0004 1937 0503The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| |
Collapse
|
15
|
Akerman O, Isakov H, Levi R, Psevkin V, Louzoun Y. Counting is almost all you need. Front Immunol 2023; 13:1031011. [PMID: 36741395 PMCID: PMC9896581 DOI: 10.3389/fimmu.2022.1031011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 12/27/2022] [Indexed: 01/21/2023] Open
Abstract
The immune memory repertoire encodes the history of present and past infections and immunological attributes of the individual. As such, multiple methods were proposed to use T-cell receptor (TCR) repertoires to detect disease history. We here show that the counting method outperforms two leading algorithms. We then show that the counting can be further improved using a novel attention model to weigh the different TCRs. The attention model is based on the projection of TCRs using a Variational AutoEncoder (VAE). Both counting and attention algorithms predict better than current leading algorithms whether the host had CMV and its HLA alleles. As an intermediate solution between the complex attention model and the very simple counting model, we propose a new Graph Convolutional Network approach that obtains the accuracy of the attention model and the simplicity of the counting model. The code for the models used in the paper is provided at: https://github.com/louzounlab/CountingIsAlmostAllYouNeed.
Collapse
Affiliation(s)
- Ofek Akerman
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
| | - Haim Isakov
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Reut Levi
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Vladimir Psevkin
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Yoram Louzoun
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| |
Collapse
|
16
|
Ostmeyer J, Cowell L, Christley S. Dynamic kernel matching for non-conforming data: A case study of T cell receptor datasets. PLoS One 2023; 18:e0265313. [PMID: 36881590 PMCID: PMC9990938 DOI: 10.1371/journal.pone.0265313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 03/01/2022] [Indexed: 03/08/2023] Open
Abstract
Most statistical classifiers are designed to find patterns in data where numbers fit into rows and columns, like in a spreadsheet, but many kinds of data do not conform to this structure. To uncover patterns in non-conforming data, we describe an approach for modifying established statistical classifiers to handle non-conforming data, which we call dynamic kernel matching (DKM). As examples of non-conforming data, we consider (i) a dataset of T-cell receptor (TCR) sequences labelled by disease antigen and (ii) a dataset of sequenced TCR repertoires labelled by patient cytomegalovirus (CMV) serostatus, anticipating that both datasets contain signatures for diagnosing disease. We successfully fit statistical classifiers augmented with DKM to both datasets and report the performance on holdout data using standard metrics and metrics allowing for indeterminant diagnoses. Finally, we identify the patterns used by our statistical classifiers to generate predictions and show that these patterns agree with observations from experimental studies.
Collapse
Affiliation(s)
- Jared Ostmeyer
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- * E-mail:
| | - Lindsay Cowell
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Scott Christley
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| |
Collapse
|
17
|
Safra M, Werner L, Peres A, Polak P, Salamon N, Schvimer M, Weiss B, Barshack I, Shouval DS, Yaari G. A somatic hypermutation-based machine learning model stratifies individuals with Crohn's disease and controls. Genome Res 2023; 33:71-79. [PMID: 36526432 PMCID: PMC9977146 DOI: 10.1101/gr.276683.122] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022]
Abstract
Crohn's disease (CD) is a chronic relapsing-remitting inflammatory disorder of the gastrointestinal tract that is characterized by altered innate and adaptive immune function. Although massively parallel sequencing studies of the T cell receptor repertoire identified oligoclonal expansion of unique clones, much less is known about the B cell receptor (BCR) repertoire in CD. Here, we present a novel BCR repertoire sequencing data set from ileal biopsies from pediatric patients with CD and controls, and identify CD-specific somatic hypermutation (SHM) patterns, revealed by a machine learning (ML) algorithm trained on BCR repertoire sequences. Moreover, ML classification of a different data set from blood samples of adults with CD versus controls identified that V gene usage, clusters, or mutation frequencies yielded excellent results in classifying the disease (F1 > 90%). In summary, we show that an ML algorithm enables the classification of CD based on unique BCR repertoire features with high accuracy.
Collapse
Affiliation(s)
- Modi Safra
- The Alexander Kofkin Faculty of Engineering, Bar Ilan University, 5290002, Ramat Gan, Israel;,Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002, Ramat Gan, Israel
| | - Lael Werner
- Institute of Gastroenterology, Nutrition and Liver Diseases, Schneider Children's Medical Center of Israel, Petah Tikva 4920235, Israel;,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Ayelet Peres
- The Alexander Kofkin Faculty of Engineering, Bar Ilan University, 5290002, Ramat Gan, Israel;,Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002, Ramat Gan, Israel
| | - Pazit Polak
- The Alexander Kofkin Faculty of Engineering, Bar Ilan University, 5290002, Ramat Gan, Israel;,Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002, Ramat Gan, Israel
| | - Naomi Salamon
- Pediatric Gastroenterology Unit, Edmond and Lily Safra Children's Hospital, Sheba Medical Center, Ramat Gan 5262100, Israel
| | - Michael Schvimer
- Institute of Pathology, Sheba Medical Center, Ramat Gan 5262100, Israel
| | - Batia Weiss
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 6997801, Israel;,Pediatric Gastroenterology Unit, Edmond and Lily Safra Children's Hospital, Sheba Medical Center, Ramat Gan 5262100, Israel
| | - Iris Barshack
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 6997801, Israel;,Institute of Pathology, Sheba Medical Center, Ramat Gan 5262100, Israel
| | - Dror S. Shouval
- Institute of Gastroenterology, Nutrition and Liver Diseases, Schneider Children's Medical Center of Israel, Petah Tikva 4920235, Israel;,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Gur Yaari
- The Alexander Kofkin Faculty of Engineering, Bar Ilan University, 5290002, Ramat Gan, Israel;,Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002, Ramat Gan, Israel
| |
Collapse
|
18
|
Kanduri C, Scheffer L, Pavlović M, Rand KD, Chernigovskaya M, Pirvandy O, Yaari G, Greiff V, Sandve GK. simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods. Gigascience 2022; 12:giad074. [PMID: 37848619 PMCID: PMC10580376 DOI: 10.1093/gigascience/giad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 07/20/2023] [Accepted: 08/29/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Machine learning (ML) has gained significant attention for classifying immune states in adaptive immune receptor repertoires (AIRRs) to support the advancement of immunodiagnostics and therapeutics. Simulated data are crucial for the rigorous benchmarking of AIRR-ML methods. Existing approaches to generating synthetic benchmarking datasets result in the generation of naive repertoires missing the key feature of many shared receptor sequences (selected for common antigens) found in antigen-experienced repertoires. RESULTS We demonstrate that a common approach to generating simulated AIRR benchmark datasets can introduce biases, which may be exploited for undesired shortcut learning by certain ML methods. To mitigate undesirable access to true signals in simulated AIRR datasets, we devised a simulation strategy (simAIRR) that constructs antigen-experienced-like repertoires with a realistic overlap of receptor sequences. simAIRR can be used for constructing AIRR-level benchmarks based on a range of assumptions (or experimental data sources) for what constitutes receptor-level immune signals. This includes the possibility of making or not making any prior assumptions regarding the similarity or commonality of immune state-associated sequences that will be used as true signals. We demonstrate the real-world realism of our proposed simulation approach by showing that basic ML strategies perform similarly on simAIRR-generated and real-world experimental AIRR datasets. CONCLUSIONS This study sheds light on the potential shortcut learning opportunities for ML methods that can arise with the state-of-the-art way of simulating AIRR datasets. simAIRR is available as a Python package: https://github.com/KanduriC/simAIRR.
Collapse
Affiliation(s)
- Chakravarthi Kanduri
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| | - Lonneke Scheffer
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Milena Pavlović
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| | - Knut Dagestad Rand
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology and Oslo University Hospital, University of Oslo, 0373 Oslo, Norway
| | - Oz Pirvandy
- Faculty of Engineering, Bar-Ilan University, 5290002, Israel
| | - Gur Yaari
- Faculty of Engineering, Bar-Ilan University, 5290002, Israel
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, 0373 Oslo, Norway
| | - Geir K Sandve
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| |
Collapse
|
19
|
Multiple instance neural networks based on sparse attention for cancer detection using T-cell receptor sequences. BMC Bioinformatics 2022; 23:469. [PMID: 36348271 PMCID: PMC9644450 DOI: 10.1186/s12859-022-05012-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/26/2022] [Indexed: 11/11/2022] Open
Abstract
Early detection of cancers has been much explored due to its paramount importance in biomedical fields. Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology. However, the one-to-many correspondence between a patient and multiple TCR sequences hinders researchers from simply adopting classical statistical/machine learning methods. There were recent attempts to model this type of data in the context of multiple instance learning (MIL). Despite the novel application of MIL to cancer detection using TCR sequences and the demonstrated adequate performance in several tumor types, there is still room for improvement, especially for certain cancer types. Furthermore, explainable neural network models are not fully investigated for this application. In this article, we propose multiple instance neural networks based on sparse attention (MINN-SA) to enhance the performance in cancer detection and explainability. The sparse attention structure drops out uninformative instances in each bag, achieving both interpretability and better predictive performance in combination with the skip connection. Our experiments show that MINN-SA yields the highest area under the ROC curve scores on average measured across 10 different types of cancers, compared to existing MIL approaches. Moreover, we observe from the estimated attentions that MINN-SA can identify the TCRs that are specific for tumor antigens in the same T cell repertoire.
Collapse
|
20
|
Garrido-Mesa J, Brown MA. T cell Repertoire Profiling and the Mechanism by which HLA-B27 Causes Ankylosing Spondylitis. Curr Rheumatol Rep 2022; 24:398-410. [PMID: 36197645 PMCID: PMC9666335 DOI: 10.1007/s11926-022-01090-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2022] [Indexed: 11/25/2022]
Abstract
Purpose of Review Ankylosing spondylitis (AS) is strongly associated with the HLA-B27 gene. The canonical function of HLA-B27 is to present antigenic peptides to CD8 lymphocytes, leading to adaptive immune responses. The ‘arthritogenic peptide’ theory as to the mechanism by which HLA-B27 induces ankylosing spondylitis proposes that HLA-B27 presents peptides derived from exogenous sources such as bacteria to CD8 lymphocytes, which subsequently cross-react with antigens at the site of inflammation of the disease, causing inflammation. This review describes findings of studies in AS involving profiling of T cell expansions and discusses future research opportunities based on these findings. Recent Findings Consistent with this theory, there is an expanding body of data showing that expansion of a restricted pool of CD8 lymphocytes is found in most AS patients yet only in a small proportion of healthy HLA-B27 carriers. Summary These exciting findings strongly support the theory that AS is driven by presentation of antigenic peptides to the adaptive immune system by HLA-B27. They point to new potential approaches to identify the exogenous and endogenous antigens involved and to potential therapies for the disease.
Collapse
Affiliation(s)
- Jose Garrido-Mesa
- Department of Medical and Molecular Genetics, Faculty of Life Sciences and Medicine, King's College London, London, England
| | - Matthew A Brown
- Department of Medical and Molecular Genetics, Faculty of Life Sciences and Medicine, King's College London, London, England.
- Genomics England, Charterhouse Square, London, EC1M 6BQ, England.
| |
Collapse
|
21
|
Ji F, Chen L, Chen Z, Luo B, Wang Y, Lan X. TCR repertoire and transcriptional signatures of circulating tumour-associated T cells facilitate effective non-invasive cancer detection. Clin Transl Med 2022; 12:e853. [PMID: 36134717 PMCID: PMC9494610 DOI: 10.1002/ctm2.853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/11/2022] [Accepted: 04/15/2022] [Indexed: 11/10/2022] Open
Affiliation(s)
- Fansen Ji
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.,School of Medicine, Tsinghua University, Beijing, China
| | - Lin Chen
- School of Medicine, Tsinghua University, Beijing, China.,General Surgery Department, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Zhizhuo Chen
- School of Life Science, Tsinghua University, Beijing, China
| | - Bin Luo
- General Surgery Department, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Yongwang Wang
- Department of Anesthesiology, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Xun Lan
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.,School of Medicine, Tsinghua University, Beijing, China
| |
Collapse
|
22
|
Katayama Y, Kobayashi TJ. Comparative Study of Repertoire Classification Methods Reveals Data Efficiency of k-mer Feature Extraction. Front Immunol 2022; 13:797640. [PMID: 35936014 PMCID: PMC9346074 DOI: 10.3389/fimmu.2022.797640] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 06/20/2022] [Indexed: 01/18/2023] Open
Abstract
The repertoire of T cell receptors encodes various types of immunological information. Machine learning is indispensable for decoding such information from repertoire datasets measured by next-generation sequencing (NGS). In particular, the classification of repertoires is the most basic task, which is relevant for a variety of scientific and clinical problems. Supported by the recent appearance of large datasets, efficient but data-expensive methods have been proposed. However, it is unclear whether they can work efficiently when the available sample size is severely restricted as in practical situations. In this study, we demonstrate that their performances can be impaired substantially below critical sample sizes. To complement this drawback, we propose MotifBoost, which exploits the information of short k-mer motifs of TCRs. MotifBoost can perform the classification as efficiently as a deep learning method on large datasets while providing more stable and reliable results on small datasets. We tested MotifBoost on the four small datasets which consist of various conditions such as Cytomegalovirus (CMV), HIV, α-chain, β-chain and it consistently preserved the stability. We also clarify that the robustness of MotifBoost can be attributed to the efficiency of k-mer motifs as representation features of repertoires. Finally, by comparing the predictions of these methods, we show that the whole sequence identity and sequence motifs encode partially different information and that a combination of such complementary information is necessary for further development of repertoire analysis.
Collapse
Affiliation(s)
- Yotaro Katayama
- Graduate School of Engineering, The University of Tokyo, Tokyo, Japan
- *Correspondence: Yotaro Katayama,
| | | |
Collapse
|
23
|
Katayama Y, Yokota R, Akiyama T, Kobayashi TJ. Machine Learning Approaches to TCR Repertoire Analysis. Front Immunol 2022; 13:858057. [PMID: 35911778 PMCID: PMC9334875 DOI: 10.3389/fimmu.2022.858057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Sparked by the development of genome sequencing technology, the quantity and quality of data handled in immunological research have been changing dramatically. Various data and database platforms are now driving the rapid progress of machine learning for immunological data analysis. Of various topics in immunology, T cell receptor repertoire analysis is one of the most important targets of machine learning for assessing the state and abnormalities of immune systems. In this paper, we review recent repertoire analysis methods based on machine learning and deep learning and discuss their prospects.
Collapse
Affiliation(s)
- Yotaro Katayama
- Graduate School of Engineering, The University of Tokyo, Tokyo, Japan
- *Correspondence: Yotaro Katayama,
| | - Ryo Yokota
- National Research Institute of Police Science, Kashiwa, Chiba, Japan
| | - Taishin Akiyama
- Laboratory for Immune Homeostasis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Graduate School of Medical Life Science, Yokohama City University, Yokohama, Japan
| | - Tetsuya J. Kobayashi
- Graduate School of Engineering, The University of Tokyo, Tokyo, Japan
- Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
24
|
Chen Y, Ye Z, Zhang Y, Xie W, Chen Q, Lan C, Yang X, Zeng H, Zhu Y, Ma C, Tang H, Wang Q, Guan J, Chen S, Li F, Yang W, Yan H, Yu X, Zhang Z. A Deep Learning Model for Accurate Diagnosis of Infection Using Antibody Repertoires. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2022; 208:2675-2685. [PMID: 35606050 DOI: 10.4049/jimmunol.2200063] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 04/11/2022] [Indexed: 06/15/2023]
Abstract
The adaptive immune receptor repertoire consists of the entire set of an individual's BCRs and TCRs and is believed to contain a record of prior immune responses and the potential for future immunity. Analyses of TCR repertoires via deep learning (DL) methods have successfully diagnosed cancers and infectious diseases, including coronavirus disease 2019. However, few studies have used DL to analyze BCR repertoires. In this study, we collected IgG H chain Ab repertoires from 276 healthy control subjects and 326 patients with various infections. We then extracted a comprehensive feature set consisting of 10 subsets of repertoire-level features and 160 sequence-level features and tested whether these features can distinguish between infected individuals and healthy control subjects. Finally, we developed an ensemble DL model, namely, DL method for infection diagnosis (https://github.com/chenyuan0510/DeepID), and used this model to differentiate between the infected and healthy individuals. Four subsets of repertoire-level features and four sequence-level features were selected because of their excellent predictive performance. The DL method for infection diagnosis outperformed traditional machine learning methods in distinguishing between healthy and infected samples (area under the curve = 0.9883) and achieved a multiclassification accuracy of 0.9104. We also observed differences between the healthy and infected groups in V genes usage, clonal expansion, the complexity of reads within clone, the physical properties in the α region, and the local flexibility of the CDR3 amino acid sequence. Our results suggest that the Ab repertoire is a promising biomarker for the diagnosis of various infections.
Collapse
Affiliation(s)
- Yuan Chen
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Zhiming Ye
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- Division of Nephrology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Yanfang Zhang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Wenxi Xie
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Qingyun Chen
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Chunhong Lan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Xiujia Yang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Huikun Zeng
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Yan Zhu
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Cuiyu Ma
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Haipei Tang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Qilong Wang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Junjie Guan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Sen Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Fenxiang Li
- Department of Infectious Disease Control and Prevention, Center for Disease Control and Prevention of Southern Theatre Command, Guangzhou, China
| | - Wei Yang
- Department of Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Huacheng Yan
- Department of Infectious Disease Control and Prevention, Center for Disease Control and Prevention of Southern Theatre Command, Guangzhou, China
| | - Xueqing Yu
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China;
- Division of Nephrology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Zhenhai Zhang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China;
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou, China; and
- Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China
| |
Collapse
|
25
|
Gilboa A, Hope R, Ben Simon S, Polak P, Koren O, Yaari G. Ontogeny of the B Cell Receptor Repertoire and Microbiome in Mice. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2022; 208:2713-2725. [PMID: 35623663 DOI: 10.4049/jimmunol.2100955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 03/30/2022] [Indexed: 06/15/2023]
Abstract
The immune system matures throughout childhood to achieve full functionality in protecting our bodies against threats. The immune system has a strong reciprocal symbiosis with the host bacterial population and the two systems co-develop, shaping each other. Despite their fundamental role in health physiology, the ontogeny of these systems is poorly characterized. In this study, we investigated the development of the BCR repertoire by analyzing high-throughput sequencing of their receptors in several time points of young C57BL/6J mice. In parallel, we explored the development of the gut microbiome. We discovered that the gut IgA repertoires change from birth to adolescence, including an increase in CDR3 lengths and somatic hypermutation levels. This contrasts with the spleen IgM repertoires that remain stable and distinct from the IgA repertoires in the gut. We also discovered that large clones that germinate in the gut are initially confined to a specific gut compartment, then expand to nearby compartments and later on expand also to the spleen and remain there. Finally, we explored the associations between diversity indices of the B cell repertoires and the microbiome, as well as associations between bacterial and BCR clusters. Our results shed light on the ontogeny of the adaptive immune system and the microbiome, providing a baseline for future research.
Collapse
Affiliation(s)
- Amit Gilboa
- Bioengineering, Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
- Bar Ilan Institute of Nanotechnologies and Advanced Materials, Bar Ilan University, Ramat Gan, Israel; and
| | - Ronen Hope
- Bioengineering, Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Shira Ben Simon
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Pazit Polak
- Bioengineering, Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
- Bar Ilan Institute of Nanotechnologies and Advanced Materials, Bar Ilan University, Ramat Gan, Israel; and
| | - Omry Koren
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Gur Yaari
- Bioengineering, Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel;
- Bar Ilan Institute of Nanotechnologies and Advanced Materials, Bar Ilan University, Ramat Gan, Israel; and
| |
Collapse
|
26
|
Glazer N, Akerman O, Louzoun Y. Naive and memory T cells TCR-HLA-binding prediction. OXFORD OPEN IMMUNOLOGY 2022; 3:iqac001. [PMID: 36846560 PMCID: PMC9914496 DOI: 10.1093/oxfimm/iqac001] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/01/2022] [Accepted: 05/17/2022] [Indexed: 11/12/2022] Open
Abstract
T cells recognize antigens through the interaction of their T cell receptor (TCR) with a peptide-major histocompatibility complex (pMHC) molecule. Following thymic-positive selection, TCRs in peripheral naive T cells are expected to bind MHC alleles of the host. Peripheral clonal selection is expected to further increase the frequency of antigen-specific TCRs that bind to the host MHC alleles. To check for a systematic preference for MHC-binding T cells in TCR repertoires, we developed Natural Language Processing-based methods to predict TCR-MHC binding independently of the peptide presented for Class I MHC alleles. We trained a classifier on published TCR-pMHC binding pairs and obtained a high area under curve (AUC) of over 0.90 on the test set. However, when applied to TCR repertoires, the accuracy of the classifier dropped. We thus developed a two-stage prediction model, based on large-scale naive and memory TCR repertoires, denoted TCR HLA-binding predictor (CLAIRE). Since each host carries multiple human leukocyte antigen (HLA) alleles, we first computed whether a TCR on a CD8 T cell binds an MHC from any of the host Class-I HLA alleles. We then performed an iteration, where we predict the binding with the most probable allele from the first round. We show that this classifier is more precise for memory than for naïve cells. Moreover, it can be transferred between datasets. Finally, we developed a CD4-CD8 T cell classifier to apply CLAIRE to unsorted bulk sequencing datasets and showed a high AUC of 0.96 and 0.90 on large datasets. CLAIRE is available through a GitHub at: https://github.com/louzounlab/CLAIRE, and as a server at: https://claire.math.biu.ac.il/Home.
Collapse
Affiliation(s)
- Neta Glazer
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Ofek Akerman
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Yoram Louzoun
- Correspondence address. Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel. E-mail:
| |
Collapse
|
27
|
Kanduri C, Pavlović M, Scheffer L, Motwani K, Chernigovskaya M, Greiff V, Sandve GK. Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification. Gigascience 2022; 11:6593147. [PMID: 35639633 PMCID: PMC9154052 DOI: 10.1093/gigascience/giac046] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 12/23/2021] [Accepted: 04/08/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Machine learning (ML) methodology development for the classification of immune states in adaptive immune receptor repertoires (AIRRs) has seen a recent surge of interest. However, so far, there does not exist a systematic evaluation of scenarios where classical ML methods (such as penalized logistic regression) already perform adequately for AIRR classification. This hinders investigative reorientation to those scenarios where method development of more sophisticated ML approaches may be required. RESULTS To identify those scenarios where a baseline ML method is able to perform well for AIRR classification, we generated a collection of synthetic AIRR benchmark data sets encompassing a wide range of data set architecture-associated and immune state-associated sequence patterns (signal) complexity. We trained ≈1,700 ML models with varying assumptions regarding immune signal on ≈1,000 data sets with a total of ≈250,000 AIRRs containing ≈46 billion TCRβ CDR3 amino acid sequences, thereby surpassing the sample sizes of current state-of-the-art AIRR-ML setups by two orders of magnitude. We found that L1-penalized logistic regression achieved high prediction accuracy even when the immune signal occurs only in 1 out of 50,000 AIR sequences. CONCLUSIONS We provide a reference benchmark to guide new AIRR-ML classification methodology by (i) identifying those scenarios characterized by immune signal and data set complexity, where baseline methods already achieve high prediction accuracy, and (ii) facilitating realistic expectations of the performance of AIRR-ML models given training data set properties and assumptions. Our study serves as a template for defining specialized AIRR benchmark data sets for comprehensive benchmarking of AIRR-ML methods.
Collapse
Affiliation(s)
- Chakravarthi Kanduri
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| | - Milena Pavlović
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| | - Lonneke Scheffer
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| | - Keshav Motwani
- Department of Pathology, Immunology and Laboratory Medicine, University of Florida, FL 32610, USA
| | - Maria Chernigovskaya
- Department of Immunology and Oslo University Hospital, University of Oslo, Oslo, 0372, Norway
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, Oslo, 0372, Norway
| | - Geir K Sandve
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| |
Collapse
|
28
|
Xu Y, Qian X, Zhang X, Lai X, Liu Y, Wang J. DeepLION: Deep Multi-Instance Learning Improves the Prediction of Cancer-Associated T Cell Receptors for Accurate Cancer Detection. Front Genet 2022; 13:860510. [PMID: 35601486 PMCID: PMC9121378 DOI: 10.3389/fgene.2022.860510] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 02/23/2022] [Indexed: 01/21/2023] Open
Abstract
Recent studies highlight the potential of T cell receptor (TCR) repertoires in accurately detecting cancers via noninvasive sampling. Unfortunately, due to the complicated associations among cancer antigens and the possible induced T cell responses, currently, the practical strategy for identifying cancer-associated TCRs is the computational prediction based on TCR repertoire data. Several state-of-the-art methods were proposed in recent year or two; however, the prediction algorithms were still weakened by two major issues. To facilitate the computational processes, the algorithms prefer to decompose the original TCR sequences into length-fixed amino acid fragments, while the first dilemma comes as the lengths of cancer-associated motifs are suggested to be various. Moreover, the correlations among TCRs in the same repertoire should be further considered, which are often ignored by the existing methods. We here developed a deep multi-instance learning method, named DeepLION, to improve the prediction of cancer-associated TCRs by considering these issues. First, DeepLION introduced a deep learning framework with alternative convolution filters and 1-max pooling operations to handle the amino acid fragments with different lengths. Then, the multi-instance learning framework modeled the TCR correlations and assigned adjusted weights for each TCR sequence during the predicting process. To validate the performance of DeepLION, we conducted a series of experiments on several cohorts of patients from nine cancer types. Compared to the existing methods, DeepLION achieved, on most of the cohorts, higher prediction accuracies, sensitivities, specificities, and areas under the curve (AUCs), where the AUC reached notably 0.97 and 0.90 for thyroid and lung cancer cohorts, respectively. Thus, DeepLION may further support the detection of cancers from TCR repertoire data. DeepLION is publicly available on GitHub, at https://github.com/Bioinformatics7181/DeepLION, for academic usage only.
Collapse
Affiliation(s)
- Ying Xu
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xinyang Qian
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xuanping Zhang
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xin Lai
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Yuqian Liu
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Jiayin Wang
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
- *Correspondence: Jiayin Wang,
| |
Collapse
|
29
|
Pauken KE, Lagattuta KA, Lu BY, Lucca LE, Daud AI, Hafler DA, Kluger HM, Raychaudhuri S, Sharpe AH. TCR-sequencing in cancer and autoimmunity: barcodes and beyond. Trends Immunol 2022; 43:180-194. [PMID: 35090787 PMCID: PMC8882139 DOI: 10.1016/j.it.2022.01.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 01/04/2022] [Accepted: 01/04/2022] [Indexed: 01/21/2023]
Abstract
The T cell receptor (TCR) endows T cells with antigen specificity and is central to nearly all aspects of T cell function. Each naïve T cell has a unique TCR sequence that is stably maintained during cell division. In this way, the TCR serves as a molecular barcode that tracks processes such as migration, differentiation, and proliferation of T cells. Recent technological advances have enabled sequencing of the TCR from single cells alongside deep molecular phenotypes on an unprecedented scale. In this review, we discuss strengths and limitations of TCR sequences as molecular barcodes and their application to study immune responses following Programmed Death-1 (PD-1) blockade in cancer. Additionally, we consider applications of TCR data beyond use as a barcode.
Collapse
Affiliation(s)
- Kristen E Pauken
- Department of Immunology, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Evergrande Center for Immunological Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA.
| | - Kaitlyn A Lagattuta
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Benjamin Y Lu
- Department of Neurology and Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA; Department of Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Liliana E Lucca
- Department of Neurology and Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
| | - Adil I Daud
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - David A Hafler
- Department of Neurology and Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Harriet M Kluger
- Department of Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Centre for Genetics and Genomics Versus Arthritis, Manchester Academic Health Science Centre, University of Manchester, Manchester M13 9PL, UK
| | - Arlene H Sharpe
- Department of Immunology, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Evergrande Center for Immunological Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
30
|
Joshi K, Milighetti M, Chain BM. Application of T cell receptor (TCR) repertoire analysis for the advancement of cancer immunotherapy. Curr Opin Immunol 2022; 74:1-8. [PMID: 34454284 DOI: 10.1016/j.coi.2021.07.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 07/13/2021] [Accepted: 07/13/2021] [Indexed: 12/14/2022]
Abstract
T cell receptor (TCR) sequencing has emerged as a powerful new technology in analysis of the host-tumour interaction. The advances in NextGen sequencing technologies, coupled with powerful novel bioinformatic tools, allow quantitative and reproducible characterisation of repertoires from tumour and blood samples from an increasing number of patients with a variety of solid cancers. In this review, we consider how global metrics such as T cell clonality and diversity can be extracted from these repertoires and used to give insight into the mechanism of action of immune checkpoint blockade. Furthermore, we explore how the analysis of TCR overlap between repertories can help define spatial and temporal heterogeneity of the anti-tumoural immune response. Finally, we review how analysis of TCR sequence and structure, either of individual TCRs or from sets of related TCRs can be used to annotate the antigenic specificity, with important implications for the development of personalised adoptive cellular immunotherapies.
Collapse
Affiliation(s)
- Kroopa Joshi
- Department of Medical Oncology, The Royal Marsden NHS Foundation Trust, London, United Kingdom
| | - Martina Milighetti
- Division of Infection and Immunity, University College London, London, United Kingdom
| | - Benjamin M Chain
- Division of Infection and Immunity, University College London, London, United Kingdom; Department of Computer Science, University College London, London, United Kingdom.
| |
Collapse
|
31
|
The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00413-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
32
|
Milighetti M, Shawe-Taylor J, Chain B. Predicting T Cell Receptor Antigen Specificity From Structural Features Derived From Homology Models of Receptor-Peptide-Major Histocompatibility Complexes. Front Physiol 2021; 12:730908. [PMID: 34566692 PMCID: PMC8456106 DOI: 10.3389/fphys.2021.730908] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 08/02/2021] [Indexed: 11/13/2022] Open
Abstract
The physical interaction between the T cell receptor (TCR) and its cognate antigen causes T cells to activate and participate in the immune response. Understanding this physical interaction is important in predicting TCR binding to a target epitope, as well as potential cross-reactivity. Here, we propose a way of collecting informative features of the binding interface from homology models of T cell receptor-peptide-major histocompatibility complex (TCR-pMHC) complexes. The information collected from these structures is sufficient to discriminate binding from non-binding TCR-pMHC pairs in multiple independent datasets. The classifier is limited by the number of crystal structures available for the homology modelling and by the size of the training set. However, the classifier shows comparable performance to sequence-based classifiers requiring much larger training sets.
Collapse
Affiliation(s)
- Martina Milighetti
- Division of Infection and Immunity, University College London, London, United Kingdom
- Cancer Institute, University College London, London, United Kingdom
| | - John Shawe-Taylor
- Department of Computer Science, University College London, London, United Kingdom
| | - Benny Chain
- Division of Infection and Immunity, University College London, London, United Kingdom
- Department of Computer Science, University College London, London, United Kingdom
| |
Collapse
|
33
|
GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation. Nat Commun 2021; 12:4699. [PMID: 34349111 PMCID: PMC8339063 DOI: 10.1038/s41467-021-25006-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 07/19/2021] [Indexed: 01/18/2023] Open
Abstract
Similarity in T-cell receptor (TCR) sequences implies shared antigen specificity between receptors, and could be used to discover novel therapeutic targets. However, existing methods that cluster T-cell receptor sequences by similarity are computationally inefficient, making them impractical to use on the ever-expanding datasets of the immune repertoire. Here, we developed GIANA (Geometric Isometry-based TCR AligNment Algorithm) a computationally efficient tool for this task that provides the same level of clustering specificity as TCRdist at 600 times its speed, and without sacrificing accuracy. GIANA also allows the rapid query of large reference cohorts within minutes. Using GIANA to cluster large-scale TCR datasets provides candidate disease-specific receptors, and provides a new solution to repertoire classification. Querying unseen TCR-seq samples against an existing reference differentiates samples from patients across various cohorts associated with cancer, infectious and autoimmune disease. Our results demonstrate how GIANA could be used as the basis for a TCR-based non-invasive multi-disease diagnostic platform. Grouping T-cell receptors (TCRs) by sequence similarity could lead to new immunological insights. Here, the authors propose a tool that allows the rapid clustering of millions of TCR sequences, identifying TCRs potentially associated with the response to cancer, infectious and autoimmune diseases.
Collapse
|
34
|
Patel DN, Yeagley M, Arturo JF, Falasiri S, Chobrutskiy BI, Gozlan EC, Blanck G. A comparison of immune receptor recombination databases sourced from tumour exome or RNAseq files: Verifications of immunological distinctions between primary and metastatic melanoma. Int J Immunogenet 2021; 48:409-418. [PMID: 34298587 DOI: 10.1111/iji.12550] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 07/11/2021] [Indexed: 02/07/2023]
Abstract
It became apparent several years ago that RNAseq and exome files prepared from tissue could be mined for adaptive immune receptor (IR) recombinations, which has given extra value to datasets originally intended for gene expression or mutation studies. For example, recovery of IR recombination reads from tumour specimen genomics files can correlate with survival rates. In particular, many benchmarking processes have been applied to the two sets of the IR recombination reads obtained from the cancer genome atlas files, but these two sets have never been directly compared. Here we show that both sets largely agree regarding several parameters. For example, recovery of TRB recombination reads from both WXS and RNAseq files representing metastatic melanoma was associated with a better outcome (p < .0004 in both cases); and T-cell receptor recombination read recovery, for both genomics file types, associated very strongly with T-cell gene expression markers. However, the use of CDR3 chemical features for survival distinctions was not consistent. This topic, and the surprising result that both datasets indicated that primary melanoma with recovery of IR recombination reads, in stark contrast to metastatic melanoma, represents a worse outcome, are discussed.
Collapse
Affiliation(s)
- Dhruv N Patel
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Michelle Yeagley
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Juan F Arturo
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Shayan Falasiri
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Boris I Chobrutskiy
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Etienne C Gozlan
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - George Blanck
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.,Department of Immunology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| |
Collapse
|
35
|
de Sousa E, Lérias JR, Beltran A, Paraschoudi G, Condeço C, Kamiki J, António PA, Figueiredo N, Carvalho C, Castillo-Martin M, Wang Z, Ligeiro D, Rao M, Maeurer M. Targeting Neoepitopes to Treat Solid Malignancies: Immunosurgery. Front Immunol 2021; 12:592031. [PMID: 34335558 PMCID: PMC8320363 DOI: 10.3389/fimmu.2021.592031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 05/07/2021] [Indexed: 12/26/2022] Open
Abstract
Successful outcome of immune checkpoint blockade in patients with solid cancers is in part associated with a high tumor mutational burden (TMB) and the recognition of private neoantigens by T-cells. The quality and quantity of target recognition is determined by the repertoire of ‘neoepitope’-specific T-cell receptors (TCRs) in tumor-infiltrating lymphocytes (TIL), or peripheral T-cells. Interferon gamma (IFN-γ), produced by T-cells and other immune cells, is essential for controlling proliferation of transformed cells, induction of apoptosis and enhancing human leukocyte antigen (HLA) expression, thereby increasing immunogenicity of cancer cells. TCR αβ-dependent therapies should account for tumor heterogeneity and availability of the TCR repertoire capable of reacting to neoepitopes and functional HLA pathways. Immunogenic epitopes in the tumor-stroma may also be targeted to achieve tumor-containment by changing the immune-contexture in the tumor microenvironment (TME). Non protein-coding regions of the tumor-cell genome may also contain many aberrantly expressed, non-mutated tumor-associated antigens (TAAs) capable of eliciting productive anti-tumor immune responses. Whole-exome sequencing (WES) and/or RNA sequencing (RNA-Seq) of cancer tissue, combined with several layers of bioinformatic analysis is commonly used to predict possible neoepitopes present in clinical samples. At the ImmunoSurgery Unit of the Champalimaud Centre for the Unknown (CCU), a pipeline combining several tools is used for predicting private mutations from WES and RNA-Seq data followed by the construction of synthetic peptides tailored for immunological response assessment reflecting the patient’s tumor mutations, guided by MHC typing. Subsequent immunoassays allow the detection of differential IFN-γ production patterns associated with (intra-tumoral) spatiotemporal differences in TIL or peripheral T-cells versus TIL. These bioinformatics tools, in addition to histopathological assessment, immunological readouts from functional bioassays and deep T-cell ‘adaptome’ analyses, are expected to advance discovery and development of next-generation personalized precision medicine strategies to improve clinical outcomes in cancer in the context of i) anti-tumor vaccination strategies, ii) gauging mutation-reactive T-cell responses in biological therapies and iii) expansion of tumor-reactive T-cells for the cellular treatment of patients with cancer.
Collapse
Affiliation(s)
- Eric de Sousa
- ImmunoSurgery Unit, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | - Joana R Lérias
- ImmunoSurgery Unit, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | - Antonio Beltran
- Department of Pathology, Champalimaud Clinical Centre, Lisbon, Portugal
| | | | - Carolina Condeço
- ImmunoSurgery Unit, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | - Jéssica Kamiki
- ImmunoSurgery Unit, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | | | - Nuno Figueiredo
- Digestive Unit, Champalimaud Clinical Centre, Lisbon, Portugal
| | - Carlos Carvalho
- Digestive Unit, Champalimaud Clinical Centre, Lisbon, Portugal
| | | | - Zhe Wang
- Jiangsu Industrial Technology Research Institute (JITRI), Applied Adaptome Immunology Institute, Nanjing, China
| | - Dário Ligeiro
- Lisbon Centre for Blood and Transplantation, Instituto Português do Sangue e Transplantação (IPST), Lisbon, Portugal
| | - Martin Rao
- ImmunoSurgery Unit, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | - Markus Maeurer
- ImmunoSurgery Unit, Champalimaud Centre for the Unknown, Lisbon, Portugal.,I Medical Clinic, Johannes Gutenberg University of Mainz, Mainz, Germany
| |
Collapse
|
36
|
Ostmeyer J, Cowell L, Greenberg B, Christley S. Reconstituting T cell receptor selection in-silico. Genes Immun 2021; 22:187-193. [PMID: 34127826 DOI: 10.1038/s41435-021-00141-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/13/2021] [Accepted: 05/26/2021] [Indexed: 11/09/2022]
Abstract
Each T cell receptor (TCR) gene is created without regard for which substances (antigens) the receptor can recognize. T cell selection culls developing T cells when their TCRs (i) fail to recognize major histocompatibility complexes (MHCs) that act as antigen presenting platforms or (ii) recognize with high affinity self-antigens derived from healthy cells and tissue. While T cell selection has been thoroughly studied, little is known about which TCRs are retained or removed by this process. Therefore, we develop an approach using TCR gene sequencing and machine learning to identify patterns in TCR protein sequences influencing the outcome of T cell receptor selection. We verify the trained models classify TCRs from developing T cells as being before selection and TCRs from mature T cells as being after selection. Our approach may provide future avenues for studying the relationship between T cell selection and conditions like autoimmune diseases.
Collapse
Affiliation(s)
- Jared Ostmeyer
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, USA.
| | - Lindsay Cowell
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| | - Benjamin Greenberg
- Department of Neurology, UT Southwestern Medical Center, Dallas, TX, USA
| | - Scott Christley
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
37
|
Xiong D, Zhang Z, Wang T, Wang X. A comparative study of multiple instance learning methods for cancer detection using T-cell receptor sequences. Comput Struct Biotechnol J 2021; 19:3255-3268. [PMID: 34141144 PMCID: PMC8192570 DOI: 10.1016/j.csbj.2021.05.038] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 05/12/2021] [Accepted: 05/20/2021] [Indexed: 11/02/2022] Open
Abstract
As a branch of machine learning, multiple instance learning (MIL) learns from a collection of labeled bags, each containing a set of instances. The learning process is weakly supervised due to ambiguous instance labels. Since its emergence, MIL has been applied to solve various problems including content-based image retrieval, object tracking/detection, and computer-aided diagnosis. In biomedical research, the use of MIL has been focused on medical image analysis and molecule activity prediction. We review and apply 16 methods to investigate the applicability of MIL to a novel biomedical application, cancer detection using T-cell receptor (TCR) sequences. This important application can be a viable approach for large-scale cancer screening, as TCRs can be easily profiled from a subject's peripheral blood. We consider two feasible data-generating mechanisms, and for the purpose of performance evaluation, we simulate data under each mechanism, where we vary potentially important factors to mimic realistic situations. We also apply the methods to sequencing data of ten cancer types from The Cancer Genome Atlas, as an early proof of concept for distinguishing tumor patients from healthy individuals via TCR sequencing of peripheral blood. We find that given an appropriate MIL method is used, satisfactory performance with Area Under the Receiver Operating Characteristic Curve above 80% can be achieved for five in the ten cancers. Based on our numerical results, we make suggestions about selection of a proper method and avoidance of any method with poor performance. We further point out directions of future research as well as identify a pressing need of new MIL methodologies for improved performance (for some cancer types) and more explainable outcomes.
Collapse
Affiliation(s)
- Danyi Xiong
- Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, Dallas 75275, TX, USA
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390, TX, USA
| | - Ze Zhang
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390, TX, USA
| | - Tao Wang
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390, TX, USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, Dallas 75275, TX, USA
| |
Collapse
|
38
|
Pertseva M, Gao B, Neumeier D, Yermanos A, Reddy ST. Applications of Machine and Deep Learning in Adaptive Immunity. Annu Rev Chem Biomol Eng 2021; 12:39-62. [PMID: 33852352 DOI: 10.1146/annurev-chembioeng-101420-125021] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Adaptive immunity is mediated by lymphocyte B and T cells, which respectively express a vast and diverse repertoire of B cell and T cell receptors and, in conjunction with peptide antigen presentation through major histocompatibility complexes (MHCs), can recognize and respond to pathogens and diseased cells. In recent years, advances in deep sequencing have led to a massive increase in the amount of adaptive immune receptor repertoire data; additionally, proteomics techniques have led to a wealth of data on peptide-MHC presentation. These large-scale data sets are now making it possible to train machine and deep learning models, which can be used to identify complex and high-dimensional patterns in immune repertoires. This article introduces adaptive immune repertoires and machine and deep learning related to biological sequence data and then summarizes the many applications in this field, which span from predicting the immunological status of a host to the antigen specificity of individual receptors and the engineering of immunotherapeutics.
Collapse
Affiliation(s)
- Margarita Pertseva
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; .,Life Science Zurich Graduate School, ETH Zurich and University of Zurich, 8006 Zurich, Switzerland
| | - Beichen Gao
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland;
| | - Daniel Neumeier
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland;
| | - Alexander Yermanos
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; .,Department of Pathology and Immunology, University of Geneva, 1205 Geneva, Switzerland.,Department of Biology, Institute of Microbiology and Immunology, ETH Zurich, 8093 Zurich, Switzerland
| | - Sai T Reddy
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland;
| |
Collapse
|
39
|
Christley S, Ostmeyer J, Quirk L, Zhang W, Sirak B, Giuliano AR, Zhang S, Monson N, Tiro J, Lucas E, Cowell LG. T Cell Receptor Repertoires Acquired via Routine Pap Testing May Help Refine Cervical Cancer and Precancer Risk Estimates. Front Immunol 2021; 12:624230. [PMID: 33868241 PMCID: PMC8050337 DOI: 10.3389/fimmu.2021.624230] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 03/09/2021] [Indexed: 12/24/2022] Open
Abstract
Cervical cancer is the fourth most common cancer and fourth leading cause of cancer death among women worldwide. In low Human Development Index settings, it ranks second. Screening and surveillance involve the cytology-based Papanicolaou (Pap) test and testing for high-risk human papillomavirus (hrHPV). The Pap test has low sensitivity to detect precursor lesions, while a single hrHPV test cannot distinguish a persistent infection from one that the immune system will naturally clear. Furthermore, among women who are hrHPV-positive and progress to high-grade cervical lesions, testing cannot identify the ~20% who would progress to cancer if not treated. Thus, reliable detection and treatment of cancers and precancers requires routine screening followed by frequent surveillance among those with past abnormal or positive results. The consequence is overtreatment, with its associated risks and complications, in screened populations and an increased risk of cancer in under-screened populations. Methods to improve cervical cancer risk assessment, particularly assays to predict regression of precursor lesions or clearance of hrHPV infection, would benefit both populations. Here we show that women who have lower risk results on follow-up testing relative to index testing have evidence of enhanced T cell clonal expansion in the index cervical cytology sample compared to women who persist with higher risk results from index to follow-up. We further show that a machine learning classifier based on the index sample T cells predicts this transition to lower risk with 95% accuracy (19/20) by leave-one-out cross-validation. Using T cell receptor deep sequencing and machine learning, we identified a biophysicochemical motif in the complementarity-determining region 3 of T cell receptor β chains whose presence predicts this transition. While these results must still be tested on an independent cohort in a prospective study, they suggest that this approach could improve cervical cancer screening by helping distinguish women likely to spontaneously regress from those at elevated risk of progression to cancer. The advancement of such a strategy could reduce surveillance frequency and overtreatment in screened populations and improve the delivery of screening to under-screened populations.
Collapse
Affiliation(s)
- Scott Christley
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States
| | - Jared Ostmeyer
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States
| | - Lisa Quirk
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States
| | - Wei Zhang
- Department of Neurology and Neurotherapeutics, Department of Immunology, UT Southwestern Medical Center, Dallas, TX, United States
| | - Bradley Sirak
- Center for Immunization and Infection Research, Moffitt Cancer Center, Tampa, FL, United States
| | - Anna R Giuliano
- Center for Immunization and Infection Research, Moffitt Cancer Center, Tampa, FL, United States
| | - Song Zhang
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States
| | - Nancy Monson
- Department of Neurology and Neurotherapeutics, Department of Immunology, UT Southwestern Medical Center, Dallas, TX, United States
| | - Jasmin Tiro
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States
| | - Elena Lucas
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX, United States.,Department of Pathology, Parkland Health and Hospital System, Dallas, TX, United States
| | - Lindsay G Cowell
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States.,Department of Neurology and Neurotherapeutics, Department of Immunology, UT Southwestern Medical Center, Dallas, TX, United States
| |
Collapse
|
40
|
Yeagley M, Chobrutskiy BI, Gozlan EC, Medikonda N, Patel DN, Falasiri S, Callahan BM, Huda T, Blanck G. Electrostatic Complementarity of T-Cell Receptor-Alpha CDR3 Domains and Mutant Amino Acids Is Associated with Better Survival Rates for Sarcomas. Pediatr Hematol Oncol 2021; 38:251-264. [PMID: 33616477 DOI: 10.1080/08880018.2020.1843576] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
While sarcoma immunology has advanced with regard to basic, and even some applied topics, this disease has not been subject to more recent immunogenomics approaches. Thus, we assessed the immune receptor recombinations available from the cancer genome atlas (TCGA) sarcoma database via tumor sample exome and RNASeq files. Results indicated that recovery of T-cell receptor-alpha recombination reads (TRA) correlated with a better survival rate, with the expression of T-cell biomarkers, and with tumor sample apoptosis signatures consistent with the longer patient survival times. Furthermore, samples representing TRA complementarity determining region-3 (CDR3) net charge per residue (NCPR) based complementarity with the corresponding sarcoma mutanome had a better survival rate, and more granzyme expression, than samples lacking such complementarity. By specifically using RNASeq-recovered TRA CDR3s and related NCPR assessments, three genes, TP53, ATRX, and RB1, were identified as being key components of the mutanome-based complementarity. Thus, these genes may represent key immune system targets for soft tissue sarcomas. Also, several key results from above were reproduced with a pediatric osteosarcoma dataset, work that led to identification of MUC6 mutations as potentially linked to a strong immune response. In sum, TRA CDR3s are likely to be important prognostic indicators, and possibly a beginning tool for immunotherapy development strategies, for adult and pediatric sarcomas.
Collapse
Affiliation(s)
- Michelle Yeagley
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Boris I Chobrutskiy
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Etienne C Gozlan
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Nikhila Medikonda
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Dhruv N Patel
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Shayan Falasiri
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Blake M Callahan
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - Taha Huda
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| | - George Blanck
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA.,Immunology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida, USA
| |
Collapse
|
41
|
Gozlan EC, Chobrutskiy BI, Zaman S, Yeagley M, Blanck G. Systemic Adaptive Immune Parameters Associated with Neuroblastoma Outcomes: the Significance of Gamma-Delta T Cells. J Mol Neurosci 2021; 71:2393-2404. [DOI: 10.1007/s12031-021-01813-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 02/08/2021] [Indexed: 12/17/2022]
|
42
|
Akbar R, Robert PA, Pavlović M, Jeliazkov JR, Snapkov I, Slabodkin A, Weber CR, Scheffer L, Miho E, Haff IH, Haug DTT, Lund-Johansen F, Safonova Y, Sandve GK, Greiff V. A compact vocabulary of paratope-epitope interactions enables predictability of antibody-antigen binding. Cell Rep 2021; 34:108856. [PMID: 33730590 DOI: 10.1016/j.celrep.2021.108856] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 11/29/2020] [Accepted: 02/22/2021] [Indexed: 12/16/2022] Open
Abstract
Antibody-antigen binding relies on the specific interaction of amino acids at the paratope-epitope interface. The predictability of antibody-antigen binding is a prerequisite for de novo antibody and (neo-)epitope design. A fundamental premise for the predictability of antibody-antigen binding is the existence of paratope-epitope interaction motifs that are universally shared among antibody-antigen structures. In a dataset of non-redundant antibody-antigen structures, we identify structural interaction motifs, which together compose a commonly shared structure-based vocabulary of paratope-epitope interactions. We show that this vocabulary enables the machine learnability of antibody-antigen binding on the paratope-epitope level using generative machine learning. The vocabulary (1) is compact, less than 104 motifs; (2) distinct from non-immune protein-protein interactions; and (3) mediates specific oligo- and polyreactive interactions between paratope-epitope pairs. Our work leverages combined structure- and sequence-based learning to demonstrate that machine-learning-driven predictive paratope and epitope engineering is feasible.
Collapse
Affiliation(s)
- Rahmad Akbar
- Department of Immunology, University of Oslo, Oslo, Norway.
| | | | - Milena Pavlović
- Department of Informatics, University of Oslo, Oslo, Norway; Centre for Bioinformatics, University of Oslo, Norway; K.G. Jebsen Centre for Coeliac Disease Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | | | - Igor Snapkov
- Department of Immunology, University of Oslo, Oslo, Norway
| | | | - Cédric R Weber
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Lonneke Scheffer
- Department of Informatics, University of Oslo, Oslo, Norway; Centre for Bioinformatics, University of Oslo, Norway
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
| | | | | | | | - Yana Safonova
- Computer Science and Engineering Department, University of California, San Diego, La Jolla, CA, USA
| | - Geir K Sandve
- Department of Informatics, University of Oslo, Oslo, Norway; Centre for Bioinformatics, University of Oslo, Norway; K.G. Jebsen Centre for Coeliac Disease Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway.
| |
Collapse
|
43
|
Zhang Z, Xiong D, Wang X, Liu H, Wang T. Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics. Nat Methods 2021; 18:92-99. [PMID: 33408405 PMCID: PMC7799492 DOI: 10.1038/s41592-020-01020-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Accepted: 11/12/2020] [Indexed: 11/08/2022]
Abstract
Many experimental and bioinformatics approaches have been developed to characterize the human T cell receptor (TCR) repertoire. However, the unknown functional relevance of TCR profiling hinders unbiased interpretation of the biology of T cells. To address this inadequacy, we developed tessa, a tool to integrate TCRs with gene expression of T cells to estimate the effect that TCRs confer on the phenotypes of T cells. Tessa leveraged techniques combining single-cell RNA-sequencing with TCR sequencing. We validated tessa and showed its superiority over existing approaches that investigate only the TCR sequences. With tessa, we demonstrated that TCR similarity constrains the phenotypes of T cells to be similar and dictates a gradient in antigen targeting efficiency of T cell clonotypes with convergent TCRs. We showed this constraint could predict a functional dichotomization of T cells postimmunotherapy treatment and is weakened in tumor contexts.
Collapse
Affiliation(s)
- Ze Zhang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Danyi Xiong
- Department of Statistical Science, Southern Methodist University, Dallas, TX, USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, TX, USA
| | - Hongyu Liu
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA.
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
44
|
Greiff V, Yaari G, Cowell LG. Mining adaptive immune receptor repertoires for biological and clinical information using machine learning. ACTA ACUST UNITED AC 2020. [DOI: 10.1016/j.coisb.2020.10.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
45
|
Lee CH, Salio M, Napolitani G, Ogg G, Simmons A, Koohy H. Predicting Cross-Reactivity and Antigen Specificity of T Cell Receptors. Front Immunol 2020; 11:565096. [PMID: 33193332 PMCID: PMC7642207 DOI: 10.3389/fimmu.2020.565096] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Accepted: 09/07/2020] [Indexed: 12/13/2022] Open
Abstract
Adaptive immune recognition is mediated by specific interactions between heterodimeric T cell receptors (TCRs) and their cognate peptide-MHC (pMHC) ligands, and the methods to accurately predict TCR:pMHC interaction would have profound clinical, therapeutic and pharmaceutical applications. Herein, we review recent developments in predicting cross-reactivity and antigen specificity of TCR recognition. We discuss current experimental and computational approaches to investigate cross-reactivity and antigen-specificity of TCRs and highlight how integrating kinetic, biophysical and structural features may offer valuable insights in modeling immunogenicity. We further underscore the close inter-relationship of these two interconnected notions and the need to investigate each in the light of the other for a better understanding of T cell responsiveness for the effective clinical applications.
Collapse
Affiliation(s)
- Chloe H. Lee
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
- MRC WIMM Centre for Computational Biology, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
| | - Mariolina Salio
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
| | - Giorgio Napolitani
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
| | - Graham Ogg
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
| | - Alison Simmons
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
- Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford, United Kingdom
| | - Hashem Koohy
- MRC Human Immunology Unit, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine (WIMM), John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
- MRC WIMM Centre for Computational Biology, Medical Research Council (MRC) Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
46
|
T-cell repertoire analysis and metrics of diversity and clonality. Curr Opin Biotechnol 2020; 65:284-295. [PMID: 32889231 DOI: 10.1016/j.copbio.2020.07.010] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 07/22/2020] [Indexed: 12/12/2022]
Abstract
The recent developments of high-throughput bulk and single-cell sequencing technologies accelerated the understanding of the complexity of immune repertoire dynamics combined to transcriptomics. Also, profiling of cellular repertoires in health or disease requires statistical metrics to capture clonal diversity characterized by clones frequency, repertoire richness and convergence. Here we present the common technologies of bulk and single-cell sequencing of T-cell receptors (TCRs), discuss current knowledge regarding computational tools clustering and predicting specificity of TCR repertoires based on shared structural motifs and review main indices for repertoire diversity and convergence analyses. These tools represent potential biomarkers to decipher the fitness of immune repertoires in diseased or treated patients but also the presages and promises of computational approaches to revolutionize personalized immunotherapy.
Collapse
|
47
|
Beshnova D, Ye J, Onabolu O, Moon B, Zheng W, Fu YX, Brugarolas J, Lea J, Li B. De novo prediction of cancer-associated T cell receptors for noninvasive cancer detection. Sci Transl Med 2020; 12:eaaz3738. [PMID: 32817363 PMCID: PMC7887928 DOI: 10.1126/scitranslmed.aaz3738] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 03/05/2020] [Accepted: 07/21/2020] [Indexed: 01/21/2023]
Abstract
The adaptive immune system recognizes tumor antigens at an early stage to eradicate cancer cells. This process is accompanied by systemic proliferation of the tumor antigen-specific T lymphocytes. While detection of asymptomatic early-stage cancers is challenging due to small tumor size and limited somatic alterations, tracking peripheral T cell repertoire changes may provide an attractive solution to cancer diagnosis. Here, we developed a deep learning method called DeepCAT to enable de novo prediction of cancer-associated T cell receptors (TCRs). We validated DeepCAT using cancer-specific or non-cancer TCRs obtained from multiple major histocompatibility complex I (MHC-I) multimer-sorting experiments and demonstrated its prediction power for TCRs specific to cancer antigens. We blindly applied DeepCAT to distinguish over 250 patients with cancer from over 600 healthy individuals using blood TCR sequences and observed high prediction accuracy, with area under the curve (AUC) ≥ 0.95 for multiple early-stage cancers. This work sets the stage for using the peripheral blood TCR repertoire for noninvasive cancer detection.
Collapse
Affiliation(s)
- Daria Beshnova
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Jianfeng Ye
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Oreoluwa Onabolu
- Department of Internal Medicine, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Benjamin Moon
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Wenxin Zheng
- Department of Obstetrics and Gynecology, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Yang-Xin Fu
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX 75390, USA
- Department of Immunology, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - James Brugarolas
- Department of Internal Medicine, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Jayanthi Lea
- Department of Obstetrics and Gynecology, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Bo Li
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX 75390, USA.
- Department of Immunology, UT Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
48
|
Immunogenomics of colorectal adenocarcinoma: Survival distinctions represented by immune receptor, CDR3 chemical features and high expression of BTN gene family members. Cancer Treat Res Commun 2020; 24:100196. [PMID: 32769037 DOI: 10.1016/j.ctarc.2020.100196] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 07/08/2020] [Accepted: 07/11/2020] [Indexed: 11/24/2022]
Abstract
Immunogenomics studies of colon cancer have lagged behind other cancer types, such as melanoma and lung cancer, potentially limiting immunotherapy approaches to colon cancer, also less common than in the cases of melanoma and lung cancer. Here we applied an extensively benchmarked algorithm for retrieving immune receptor recombination sequencing reads from colon cancer exomes available via the cancer genome atlas. Assessment of the complementarity determining region-3 chemical features represented by the reads revealed associations of distinct chemical features with better or worse survival rates, for both T-cell and B-cell receptor, recombination reads. A follow up assessment of immune gene expression correlations with the recovery of the recombination reads revealed a consistent association of high level expression of BTN gene family members and better survival rates. Overall, these approaches provide several striking consistencies connecting immunogenomics features with colon cancer survival rates, potentially providing a basis for guiding immuno-therapy applications.
Collapse
|
49
|
Abstract
T cells respond to threats in an antigen-specific manner using T cell receptors (TCRs) that recognize short peptide antigens presented on major histocompatibility complex (MHC) proteins. The TCR-peptide-MHC interaction mediated between a T cell and its target cell dictates its function and thereby influences its role in disease. A lack of approaches for antigen discovery has limited the fundamental understanding of the antigenic landscape of the overall T cell response. Recent advances in high-throughput sequencing, mass cytometry, microfluidics and computational biology have led to a surge in approaches to address the challenge of T cell antigen discovery. Here, we summarize the scope of this challenge, discuss in depth the recent exciting work and highlight the outstanding questions and remaining technical hurdles in this field.
Collapse
|
50
|
Ostmeyer J, Lucas E, Christley S, Lea J, Monson N, Tiro J, Cowell LG. Biophysicochemical motifs in T cell receptor sequences as a potential biomarker for high-grade serous ovarian carcinoma. PLoS One 2020; 15:e0229569. [PMID: 32134923 PMCID: PMC7058380 DOI: 10.1371/journal.pone.0229569] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 02/09/2020] [Indexed: 11/18/2022] Open
Abstract
We previously showed, in a pilot study with publicly available data, that T cell receptor (TCR) repertoires from tumor infiltrating lymphocytes (TILs) could be distinguished from adjacent healthy tissue repertoires by the presence of TCRs bearing specific, biophysicochemical motifs in their antigen binding regions. We hypothesized that such motifs might allow development of a novel approach to cancer detection. The motifs were cancer specific and achieved high classification accuracy: we found distinct motifs for breast versus colorectal cancer-associated repertoires, and the colorectal cancer motif achieved 93% accuracy, while the breast cancer motif achieved 94% accuracy. In the current study, we sought to determine whether such motifs exist for ovarian cancer, a cancer type for which detection methods are urgently needed. We made two significant advances over the prior work. First, the prior study used patient-matched TILs and healthy repertoires, collecting healthy tissue adjacent to the tumors. The current study collected TILs from patients with high-grade serous ovarian carcinoma (HGSOC) and healthy ovary repertoires from cancer-free women undergoing hysterectomy/salpingo-oophorectomy for benign disease. Thus, the classification task is distinguishing women with cancer from women without cancer. Second, in the prior study, classification accuracy was measured by patient-hold-out cross-validation on the training data. In the current study, classification accuracy was additionally assessed on an independent cohort not used during model development to establish the generalizability of the motif to unseen data. Classification accuracy was 95% by patient-hold-out cross-validation on the training set and 80% when the model was applied to the blinded test set. The results on the blinded test set demonstrate a biophysicochemical TCR motif found overwhelmingly in women with HGSOC but rarely in women with healthy ovaries, strengthening the proposal that cancer detection approaches might benefit from incorporation of TCR motif-based biomarkers. Furthermore, these results call for studies on large cohorts to establish higher classification accuracies, as well as for studies in other cancer types.
Collapse
Affiliation(s)
- Jared Ostmeyer
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States of America
| | - Elena Lucas
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX, United States of America
| | - Scott Christley
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States of America
| | - Jayanthi Lea
- Department of Obstetrics and Gynecology, UT Southwestern Medical Center, Dallas, TX, United States of America
| | - Nancy Monson
- Department of Neurology and Neurotherapeutics, Department of Immunology, UT Southwestern Medical Center, Dallas, TX, United States of America
| | - Jasmin Tiro
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States of America
| | - Lindsay G. Cowell
- Department of Population and Data Sciences, Department of Immunology, UT Southwestern Medical Center, Dallas, TX, United States of America
- * E-mail:
| |
Collapse
|