1
|
Liu J, Zhou S, Zang M, Liu C, Liu T, Wang Q. Multiple instance learning method based on convolutional neural network and self-attention for early cancer detection. Comput Methods Biomech Biomed Engin 2024:1-16. [PMID: 39644499 DOI: 10.1080/10255842.2024.2436909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 10/07/2024] [Accepted: 11/27/2024] [Indexed: 12/09/2024]
Abstract
Early cancer detection using T-cell receptor sequencing (TCR-seq) and multiple instances learning methods has shown significant effectiveness. We introduce a multiple instance learning method based on convolutional neural networks and self-attention (MICA). First, MICA preprocesses TCR-seq using word vectors and then extracts features using convolutional neural networks. Second, MICA uses an enhanced self-attention mechanism to extract relational features of instances. Finally, MICA can extract the crucial TCR-seq. After cross-validation, MICA achieves an area under the curve (AUC) of 0.911 and 0.946 on the lung and thyroid cancer datasets, which are 7.1% and 2.1% higher than other methods, respectively.
Collapse
Affiliation(s)
- Junjiang Liu
- School of Information and Electrical Engineering, Ludong University, Shandong, China
| | - Shusen Zhou
- School of Information and Electrical Engineering, Ludong University, Shandong, China
| | - Mujun Zang
- School of Information and Electrical Engineering, Ludong University, Shandong, China
| | - Chanjuan Liu
- School of Information and Electrical Engineering, Ludong University, Shandong, China
| | - Tong Liu
- School of Information and Electrical Engineering, Ludong University, Shandong, China
| | - Qingjun Wang
- School of Information and Electrical Engineering, Ludong University, Shandong, China
| |
Collapse
|
2
|
Liu Z, Zhou Y, Lu J, Gong T, Ibáñez E, Cifuentes A, Lu W. Microfluidic biosensors for biomarker detection in body fluids: a key approach for early cancer diagnosis. Biomark Res 2024; 12:153. [PMID: 39639411 PMCID: PMC11622463 DOI: 10.1186/s40364-024-00697-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 11/22/2024] [Indexed: 12/07/2024] Open
Abstract
Early detection of cancer significantly improves patient outcomes, with biomarkers offering a promising avenue for earlier and more precise diagnoses. Microfluidic biosensors have emerged as a powerful tool for detecting these biomarkers in body fluids, providing enhanced sensitivity, specificity, and rapid analysis. This review focuses on recent advances in microfluidic biosensors from 2018 to 2024, detailing their operational principles, fabrication techniques, and integration with nanotechnology for cancer biomarker detection. Additionally, we have reviewed recent innovations in several aspects of microfluidic biosensors, such as novel detection technologies, nanomaterials and novel microfluidic chip structures, which significantly enhance detection capabilities. We highlight key biomarkers pertinent to early cancer detection and explore how these innovations in biosensor technology contribute to the evolving landscape of personalized medicine. We further explore how these technologies could be incorporated into clinical cancer diagnostic workflows to improve early detection and treatment outcomes. These innovations could help enable more precise and personalized cancer diagnostics. In addition, this review addresses several important issues such as enhancing the scalability and sensitivity of these biosensors in clinical settings and points out future possibilities of combining artificial intelligence diagnostics with microfluidic biosensors to optimize their practical applications. This overview aims to guide future research and clinical applications by addressing current challenges and identifying opportunities for further development in the field of biomarker research.
Collapse
Affiliation(s)
- Zhiting Liu
- School of Medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street, Nangang District, Harbin, 150001, China
- National and Local Joint Engineering Laboratory for Synthesis Transformation and Separation of Extreme Environmental Nutrients, 92 Xidazhi Street, Nangang District, Harbin, 150001, China
| | - Yingyu Zhou
- School of Medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street, Nangang District, Harbin, 150001, China.
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China.
- National and Local Joint Engineering Laboratory for Synthesis Transformation and Separation of Extreme Environmental Nutrients, 92 Xidazhi Street, Nangang District, Harbin, 150001, China.
| | - Jia Lu
- School of Mechatronics Engineering, Harbin Institute of Technology, 92 Xidazhi Street, Nangang District, Harbin, 150001, China.
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China.
| | - Ting Gong
- School of Medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street, Nangang District, Harbin, 150001, China
- National and Local Joint Engineering Laboratory for Synthesis Transformation and Separation of Extreme Environmental Nutrients, 92 Xidazhi Street, Nangang District, Harbin, 150001, China
| | - Elena Ibáñez
- Laboratory of Foodomics, Institute of Food Science Research, CIAL, CSIC, Nicolás Cabrera 9, Madrid, 28049, Spain
| | - Alejandro Cifuentes
- Laboratory of Foodomics, Institute of Food Science Research, CIAL, CSIC, Nicolás Cabrera 9, Madrid, 28049, Spain
| | - Weihong Lu
- School of Medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street, Nangang District, Harbin, 150001, China.
- Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan, China.
- National and Local Joint Engineering Laboratory for Synthesis Transformation and Separation of Extreme Environmental Nutrients, 92 Xidazhi Street, Nangang District, Harbin, 150001, China.
| |
Collapse
|
3
|
Tayebi Z, Ali S, Patterson M. TCellR2Vec: efficient feature selection for TCR sequences for cancer classification. PeerJ Comput Sci 2024; 10:e2239. [PMID: 39650499 PMCID: PMC11622898 DOI: 10.7717/peerj-cs.2239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/14/2024] [Indexed: 12/11/2024]
Abstract
Cancer remains one of the leading causes of death globally. New immunotherapies that harness the patient's immune system to fight cancer show promise, but their development requires analyzing the diversity of immune cells called T-cells. T-cells have receptors that recognize and bind to cancer cells. Sequencing these T-cell receptors allows to provide insights into their immune response, but extracting useful information is challenging. In this study, we propose a new computational method, TCellR2Vec, to select key features from T-cell receptor sequences for classifying different cancer types. We extracted features like amino acid composition, charge, and diversity measures and combined them with other sequence embedding techniques. For our experiments, we used a dataset of over 50,000 T-cell receptor sequences from five cancer types, which showed that TCellR2Vec improved classification accuracy and efficiency over baseline methods. These results demonstrate TCellR2Vec's ability to capture informative aspects of complex T-cell receptor sequences. By improving computational analysis of the immune response, TCellR2Vec could aid the development of personalized immunotherapies tailored to each patient's T-cells. This has important implications for creating more effective cancer treatments based on the individual's immune system.
Collapse
Affiliation(s)
- Zahra Tayebi
- Computer Science, Georgia State University, Atlanta, GA, United States of America
| | - Sarwan Ali
- Computer Science, Georgia State University, Atlanta, GA, United States of America
| | - Murray Patterson
- Computer Science, Georgia State University, Atlanta, GA, United States of America
| |
Collapse
|
4
|
Huang Q, Zhu J. Regulatory T cell-based therapy in type 1 diabetes: Latest breakthroughs and evidence. Int Immunopharmacol 2024; 140:112724. [PMID: 39098233 DOI: 10.1016/j.intimp.2024.112724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 07/10/2024] [Accepted: 07/16/2024] [Indexed: 08/06/2024]
Abstract
Autoimmune diseases (ADs) are among the most significant health complications, with their incidence rising in recent years. Type 1 diabetes (T1D), an AD, targets the insulin-producing β cells in the pancreas, leading to chronic insulin deficiency in genetically susceptible individuals. Regulatory immune cells, particularly T-cells (Tregs), have been shown to play a crucial role in the pathogenesis of diabetes by modulating immune responses. In diabetic patients, Tregs often exhibit diminished effectiveness due to various factors, such as instability in forkhead box P3 (Foxp3) expression or abnormal production of the proinflammatory cytokine interferon-gamma (IFN-γ) by autoreactive T-cells. Consequently, Tregs represent a potential therapeutic target for diabetes treatment. Building on the successful clinical outcomes of chimeric antigen receptor (CAR) T-cell therapy in cancer treatment, particularly in leukemias, the concept of designing and utilizing CAR Tregs for ADs has emerged. This review summarizes the findings on Treg targeting in T1D and discusses the benefits and limitations of this treatment approach for patients suffering from T1D.
Collapse
Affiliation(s)
- Qiongxiao Huang
- Center for Reproductive Medicine, Department of Reproductive Endocrinology, Zhejiang Provincial People's Hospital (Affiliated People's Hospital, Hangzhou Medical College), Hangzhou, Zhejiang 310014, China
| | - Jing Zhu
- Center for Reproductive Medicine, Department of Reproductive Endocrinology, Zhejiang Provincial People's Hospital (Affiliated People's Hospital, Hangzhou Medical College), Hangzhou, Zhejiang 310014, China.
| |
Collapse
|
5
|
Zhang M, Cheng Q, Wei Z, Xu J, Wu S, Xu N, Zhao C, Yu L, Feng W. BertTCR: a Bert-based deep learning framework for predicting cancer-related immune status based on T cell receptor repertoire. Brief Bioinform 2024; 25:bbae420. [PMID: 39177262 PMCID: PMC11342255 DOI: 10.1093/bib/bbae420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 07/24/2024] [Accepted: 08/08/2024] [Indexed: 08/24/2024] Open
Abstract
The T cell receptor (TCR) repertoire is pivotal to the human immune system, and understanding its nuances can significantly enhance our ability to forecast cancer-related immune responses. However, existing methods often overlook the intra- and inter-sequence interactions of T cell receptors (TCRs), limiting the development of sequence-based cancer-related immune status predictions. To address this challenge, we propose BertTCR, an innovative deep learning framework designed to predict cancer-related immune status using TCRs. BertTCR combines a pre-trained protein large language model with deep learning architectures, enabling it to extract deeper contextual information from TCRs. Compared to three state-of-the-art sequence-based methods, BertTCR improves the AUC on an external validation set for thyroid cancer detection by 21 percentage points. Additionally, this model was trained on over 2000 publicly available TCR libraries covering 17 types of cancer and healthy samples, and it has been validated on multiple public external datasets for its ability to distinguish cancer patients from healthy individuals. Furthermore, BertTCR can accurately classify various cancer types and healthy individuals. Overall, BertTCR is the advancing method for cancer-related immune status forecasting based on TCRs, offering promising potential for a wide range of immune status prediction tasks.
Collapse
Affiliation(s)
- Min Zhang
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
| | - Qi Cheng
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
| | - Zhenyu Wei
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
| | - Jiayu Xu
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
| | - Shiwei Wu
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
| | - Nan Xu
- Institute of Biomedical Engineering and Technology, Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, No. 500 Dongchuan Road, Shanghai, 200241, China
- Shanghai Unicar-Therapy Bio-medicine Technology Co., Ltd, No. 1525 Minqiang Road, Shanghai, 201612, China
| | - Chengkui Zhao
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
- Shanghai Unicar-Therapy Bio-medicine Technology Co., Ltd, No. 1525 Minqiang Road, Shanghai, 201612, China
| | - Lei Yu
- Institute of Biomedical Engineering and Technology, Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, No. 500 Dongchuan Road, Shanghai, 200241, China
- Shanghai Unicar-Therapy Bio-medicine Technology Co., Ltd, No. 1525 Minqiang Road, Shanghai, 201612, China
| | - Weixing Feng
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, No. 145 Nantong Street, Nangang District, Harbin, 150001, China
| |
Collapse
|
6
|
Yu X, Pan M, Ye J, Hathaway CA, Tworoger SS, Lea J, Li B. Quantifiable TCR repertoire changes in prediagnostic blood specimens among patients with high-grade ovarian cancer. Cell Rep Med 2024; 5:101612. [PMID: 38878776 PMCID: PMC11293308 DOI: 10.1016/j.xcrm.2024.101612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 04/16/2024] [Accepted: 05/20/2024] [Indexed: 06/25/2024]
Abstract
High-grade ovarian cancer (HGOC) is a major cause of death in women. Early detection of HGOC usually leads to a cure, yet it remains a clinical challenge with over 90% HGOCs diagnosed at advanced stages. This is mainly because conventional biomarkers are not sensitive enough to detect the microscopic yet metastatic early lesions. In this study, we sequence the blood T cell receptor (TCR) repertoires of 466 patients with ovarian cancer and controls and systematically investigate the immune repertoire signatures in HGOCs. We observe quantifiable changes of selected TCRs in HGOCs that are reproducible in multiple independent cohorts. Importantly, these changes are stronger during stage I. Using pre-diagnostic patient blood samples from the Nurses' Health Study, we confirm that HGOC signals can be detected in the blood TCR repertoire up to 4 years preceding conventional diagnosis. Our findings may provide the basis for future immune-based HGOC early detection criteria.
Collapse
Affiliation(s)
- Xuexin Yu
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Mingyao Pan
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jianfeng Ye
- Department of Neuroscience, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | | | - Shelley S Tworoger
- Knight Cancer Institute and Division of Oncological Sciences, Oregon Health & Science University, Portland, OR 97239, USA
| | - Jayanthi Lea
- Department of Gynecology, UT Southwestern Medical Center, Dallas, TX 75390, USA.
| | - Bo Li
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
7
|
Cai Y, Luo M, Yang W, Xu C, Wang P, Xue G, Jin X, Cheng R, Que J, Zhou W, Pang B, Xu S, Li Y, Jiang Q, Xu Z. The Deep Learning Framework iCanTCR Enables Early Cancer Detection Using the T-cell Receptor Repertoire in Peripheral Blood. Cancer Res 2024; 84:1915-1928. [PMID: 38536129 DOI: 10.1158/0008-5472.can-23-0860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 07/20/2023] [Accepted: 03/19/2024] [Indexed: 06/05/2024]
Abstract
T cells recognize tumor antigens and initiate an anticancer immune response in the very early stages of tumor development, and the antigen specificity of T cells is determined by the T-cell receptor (TCR). Therefore, monitoring changes in the TCR repertoire in peripheral blood may offer a strategy to detect various cancers at a relatively early stage. Here, we developed the deep learning framework iCanTCR to identify patients with cancer based on the TCR repertoire. The iCanTCR framework uses TCRβ sequences from an individual as an input and outputs the predicted cancer probability. The model was trained on over 2,000 publicly available TCR repertoires from 11 types of cancer and healthy controls. Analysis of several additional publicly available datasets validated the ability of iCanTCR to distinguish patients with cancer from noncancer individuals and demonstrated the capability of iCanTCR for the accurate classification of multiple cancers. Importantly, iCanTCR precisely identified individuals with early-stage cancer with an AUC of 86%. Altogether, this work provides a liquid biopsy approach to capture immune signals from peripheral blood for noninvasive cancer diagnosis. SIGNIFICANCE Development of a deep learning-based method for multicancer detection using the TCR repertoire in the peripheral blood establishes the potential of evaluating circulating immune signals for noninvasive early cancer detection.
Collapse
Affiliation(s)
- Yideng Cai
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Meng Luo
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Wenyi Yang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chang Xu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Pingping Wang
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Guangfu Xue
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Xiyun Jin
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Rui Cheng
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jinhao Que
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Wenyang Zhou
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Boran Pang
- Center for Difficult and Complicated Abdominal Surgery, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, China
| | - Shouping Xu
- Department of Breast Cancer, Harbin Medical University Cancer Hospital, Harbin, China
| | - Yu Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Zhaochun Xu
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| |
Collapse
|
8
|
Zaslavsky ME, Craig E, Michuda JK, Sehgal N, Ram-Mohan N, Lee JY, Nguyen KD, Hoh RA, Pham TD, Röltgen K, Lam B, Parsons ES, Macwana SR, DeJager W, Drapeau EM, Roskin KM, Cunningham-Rundles C, Moody MA, Haynes BF, Goldman JD, Heath JR, Nadeau KC, Pinsky BA, Blish CA, Hensley SE, Jensen K, Meyer E, Balboni I, Utz PJ, Merrill JT, Guthridge JM, James JA, Yang S, Tibshirani R, Kundaje A, Boyd SD. Disease diagnostics using machine learning of immune receptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2022.04.26.489314. [PMID: 35547855 PMCID: PMC9094102 DOI: 10.1101/2022.04.26.489314] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Clinical diagnosis typically incorporates physical examination, patient history, and various laboratory tests and imaging studies, but makes limited use of the human system's own record of antigen exposures encoded by receptors on B cells and T cells. We analyzed immune receptor datasets from 593 individuals to develop MAchine Learning for Immunological Diagnosis (Mal-ID) , an interpretive framework to screen for multiple illnesses simultaneously or precisely test for one condition. This approach detects specific infections, autoimmune disorders, vaccine responses, and disease severity differences. Human-interpretable features of the model recapitulate known immune responses to SARS-CoV-2, Influenza, and HIV, highlight antigen-specific receptors, and reveal distinct characteristics of Systemic Lupus Erythematosus and Type-1 Diabetes autoreactivity. This analysis framework has broad potential for scientific and clinical interpretation of human immune responses.
Collapse
|
9
|
Qian X, Yang G, Li F, Zhang X, Zhu X, Lai X, Xiao X, Wang T, Wang J. DeepLION2: deep multi-instance contrastive learning framework enhancing the prediction of cancer-associated T cell receptors by attention strategy on motifs. Front Immunol 2024; 15:1345586. [PMID: 38515756 PMCID: PMC10956474 DOI: 10.3389/fimmu.2024.1345586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/19/2024] [Indexed: 03/23/2024] Open
Abstract
Introduction T cell receptor (TCR) repertoires provide valuable insights into complex human diseases, including cancers. Recent advancements in immune sequencing technology have significantly improved our understanding of TCR repertoire. Some computational methods have been devised to identify cancer-associated TCRs and enable cancer detection using TCR sequencing data. However, the existing methods are often limited by their inadequate consideration of the correlations among TCRs within a repertoire, hindering the identification of crucial TCRs. Additionally, the sparsity of cancer-associated TCR distribution presents a challenge in accurate prediction. Methods To address these issues, we presented DeepLION2, an innovative deep multi-instance contrastive learning framework specifically designed to enhance cancer-associated TCR prediction. DeepLION2 leveraged content-based sparse self-attention, focusing on the top k related TCRs for each TCR, to effectively model inter-TCR correlations. Furthermore, it adopted a contrastive learning strategy for bootstrapping parameter updates of the attention matrix, preventing the model from fixating on non-cancer-associated TCRs. Results Extensive experimentation on diverse patient cohorts, encompassing over ten cancer types, demonstrated that DeepLION2 significantly outperformed current state-of-the-art methods in terms of accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the curve (AUC). Notably, DeepLION2 achieved impressive AUC values of 0.933, 0.880, and 0.763 on thyroid, lung, and gastrointestinal cancer cohorts, respectively. Furthermore, it effectively identified cancer-associated TCRs along with their key motifs, highlighting the amino acids that play a crucial role in TCR-peptide binding. Conclusion These compelling results underscore DeepLION2's potential for enhancing cancer detection and facilitating personalized cancer immunotherapy. DeepLION2 is publicly available on GitHub, at https://github.com/Bioinformatics7181/DeepLION2, for academic use only.
Collapse
Affiliation(s)
- Xinyang Qian
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Guang Yang
- Department of Clinical Oncology, The Second Affiliated Hospital of Air Force Medical University, Xi’an, China
| | - Fan Li
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xuanping Zhang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xiaoyan Zhu
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xin Lai
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xiao Xiao
- Genomics Institute, Geneplus-Shenzhen, Shenzhen, China
| | - Tao Wang
- Department of Thoracic Surgery, The Second Affiliated Hospital of Air Force Medical University, Xi’an, China
| | - Jiayin Wang
- School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
10
|
Zhang Y, Zhou X, Zhong Y, Chen X, Li Z, Li R, Qin P, Wang S, Yin J, Liu S, Jiang M, Yu Q, Hou Y, Liu S, Wu L. Pan-cancer scRNA-seq analysis reveals immunological and diagnostic significance of the peripheral blood mononuclear cells. Hum Mol Genet 2024; 33:342-354. [PMID: 37944069 DOI: 10.1093/hmg/ddad187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 12/02/2023] [Accepted: 10/19/2023] [Indexed: 11/12/2023] Open
Abstract
Peripheral blood mononuclear cells (PBMCs) reflect systemic immune response during cancer progression. However, a comprehensive understanding of the composition and function of PBMCs in cancer patients is lacking, and the potential of these features to assist cancer diagnosis is also unclear. Here, the compositional and status differences between cancer patients and healthy donors in PBMCs were investigated by single-cell RNA sequencing (scRNA-seq), involving 262,025 PBMCs from 68 cancer samples and 14 healthy samples. We observed an enhanced activation and differentiation of most immune subsets in cancer patients, along with reduction of naïve T cells, expansion of macrophages, impairment of NK cells and myeloid cells, as well as tumor promotion and immunosuppression. Based on characteristics including differential cell type abundances and/or hub genes identified from weight gene co-expression network analysis (WGCNA) modules of each major cell type, we applied logistic regression to construct cancer diagnosis models. Furthermore, we found that the above models can distinguish cancer patients and healthy donors with high sensitivity. Our study provided new insights into using the features of PBMCs in non-invasive cancer diagnosis.
Collapse
Affiliation(s)
- Yuanhang Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, Yuquan Road, Shijingshan District, Beijing 100049, China
- BGI Research, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Xiaorui Zhou
- College of Life Sciences, University of Chinese Academy of Sciences, Yuquan Road, Shijingshan District, Beijing 100049, China
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Yu Zhong
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Xi Chen
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Zeyu Li
- College of Life Sciences, University of Chinese Academy of Sciences, Yuquan Road, Shijingshan District, Beijing 100049, China
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Rui Li
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Pengfei Qin
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Shanshan Wang
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Jianhua Yin
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Shang Liu
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Miaomiao Jiang
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Qichao Yu
- College of Life Sciences, University of Chinese Academy of Sciences, Yuquan Road, Shijingshan District, Beijing 100049, China
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Yong Hou
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Shiping Liu
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Liang Wu
- BGI Research , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- JFL-BGI STOmics Center, Jinfeng Laboratory , Gaoteng Avenue, Jiulongpo District, Chongqing 401329, China
| |
Collapse
|
11
|
Chen C, Liu Y, Yao J, Wang K, Zhang M, Shi F, Tian Y, Gao L, Ying Y, Pan Q, Wang H, Wu J, Qi X, Wang Y, Xu D. Deep learning approaches for differentiating thyroid nodules with calcification: a two-center study. BMC Cancer 2023; 23:1139. [PMID: 37996814 PMCID: PMC10668439 DOI: 10.1186/s12885-023-11456-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 09/27/2023] [Indexed: 11/25/2023] Open
Abstract
BACKGROUND Calcification is a common phenomenon in both benign and malignant thyroid nodules. However, the clinical significance of calcification remains unclear. Therefore, we explored a more objective method for distinguishing between benign and malignant thyroid calcified nodules. METHODS This retrospective study, conducted at two centers, involved a total of 631 thyroid nodules, all of which were pathologically confirmed. Ultrasound image sets were employed for analysis. The primary evaluation index was the area under the receiver-operator characteristic curve (AUROC). We compared the diagnostic performance of deep learning (DL) methods with that of radiologists and determined whether DL could enhance the diagnostic capabilities of radiologists. RESULTS The Xception classification model exhibited the highest performance, achieving an AUROC of up to 0.970, followed by the DenseNet169 model, which attained an AUROC of up to 0.959. Notably, both DL models outperformed radiologists (P < 0.05). The success of the Xception model can be attributed to its incorporation of deep separable convolution, which effectively reduces the model's parameter count. This feature enables the model to capture features more effectively during the feature extraction process, resulting in superior performance, particularly when dealing with limited data. CONCLUSIONS This study conclusively demonstrated that DL outperformed radiologists in differentiating between benign and malignant calcified thyroid nodules. Additionally, the diagnostic capabilities of radiologists could be enhanced with the aid of DL.
Collapse
Affiliation(s)
- Chen Chen
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Yuanzhen Liu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Jincao Yao
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology, Hangzhou, 310022, China
- Key Laboratory of Head & Neck Cancer Translational Research of Zhejiang Province, Hangzhou, 310022, China
| | - Kai Wang
- Department of Ultrasound, The Affiliated Dongyang Hospital of Wenzhou Medical University, Dongyang, 317502, China
| | - Maoliang Zhang
- Department of Ultrasound, The Affiliated Dongyang Hospital of Wenzhou Medical University, Dongyang, 317502, China
| | - Fang Shi
- Capacity Building and Continuing Education Center of National Health Commission, Beijing, 100098, China
| | - Yuan Tian
- Capacity Building and Continuing Education Center of National Health Commission, Beijing, 100098, China
| | - Lu Gao
- Capacity Building and Continuing Education Center of National Health Commission, Beijing, 100098, China
| | - Yajun Ying
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Qianmeng Pan
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Hui Wang
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Jinxin Wu
- Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China
| | - Xiaoqing Qi
- Department of Ultrasound, Hangzhou Ninth People's Hospital, Hangzhou, 311225, China
| | - Yifan Wang
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China.
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China.
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China.
| | - Dong Xu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China.
- Wenling Big Data and Artificial Intelligence Institute in Medicine, Taizhou, 317502, China.
- Taizhou Key Laboratory of Minimally Invasive Interventional Therapy & Artificial Intelligence, Taizhou Campus of Zhejiang Cancer Hospital (Taizhou Cancer Hospital), Taizhou, 317502, China.
| |
Collapse
|
12
|
Shen HY, Xu JL, Zhu Z, Xu HP, Liang MX, Xu D, Chen WQ, Tang JH, Fang Z, Zhang J. Integration of bioinformatics and machine learning strategies identifies APM-related gene signatures to predict clinical outcomes and therapeutic responses for breast cancer patients. Neoplasia 2023; 45:100942. [PMID: 37839160 PMCID: PMC10587768 DOI: 10.1016/j.neo.2023.100942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/10/2023] [Indexed: 10/17/2023]
Abstract
BACKGROUND Tumor antigenicity and efficiency of antigen presentation jointly influence tumor immunogenicity, which largely determines the effectiveness of immune checkpoint blockade (ICB). However, the role of altered antigen processing and presentation machinery (APM) in breast cancer (BRCA) has not been fully elucidated. METHODS A series of bioinformatic analyses and machine learning strategies were performed to construct APM-related gene signatures to guide personalized treatment for BRCA patients. A single-sample gene set enrichment analysis (ssGSEA) algorithm and weighted gene co-expression network analysis (WGCNA) were combined to screen for BRCA-specific APM-related genes. The non-negative matrix factorization (NMF) algorithm was used to divide the cohort into different clusters and the fgsea algorithm was applied to investigate the altered signaling pathways. Random survival forest (RSF) and the least absolute shrinkage and selection operator (Lasso) Cox regression analysis were combined to construct an APM-related risk score (APMrs) signature to predict overall survival. Furthermore, a nomogram and decision tree were generated to improve predictive accuracy and risk stratification for individual patients. Based on Tumor Immune Dysfunction and Exclusion (TIDE) method, random forest (RF) and Lasso logistic regression model were combined to establish an APM-related immunotherapeutic response score (APMis). Finally, immune infiltration, immunomodulators, mutational patterns, and potentially applicable drugs were comprehensively analyzed in different APM-related risk groups. IHC staining was used to assess the expression of APM-related genes in clinical samples. RESULTS In this study, APMrs and APMis showed favorable performances in risk stratification and therapeutic prediction for BRCA patients. APMrs exhibited more powerful prognostic capacity and accurate survival prediction compared to conventional clinicopathological features. APMrs was closely associated with distinct mutational patterns, immune cell infiltration and immunomodulators expression. Furthermore, the two APM-related gene signatures were independently validated in external cohorts with prognosis or immunotherapeutic responses. Potential applicable drugs and targets were mined in the APMrs-high group. APM-related genes were further validated in our in-house samples. CONCLUSION The APM-related gene signatures established in our study could improve the personalized assessment of survival risk and guide ICB decision-making for BRCA patients.
Collapse
Affiliation(s)
- Hong-Yu Shen
- Gusu School, The Affiliated Suzhou Hospital of Nanjing Medical University, Nanjing Medical University, Suzhou, China; Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Jia-Lin Xu
- Gusu School, The Affiliated Suzhou Hospital of Nanjing Medical University, Nanjing Medical University, Suzhou, China
| | - Zhen Zhu
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Hai-Ping Xu
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Ming-Xing Liang
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Di Xu
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Wen-Quan Chen
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Jin-Hai Tang
- Gusu School, The Affiliated Suzhou Hospital of Nanjing Medical University, Nanjing Medical University, Suzhou, China; Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| | - Zheng Fang
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| | - Jian Zhang
- Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
13
|
Nagano Y, Chain B. tidytcells: standardizer for TR/MH nomenclature. Front Immunol 2023; 14:1276106. [PMID: 37954585 PMCID: PMC10634431 DOI: 10.3389/fimmu.2023.1276106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 10/09/2023] [Indexed: 11/14/2023] Open
Abstract
T cell receptors (TR) underpin the diversity and specificity of T cell activity. As such, TR repertoire data is valuable both as an adaptive immune biomarker, and as a way to identify candidate therapeutic TR. Analysis of TR repertoires relies heavily on computational analysis, and therefore it is of vital importance that the data is standardized and computer-readable. However in practice, the usage of different abbreviations and non-standard nomenclature in different datasets makes this data pre-processing non-trivial. tidytcells is a lightweight, platform-independent Python package that provides easy-to-use standardization tools specifically designed for TR nomenclature. The software is open-sourced under the MIT license and is available to install from the Python Package Index (PyPI). At the time of publishing, tidytcells is on version 2.0.0.
Collapse
Affiliation(s)
- Yuta Nagano
- Division of Medicine, Faculty of Medical Scienecs, University College London (UCL), London, United Kingdom
| | - Benjamin Chain
- Division of Infection and Immunity, Faculty of Medical Sciences, University College London (UCL), London, United Kingdom
| |
Collapse
|
14
|
Yu X, Ye J, Hathaway CA, Tworoger S, Lea J, Li B. Quantifiable TCR repertoire changes in pre-diagnostic blood specimens among high-grade ovarian cancer patients. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.12.562056. [PMID: 37905034 PMCID: PMC10614767 DOI: 10.1101/2023.10.12.562056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
High grade serous ovarian cancer (HGOC) is a major cause of death in women. Early detection of HGOC usually leads to a cure, yet it remains a clinical challenge with over 90% HGOCs diagnosed at advanced stages. This is mainly because conventional biomarkers are not sensitive to detect the microscopic yet metastatic early HGOC lesions. In this study, we sequenced the blood T cell receptor (TCR) repertoires of 466 ovarian cancer patients and controls, and systematically investigated the immune repertoire signatures in HGOCs. We observed quantifiable changes of selected TCRs in HGOCs that are reproducible in multiple independent cohorts. Importantly, these changes are stronger during stage I. Using pre-diagnostic patient blood samples from the Nurses' Health Study, we confirmed that HGOC signals can be detected in the blood TCR repertoire up to 4 years proceeding conventional diagnosis. Our findings may provide the basis of an immune-based HGOC early detection criterion.
Collapse
Affiliation(s)
- Xuexin Yu
- Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia
- Department of Pathology, Perelman School of Medicine, University of Pennsylvania
| | - Jianfeng Ye
- Department of Neuroscience, UT Southwestern Medical Center
| | | | | | - Jayanthi Lea
- Department of Gynecology, UT Southwestern Medical Center
| | - Bo Li
- Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia
- Department of Pathology, Perelman School of Medicine, University of Pennsylvania
| |
Collapse
|
15
|
Liu Y, Liang Y, Li Q, Li Q. Comprehensive analysis of circulating cell-free RNAs in blood for diagnosing non-small cell lung cancer. Comput Struct Biotechnol J 2023; 21:4238-4251. [PMID: 37692082 PMCID: PMC10491804 DOI: 10.1016/j.csbj.2023.08.029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 08/26/2023] [Accepted: 08/27/2023] [Indexed: 09/12/2023] Open
Abstract
Early screening and detection of non-small cell lung cancer (NSCLC) is crucial due to the significantly low survival rate in advanced stages. Blood-based liquid biopsy is non-invasive test to assistant disease diagnosis, while cell-free RNA is one of the promising biomarkers in blood. However, the disease related signatures have not been explored completely for most cell-free RNA transcriptome sequencing (cfRNA-Seq) datasets. To address this gap, we developed a comprehensive cfRNA-Seq pipeline for data analysis and constructed a machine learning model to facilitate noninvasive early diagnosis of NSCLC. The results of our study have demonstrated the identification of differential mRNA, lncRNAs and miRNAs from cfRNA-Seq, which have exhibited significant association with development and progression of lung cancer. The classifier based on gene expression signatures achieved an impressive area under the curve (AUC) of up to 0.9, indicating high specificity and sensitivity in both cross-validation and independent test. Furthermore, the analysis of T cell and B cell immune repertoire extracted from cfRNA-Seq have provided insights into the immune status of cancer patients, while the microbiome analysis has revealed distinct bacterial and viral profiles between NSCLC and normal samples. In our future work, we aim to validate the existence of cancer associated T cell receptors (TCR)/B cell receptors (BCR) and microorganisms, and subsequently integrate all identified signatures into diagnostic model to improve the prediction accuracy. This study not only provided a comprehensive analysis pipeline for cfRNA-Seq dataset but also highlights the potential of cfRNAs as promising biomarkers and models for early NSCLC diagnosis, emphasizing their importance in clinical settings.
Collapse
Affiliation(s)
| | | | - Qiyan Li
- Department of Laboratory Medicine, The Eighth Affiliated Hospital, Sun Yat-Sen University, Shenzhen, China
| | - Qingjiao Li
- Department of Laboratory Medicine, The Eighth Affiliated Hospital, Sun Yat-Sen University, Shenzhen, China
| |
Collapse
|
16
|
Liu S, Bradley P, Sun W. Neural Network Models for Sequence-Based TCR and HLA Association Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.25.542327. [PMID: 37293077 PMCID: PMC10245990 DOI: 10.1101/2023.05.25.542327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
T cells rely on their T cell receptors (TCRs) to recognize foreign antigens presented by human leukocyte antigen (HLA) proteins. TCRs contain a record of an individual's past immune activities, and some TCRs are observed only in individuals with certain HLA alleles. As a result, characterising TCRs requires a thorough understanding of TCR-HLA associations. To this end, we propose a neural network method named Deep learning Prediction of TCR-HLA association (DePTH) to predict TCR-HLA associations based on their amino acid sequences. We show that the DePTH can be used to quantify the functional similarities of HLA alleles, and that these HLA similarities are associated with the survival outcomes of cancer patients who received immune checkpoint blockade treatment.
Collapse
Affiliation(s)
- Si Liu
- Public Health Science Division, Fred Hutchinson Cancer Center, Seattle, USA
| | - Philip Bradley
- Public Health Science Division, Fred Hutchinson Cancer Center, Seattle, USA
- Herbold Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, USA
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, USA
| | - Wei Sun
- Public Health Science Division, Fred Hutchinson Cancer Center, Seattle, USA
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, USA
- Department of Biostatistics, University of Washington, Seattle, USA
- Department of Biostatistics, University of North Carolina, Chapel Hill, USA
| |
Collapse
|
17
|
Luo Y, Deng X, Liao W, Huang Y, Lu C. Prognostic value of autophagy-related genes based on single-cell RNA-sequencing in colorectal cancer. Front Genet 2023; 14:1109683. [PMID: 37065476 PMCID: PMC10097963 DOI: 10.3389/fgene.2023.1109683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 03/21/2023] [Indexed: 03/31/2023] Open
Abstract
Background: Colorectal cancer (CRC) is the second most common cancer in China. Autophagy plays an important role in the initiation and development of CRC. Here, we assessed the prognostic value and potential functions of autophagy-related genes (ARGs) using integrated analysis using single-cell RNA sequencing (scRNA-seq) data from the Gene Expression Omnibus (GEO) and RNA sequencing (RNA-seq) data from The Cancer Genome Atlas (TCGA).Methods: We analyzed GEO-scRNA-seq data from GEO using various single-cell technologies, including cell clustering, and identification of differentially expressed genes (DEGs) in different cell types. Additionally, we performed gene set variation analysis (GSVA). The differentially expressed ARGs among different cell types and those between CRC and normal tissues were identified using TCGA-RNA-seq data, and the hub ARGs were screened. Finally, a prognostic model based on the hub ARGs was constructed and validated, and patients with CRC in TCGA datasets were divided into high- and low-risk groups based on their risk-score, and immune cells infiltration and drug sensitivity analyses between the two groups were performed.Results: We obtained single-cell expression profiles of 16,270 cells, and clustered them into seven types of cells. GSVA revealed that the DEGs among the seven types of cells were enriched in many signaling pathways associated with cancer development. We screened 55 differentially expressed ARGs, and identified 11 hub ARGs. Our prognostic model revealed that the 11 hub ARGs including CTSB, ITGA6, and S100A8, had a good predictive ability. Moreover, the immune cell infiltrations in CRC tissues were different between the two groups, and the hub ARGs were significantly correlated with the enrichment of immune cell infiltration. The drug sensitivity analysis revealed that the patients in the two risk groups had difference in their response to anti-cancer drugs.Conclusion: We developed a novel prognostic 11-hub ARG risk model, and these hubs may act as potential therapeutic targets for CRC.
Collapse
Affiliation(s)
- Yuqi Luo
- Department of Gastrointestinal and Hepatobiliary Surgery, Shenzhen Longhua District Central Hospital, Shenzhen, Guangdong, China
- *Correspondence: Yuqi Luo,
| | - Xuesong Deng
- Department of Hepatobiliary Surgery, Shenzhen Second People’s Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Weihua Liao
- Department of Radiology, Guangzhou Nansha District Maternal and Child Health Hospital, Guangzhou, Guangdong, China
| | - Yiwen Huang
- Department of Emergency, Nansha Hospital, Guangzhou First People’s Hospital, Guangzhou, Guangdong, China
| | - Caijie Lu
- Department of Gastrointestinal and Hepatobiliary Surgery, Shenzhen Longhua District Central Hospital, Shenzhen, Guangdong, China
| |
Collapse
|
18
|
Jiang Y, Li SC. Deep autoregressive generative models capture the intrinsics embedded in T-cell receptor repertoires. Brief Bioinform 2023; 24:7031156. [PMID: 36752378 DOI: 10.1093/bib/bbad038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 01/07/2023] [Accepted: 01/18/2023] [Indexed: 02/09/2023] Open
Abstract
T-cell receptors (TCRs) play an essential role in the adaptive immune system. Probabilistic models for TCR repertoires can help decipher the underlying complex sequence patterns and provide novel insights into understanding the adaptive immune system. In this work, we develop TCRpeg, a deep autoregressive generative model to unravel the sequence patterns of TCR repertoires. TCRpeg largely outperforms state-of-the-art methods in estimating the probability distribution of a TCR repertoire, boosting the average accuracy from 0.672 to 0.906 measured by the Pearson correlation coefficient. Furthermore, with promising performance in probability inference, TCRpeg improves on a range of TCR-related tasks: profiling TCR repertoire probabilistically, classifying antigen-specific TCRs, validating previously discovered TCR motifs, generating novel TCRs and augmenting TCR data. Our results and analysis highlight the flexibility and capacity of TCRpeg to extract TCR sequence information, providing a novel approach for deciphering complex immunogenomic repertoires.
Collapse
Affiliation(s)
- Yuepeng Jiang
- Department of Computer science, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Shuai Cheng Li
- Department of Computer science, City University of Hong Kong, Kowloon Tong, Hong Kong
| |
Collapse
|
19
|
Frank ML, Lu K, Erdogan C, Han Y, Hu J, Wang T, Heymach JV, Zhang J, Reuben A. T-Cell Receptor Repertoire Sequencing in the Era of Cancer Immunotherapy. Clin Cancer Res 2023; 29:994-1008. [PMID: 36413126 PMCID: PMC10011887 DOI: 10.1158/1078-0432.ccr-22-2469] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/07/2022] [Accepted: 11/14/2022] [Indexed: 11/23/2022]
Abstract
T cells are integral components of the adaptive immune system, and their responses are mediated by unique T-cell receptors (TCR) that recognize specific antigens from a variety of biological contexts. As a result, analyzing the T-cell repertoire offers a better understanding of immune responses and of diseases like cancer. Next-generation sequencing technologies have greatly enabled the high-throughput analysis of the TCR repertoire. On the basis of our extensive experience in the field from the past decade, we provide an overview of TCR sequencing, from the initial library preparation steps to sequencing and analysis methods and finally to functional validation techniques. With regards to data analysis, we detail important TCR repertoire metrics and present several computational tools for predicting antigen specificity. Finally, we highlight important applications of TCR sequencing and repertoire analysis to understanding tumor biology and developing cancer immunotherapies.
Collapse
Affiliation(s)
- Meredith L. Frank
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
| | - Kaylene Lu
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
- Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Can Erdogan
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
- Rice University, Houston, Texas
| | - Yi Han
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Jian Hu
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
- Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Tao Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, Texas
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, Texas
| | - John V. Heymach
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
| | - Jianjun Zhang
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Alexandre Reuben
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
- The University of Texas MD Anderson Cancer Center UT Health Houston Graduate School of Biomedical Sciences, Houston, Texas
| |
Collapse
|
20
|
From patterns to patients: Advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell 2023; 186:1772-1791. [PMID: 36905928 DOI: 10.1016/j.cell.2023.01.035] [Citation(s) in RCA: 150] [Impact Index Per Article: 75.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 01/10/2023] [Accepted: 01/26/2023] [Indexed: 03/12/2023]
Abstract
Machine learning (ML) is increasingly used in clinical oncology to diagnose cancers, predict patient outcomes, and inform treatment planning. Here, we review recent applications of ML across the clinical oncology workflow. We review how these techniques are applied to medical imaging and to molecular data obtained from liquid and solid tumor biopsies for cancer diagnosis, prognosis, and treatment design. We discuss key considerations in developing ML for the distinct challenges posed by imaging and molecular data. Finally, we examine ML models approved for cancer-related patient usage by regulatory agencies and discuss approaches to improve the clinical usefulness of ML.
Collapse
|
21
|
Akerman O, Isakov H, Levi R, Psevkin V, Louzoun Y. Counting is almost all you need. Front Immunol 2023; 13:1031011. [PMID: 36741395 PMCID: PMC9896581 DOI: 10.3389/fimmu.2022.1031011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 12/27/2022] [Indexed: 01/21/2023] Open
Abstract
The immune memory repertoire encodes the history of present and past infections and immunological attributes of the individual. As such, multiple methods were proposed to use T-cell receptor (TCR) repertoires to detect disease history. We here show that the counting method outperforms two leading algorithms. We then show that the counting can be further improved using a novel attention model to weigh the different TCRs. The attention model is based on the projection of TCRs using a Variational AutoEncoder (VAE). Both counting and attention algorithms predict better than current leading algorithms whether the host had CMV and its HLA alleles. As an intermediate solution between the complex attention model and the very simple counting model, we propose a new Graph Convolutional Network approach that obtains the accuracy of the attention model and the simplicity of the counting model. The code for the models used in the paper is provided at: https://github.com/louzounlab/CountingIsAlmostAllYouNeed.
Collapse
Affiliation(s)
- Ofek Akerman
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
| | - Haim Isakov
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Reut Levi
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Vladimir Psevkin
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Yoram Louzoun
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| |
Collapse
|
22
|
Kanduri C, Scheffer L, Pavlović M, Rand KD, Chernigovskaya M, Pirvandy O, Yaari G, Greiff V, Sandve GK. simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods. Gigascience 2022; 12:giad074. [PMID: 37848619 PMCID: PMC10580376 DOI: 10.1093/gigascience/giad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 07/20/2023] [Accepted: 08/29/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Machine learning (ML) has gained significant attention for classifying immune states in adaptive immune receptor repertoires (AIRRs) to support the advancement of immunodiagnostics and therapeutics. Simulated data are crucial for the rigorous benchmarking of AIRR-ML methods. Existing approaches to generating synthetic benchmarking datasets result in the generation of naive repertoires missing the key feature of many shared receptor sequences (selected for common antigens) found in antigen-experienced repertoires. RESULTS We demonstrate that a common approach to generating simulated AIRR benchmark datasets can introduce biases, which may be exploited for undesired shortcut learning by certain ML methods. To mitigate undesirable access to true signals in simulated AIRR datasets, we devised a simulation strategy (simAIRR) that constructs antigen-experienced-like repertoires with a realistic overlap of receptor sequences. simAIRR can be used for constructing AIRR-level benchmarks based on a range of assumptions (or experimental data sources) for what constitutes receptor-level immune signals. This includes the possibility of making or not making any prior assumptions regarding the similarity or commonality of immune state-associated sequences that will be used as true signals. We demonstrate the real-world realism of our proposed simulation approach by showing that basic ML strategies perform similarly on simAIRR-generated and real-world experimental AIRR datasets. CONCLUSIONS This study sheds light on the potential shortcut learning opportunities for ML methods that can arise with the state-of-the-art way of simulating AIRR datasets. simAIRR is available as a Python package: https://github.com/KanduriC/simAIRR.
Collapse
Affiliation(s)
- Chakravarthi Kanduri
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| | - Lonneke Scheffer
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Milena Pavlović
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| | - Knut Dagestad Rand
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology and Oslo University Hospital, University of Oslo, 0373 Oslo, Norway
| | - Oz Pirvandy
- Faculty of Engineering, Bar-Ilan University, 5290002, Israel
| | - Gur Yaari
- Faculty of Engineering, Bar-Ilan University, 5290002, Israel
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, 0373 Oslo, Norway
| | - Geir K Sandve
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway
- UiORealArt Convergence Environment, University of Oslo, 0373 Oslo, Norway
| |
Collapse
|
23
|
Li R, Ferdinand JR, Loudon KW, Bowyer GS, Laidlaw S, Muyas F, Mamanova L, Neves JB, Bolt L, Fasouli ES, Lawson ARJ, Young MD, Hooks Y, Oliver TRW, Butler TM, Armitage JN, Aho T, Riddick ACP, Gnanapragasam V, Welsh SJ, Meyer KB, Warren AY, Tran MGB, Stewart GD, Cortés-Ciriano I, Behjati S, Clatworthy MR, Campbell PJ, Teichmann SA, Mitchell TJ. Mapping single-cell transcriptomes in the intra-tumoral and associated territories of kidney cancer. Cancer Cell 2022; 40:1583-1599.e10. [PMID: 36423636 PMCID: PMC9767677 DOI: 10.1016/j.ccell.2022.11.001] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 08/12/2022] [Accepted: 11/04/2022] [Indexed: 11/24/2022]
Abstract
Tumor behavior is intricately dependent on the oncogenic properties of cancer cells and their multi-cellular interactions. To understand these dependencies within the wider microenvironment, we studied over 270,000 single-cell transcriptomes and 100 microdissected whole exomes from 12 patients with kidney tumors, prior to validation using spatial transcriptomics. Tissues were sampled from multiple regions of the tumor core, the tumor-normal interface, normal surrounding tissues, and peripheral blood. We find that the tissue-type location of CD8+ T cell clonotypes largely defines their exhaustion state with intra-tumoral spatial heterogeneity that is not well explained by somatic heterogeneity. De novo mutation calling from single-cell RNA-sequencing data allows us to broadly infer the clonality of stromal cells and lineage-trace myeloid cell development. We report six conserved meta-programs that distinguish tumor cell function, and find an epithelial-mesenchymal transition meta-program highly enriched at the tumor-normal interface that co-localizes with IL1B-expressing macrophages, offering a potential therapeutic target.
Collapse
Affiliation(s)
- Ruoyan Li
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - John R Ferdinand
- Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Kevin W Loudon
- Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK; Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Georgina S Bowyer
- Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Sean Laidlaw
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Francesc Muyas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Lira Mamanova
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Joana B Neves
- UCL Division of Surgery and Interventional Science, Royal Free Hospital, London NW3 2PS, UK; Specialist Centre for Kidney Cancer, Royal Free Hospital, London NW3 2PS, UK
| | - Liam Bolt
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Eirini S Fasouli
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Andrew R J Lawson
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Matthew D Young
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Yvette Hooks
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Thomas R W Oliver
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK; Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Timothy M Butler
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - James N Armitage
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Tev Aho
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Antony C P Riddick
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Vincent Gnanapragasam
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK; Department of Surgery, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Sarah J Welsh
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Kerstin B Meyer
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Anne Y Warren
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Maxine G B Tran
- UCL Division of Surgery and Interventional Science, Royal Free Hospital, London NW3 2PS, UK; Specialist Centre for Kidney Cancer, Royal Free Hospital, London NW3 2PS, UK
| | - Grant D Stewart
- Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK; Department of Surgery, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Isidro Cortés-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Sam Behjati
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK; Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Menna R Clatworthy
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK; Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK; Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Peter J Campbell
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Sarah A Teichmann
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK; Department of Physics, Cavendish Laboratory, JJ Thomson Avenue, Cambridge CB3 0HE, UK.
| | - Thomas J Mitchell
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK; Cambridge University Hospitals NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Cambridge CB2 0QQ, UK; Department of Surgery, University of Cambridge, Cambridge CB2 0QQ, UK.
| |
Collapse
|
24
|
Multiple instance neural networks based on sparse attention for cancer detection using T-cell receptor sequences. BMC Bioinformatics 2022; 23:469. [PMID: 36348271 PMCID: PMC9644450 DOI: 10.1186/s12859-022-05012-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/26/2022] [Indexed: 11/11/2022] Open
Abstract
Early detection of cancers has been much explored due to its paramount importance in biomedical fields. Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology. However, the one-to-many correspondence between a patient and multiple TCR sequences hinders researchers from simply adopting classical statistical/machine learning methods. There were recent attempts to model this type of data in the context of multiple instance learning (MIL). Despite the novel application of MIL to cancer detection using TCR sequences and the demonstrated adequate performance in several tumor types, there is still room for improvement, especially for certain cancer types. Furthermore, explainable neural network models are not fully investigated for this application. In this article, we propose multiple instance neural networks based on sparse attention (MINN-SA) to enhance the performance in cancer detection and explainability. The sparse attention structure drops out uninformative instances in each bag, achieving both interpretability and better predictive performance in combination with the skip connection. Our experiments show that MINN-SA yields the highest area under the ROC curve scores on average measured across 10 different types of cancers, compared to existing MIL approaches. Moreover, we observe from the estimated attentions that MINN-SA can identify the TCRs that are specific for tumor antigens in the same T cell repertoire.
Collapse
|
25
|
Jiang P, Sinha S, Aldape K, Hannenhalli S, Sahinalp C, Ruppin E. Big data in basic and translational cancer research. Nat Rev Cancer 2022; 22:625-639. [PMID: 36064595 PMCID: PMC9443637 DOI: 10.1038/s41568-022-00502-0] [Citation(s) in RCA: 92] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/26/2022] [Indexed: 02/07/2023]
Abstract
Historically, the primary focus of cancer research has been molecular and clinical studies of a few essential pathways and genes. Recent years have seen the rapid accumulation of large-scale cancer omics data catalysed by breakthroughs in high-throughput technologies. This fast data growth has given rise to an evolving concept of 'big data' in cancer, whose analysis demands large computational resources and can potentially bring novel insights into essential questions. Indeed, the combination of big data, bioinformatics and artificial intelligence has led to notable advances in our basic understanding of cancer biology and to translational advancements. Further advances will require a concerted effort among data scientists, clinicians, biologists and policymakers. Here, we review the current state of the art and future challenges for harnessing big data to advance cancer research and treatment.
Collapse
Affiliation(s)
- Peng Jiang
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Sanju Sinha
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kenneth Aldape
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sridhar Hannenhalli
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Cenk Sahinalp
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Eytan Ruppin
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
26
|
Yu X, Lin W, Spirtos A, Wang Y, Chen H, Ye J, Parker J, Liu CC, Wang Y, Quinn G, Zhou F, Chambers SK, Lewis C, Lea J, Li B, Zheng W. Dissection of transcriptome dysregulation and immune characterization in women with germline BRCA1 mutation at single-cell resolution. BMC Med 2022; 20:283. [PMID: 36076202 PMCID: PMC9461201 DOI: 10.1186/s12916-022-02489-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/19/2022] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND High-grade serous carcinoma (HGSC) is the most frequent and lethal type of ovarian cancer. It has been proposed that tubal secretory cells are the origin of ovarian HGSC in women with familial BRCA1/2 mutations. However, the molecular changes underlying malignant transformation remain unknown. METHOD We performed single-cell RNA and T cell receptor sequencing of tubal fimbriated ends from 3 BRCA1 germline mutation carriers (BRCA1 carriers) and 3 normal controls with no high-risk history (non-BRCA1 carriers). RESULTS Exploring the transcriptomes of 19,008 cells, predominantly from BRCA1+ samples, we identified 5 major cell populations in the fallopian tubal mucosae. The secretory cells of BRCA1+ samples had differentially expressed genes involved in tumor growth and regulation, chemokine signaling, and antigen presentation compared to the wild-type BRCA1 controls. There are several novel findings in this study. First, a subset of the fallopian tubal secretory cells from one BRCA1 carrier exhibited an epithelial-to-mesenchymal transition (EMT) phenotype, which was also present in the mucosal fibroblasts. Second, we identified a previously unreported phenotypic split of the EMT secretory cells with distinct evolutionary endpoints. Third, we observed increased clonal expansion among the CD8+ T cell population from BRCA1+ carriers. Among those clonally expanded CD8+ T cells, PD-1 was significantly increased in tubal mucosae of BRCA1+ patients compared with that of normal controls, indicating that T cell exhaustion may occur before the development of any premalignant or malignant lesions. CONCLUSION These results indicate that EMT and immune evasion in normal-looking tubal mucosae may represent early events leading to the development of HGSC in women with BRCA1 germline mutation. Our findings provide a probable molecular mechanism explaining why some, but not all, women with BRCA1 germline mutation present with early development and rapid dissemination of HGSC.
Collapse
Affiliation(s)
- Xuexin Yu
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA
| | - Wanrun Lin
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Alexandra Spirtos
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Yan Wang
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Hao Chen
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jianfeng Ye
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA
| | - Jessica Parker
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Present address: Department of Obstetrics and Gynecology, Indiana University, Indianapolis, IN, USA
| | - Ci Ci Liu
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Present address: Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY, USA
| | - Yiying Wang
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Gabriella Quinn
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA
| | - Feng Zhou
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Present address: Department of Pathology, Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Setsuko K Chambers
- Department of Obstetrics and Gynecology, The University of Arizona Cancer Center, University of Arizona, Tucson, AZ, USA
| | - Cheryl Lewis
- Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA
| | - Jayanthi Lea
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA.
- Present address: Department of Obstetrics and Gynecology, University of Rochester Medical Center, Rochester, NY, USA.
| | - Bo Li
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USA.
- Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA.
- Department of Immunology, UT Southwestern Medical Center, Dallas, TX, USA.
| | - Wenxin Zheng
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA.
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA.
- Harold C. Simmons Comprehensive Cancer Center, UT Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
27
|
Ji F, Chen L, Chen Z, Luo B, Wang Y, Lan X. TCR repertoire and transcriptional signatures of circulating tumour-associated T cells facilitate effective non-invasive cancer detection. Clin Transl Med 2022; 12:e853. [PMID: 36134717 PMCID: PMC9494610 DOI: 10.1002/ctm2.853] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/11/2022] [Accepted: 04/15/2022] [Indexed: 11/10/2022] Open
Affiliation(s)
- Fansen Ji
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.,School of Medicine, Tsinghua University, Beijing, China
| | - Lin Chen
- School of Medicine, Tsinghua University, Beijing, China.,General Surgery Department, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Zhizhuo Chen
- School of Life Science, Tsinghua University, Beijing, China
| | - Bin Luo
- General Surgery Department, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Yongwang Wang
- Department of Anesthesiology, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Xun Lan
- Tsinghua-Peking Center for Life Sciences, MOE Key Laboratory of Tsinghua University, Beijing, China.,School of Medicine, Tsinghua University, Beijing, China
| |
Collapse
|
28
|
Liang J, Li ZW, Yue CT, Hu Z, Cheng H, Liu ZX, Guo WF. Multi-modal optimization to identify personalized biomarkers for disease prediction of individual patients with cancer. Brief Bioinform 2022; 23:6647504. [PMID: 35858208 DOI: 10.1093/bib/bbac254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 05/16/2022] [Accepted: 05/31/2022] [Indexed: 11/14/2022] Open
Abstract
Finding personalized biomarkers for disease prediction of patients with cancer remains a massive challenge in precision medicine. Most methods focus on one subnetwork or module as a network biomarker; however, this ignores the early warning capabilities of other modules with different configurations of biomarkers (i.e. multi-modal personalized biomarkers). Identifying such modules would not only predict disease but also provide effective therapeutic drug target information for individual patients. To solve this problem, we developed a novel model (denoted multi-modal personalized dynamic network biomarkers (MMPDNB)) based on a multi-modal optimization mechanism and personalized dynamic network biomarker (PDNB) theory, which can provide multiple modules of personalized biomarkers and unveil their multi-modal properties. Using the genomics data of patients with breast or lung cancer from The Cancer Genome Atlas database, we validated the effectiveness of the MMPDNB model. The experimental results showed that compared with other advanced methods, MMPDNB can more effectively predict the critical state with the highest early warning signal score during cancer development. Furthermore, MMPDNB more significantly identified PDNBs containing driver and biomarker genes specific to cancer tissues. More importantly, we validated the biological significance of multi-modal PDNBs, which could provide effective drug targets of individual patients as well as markers for predicting early warning signals of the critical disease state. In conclusion, multi-modal optimization is an effective method to identify PDNBs and offers a new perspective for understanding tumor heterogeneity in cancer precision medicine.
Collapse
Affiliation(s)
- Jing Liang
- School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Zong-Wei Li
- School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Cai-Tong Yue
- School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Zhuo Hu
- School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Han Cheng
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Ze-Xian Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Wei-Feng Guo
- School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China.,State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| |
Collapse
|
29
|
Song L, Ouyang Z, Cohen D, Cao Y, Altreuter J, Bai G, Hu X, Livak KJ, Li H, Tang M, Li B, Shirley Liu X. Comprehensive Characterizations of Immune Receptor Repertoire in Tumors and Cancer Immunotherapy Studies. Cancer Immunol Res 2022; 10:788-799. [PMID: 35605261 PMCID: PMC9299271 DOI: 10.1158/2326-6066.cir-21-0965] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/17/2022] [Accepted: 05/20/2022] [Indexed: 01/03/2023]
Abstract
We applied our computational algorithm TRUST4 to assemble immune receptor (T-cell receptor/B-cell receptor) repertoires from approximately 12,000 RNA sequencing samples from The Cancer Genome Atlas and seven immunotherapy studies. From over 35 million assembled complete complementary-determining region 3 sequences, we observed that the expression of CCL5 and MZB1 is the most positively correlated genes with T-cell clonal expansion and B-cell clonal expansion, respectively. We analyzed amino acid evolution during B-cell receptor somatic hypermutation and identified tyrosine as the preferred residue. We found that IgG1+IgG3 antibodies together with FcRn were associated with complement-dependent cytotoxicity and antibody-dependent cellular cytotoxicity or phagocytosis. In addition to B-cell infiltration, we discovered that B-cell clonal expansion and IgG1+IgG3 antibodies are also correlated with better patient outcomes. Finally, we created a website, VisualizIRR, for users to interactively explore and visualize the immune repertoires in this study. See related Spotlight by Liu and Han, p. 786.
Collapse
Affiliation(s)
- Li Song
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zhangyi Ouyang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Experimental Hematology and Biochemistry, Beijing Institute of Radiation Medicine, Beijing, China
| | - David Cohen
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Yang Cao
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- College of Life Sciences, Sichuan University, Chengdu, Sichuan, China
| | - Jennifer Altreuter
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Gali Bai
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Xihao Hu
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Current affiliation: GV20 Therapeutics, Cambridge, MA, USA
| | - Kenneth J. Livak
- Department of Medical, Dana-Farber Cancer Institute, Boston, MA, USA
- Translational Immunogenomics Lab, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ming Tang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Bo Li
- Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - X. Shirley Liu
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Current affiliation: GV20 Therapeutics, Cambridge, MA, USA
| |
Collapse
|
30
|
Chen Y, Ye Z, Zhang Y, Xie W, Chen Q, Lan C, Yang X, Zeng H, Zhu Y, Ma C, Tang H, Wang Q, Guan J, Chen S, Li F, Yang W, Yan H, Yu X, Zhang Z. A Deep Learning Model for Accurate Diagnosis of Infection Using Antibody Repertoires. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2022; 208:2675-2685. [PMID: 35606050 DOI: 10.4049/jimmunol.2200063] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 04/11/2022] [Indexed: 06/15/2023]
Abstract
The adaptive immune receptor repertoire consists of the entire set of an individual's BCRs and TCRs and is believed to contain a record of prior immune responses and the potential for future immunity. Analyses of TCR repertoires via deep learning (DL) methods have successfully diagnosed cancers and infectious diseases, including coronavirus disease 2019. However, few studies have used DL to analyze BCR repertoires. In this study, we collected IgG H chain Ab repertoires from 276 healthy control subjects and 326 patients with various infections. We then extracted a comprehensive feature set consisting of 10 subsets of repertoire-level features and 160 sequence-level features and tested whether these features can distinguish between infected individuals and healthy control subjects. Finally, we developed an ensemble DL model, namely, DL method for infection diagnosis (https://github.com/chenyuan0510/DeepID), and used this model to differentiate between the infected and healthy individuals. Four subsets of repertoire-level features and four sequence-level features were selected because of their excellent predictive performance. The DL method for infection diagnosis outperformed traditional machine learning methods in distinguishing between healthy and infected samples (area under the curve = 0.9883) and achieved a multiclassification accuracy of 0.9104. We also observed differences between the healthy and infected groups in V genes usage, clonal expansion, the complexity of reads within clone, the physical properties in the α region, and the local flexibility of the CDR3 amino acid sequence. Our results suggest that the Ab repertoire is a promising biomarker for the diagnosis of various infections.
Collapse
Affiliation(s)
- Yuan Chen
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Zhiming Ye
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- Division of Nephrology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Yanfang Zhang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Wenxi Xie
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Qingyun Chen
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Chunhong Lan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Xiujia Yang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Huikun Zeng
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Yan Zhu
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Cuiyu Ma
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Haipei Tang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Qilong Wang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Junjie Guan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Sen Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Fenxiang Li
- Department of Infectious Disease Control and Prevention, Center for Disease Control and Prevention of Southern Theatre Command, Guangzhou, China
| | - Wei Yang
- Department of Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Huacheng Yan
- Department of Infectious Disease Control and Prevention, Center for Disease Control and Prevention of Southern Theatre Command, Guangzhou, China
| | - Xueqing Yu
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China;
- Division of Nephrology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Zhenhai Zhang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China;
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
- State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou, China; and
- Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China
| |
Collapse
|
31
|
Glazer N, Akerman O, Louzoun Y. Naive and memory T cells TCR-HLA-binding prediction. OXFORD OPEN IMMUNOLOGY 2022; 3:iqac001. [PMID: 36846560 PMCID: PMC9914496 DOI: 10.1093/oxfimm/iqac001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/01/2022] [Accepted: 05/17/2022] [Indexed: 11/12/2022] Open
Abstract
T cells recognize antigens through the interaction of their T cell receptor (TCR) with a peptide-major histocompatibility complex (pMHC) molecule. Following thymic-positive selection, TCRs in peripheral naive T cells are expected to bind MHC alleles of the host. Peripheral clonal selection is expected to further increase the frequency of antigen-specific TCRs that bind to the host MHC alleles. To check for a systematic preference for MHC-binding T cells in TCR repertoires, we developed Natural Language Processing-based methods to predict TCR-MHC binding independently of the peptide presented for Class I MHC alleles. We trained a classifier on published TCR-pMHC binding pairs and obtained a high area under curve (AUC) of over 0.90 on the test set. However, when applied to TCR repertoires, the accuracy of the classifier dropped. We thus developed a two-stage prediction model, based on large-scale naive and memory TCR repertoires, denoted TCR HLA-binding predictor (CLAIRE). Since each host carries multiple human leukocyte antigen (HLA) alleles, we first computed whether a TCR on a CD8 T cell binds an MHC from any of the host Class-I HLA alleles. We then performed an iteration, where we predict the binding with the most probable allele from the first round. We show that this classifier is more precise for memory than for naïve cells. Moreover, it can be transferred between datasets. Finally, we developed a CD4-CD8 T cell classifier to apply CLAIRE to unsorted bulk sequencing datasets and showed a high AUC of 0.96 and 0.90 on large datasets. CLAIRE is available through a GitHub at: https://github.com/louzounlab/CLAIRE, and as a server at: https://claire.math.biu.ac.il/Home.
Collapse
Affiliation(s)
- Neta Glazer
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Ofek Akerman
- Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
| | - Yoram Louzoun
- Correspondence address. Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel. E-mail:
| |
Collapse
|
32
|
Kanduri C, Pavlović M, Scheffer L, Motwani K, Chernigovskaya M, Greiff V, Sandve GK. Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification. Gigascience 2022; 11:giac046. [PMID: 35639633 PMCID: PMC9154052 DOI: 10.1093/gigascience/giac046] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 12/23/2021] [Accepted: 04/08/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Machine learning (ML) methodology development for the classification of immune states in adaptive immune receptor repertoires (AIRRs) has seen a recent surge of interest. However, so far, there does not exist a systematic evaluation of scenarios where classical ML methods (such as penalized logistic regression) already perform adequately for AIRR classification. This hinders investigative reorientation to those scenarios where method development of more sophisticated ML approaches may be required. RESULTS To identify those scenarios where a baseline ML method is able to perform well for AIRR classification, we generated a collection of synthetic AIRR benchmark data sets encompassing a wide range of data set architecture-associated and immune state-associated sequence patterns (signal) complexity. We trained ≈1,700 ML models with varying assumptions regarding immune signal on ≈1,000 data sets with a total of ≈250,000 AIRRs containing ≈46 billion TCRβ CDR3 amino acid sequences, thereby surpassing the sample sizes of current state-of-the-art AIRR-ML setups by two orders of magnitude. We found that L1-penalized logistic regression achieved high prediction accuracy even when the immune signal occurs only in 1 out of 50,000 AIR sequences. CONCLUSIONS We provide a reference benchmark to guide new AIRR-ML classification methodology by (i) identifying those scenarios characterized by immune signal and data set complexity, where baseline methods already achieve high prediction accuracy, and (ii) facilitating realistic expectations of the performance of AIRR-ML models given training data set properties and assumptions. Our study serves as a template for defining specialized AIRR benchmark data sets for comprehensive benchmarking of AIRR-ML methods.
Collapse
Affiliation(s)
- Chakravarthi Kanduri
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| | - Milena Pavlović
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| | - Lonneke Scheffer
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| | - Keshav Motwani
- Department of Pathology, Immunology and Laboratory Medicine, University of Florida,
FL 32610, USA
| | - Maria Chernigovskaya
- Department of Immunology and Oslo University Hospital, University of Oslo, Oslo, 0372, Norway
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, Oslo, 0372, Norway
| | - Geir K Sandve
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway
| |
Collapse
|
33
|
Abdollahi S, Lin PC, Chiang JH. DiaDeL: An Accurate Deep Learning-Based Model With Mutational Signatures for Predicting Metastasis Stage and Cancer Types. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1336-1343. [PMID: 34570707 DOI: 10.1109/tcbb.2021.3115504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Mutational signatures help identify cancer-associated genes that are being involved in tumorigenesis pathways. Hence, these pathways guide precision medicine approaches to find appropriate drugs and treatments. The pattern of mutations varies in different cancer types. Some mutations dysregulate protein function so that their accumulation is responsible for cancer development and might be associated with different cancer types. Therefore, mutations as a feature set can be used as an informative candidate to distinguish various cancer types. There are several options for demonstrating mutations. One might employ binary values to demonstrate mutation regions. Another potential method for extracting features is utilizing mutation interpreters. In this study, we investigate the trinucleotide mutational pattern of each cancer type. Moreover, we extract salient NMF-based mutational signatures across various cancer types. Then, we identify cancer-associated genes of a target cancer based on its salient signatures. We evaluate the cancer-associated genes using survival and gene expression analysis in different stages of cancer. Furthermore, we introduce DiaDeL, which is a deep learning-based binary classifier. The DiaDeL model uses mutational signatures as input features and distinct a cancer type from the others. Our proposed model outperforms six state-of-the-art methods with 0.824 and 0.88 for accuracy and AUC, respectively. The source code is available at https://github.com/sabdollahi/DiaDeL.
Collapse
|
34
|
Gao Q, Zeng Q, Wang Z, Li C, Xu Y, Cui P, Zhu X, Lu H, Wang G, Cai S, Wang J, Fan J. Start of an era: circulating cell-free DNA for early detection of cancers. Innovation (N Y) 2022; 3:100259. [PMID: 35647572 PMCID: PMC9133648 DOI: 10.1016/j.xinn.2022.100259] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 05/02/2022] [Indexed: 11/29/2022] Open
Abstract
Effective screening modalities are currently available for only a small subset of cancers, and they generally have suboptimal performance with complicated procedures. Therefore, there is an urgent need to develop simple, accurate, and non-invasive methods for early detection of cancers. Genetic and epigenetic alterations in plasma circulating cell-free DNA (cfDNA) have shown the potential to revolutionize methods of early detection of cancers and facilitate subsequent diagnosis to improve survival of patients. The medical interest in cfDNA assays has been inspired by emerging single- and multi-early detection of cancers studies. This review summarizes current technological and clinical advances, in the hopes of providing insights into the development and applications of cfDNA assays in various cancers and clinical scenarios. The key phases of clinical development of biomarkers are highlighted, and the future developments of cfDNA-based liquid biopsies in early detection of cancers are outlined. It is hoped that this study can boost the potential integration of cfDNA-based early detection of cancers into the current clinical workflow. Liquid biopsy, characterized by minimal invasiveness and user friendliness, can identify multiple cancers at the early stage and localize the tissue of origin The state-of-the-art technology facilitates the application of circulating cell-free DNA (cfDNA) assays in the early detection of cancers cfDNA assays are expected to be integrated into the clinical workflow after technological refinement and clinical trial validation The development and application strategies of cfDNA assays in various cancers and clinical scenarios can vary, and the harm-and-benefit should be balanced carefully
Collapse
Affiliation(s)
- Qiang Gao
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Fudan University, Shanghai 200032, China
- Key Laboratory of Medical Epigenetics and Metabolism, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, China
| | - Qiang Zeng
- Health Management Institute, The Second Medical Center & National Clinical Research Center for Geriatric Diseases, Chinese PLA General Hospital, Beijing 100853, China
| | - Zhijie Wang
- State Key Laboratory of Molecular Oncology, Department of Medical Oncology, National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100006, China
| | | | - Yu Xu
- Burning Rock Biotech, Guangzhou 510320, China
| | - Peng Cui
- Burning Rock Biotech, Guangzhou 510320, China
| | - Xin Zhu
- Burning Rock Biotech, Guangzhou 510320, China
| | - Huafei Lu
- Burning Rock Biotech, Guangzhou 510320, China
| | | | - Shangli Cai
- Burning Rock Biotech, Guangzhou 510320, China
- Corresponding author
| | - Jie Wang
- State Key Laboratory of Molecular Oncology, Department of Medical Oncology, National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100006, China
- Corresponding author
| | - Jia Fan
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Fudan University, Shanghai 200032, China
- Key Laboratory of Medical Epigenetics and Metabolism, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, China
- Corresponding author
| |
Collapse
|
35
|
Xu Y, Qian X, Zhang X, Lai X, Liu Y, Wang J. DeepLION: Deep Multi-Instance Learning Improves the Prediction of Cancer-Associated T Cell Receptors for Accurate Cancer Detection. Front Genet 2022; 13:860510. [PMID: 35601486 PMCID: PMC9121378 DOI: 10.3389/fgene.2022.860510] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 02/23/2022] [Indexed: 01/21/2023] Open
Abstract
Recent studies highlight the potential of T cell receptor (TCR) repertoires in accurately detecting cancers via noninvasive sampling. Unfortunately, due to the complicated associations among cancer antigens and the possible induced T cell responses, currently, the practical strategy for identifying cancer-associated TCRs is the computational prediction based on TCR repertoire data. Several state-of-the-art methods were proposed in recent year or two; however, the prediction algorithms were still weakened by two major issues. To facilitate the computational processes, the algorithms prefer to decompose the original TCR sequences into length-fixed amino acid fragments, while the first dilemma comes as the lengths of cancer-associated motifs are suggested to be various. Moreover, the correlations among TCRs in the same repertoire should be further considered, which are often ignored by the existing methods. We here developed a deep multi-instance learning method, named DeepLION, to improve the prediction of cancer-associated TCRs by considering these issues. First, DeepLION introduced a deep learning framework with alternative convolution filters and 1-max pooling operations to handle the amino acid fragments with different lengths. Then, the multi-instance learning framework modeled the TCR correlations and assigned adjusted weights for each TCR sequence during the predicting process. To validate the performance of DeepLION, we conducted a series of experiments on several cohorts of patients from nine cancer types. Compared to the existing methods, DeepLION achieved, on most of the cohorts, higher prediction accuracies, sensitivities, specificities, and areas under the curve (AUCs), where the AUC reached notably 0.97 and 0.90 for thyroid and lung cancer cohorts, respectively. Thus, DeepLION may further support the detection of cancers from TCR repertoire data. DeepLION is publicly available on GitHub, at https://github.com/Bioinformatics7181/DeepLION, for academic usage only.
Collapse
Affiliation(s)
- Ying Xu
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xinyang Qian
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xuanping Zhang
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Xin Lai
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Yuqian Liu
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
| | - Jiayin Wang
- Department of Computer Science and Technology, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
- *Correspondence: Jiayin Wang,
| |
Collapse
|
36
|
Crosby D, Bhatia S, Brindle KM, Coussens LM, Dive C, Emberton M, Esener S, Fitzgerald RC, Gambhir SS, Kuhn P, Rebbeck TR, Balasubramanian S. Early detection of cancer. Science 2022; 375:eaay9040. [PMID: 35298272 DOI: 10.1126/science.aay9040] [Citation(s) in RCA: 350] [Impact Index Per Article: 116.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Survival improves when cancer is detected early. However, ~50% of cancers are at an advanced stage when diagnosed. Early detection of cancer or precancerous change allows early intervention to try to slow or prevent cancer development and lethality. To achieve early detection of all cancers, numerous challenges must be overcome. It is vital to better understand who is at greatest risk of developing cancer. We also need to elucidate the biology and trajectory of precancer and early cancer to identify consequential disease that requires intervention. Insights must be translated into sensitive and specific early detection technologies and be appropriately evaluated to support practical clinical implementation. Interdisciplinary collaboration is key; advances in technology and biological understanding highlight that it is time to accelerate early detection research and transform cancer survival.
Collapse
Affiliation(s)
| | - Sangeeta Bhatia
- Marble Center for Cancer Nanomedicine, Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kevin M Brindle
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Lisa M Coussens
- Cell, Developmental and Cancer Biology, Oregon Health and Science University, Portland, OR, USA
- Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA
| | - Caroline Dive
- Cancer Research UK Lung Cancer Centre of Excellence at the University of Manchester and University College London, University of Manchester, Manchester, UK
- CRUK Manchester Institute Cancer Biomarker Centre, University of Manchester, Manchester, UK
| | - Mark Emberton
- Division of Surgery and Interventional Science, University College London, London, UK
| | - Sadik Esener
- Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA
- Department of Biomedical Engineering, School of Medicine, Oregon Health and Science University, Portland, OR, USA
- Cancer Early Detection Advanced Research Center, Oregon Health and Science University, Portland, OR, USA
| | - Rebecca C Fitzgerald
- Medical Research Council (MRC) Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK
| | - Sanjiv S Gambhir
- Department of Radiology, Molecular Imaging Program at Stanford, Stanford University, Stanford, CA, USA
| | - Peter Kuhn
- USC Michelson Center Convergent Science Institute in Cancer, University of Southern California, Los Angeles, CA, USA
| | - Timothy R Rebbeck
- Division of Population Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Shankar Balasubramanian
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
37
|
Wang Z, Zhong Y, Zhang Z, Zhou K, Huang Z, Yu H, Liu L, Liu S, Yang H, Zhou J, Fan J, Wu L, Sun Y. Characteristics and Clinical Significance of T-Cell Receptor Repertoire in Hepatocellular Carcinoma. Front Immunol 2022; 13:847263. [PMID: 35371059 PMCID: PMC8965762 DOI: 10.3389/fimmu.2022.847263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Accepted: 02/09/2022] [Indexed: 11/13/2022] Open
Abstract
Several studies have demonstrated that the T-cell receptor (TCR) repertoire is associated with prognosis and immune therapy response in several types of cancer. However, the comprehensive features of TCR repertoire in tumor-infiltrating and circulating T cells, as well as its clinical significance of diagnosis in hepatocellular carcinoma (HCC) patients, are still unknown. In this study, we perform paired tumor/peritumoral tissues and peripheral blood samples from 58 patients with HCC and sequenced them with high-throughput TCR to comprehensively analyze the characteristics of TCR and the clinical significance of peripheral TCR sequence. By exploring the abundance and diversity of TCR repertoires, we observe that there was a significantly higher TCR diversity in peripheral blood than in tumoral and peritumoral tissues, while tumoral and peritumoral tissues showed similar TCR diversity. A substantial difference in the usage frequencies of several Vβ, Jβ genes, and TCRβ VJ pairings was found among three types of tissues. Moreover, we reveal that HCC patients have a unique profile of TCR repertoire in peripheral blood in contrast to healthy individuals. We further establish an HCC diagnostic model based on TCRβ VJ pairing usage in peripheral blood, which yields a best-fit area under the curve (AUC) of 0.9746 ± 0.0481 (sensitivity = 0.9675 ± 0.0603, specificity = 0.9998 ± 0.0007, average of 100 repeats) in the test set. Our study describes the characteristics of tissue infiltration and circulating T-cell bank in patients with HCC and shows the potential of using circulating TCR sequence as a biomarker for the non-invasive diagnosis of patients with HCC.
Collapse
Affiliation(s)
- Zifei Wang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
- Zhong-Hua Precision Medical Center, Zhongshan Hospital, Fudan University-BGI, Shanghai, China
| | - Yu Zhong
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
- Zhong-Hua Precision Medical Center, Zhongshan Hospital, Fudan University-BGI, Shanghai, China
| | - Zefan Zhang
- Zhong-Hua Precision Medical Center, Zhongshan Hospital, Fudan University-BGI, Shanghai, China
- Department of Liver Surgery & Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Shanghai, China
| | - Kaiqian Zhou
- Zhong-Hua Precision Medical Center, Zhongshan Hospital, Fudan University-BGI, Shanghai, China
- Department of Liver Surgery & Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Shanghai, China
| | - Zhihao Huang
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
| | - Hao Yu
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
| | - Longqi Liu
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
- Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen, China
| | - Shiping Liu
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
- Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen, China
| | - Huanming Yang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Jian Zhou
- Department of Liver Surgery & Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Shanghai, China
| | - Jia Fan
- Department of Liver Surgery & Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Shanghai, China
| | - Liang Wu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
- Beijing Genomics Institute at Shenzhen, Shenzhen, China
- Zhong-Hua Precision Medical Center, Zhongshan Hospital, Fudan University-BGI, Shanghai, China
- Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen, China
| | - Yunfan Sun
- Zhong-Hua Precision Medical Center, Zhongshan Hospital, Fudan University-BGI, Shanghai, China
- Department of Liver Surgery & Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Shanghai, China
| |
Collapse
|
38
|
Pauken KE, Lagattuta KA, Lu BY, Lucca LE, Daud AI, Hafler DA, Kluger HM, Raychaudhuri S, Sharpe AH. TCR-sequencing in cancer and autoimmunity: barcodes and beyond. Trends Immunol 2022; 43:180-194. [PMID: 35090787 PMCID: PMC8882139 DOI: 10.1016/j.it.2022.01.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 01/04/2022] [Accepted: 01/04/2022] [Indexed: 01/21/2023]
Abstract
The T cell receptor (TCR) endows T cells with antigen specificity and is central to nearly all aspects of T cell function. Each naïve T cell has a unique TCR sequence that is stably maintained during cell division. In this way, the TCR serves as a molecular barcode that tracks processes such as migration, differentiation, and proliferation of T cells. Recent technological advances have enabled sequencing of the TCR from single cells alongside deep molecular phenotypes on an unprecedented scale. In this review, we discuss strengths and limitations of TCR sequences as molecular barcodes and their application to study immune responses following Programmed Death-1 (PD-1) blockade in cancer. Additionally, we consider applications of TCR data beyond use as a barcode.
Collapse
Affiliation(s)
- Kristen E Pauken
- Department of Immunology, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Evergrande Center for Immunological Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA.
| | - Kaitlyn A Lagattuta
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Benjamin Y Lu
- Department of Neurology and Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA; Department of Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Liliana E Lucca
- Department of Neurology and Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
| | - Adil I Daud
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - David A Hafler
- Department of Neurology and Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Harriet M Kluger
- Department of Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA; Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Centre for Genetics and Genomics Versus Arthritis, Manchester Academic Health Science Centre, University of Manchester, Manchester M13 9PL, UK
| | - Arlene H Sharpe
- Department of Immunology, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Evergrande Center for Immunological Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
39
|
Wong C, Li B. AutoCAT: automated cancer-associated TCRs discovery from TCR-seq data. Bioinformatics 2022; 38:589-591. [PMID: 34529039 DOI: 10.1093/bioinformatics/btab661] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/19/2021] [Accepted: 09/13/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY T cells participate directly in the body's immune response to cancer, allowing immunotherapy treatments to effectively recognize and target cancer cells. We previously developed DeepCAT to demonstrate that T cells serve as a biomarker of immune response in cancer patients and can be utilized as a diagnostic tool to differentiate healthy and cancer patient samples. However, DeepCAT's reliance on tumor bulk RNA-seq samples as training data limited its further performance improvement. Here, we benchmarked a new approach, AutoCAT, to predict tumor-associated TCRs from targeted TCR-seq data as a new form of input for DeepCAT, and observed the same level of predictive accuracy. AVAILABILITY AND IMPLEMENTATION Source code is freely available at https://github.com/cew88/AutoCAT, and data is available at 10.5281/zenodo.5176884. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christina Wong
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Bo Li
- Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
40
|
Nakayama M, Michels AW. Using the T Cell Receptor as a Biomarker in Type 1 Diabetes. Front Immunol 2021; 12:777788. [PMID: 34868047 PMCID: PMC8635517 DOI: 10.3389/fimmu.2021.777788] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 10/26/2021] [Indexed: 12/20/2022] Open
Abstract
T cell receptors (TCRs) are unique markers that define antigen specificity for a given T cell. With the evolution of sequencing and computational analysis technologies, TCRs are now prime candidates for the development of next-generation non-cell based T cell biomarkers, which provide a surrogate measure to assess the presence of antigen-specific T cells. Type 1 diabetes (T1D), the immune-mediated form of diabetes, is a prototypical organ specific autoimmune disease in which T cells play a pivotal role in targeting pancreatic insulin-producing beta cells. While the disease is now predictable by measuring autoantibodies in the peripheral blood directed to beta cell proteins, there is an urgent need to develop T cell markers that recapitulate T cell activity in the pancreas and can be a measure of disease activity. This review focuses on the potential and challenges of developing TCR biomarkers for T1D. We summarize current knowledge about TCR repertoires and clonotypes specific for T1D and discuss challenges that are unique for autoimmune diabetes. Ultimately, the integration of large TCR datasets produced from individuals with and without T1D along with computational 'big data' analysis will facilitate the development of TCRs as potentially powerful biomarkers in the development of T1D.
Collapse
Affiliation(s)
- Maki Nakayama
- Barbara Davis Center for Childhood Diabetes, University of Colorado School of Medicine, Aurora, CO, United States.,Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, United States.,Department of Immunology and Microbiology, University of Colorado School of Medicine, Aurora, CO, United States
| | - Aaron W Michels
- Barbara Davis Center for Childhood Diabetes, University of Colorado School of Medicine, Aurora, CO, United States.,Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, United States.,Department of Immunology and Microbiology, University of Colorado School of Medicine, Aurora, CO, United States.,Department of Medicine, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
41
|
The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00413-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
42
|
GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation. Nat Commun 2021; 12:4699. [PMID: 34349111 PMCID: PMC8339063 DOI: 10.1038/s41467-021-25006-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 07/19/2021] [Indexed: 01/18/2023] Open
Abstract
Similarity in T-cell receptor (TCR) sequences implies shared antigen specificity between receptors, and could be used to discover novel therapeutic targets. However, existing methods that cluster T-cell receptor sequences by similarity are computationally inefficient, making them impractical to use on the ever-expanding datasets of the immune repertoire. Here, we developed GIANA (Geometric Isometry-based TCR AligNment Algorithm) a computationally efficient tool for this task that provides the same level of clustering specificity as TCRdist at 600 times its speed, and without sacrificing accuracy. GIANA also allows the rapid query of large reference cohorts within minutes. Using GIANA to cluster large-scale TCR datasets provides candidate disease-specific receptors, and provides a new solution to repertoire classification. Querying unseen TCR-seq samples against an existing reference differentiates samples from patients across various cohorts associated with cancer, infectious and autoimmune disease. Our results demonstrate how GIANA could be used as the basis for a TCR-based non-invasive multi-disease diagnostic platform. Grouping T-cell receptors (TCRs) by sequence similarity could lead to new immunological insights. Here, the authors propose a tool that allows the rapid clustering of millions of TCR sequences, identifying TCRs potentially associated with the response to cancer, infectious and autoimmune diseases.
Collapse
|
43
|
Wang C, Shi M, Zhang L, Ji J, Xie R, Wu C, Guo X, Yang Y, Zhou W, Peng C, Zhang H, Yuan F, Zhang J. Identification of KRAS G12V associated clonal neoantigens and immune microenvironment in long-term survival of pancreatic adenocarcinoma. Cancer Immunol Immunother 2021; 71:491-504. [PMID: 34255132 PMCID: PMC8783870 DOI: 10.1007/s00262-021-03012-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 07/06/2021] [Indexed: 11/06/2022]
Abstract
Objective To investigate the molecular characteristics in tumor immune microenvironment that affect long-term survival of patients with pancreatic adenocarcinoma (PAAD). Methods The tumor related genetic features of a female PAAD patient (over 13-year survival) who suffered from multiple recurrences and metastases, and six operations over one decade were investigated deeply. Genomic features and immune microenvironment signatures of her primary lesion as well as six metastatic tumors at different time-points were characterized. Results High-frequency clonal neoantigenic mutations identified in these specimens revealed the significant associations between clonal neoantigens with her prognosis after each surgery. Meanwhile, the TCGA and ICGC databases were employed to analyse the function of KRAS G12V in pancreatic cancer. Conclusions The genomic analysis of clonal neoantigens combined with tumor immune microenvironment could promote the understandings of personalized prognostic evaluation and the stratification of resected PAAD individuals with better outcome. Supplementary Information The online version contains supplementary material available at 10.1007/s00262-021-03012-4.
Collapse
Affiliation(s)
- Chao Wang
- Department of Oncology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China.,State Key Laboratory of Oncogenes and Related Genes, Shanghai Jiao Tong University, Shanghai, 200032, China
| | - Min Shi
- Department of Oncology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China
| | - Lei Zhang
- Genecast Biotechnology Co., Ltd, Wuxi City, 214104, Jiangsu, China
| | - Jun Ji
- Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China
| | - Ruyan Xie
- VIP Health Center, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China
| | - Chao Wu
- VIP Health Center, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China
| | - Xianchao Guo
- Genecast Biotechnology Co., Ltd, Wuxi City, 214104, Jiangsu, China
| | - Ying Yang
- Genecast Biotechnology Co., Ltd, Wuxi City, 214104, Jiangsu, China
| | - Wei Zhou
- Genecast Biotechnology Co., Ltd, Wuxi City, 214104, Jiangsu, China
| | - Chenhong Peng
- Department of Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China
| | - Henghui Zhang
- Genecast Biotechnology Co., Ltd, Wuxi City, 214104, Jiangsu, China
| | - Fei Yuan
- Department of Pathology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China.
| | - Jun Zhang
- Department of Oncology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijin er Road, Shanghai, 200025, China. .,State Key Laboratory of Oncogenes and Related Genes, Shanghai Jiao Tong University, Shanghai, 200032, China.
| |
Collapse
|
44
|
Xiong D, Zhang Z, Wang T, Wang X. A comparative study of multiple instance learning methods for cancer detection using T-cell receptor sequences. Comput Struct Biotechnol J 2021; 19:3255-3268. [PMID: 34141144 PMCID: PMC8192570 DOI: 10.1016/j.csbj.2021.05.038] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 05/12/2021] [Accepted: 05/20/2021] [Indexed: 11/02/2022] Open
Abstract
As a branch of machine learning, multiple instance learning (MIL) learns from a collection of labeled bags, each containing a set of instances. The learning process is weakly supervised due to ambiguous instance labels. Since its emergence, MIL has been applied to solve various problems including content-based image retrieval, object tracking/detection, and computer-aided diagnosis. In biomedical research, the use of MIL has been focused on medical image analysis and molecule activity prediction. We review and apply 16 methods to investigate the applicability of MIL to a novel biomedical application, cancer detection using T-cell receptor (TCR) sequences. This important application can be a viable approach for large-scale cancer screening, as TCRs can be easily profiled from a subject's peripheral blood. We consider two feasible data-generating mechanisms, and for the purpose of performance evaluation, we simulate data under each mechanism, where we vary potentially important factors to mimic realistic situations. We also apply the methods to sequencing data of ten cancer types from The Cancer Genome Atlas, as an early proof of concept for distinguishing tumor patients from healthy individuals via TCR sequencing of peripheral blood. We find that given an appropriate MIL method is used, satisfactory performance with Area Under the Receiver Operating Characteristic Curve above 80% can be achieved for five in the ten cancers. Based on our numerical results, we make suggestions about selection of a proper method and avoidance of any method with poor performance. We further point out directions of future research as well as identify a pressing need of new MIL methodologies for improved performance (for some cancer types) and more explainable outcomes.
Collapse
Affiliation(s)
- Danyi Xiong
- Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, Dallas 75275, TX, USA
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390, TX, USA
| | - Ze Zhang
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390, TX, USA
| | - Tao Wang
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390, TX, USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, 3225 Daniel Avenue, Dallas 75275, TX, USA
| |
Collapse
|
45
|
Shoukat MS, Foers AD, Woodmansey S, Evans SC, Fowler A, Soilleux EJ. Use of machine learning to identify a T cell response to SARS-CoV-2. Cell Rep Med 2021; 2:100192. [PMID: 33495756 PMCID: PMC7816879 DOI: 10.1016/j.xcrm.2021.100192] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 12/08/2020] [Accepted: 01/12/2021] [Indexed: 12/29/2022]
Abstract
The identification of SARS-CoV-2-specific T cell receptor (TCR) sequences is critical for understanding T cell responses to SARS-CoV-2. Accordingly, we reanalyze publicly available data from SARS-CoV-2-recovered patients who had low-severity disease (n = 17) and SARS-CoV-2 infection-naive (control) individuals (n = 39). Applying a machine learning approach to TCR beta (TRB) repertoire data, we can classify patient/control samples with a training sensitivity, specificity, and accuracy of 88.2%, 100%, and 96.4% and a testing sensitivity, specificity, and accuracy of 82.4%, 97.4%, and 92.9%, respectively. Interestingly, the same machine learning approach cannot separate SARS-CoV-2 recovered from SARS-CoV-2 infection-naive individual samples on the basis of B cell receptor (immunoglobulin heavy chain; IGH) repertoire data, suggesting that the T cell response to SARS-CoV-2 may be more stereotyped and longer lived. Following validation in larger cohorts, our method may be useful in detecting protective immunity acquired through natural infection or in determining the longevity of vaccine-induced immunity.
Collapse
Affiliation(s)
- M. Saad Shoukat
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Andrew D. Foers
- Department of Pathology, University of Cambridge, Cambridge, UK
| | | | | | - Anna Fowler
- Department of Health Data Science, Institute of Population Health, University of Liverpool, Liverpool, UK
| | | |
Collapse
|
46
|
Greiff V, Yaari G, Cowell LG. Mining adaptive immune receptor repertoires for biological and clinical information using machine learning. ACTA ACUST UNITED AC 2020. [DOI: 10.1016/j.coisb.2020.10.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|