1
|
Yu Q, Shen X, Yi L, Liang M, Li G, Guan Z, Wu X, Castel H, Hu B, Yin P, Zhang W. Fragment-Fusion Transformer: Deep Learning-Based Discretization Method for Continuous Single-Cell Raman Spectral Analysis. ACS Sens 2024. [PMID: 38934798 DOI: 10.1021/acssensors.4c00149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
Raman spectroscopy has become an important single-cell analysis tool for monitoring biochemical changes at the cellular level. However, Raman spectral data, typically presented as continuous data with high-dimensional characteristics, is distinct from discrete sequences, which limits the application of deep learning-based algorithms in data analysis due to the lack of discretization. Herein, a model called fragment-fusion transformer is proposed, which integrates the discrete fragmentation of continuous spectra based on their intrinsic characteristics with the extraction of intrafragment features and the fusion of interfragment features. The model integrates the intrinsic feature-based fragmentation of spectra with transformer, constructing the fragment transformer block for feature extraction within fragments. Interfragment information is combined through the pyramid design structure to improve the model's receptive field and fully exploit the spectral properties. During the pyramidal fusion process, the information gain of the final extracted features in the spectrum has been enhanced by a factor of 9.24 compared to the feature extraction stage within the fragment, and the information entropy has been enhanced by a factor of 13. The fragment-fusion transformer achieved a spectral recognition accuracy of 94.5%, which is 4% higher compared to the method without fragmentation and fusion processes on the test set of cell Raman spectroscopy identification experiments. In comparison to common spectral classification models such as KNN, SVM, logistic regression, and CNN, fragment-fusion transformer has achieved 4.4% higher accuracy than the best-performing CNN model. Fragment-fusion transformer method has the potential to serve as a general framework for discretization in the field of continuous spectral data analysis and as a research tool for analyzing the intrinsic information within spectra.
Collapse
Affiliation(s)
- Qiang Yu
- Hangzhou Institute of Technology, Xidian University, Hangzhou, Zhejiang 311200, China
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
| | - Xiaokun Shen
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
| | - LangLang Yi
- School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
| | - Minghui Liang
- School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
| | - Guoqian Li
- School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
| | - Zhihui Guan
- School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
| | - Xiaoyao Wu
- School of Mathematics and Physics Science and Engineering, Hebei University of Engineering, Handan, Hebei 056038, China
| | - Helene Castel
- Institute of Research and Biomedical Innovation, University of Rouen Normandie, Mont-Saint-Aignan, 76821, France
| | - Bo Hu
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710126, China
- School of Mathematics and Physics Science and Engineering, Hebei University of Engineering, Handan, Hebei 056038, China
- Xi'an Intelligent Precision Diagnosis and Treatment International Science and Technology Cooperation Base, Xi'an, Shaanxi 710126, China
| | - Pengju Yin
- School of Mathematics and Physics Science and Engineering, Hebei University of Engineering, Handan, Hebei 056038, China
| | - Wenbo Zhang
- School of Electronic Engineering, Xidian University, Xi'an, Shaanxi 710126, China
| |
Collapse
|
2
|
Yang Q, Xu L, Dong W, Li X, Wang K, Dong S, Zhang X, Yang T, Jiang F, Zhang B, Luo G, Gao X, Wang G. HLAIImaster: a deep learning method with adaptive domain knowledge predicts HLA II neoepitope immunogenic responses. Brief Bioinform 2024; 25:bbae302. [PMID: 38920343 PMCID: PMC11200192 DOI: 10.1093/bib/bbae302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/20/2024] [Accepted: 06/11/2024] [Indexed: 06/27/2024] Open
Abstract
While significant strides have been made in predicting neoepitopes that trigger autologous CD4+ T cell responses, accurately identifying the antigen presentation by human leukocyte antigen (HLA) class II molecules remains a challenge. This identification is critical for developing vaccines and cancer immunotherapies. Current prediction methods are limited, primarily due to a lack of high-quality training epitope datasets and algorithmic constraints. To predict the exogenous HLA class II-restricted peptides across most of the human population, we utilized the mass spectrometry data to profile >223 000 eluted ligands over HLA-DR, -DQ, and -DP alleles. Here, by integrating these data with peptide processing and gene expression, we introduce HLAIImaster, an attention-based deep learning framework with adaptive domain knowledge for predicting neoepitope immunogenicity. Leveraging diverse biological characteristics and our enhanced deep learning framework, HLAIImaster is significantly improved against existing tools in terms of positive predictive value across various neoantigen studies. Robust domain knowledge learning accurately identifies neoepitope immunogenicity, bridging the gap between neoantigen biology and the clinical setting and paving the way for future neoantigen-based therapies to provide greater clinical benefit. In summary, we present a comprehensive exploitation of the immunogenic neoepitope repertoire of cancers, facilitating the effective development of "just-in-time" personalized vaccines.
Collapse
Affiliation(s)
- Qiang Yang
- School of Medicine and Health, Harbin Institute of Technology, Yikuang Street, Harbin 150000, China
| | - Long Xu
- School of Computer Science and Technology, Harbin Institute of Technology, West Dazhi Street, Harbin 150001, China
| | - Weihe Dong
- College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, Harbin 150004, China
| | - Xiaokun Li
- School of Computer Science and Technology, Harbin Institute of Technology, West Dazhi Street, Harbin 150001, China
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, Harbin 150080, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, Harbin 150090, China
- Shandong Hengxun Technology Co., Ltd., Miaoling Road, Qingdao 266100, China
| | - Kuanquan Wang
- School of Computer Science and Technology, Harbin Institute of Technology, West Dazhi Street, Harbin 150001, China
| | - Suyu Dong
- College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, Harbin 150004, China
| | - Xianyu Zhang
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Haping Road, Harbin 150081, China
| | - Tiansong Yang
- Department of Rehabilitation, The First Affiliated Hospital of Heilongjiang University of Traditional Chinese Medicine, and Traditional Chinese Medicine Informatics Key Laboratory of Heilongjiang Province, Heping Road, Harbin 150040, China
| | - Feng Jiang
- School of Medicine and Health, Harbin Institute of Technology, Yikuang Street, Harbin 150000, China
| | - Bin Zhang
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
| | - Gongning Luo
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, Hexing Road, Harbin 150004, China
| |
Collapse
|
3
|
Wu Y, Shao W, Yan M, Wang Y, Xu P, Huang G, Li X, Gregory BD, Yang J, Wang H, Yu X. Transfer learning enables identification of multiple types of RNA modifications using nanopore direct RNA sequencing. Nat Commun 2024; 15:4049. [PMID: 38744925 PMCID: PMC11094168 DOI: 10.1038/s41467-024-48437-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 04/26/2024] [Indexed: 05/16/2024] Open
Abstract
Nanopore direct RNA sequencing (DRS) has emerged as a powerful tool for RNA modification identification. However, concurrently detecting multiple types of modifications in a single DRS sample remains a challenge. Here, we develop TandemMod, a transferable deep learning framework capable of detecting multiple types of RNA modifications in single DRS data. To train high-performance TandemMod models, we generate in vitro epitranscriptome datasets from cDNA libraries, containing thousands of transcripts labeled with various types of RNA modifications. We validate the performance of TandemMod on both in vitro transcripts and in vivo human cell lines, confirming its high accuracy for profiling m6A and m5C modification sites. Furthermore, we perform transfer learning for identifying other modifications such as m7G, Ψ, and inosine, significantly reducing training data size and running time without compromising performance. Finally, we apply TandemMod to identify 3 types of RNA modifications in rice grown in different environments, demonstrating its applicability across species and conditions. In summary, we provide a resource with ground-truth labels that can serve as benchmark datasets for nanopore-based modification identification methods, and TandemMod for identifying diverse RNA modifications using a single DRS sample.
Collapse
Affiliation(s)
- You Wu
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Wenna Shao
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Mengxiao Yan
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Yuqin Wang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Pengfei Xu
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Guoqiang Huang
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaofei Li
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Brian D Gregory
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jun Yang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China.
- Chenshan Scientific Research Center of CAS Center for Excellence in Molecular Plant Sciences, Shanghai, 201602, China.
| | - Hongxia Wang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China.
- Chenshan Scientific Research Center of CAS Center for Excellence in Molecular Plant Sciences, Shanghai, 201602, China.
| | - Xiang Yu
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
4
|
Kawakita S, Shen A, Chao CC, Wang Z, Cheng S, Li B, Jiang C. An integrated database of experimentally validated major histocompatibility complex epitopes for antigen-specific cancer therapy. Antib Ther 2024; 7:177-186. [PMID: 38933532 PMCID: PMC11200702 DOI: 10.1093/abt/tbae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/18/2024] [Indexed: 06/28/2024] Open
Abstract
Cancer immunotherapy represents a paradigm shift in oncology, offering a superior anti-tumor efficacy and the potential for durable remission. The success of personalized vaccines and cell therapies hinges on the identification of immunogenic epitopes capable of eliciting an effective immune response. Current limitations in the availability of immunogenic epitopes restrict the broader application of such therapies. A critical criterion for serving as potential cancer antigens is their ability to stably bind to the major histocompatibility complex (MHC) for presentation on the surface of tumor cells. To address this, we have developed a comprehensive database of MHC epitopes, experimentally validated for their MHC binding and cell surface presentation. Our database catalogs 451 065 MHC peptide epitopes, each with experimental evidence for MHC binding, along with detailed information on human leukocyte antigen allele specificity, source peptides, and references to original studies. We also provide the grand average of hydropathy scores and predicted immunogenicity for the epitopes. The database (MHCepitopes) has been made available on the web and can be accessed at https://github.com/jcm1201/MHCepitopes.git. By consolidating empirical data from various sources coupled with calculated immunogenicity and hydropathy values, our database offers a robust resource for selecting actionable tumor antigens and advancing the design of antigen-specific cancer immunotherapies. It streamlines the process of identifying promising immunotherapeutic targets, potentially expediting the development of effective antigen-based cancer immunotherapies.
Collapse
Affiliation(s)
- Satoru Kawakita
- Department of Precision Medicine, Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90024, United States
| | - Aidan Shen
- Department of Precision Medicine, Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90024, United States
| | - Cheng-Chi Chao
- Department of Pipeline Development, Biomap, Inc., Palo Alto, CA 94303, United States
| | - Zhaohui Wang
- Department of Precision Medicine, Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90024, United States
| | - Siliangyu Cheng
- Quantitative and Computational Biology Department, University of Southern California, Los Angeles, CA 90089, United States
| | - Bingbing Li
- Autonomy Research Center for STEAHM (ARCS), California State University Northridge, Northridge, CA 91324, United States
| | - Chongming Jiang
- Department of Precision Medicine, Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90024, United States
| |
Collapse
|
5
|
Zhang L, Song W, Zhu T, Liu Y, Chen W, Cao Y. ConvNeXt-MHC: improving MHC-peptide affinity prediction by structure-derived degenerate coding and the ConvNeXt model. Brief Bioinform 2024; 25:bbae133. [PMID: 38561979 PMCID: PMC10985285 DOI: 10.1093/bib/bbae133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/11/2024] [Accepted: 03/02/2024] [Indexed: 04/04/2024] Open
Abstract
Peptide binding to major histocompatibility complex (MHC) proteins plays a critical role in T-cell recognition and the specificity of the immune response. Experimental validation such peptides is extremely resource-intensive. As a result, accurate computational prediction of binding peptides is highly important, particularly in the context of cancer immunotherapy applications, such as the identification of neoantigens. In recent years, there is a significant need to continually improve the existing prediction methods to meet the demands of this field. We developed ConvNeXt-MHC, a method for predicting MHC-I-peptide binding affinity. It introduces a degenerate encoding approach to enhance well-established panspecific methods and integrates transfer learning and semi-supervised learning methods into the cutting-edge deep learning framework ConvNeXt. Comprehensive benchmark results demonstrate that ConvNeXt-MHC outperforms state-of-the-art methods in terms of accuracy. We expect that ConvNeXt-MHC will help us foster new discoveries in the field of immunoinformatics in the distant future. We constructed a user-friendly website at http://www.combio-lezhang.online/predict/, where users can access our data and application.
Collapse
Affiliation(s)
- Le Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Wenkai Song
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Tinghao Zhu
- College of Computer Science, Sichuan University, Chengdu 610065, China
- Nuclear Power Institute of China, Chengdu 610213, China
| | - Yang Liu
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, No. 29 Wangjiang Road, Chengdu 610065, China
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, No. 29 Wangjiang Road, Chengdu 610065, China
| |
Collapse
|
6
|
O'Brien H, Salm M, Morton LT, Szukszto M, O'Farrell F, Boulton C, Becker PD, Samuels Y, Swanton C, Mansour MR, Reker Hadrup S, Quezada SA. Breaking the performance ceiling for neoantigen immunogenicity prediction. NATURE CANCER 2023; 4:1618-1621. [PMID: 38102360 DOI: 10.1038/s43018-023-00675-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Affiliation(s)
| | - Max Salm
- Achilles Therapeutics Ltd, London, UK.
| | | | - Maciej Szukszto
- Cancer Immunology Unit, Research Department of Haematology, University College London Cancer Institute, London, UK
| | | | - Charlotte Boulton
- Cancer Immunology Unit, Research Department of Haematology, University College London Cancer Institute, London, UK
| | | | | | | | - Marc R Mansour
- Cancer Immunology Unit, Research Department of Haematology, University College London Cancer Institute, London, UK
| | | | - Sergio A Quezada
- Achilles Therapeutics Ltd, London, UK.
- Cancer Immunology Unit, Research Department of Haematology, University College London Cancer Institute, London, UK.
| |
Collapse
|
7
|
Mariuzza RA, Wu D, Pierce BG. Structural basis for T cell recognition of cancer neoantigens and implications for predicting neoepitope immunogenicity. Front Immunol 2023; 14:1303304. [PMID: 38045695 PMCID: PMC10693334 DOI: 10.3389/fimmu.2023.1303304] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 11/03/2023] [Indexed: 12/05/2023] Open
Abstract
Adoptive cell therapy (ACT) with tumor-specific T cells has been shown to mediate durable cancer regression. Tumor-specific T cells are also the basis of other therapies, notably cancer vaccines. The main target of tumor-specific T cells are neoantigens resulting from mutations in self-antigens over the course of malignant transformation. The detection of neoantigens presents a major challenge to T cells because of their high structural similarity to self-antigens, and the need to avoid autoimmunity. How different a neoantigen must be from its wild-type parent for it to induce a T cell response is poorly understood. Here we review recent structural and biophysical studies of T cell receptor (TCR) recognition of shared cancer neoantigens derived from oncogenes, including p53R175H, KRASG12D, KRASG12V, HHATp8F, and PIK3CAH1047L. These studies have revealed that, in some cases, the oncogenic mutation improves antigen presentation by strengthening peptide-MHC binding. In other cases, the mutation is detected by direct interactions with TCR, or by energetically driven or other indirect strategies not requiring direct TCR contacts with the mutation. We also review antibodies designed to recognize peptide-MHC on cell surfaces (TCR-mimic antibodies) as an alternative to TCRs for targeting cancer neoantigens. Finally, we review recent computational advances in this area, including efforts to predict neoepitope immunogenicity and how these efforts may be advanced by structural information on peptide-MHC binding and peptide-MHC recognition by TCRs.
Collapse
Affiliation(s)
- Roy A. Mariuzza
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD, United States
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, United States
| | - Daichao Wu
- Laboratory of Structural Immunology, Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Hengyang Medical School, University of South China, Hengyang, Hunan, China
| | - Brian G. Pierce
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD, United States
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, United States
| |
Collapse
|