1
|
Gondal MN, Shah SUR, Chinnaiyan AM, Cieslik M. A systematic overview of single-cell transcriptomics databases, their use cases, and limitations. FRONTIERS IN BIOINFORMATICS 2024; 4:1417428. [PMID: 39040140 PMCID: PMC11260681 DOI: 10.3389/fbinf.2024.1417428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Accepted: 06/11/2024] [Indexed: 07/24/2024] Open
Abstract
Rapid advancements in high-throughput single-cell RNA-seq (scRNA-seq) technologies and experimental protocols have led to the generation of vast amounts of transcriptomic data that populates several online databases and repositories. Here, we systematically examined large-scale scRNA-seq databases, categorizing them based on their scope and purpose such as general, tissue-specific databases, disease-specific databases, cancer-focused databases, and cell type-focused databases. Next, we discuss the technical and methodological challenges associated with curating large-scale scRNA-seq databases, along with current computational solutions. We argue that understanding scRNA-seq databases, including their limitations and assumptions, is crucial for effectively utilizing this data to make robust discoveries and identify novel biological insights. Such platforms can help bridge the gap between computational and wet lab scientists through user-friendly web-based interfaces needed for democratizing access to single-cell data. These platforms would facilitate interdisciplinary research, enabling researchers from various disciplines to collaborate effectively. This review underscores the importance of leveraging computational approaches to unravel the complexities of single-cell data and offers a promising direction for future research in the field.
Collapse
Affiliation(s)
- Mahnoor N. Gondal
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, United States
| | - Saad Ur Rehman Shah
- Gies College of Business, University of Illinois Business College, Champaign, MI, United States
| | - Arul M. Chinnaiyan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, United States
- Department of Pathology, University of Michigan, Ann Arbor, MI, United States
- Department of Urology, University of Michigan, Ann Arbor, MI, United States
- Howard Hughes Medical Institute, Ann Arbor, MI, United States
- University of Michigan Rogel Cancer Center, Ann Arbor, MI, United States
| | - Marcin Cieslik
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, United States
- Department of Pathology, University of Michigan, Ann Arbor, MI, United States
- University of Michigan Rogel Cancer Center, Ann Arbor, MI, United States
| |
Collapse
|
2
|
Ma C, Hao Y, Shi B, Wu Z, Jin D, Yu X, Jin B. Unveiling mitochondrial and ribosomal gene deregulation and tumor microenvironment dynamics in acute myeloid leukemia. Cancer Gene Ther 2024; 31:1034-1048. [PMID: 38806621 DOI: 10.1038/s41417-024-00788-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 05/15/2024] [Accepted: 05/21/2024] [Indexed: 05/30/2024]
Abstract
Acute myeloid leukemia (AML) is a malignant clonal hematopoietic disease with a poor prognosis. Understanding the interaction between leukemic cells and the tumor microenvironment (TME) can help predict the prognosis of leukemia and guide its treatment. Re-analyzing the scRNA-seq data from the CSC and G20 cohorts, using a Python-based pipeline including machine-learning-based scVI-tools, recapitulated the distinct hierarchical structure within the samples of AML patients. Weighted correlation network analysis (WGCNA) was conducted to construct a weighted gene co-expression network and to identify gene modules primarily focusing on hematopoietic stem cells (HSCs), multipotent progenitors (MPPs), and natural killer (NK) cells. The analysis revealed significant deregulation in gene modules associated with aerobic respiration and ribosomal/cytoplasmic translation. Cell-cell communications were elucidated by the CellChat package, revealing an imbalance of activating and inhibitory immune signaling pathways. Interception of genes upregulated in leukemic HSCs & MPPs as well as in NKG2A-high NK cells was used to construct prognostic models. Normal Cox and artificial neural network models based on 10 genes were developed. The study reveals the deregulation of mitochondrial and ribosomal genes in AML patients and suggests the co-occurrence of stimulatory and inhibitory factors in the AML TME.
Collapse
Affiliation(s)
- Chao Ma
- Institute of Cancer Stem Cell, Dalian Medical University, West Section Lvshun South Road, Dalian, 116044, Liaoning, China
| | - Yuchao Hao
- Department of Hematology, The Second Hospital of Dalian Medical University, West Section Lvshun South Road, Dalian, 116027, Liaoning, China
| | - Bo Shi
- Institute of Cancer Stem Cell, Dalian Medical University, West Section Lvshun South Road, Dalian, 116044, Liaoning, China
| | - Zheng Wu
- Institute of Cancer Stem Cell, Dalian Medical University, West Section Lvshun South Road, Dalian, 116044, Liaoning, China
| | - Di Jin
- Institute of Cancer Stem Cell, Dalian Medical University, West Section Lvshun South Road, Dalian, 116044, Liaoning, China
| | - Xiao Yu
- NHC Key Laboratory of Pneumoconiosis, The First Hospital of Shanxi Medical University, South Jiefang Road, Taiyuan, 030001, Shanxi, China.
| | - Bilian Jin
- Institute of Cancer Stem Cell, Dalian Medical University, West Section Lvshun South Road, Dalian, 116044, Liaoning, China.
| |
Collapse
|
3
|
Sganzerla Martinez G, Garduno A, Toloue Ostadgavahi A, Hewins B, Dutt M, Kumar A, Martin-Loeches I, Kelvin DJ. Identification of Marker Genes in Infectious Diseases from ScRNA-seq Data Using Interpretable Machine Learning. Int J Mol Sci 2024; 25:5920. [PMID: 38892107 PMCID: PMC11172967 DOI: 10.3390/ijms25115920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 05/24/2024] [Accepted: 05/25/2024] [Indexed: 06/21/2024] Open
Abstract
A common result of infection is an abnormal immune response, which may be detrimental to the host. To control the infection, the immune system might undergo regulation, therefore producing an excess of either pro-inflammatory or anti-inflammatory pathways that can lead to widespread inflammation, tissue damage, and organ failure. A dysregulated immune response can manifest as changes in differentiated immune cell populations and concentrations of circulating biomarkers. To propose an early diagnostic system that enables differentiation and identifies the severity of immune-dysregulated syndromes, we built an artificial intelligence tool that uses input data from single-cell RNA sequencing. In our results, single-cell transcriptomics successfully distinguished between mild and severe sepsis and COVID-19 infections. Moreover, by interpreting the decision patterns of our classification system, we identified that different immune cells upregulating or downregulating the expression of the genes CD3, CD14, CD16, FOSB, S100A12, and TCRɣδ can accurately differentiate between different degrees of infection. Our research has identified genes of significance that effectively distinguish between infections, offering promising prospects as diagnostic markers and providing potential targets for therapeutic intervention.
Collapse
Affiliation(s)
- Gustavo Sganzerla Martinez
- Microbiology and Immunology, Dalhousie University, Halifax, NS B3H 4H7, Canada; (G.S.M.); (A.T.O.); (B.H.); (M.D.); (A.K.)
- Department of Pediatrics, Izaak Walton Killam (IWK) Health Center, Canadian Center for Vaccinology, Halifax, NS B3H 4H7, Canada
- Department of Immunology, Shantou University Medical College, Shantou 512025, China
| | - Alexis Garduno
- Department of Clinical Medicine, Trinity College Dublin, D08 NHY1 Dublin, Ireland; (A.G.); (I.M.-L.)
- Department of Intensive Care Medicine, St. James’s Hospital, D08 NHY1 Dublin, Ireland
| | - Ali Toloue Ostadgavahi
- Microbiology and Immunology, Dalhousie University, Halifax, NS B3H 4H7, Canada; (G.S.M.); (A.T.O.); (B.H.); (M.D.); (A.K.)
- Department of Pediatrics, Izaak Walton Killam (IWK) Health Center, Canadian Center for Vaccinology, Halifax, NS B3H 4H7, Canada
- Department of Immunology, Shantou University Medical College, Shantou 512025, China
| | - Benjamin Hewins
- Microbiology and Immunology, Dalhousie University, Halifax, NS B3H 4H7, Canada; (G.S.M.); (A.T.O.); (B.H.); (M.D.); (A.K.)
- Department of Pediatrics, Izaak Walton Killam (IWK) Health Center, Canadian Center for Vaccinology, Halifax, NS B3H 4H7, Canada
- Department of Immunology, Shantou University Medical College, Shantou 512025, China
| | - Mansi Dutt
- Microbiology and Immunology, Dalhousie University, Halifax, NS B3H 4H7, Canada; (G.S.M.); (A.T.O.); (B.H.); (M.D.); (A.K.)
- Department of Pediatrics, Izaak Walton Killam (IWK) Health Center, Canadian Center for Vaccinology, Halifax, NS B3H 4H7, Canada
- Department of Immunology, Shantou University Medical College, Shantou 512025, China
| | - Anuj Kumar
- Microbiology and Immunology, Dalhousie University, Halifax, NS B3H 4H7, Canada; (G.S.M.); (A.T.O.); (B.H.); (M.D.); (A.K.)
- Department of Pediatrics, Izaak Walton Killam (IWK) Health Center, Canadian Center for Vaccinology, Halifax, NS B3H 4H7, Canada
- Department of Immunology, Shantou University Medical College, Shantou 512025, China
| | - Ignacio Martin-Loeches
- Department of Clinical Medicine, Trinity College Dublin, D08 NHY1 Dublin, Ireland; (A.G.); (I.M.-L.)
- Department of Intensive Care Medicine, St. James’s Hospital, D08 NHY1 Dublin, Ireland
- Multidisciplinary Intensive Care Research Organization (MICRO), St. James’s Hospital, D08 NHY1 Dublin, Ireland
| | - David J. Kelvin
- Microbiology and Immunology, Dalhousie University, Halifax, NS B3H 4H7, Canada; (G.S.M.); (A.T.O.); (B.H.); (M.D.); (A.K.)
- Department of Pediatrics, Izaak Walton Killam (IWK) Health Center, Canadian Center for Vaccinology, Halifax, NS B3H 4H7, Canada
- Department of Immunology, Shantou University Medical College, Shantou 512025, China
| |
Collapse
|
4
|
Oloruntoba A, Ingvar Å, Sashindranath M, Anthony O, Abbott L, Guitera P, Caccetta T, Janda M, Soyer HP, Mar V. Examining labelling guidelines for AI-based software as a medical device: A review and analysis of dermatology mobile applications in Australia. Australas J Dermatol 2024. [PMID: 38693690 DOI: 10.1111/ajd.14269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 02/26/2024] [Accepted: 04/01/2024] [Indexed: 05/03/2024]
Abstract
In recent years, there has been a surge in the development of AI-based Software as a Medical Device (SaMD), particularly in visual specialties such as dermatology. In Australia, the Therapeutic Goods Administration (TGA) regulates AI-based SaMD to ensure its safe use. Proper labelling of these devices is crucial to ensure that healthcare professionals and the general public understand how to use them and interpret results accurately. However, guidelines for labelling AI-based SaMD in dermatology are lacking, which may result in products failing to provide essential information about algorithm development and performance metrics. This review examines existing labelling guidelines for AI-based SaMD across visual medical specialties, with a specific focus on dermatology. Common recommendations for labelling are identified and applied to currently available dermatology AI-based SaMD mobile applications to determine usage of these labels. Of the 21 AI-based SaMD mobile applications identified, none fully comply with common labelling recommendations. Results highlight the need for standardized labelling guidelines. Ensuring transparency and accessibility of information is essential for the safe integration of AI into health care and preventing potential risks associated with inaccurate clinical decisions.
Collapse
Affiliation(s)
| | - Åsa Ingvar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
- Department of Dermatology, Skåne University Hospital, Lund University, Lund, Sweden
- Department of Clinical Sciences, Skåne University Hospital, Lund University, Lund, Sweden
| | - Maithili Sashindranath
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Ojochonu Anthony
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Lisa Abbott
- Melanoma Institute Australia, The University of Sydney, Sydney, New South Wales, Australia
| | - Pascale Guitera
- Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
- Sydney Melanoma Diagnostic Centre, Royal Prince Alfred Hospital, Camperdown, New South Wales, Australia
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Tony Caccetta
- Perth Dermatology Clinic, Perth, Western Australia, Australia
| | - Monika Janda
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - H Peter Soyer
- Dermatology Research Centre, Frazer Institute, The University of Queensland, Brisbane, Queensland, Australia
| | - Victoria Mar
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Victoria, Australia
| |
Collapse
|
5
|
Gondal MN, Shah SUR, Chinnaiyan AM, Cieslik M. A Systematic Overview of Single-Cell Transcriptomics Databases, their Use cases, and Limitations. ARXIV 2024:arXiv:2404.10545v1. [PMID: 38699169 PMCID: PMC11065044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Rapid advancements in high-throughput single-cell RNA-seq (scRNA-seq) technologies and experimental protocols have led to the generation of vast amounts of genomic data that populates several online databases and repositories. Here, we systematically examined large-scale scRNA-seq databases, categorizing them based on their scope and purpose such as general, tissue-specific databases, disease-specific databases, cancer-focused databases, and cell type-focused databases. Next, we discuss the technical and methodological challenges associated with curating large-scale scRNA-seq databases, along with current computational solutions. We argue that understanding scRNA-seq databases, including their limitations and assumptions, is crucial for effectively utilizing this data to make robust discoveries and identify novel biological insights. Furthermore, we propose that bridging the gap between computational and wet lab scientists through user-friendly web-based platforms is needed for democratizing access to single-cell data. These platforms would facilitate interdisciplinary research, enabling researchers from various disciplines to collaborate effectively. This review underscores the importance of leveraging computational approaches to unravel the complexities of single-cell data and offers a promising direction for future research in the field.
Collapse
Affiliation(s)
- Mahnoor N. Gondal
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI USA
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI USA
| | - Saad Ur Rehman Shah
- Gies College of Business, University of Illinois Business College, Champaign, IL USA
| | - Arul M. Chinnaiyan
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI USA
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI USA
- Department of Pathology, University of Michigan, Ann Arbor, MI USA
- Department of Urology, University of Michigan, Ann Arbor, MI USA
- Howard Hughes Medical Institute, Ann Arbor, MI USA
- University of Michigan Rogel Cancer Center, Ann Arbor, MI USA
| | - Marcin Cieslik
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI USA
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI USA
- Department of Pathology, University of Michigan, Ann Arbor, MI USA
- University of Michigan Rogel Cancer Center, Ann Arbor, MI USA
| |
Collapse
|
6
|
Ma Q, Shen Y, Guo W, Feng K, Huang T, Cai Y. Machine Learning Reveals Impacts of Smoking on Gene Profiles of Different Cell Types in Lung. Life (Basel) 2024; 14:502. [PMID: 38672772 PMCID: PMC11051039 DOI: 10.3390/life14040502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/03/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024] Open
Abstract
Smoking significantly elevates the risk of lung diseases such as chronic obstructive pulmonary disease (COPD) and lung cancer. This risk is attributed to the harmful chemicals in tobacco smoke that damage lung tissue and impair lung function. Current research on the impact of smoking on gene expression in specific lung cells is limited. This study addresses this gap by analyzing gene expression profiles at the single-cell level from 43,539 lung endothelial cells, 234,349 lung epithelial cells, 189,843 lung immune cells, and 16,031 lung stromal cells using advanced machine learning techniques. The data, categorized by different lung cell types, were classified into three smoking states: active smoker, former smoker, and never smoker. Each cell sample encompassed 28,024 feature genes. Employing an incremental feature selection method within a computational framework, several specific genes have been identified as potential markers of smoking status in different lung cell types. These include B2M, EEF1A1, and TPT1 in lung endothelial cells; FTL and MT-ATP8 in lung epithelial cells; HLA-B and HLA-C in lung immune cells; and HSP90B1 and LCN2 in lung stroma cells. Additionally, this study developed quantitative rules for representing the gene expression patterns related to smoking. This research highlights the potential of machine learning in oncology, enhancing our molecular understanding of smoking's harm and laying the groundwork for future mechanism-based studies.
Collapse
Affiliation(s)
- Qinglan Ma
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
| | - Yulong Shen
- Department of Radiotherapy, Strategic Support Force Medical Center, Beijing 100101, China;
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai 200030, China;
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou 510507, China;
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yudong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
| |
Collapse
|
7
|
Waheed I, Ali A, Tabassum H, Khatoon N, Lai WF, Zhou X. Lipid-based nanoparticles as drug delivery carriers for cancer therapy. Front Oncol 2024; 14:1296091. [PMID: 38660132 PMCID: PMC11040677 DOI: 10.3389/fonc.2024.1296091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 02/22/2024] [Indexed: 04/26/2024] Open
Abstract
Cancer is a severe disease that results in death in all countries of the world. A nano-based drug delivery approach is the best alternative, directly targeting cancer tumor cells with improved drug cellular uptake. Different types of nanoparticle-based drug carriers are advanced for the treatment of cancer, and to increase the therapeutic effectiveness and safety of cancer therapy, many substances have been looked into as drug carriers. Lipid-based nanoparticles (LBNPs) have significantly attracted interest recently. These natural biomolecules that alternate to other polymers are frequently recycled in medicine due to their amphipathic properties. Lipid nanoparticles typically provide a variety of benefits, including biocompatibility and biodegradability. This review covers different classes of LBNPs, including their characterization and different synthesis technologies. This review discusses the most significant advancements in lipid nanoparticle technology and their use in medicine administration. Moreover, the review also emphasized the applications of lipid nanoparticles that are used in different cancer treatment types.
Collapse
Affiliation(s)
- Ibtesam Waheed
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, China
| | - Anwar Ali
- Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China
- Department of Biochemical and Biotechnological Sciences, School of Precision Medicine, University of Campania, Naples, Italy
| | - Huma Tabassum
- Institute of Social and Cultural Studies, Department of Public Health, University of the Punjab, Lahore, Pakistan
| | - Narjis Khatoon
- Department of Biotechnology, Lahore College for Women University, Lahore, Pakistan
| | - Wing-Fu Lai
- Department of Applied Biology and Chemical Technology, Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China
- School of Food Science and Nutrition, University of Leeds, Leeds, United Kingdom
| | - Xin Zhou
- Institute of Comparative Medicine, College of Veterinary Medicine, Yangzhou University, Yangzhou, China
| |
Collapse
|
8
|
Wang Q, Song JJ, Zhang F. Feature-weight based measurement of cancerous transcriptome using cohort-wide and sample-specific information. Cell Oncol (Dordr) 2024; 47:711-715. [PMID: 37814075 DOI: 10.1007/s13402-023-00879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/18/2023] [Indexed: 10/11/2023] Open
Abstract
Identifying cancerous samples or cells using transcriptomic data is critical for cancer related basic research, early diagnosis, and targeted therapy. However, the high transcriptional heterogeneity of cancers still hinders people from accurately recognizing cancerous transcriptome using bulk, single-cell, or spatial RNA-seq data. Here, we present a novel method named FWP (Feature Weight Pro) that helps measure cancerous transcriptome using transcriptomic data. The workflow of FWP is, first, to calculate feature weights using the training dataset, and then, for each sample in the testing dataset, calculate the feature-weight based final score by combining the cohort-wide and sample-specific information. Those two types of information are utilized through conducting weighted principal component analysis and calculating correlation perturbations. The effectiveness and superiority of FWP over other methods are shown by using bulk, single-cell, and spatial RNA-seq data of multiple cancer types. In addition, the high robustness and efficiency of FWP are also demonstrated by using different numbers of features and cells, respectively. FWP is available at https://github.com/jumphone/fwp .
Collapse
Affiliation(s)
- Qilu Wang
- School of Computer Science and Technology, Xinjiang University, Urumqi, 830017, China
| | - Jiaoyang Jessie Song
- Division of Arts and Sciences, New York University Shanghai, Shanghai, 200124, China
| | - Feng Zhang
- Department of Histoembryology, Genetics and Developmental Biology, Shanghai Key Laboratory of Reproductive Medicine, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| |
Collapse
|
9
|
Ren Z, Ren Y, Liu P, Xu H. Cytokine expression patterns: A single-cell RNA sequencing and machine learning based roadmap for cancer classification. Comput Biol Chem 2024; 109:108025. [PMID: 38335854 DOI: 10.1016/j.compbiolchem.2024.108025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 12/22/2023] [Accepted: 02/02/2024] [Indexed: 02/12/2024]
Abstract
Cytokines are small protein molecules that exhibit potent immunoregulatory properties, which are known as the essential components of the tumor immune microenvironment (TIME). While some cytokines are known to be universally upregulated in TIME, the unique cytokine expression patterns have not been fully resolved in specific types of cancers. To address this challenge, we develop a TIME single-cell RNA sequencing (scRNA-seq) dataset, which is designed to study cytokine expression patterns for precise cancer classification. The dataset, including 39 cancers, is constructed by integrating 684 tumor scRNA-seq samples from multiple public repositories. After screening and processing, the dataset retains only the expression data of immune cells. With a machine learning classification model, unique cytokine expression patterns are identified for various cancer categories and pioneering applied to cancer classification with an accuracy rate of 78.01%. Our method will not only boost the understanding of cancer-type-specific immune modulations in TIME but also serve as a crucial reference for future diagnostic and therapeutic research in cancer immunity.
Collapse
Affiliation(s)
- Zhixiang Ren
- Peng Cheng Laboratory, Shenzhen, Guangdong Province 518055, China
| | - Yiming Ren
- Peng Cheng Laboratory, Shenzhen, Guangdong Province 518055, China
| | - Pengfei Liu
- School of Computer Science and Engineering, Sun Yatsen University, Guangzhou, Guangdong Province 528406, China
| | - Huan Xu
- School of Public Health, Anhui University of Science and Technology, Hefei, Anhui Province 231131, China; Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism and Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China.
| |
Collapse
|
10
|
Zhong Z, Hou J, Yao Z, Dong L, Liu F, Yue J, Wu T, Zheng J, Ouyang G, Yang C, Song J. Domain generalization enables general cancer cell annotation in single-cell and spatial transcriptomics. Nat Commun 2024; 15:1929. [PMID: 38431724 PMCID: PMC10908802 DOI: 10.1038/s41467-024-46413-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/09/2024] [Indexed: 03/05/2024] Open
Abstract
Single-cell and spatial transcriptome sequencing, two recently optimized transcriptome sequencing methods, are increasingly used to study cancer and related diseases. Cell annotation, particularly for malignant cell annotation, is essential and crucial for in-depth analyses in these studies. However, current algorithms lack accuracy and generalization, making it difficult to consistently and rapidly infer malignant cells from pan-cancer data. To address this issue, we present Cancer-Finder, a domain generalization-based deep-learning algorithm that can rapidly identify malignant cells in single-cell data with an average accuracy of 95.16%. More importantly, by replacing the single-cell training data with spatial transcriptomic datasets, Cancer-Finder can accurately identify malignant spots on spatial slides. Applying Cancer-Finder to 5 clear cell renal cell carcinoma spatial transcriptomic samples, Cancer-Finder demonstrates a good ability to identify malignant spots and identifies a gene signature consisting of 10 genes that are significantly co-localized and enriched at the tumor-normal interface and have a strong correlation with the prognosis of clear cell renal cell carcinoma patients. In conclusion, Cancer-Finder is an efficient and extensible tool for malignant cell annotation.
Collapse
Affiliation(s)
- Zhixing Zhong
- Institute of Artificial Intelligence, Department of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361102, China
- Institute of Molecular Medicine, Department of Urology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
| | - Junchen Hou
- School of Pharmaceutical Sciences, State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, 361102, China
| | - Zhixian Yao
- Institute of Molecular Medicine, Department of Urology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
| | - Lei Dong
- Department of Pathology, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200025, China
| | - Feng Liu
- School of Computing and Information Systems, The University of Melbourne, Carlton, Melbourne, VIC, 3053, Australia
| | - Junqiu Yue
- Department of Pathology, Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Tiantian Wu
- School of Pharmaceutical Sciences, State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, 361102, China
| | - Junhua Zheng
- Institute of Molecular Medicine, Department of Urology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
| | - Gaoliang Ouyang
- School of Pharmaceutical Sciences, State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, 361102, China
| | - Chaoyong Yang
- Institute of Artificial Intelligence, Department of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361102, China
- Institute of Molecular Medicine, Department of Urology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen, 361005, China
| | - Jia Song
- Institute of Molecular Medicine, Department of Urology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China.
| |
Collapse
|
11
|
Maan H, Zhang L, Yu C, Geuenich MJ, Campbell KR, Wang B. Characterizing the impacts of dataset imbalance on single-cell data integration. Nat Biotechnol 2024:10.1038/s41587-023-02097-9. [PMID: 38429430 DOI: 10.1038/s41587-023-02097-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 12/13/2023] [Indexed: 03/03/2024]
Abstract
Computational methods for integrating single-cell transcriptomic data from multiple samples and conditions do not generally account for imbalances in the cell types measured in different datasets. In this study, we examined how differences in the cell types present, the number of cells per cell type and the cell type proportions across samples affect downstream analyses after integration. The Iniquitate pipeline assesses the robustness of integration results after perturbing the degree of imbalance between datasets. Benchmarking of five state-of-the-art single-cell RNA sequencing integration techniques in 2,600 integration experiments indicates that sample imbalance has substantial impacts on downstream analyses and the biological interpretation of integration results. Imbalance perturbation led to statistically significant variation in unsupervised clustering, cell type classification, differential expression and marker gene annotation, query-to-reference mapping and trajectory inference. We quantified the impacts of imbalance through newly introduced properties-aggregate cell type support and minimum cell type center distance. To better characterize and mitigate impacts of imbalance, we introduce balanced clustering metrics and imbalanced integration guidelines for integration method users.
Collapse
Affiliation(s)
- Hassaan Maan
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada.
- Vector Institute, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
| | - Lin Zhang
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Chengxin Yu
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute, Toronto, Ontario, Canada
| | - Michael J Geuenich
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute, Toronto, Ontario, Canada
| | - Kieran R Campbell
- Vector Institute, Toronto, Ontario, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
- Lunenfeld-Tanenbaum Research Institute, Toronto, Ontario, Canada.
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada.
| | - Bo Wang
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada.
- Vector Institute, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
12
|
Mehrotra S, Sharma S, Pandey RK. A journey from omics to clinicomics in solid cancers: Success stories and challenges. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:89-139. [PMID: 38448145 DOI: 10.1016/bs.apcsb.2023.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The word 'cancer' encompasses a heterogenous group of distinct disease types characterized by a spectrum of pathological features, genetic alterations and response to therapies. According to the World Health Organization, cancer is the second leading cause of death worldwide, responsible for one in six deaths and hence imposes a significant burden on global healthcare systems. High-throughput omics technologies combined with advanced imaging tools, have revolutionized our ability to interrogate the molecular landscape of tumors and has provided unprecedented understanding of the disease. Yet, there is a gap between basic research discoveries and their translation into clinically meaningful therapies for improving patient care. To bridge this gap, there is a need to analyse the vast amounts of high dimensional datasets from multi-omics platforms. The integration of multi-omics data with clinical information like patient history, histological examination and imaging has led to the novel concept of clinicomics and may expedite the bench-to-bedside transition in cancer. The journey from omics to clinicomics has gained momentum with development of radiomics which involves extracting quantitative features from medical imaging data with the help of deep learning and artificial intelligence (AI) tools. These features capture detailed information about the tumor's shape, texture, intensity, and spatial distribution. Together, the related fields of multiomics, translational bioinformatics, radiomics and clinicomics may provide evidence-based recommendations tailored to the individual cancer patient's molecular profile and clinical characteristics. In this chapter, we summarize multiomics studies in solid cancers with a specific focus on breast cancer. We also review machine learning and AI based algorithms and their use in cancer diagnosis, subtyping, prognosis and predicting treatment resistance and relapse.
Collapse
|
13
|
Tan JK, Awuah WA, Roy S, Ferreira T, Ahluwalia A, Guggilapu S, Javed M, Asyura MMAZ, Adebusoye FT, Ramamoorthy K, Paoletti E, Abdul-Rahman T, Prykhodko O, Ovechkin D. Exploring the advances of single-cell RNA sequencing in thyroid cancer: a narrative review. Med Oncol 2023; 41:27. [PMID: 38129369 PMCID: PMC10739406 DOI: 10.1007/s12032-023-02260-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 11/16/2023] [Indexed: 12/23/2023]
Abstract
Thyroid cancer, a prevalent form of endocrine malignancy, has witnessed a substantial increase in occurrence in recent decades. To gain a comprehensive understanding of thyroid cancer at the single-cell level, this narrative review evaluates the applications of single-cell RNA sequencing (scRNA-seq) in thyroid cancer research. ScRNA-seq has revolutionised the identification and characterisation of distinct cell subpopulations, cell-to-cell communications, and receptor interactions, revealing unprecedented heterogeneity and shedding light on novel biomarkers for therapeutic discovery. These findings aid in the construction of predictive models on disease prognosis and therapeutic efficacy. Altogether, scRNA-seq has deepened our understanding of the tumour microenvironment immunologic insights, informing future studies in the development of effective personalised treatment for patients. Challenges and limitations of scRNA-seq, such as technical biases, financial barriers, and ethical concerns, are discussed. Advancements in computational methods, the advent of artificial intelligence (AI), machine learning (ML), and deep learning (DL), and the importance of single-cell data sharing and collaborative efforts are highlighted. Future directions of scRNA-seq in thyroid cancer research include investigating intra-tumoral heterogeneity, integrating with other omics technologies, exploring the non-coding RNA landscape, and studying rare subtypes. Overall, scRNA-seq has transformed thyroid cancer research and holds immense potential for advancing personalised therapies and improving patient outcomes. Efforts to make this technology more accessible and cost-effective will be crucial to ensuring its widespread utilisation in healthcare.
Collapse
Affiliation(s)
| | | | - Sakshi Roy
- School of Medicine, Queen's University Belfast, Belfast, UK
| | - Tomas Ferreira
- School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | | | - Saibaba Guggilapu
- Faculty of Medicine, Bangalore Medical College and Research Institute, Bengaluru, India
| | - Mahnoor Javed
- School of Medicine, The University of Nottingham, Nottingham, NG7 2UH, UK
| | | | | | | | - Emma Paoletti
- Faculty of Medicine, University of Manchester, Manchester, M13 9WJ, UK
| | | | - Olha Prykhodko
- Faculty of Medicine, Sumy State University, Sumy, Ukraine
| | - Denys Ovechkin
- Faculty of Medicine, Sumy State University, Sumy, Ukraine
| |
Collapse
|
14
|
Basher ARMA, Hallinan C, Lee K. Heterogeneity-Preserving Discriminative Feature Selection for Subtype Discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.14.540686. [PMID: 38187596 PMCID: PMC10769187 DOI: 10.1101/2023.05.14.540686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The discovery of subtypes is pivotal for disease diagnosis and targeted therapy, considering the diverse responses of different cells or patients to specific treatments. Exploring the heterogeneity within disease or cell states provides insights into disease progression mechanisms and cell differentiation. The advent of high-throughput technologies has enabled the generation and analysis of various molecular data types, such as single-cell RNA-seq, proteomic, and imaging datasets, at large scales. While presenting opportunities for subtype discovery, these datasets pose challenges in finding relevant signatures due to their high dimensionality. Feature selection, a crucial step in the analysis pipeline, involves choosing signatures that reduce the feature size for more efficient downstream computational analysis. Numerous existing methods focus on selecting signatures that differentiate known diseases or cell states, yet they often fall short in identifying features that preserve heterogeneity and reveal subtypes. To identify features that can capture the diversity within each class while also maintaining the discrimination of known disease states, we employed deep metric learning-based feature embedding to conduct a detailed exploration of the statistical properties of features essential in preserving heterogeneity. Our analysis revealed that features with a significant difference in interquartile range (IQR) between classes possess crucial subtype information. Guided by this insight, we developed a robust statistical method, termed PHet (Preserving Heterogeneity) that performs iterative subsampling differential analysis of IQR and Fisher's method between classes, identifying a minimal set of heterogeneity-preserving discriminative features to optimize subtype clustering quality. Validation using public single-cell RNA-seq and microarray datasets showcased PHet's effectiveness in preserving sample heterogeneity while maintaining discrimination of known disease/cell states, surpassing the performance of previous outlier-based methods. Furthermore, analysis of a single-cell RNA-seq dataset from mouse tracheal epithelial cells revealed, through PHet-based features, the presence of two distinct basal cell subtypes undergoing differentiation toward a luminal secretory phenotype. Notably, one of these subtypes exhibited high expression of BPIFA1. Interestingly, previous studies have linked BPIFA1 secretion to the emergence of secretory cells during mucociliary differentiation of airway epithelial cells. PHet successfully pinpointed the basal cell subtype associated with this phenomenon, a distinction that pre-annotated markers and dispersion-based features failed to make due to their admixed feature expression profiles. These findings underscore the potential of our method to deepen our understanding of the mechanisms underlying diseases and cell differentiation and contribute significantly to personalized medicine.
Collapse
Affiliation(s)
- Abdur Rahman M. A. Basher
- Vascular Biology Program, Boston Children’s Hospital, Boston, MA 02115, USA
- Department of Surgery, Harvard Medical School, Boston, MA 02115, USA
| | - Caleb Hallinan
- Vascular Biology Program, Boston Children’s Hospital, Boston, MA 02115, USA
| | - Kwonmoo Lee
- Vascular Biology Program, Boston Children’s Hospital, Boston, MA 02115, USA
- Department of Surgery, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
15
|
Paas-Oliveros E, Hernández-Lemus E, de Anda-Jáuregui G. Computational single cell oncology: state of the art. Front Genet 2023; 14:1256991. [PMID: 38028624 PMCID: PMC10663273 DOI: 10.3389/fgene.2023.1256991] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
Single cell computational analysis has emerged as a powerful tool in the field of oncology, enabling researchers to decipher the complex cellular heterogeneity that characterizes cancer. By leveraging computational algorithms and bioinformatics approaches, this methodology provides insights into the underlying genetic, epigenetic and transcriptomic variations among individual cancer cells. In this paper, we present a comprehensive overview of single cell computational analysis in oncology, discussing the key computational techniques employed for data processing, analysis, and interpretation. We explore the challenges associated with single cell data, including data quality control, normalization, dimensionality reduction, clustering, and trajectory inference. Furthermore, we highlight the applications of single cell computational analysis, including the identification of novel cell states, the characterization of tumor subtypes, the discovery of biomarkers, and the prediction of therapy response. Finally, we address the future directions and potential advancements in the field, including the development of machine learning and deep learning approaches for single cell analysis. Overall, this paper aims to provide a roadmap for researchers interested in leveraging computational methods to unlock the full potential of single cell analysis in understanding cancer biology with the goal of advancing precision oncology. For this purpose, we also include a notebook that instructs on how to apply the recommended tools in the Preprocessing and Quality Control section.
Collapse
Affiliation(s)
- Ernesto Paas-Oliveros
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Investigadores por Mexico, Conahcyt, Mexico City, Mexico
| |
Collapse
|
16
|
Aran D. Single-Cell RNA Sequencing for Studying Human Cancers. Annu Rev Biomed Data Sci 2023; 6:1-22. [PMID: 37040737 DOI: 10.1146/annurev-biodatasci-020722-091857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
Since the first publication a decade ago describing the use of single-cell RNA sequencing (scRNA-seq) in the context of cancer, over 200 datasets and thousands of scRNA-seq studies have been published in cancer biology. scRNA-seq technologies have been applied across dozens of cancer types and a diverse array of study designs to improve our understanding of tumor biology, the tumor microenvironment, and therapeutic responses, and scRNA-seq is on the verge of being used to improve decision-making in the clinic. Computational methodologies and analytical pipelines are key in facilitating scRNA-seq research. Numerous computational methods utilizing the most advanced tools in data science have been developed to extract meaningful insights. Here, we review the advancements in cancer biology gained by scRNA-seq and discuss the computational challenges of the technology that are specific to cancer research.
Collapse
Affiliation(s)
- Dvir Aran
- Faculty of Biology, The Taub Faculty of Computer Science, and Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering, Technion-Israel Institute of Technology, Haifa, Israel;
| |
Collapse
|
17
|
Speranza E. Understanding virus-host interactions in tissues. Nat Microbiol 2023; 8:1397-1407. [PMID: 37488255 DOI: 10.1038/s41564-023-01434-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 06/20/2023] [Indexed: 07/26/2023]
Abstract
Although virus-host interactions are usually studied in a single cell type using in vitro assays in immortalized cell lines or isolated cell populations, it is important to remember that what is happening inside one infected cell does not translate to understanding how an infected cell behaves in a tissue, organ or whole organism. Infections occur in complex tissue environments, which contain a host of factors that can alter the course of the infection, including immune cells, non-immune cells and extracellular-matrix components. These factors affect how the host responds to the virus and form the basis of the protective response. To understand virus infection, tools are needed that can profile the tissue environment. This Review highlights methods to study virus-host interactions in the infection microenvironment.
Collapse
Affiliation(s)
- Emily Speranza
- Cleveland Clinic Lerner Research Institute, Port Saint Lucie, FL, USA.
| |
Collapse
|
18
|
Yang T, Yan Q, Long R, Liu Z, Wang X. PreCanCell: An ensemble learning algorithm for predicting cancer and non-cancer cells from single-cell transcriptomes. Comput Struct Biotechnol J 2023; 21:3604-3614. [PMID: 37501705 PMCID: PMC10371765 DOI: 10.1016/j.csbj.2023.07.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/25/2023] [Accepted: 07/08/2023] [Indexed: 07/29/2023] Open
Abstract
We propose PreCanCell, a novel algorithm for predicting malignant and non-malignant cells from single-cell transcriptomes. PreCanCell first identifies the differentially expressed genes (DEGs) between malignant and non-malignant cells commonly in five common cancer types-associated single-cell transcriptome datasets. The five common cancer types include renal cell carcinoma (RCC), head and neck squamous cell carcinoma (HNSCC), melanoma, lung adenocarcinoma (LUAD), and breast cancer (BC). With each of the five datasets as the training set and the DEGs as the features, a single cell is classified as malignant or non-malignant by k-NN (k = 5). Finally, the single cell is determined as malignant or non-malignant by the majority vote of the five k-NN classification results. We tested the predictive performance of PreCanCell in 19 single-cell datasets, and reported classification accuracy, sensitivity, specificity, balanced accuracy (the average of sensitivity and specificity) and the area under the receiver operating characteristic curve (AUROC). In all these datasets, PreCanCell achieved above 0.8 accuracy, sensitivity, specificity, balanced accuracy and AUROC. Finally, we compared the predictive performance of PreCanCell with that of seven other algorithms, including CHETAH, SciBet, SCINA, scmap-cell, scmap-cluster, SingleR, and ikarus. Compared to these algorithms, PreCanCell displays the advantages of higher accuracy and simpler implementation. We have developed an R package for the PreCanCell algorithm, which is available at https://github.com/WangX-Lab/PreCanCell.
Collapse
Affiliation(s)
- Tao Yang
- Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Big Data Research Institute, China Pharmaceutical University, Nanjing 211198, China
| | - Qiyu Yan
- Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Big Data Research Institute, China Pharmaceutical University, Nanjing 211198, China
| | - Rongzhuo Long
- Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Big Data Research Institute, China Pharmaceutical University, Nanjing 211198, China
| | - Zhixian Liu
- Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, Jiangsu Province, China
| | - Xiaosheng Wang
- Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China
- Big Data Research Institute, China Pharmaceutical University, Nanjing 211198, China
| |
Collapse
|
19
|
Li T, Li Y, Zhu X, He Y, Wu Y, Ying T, Xie Z. Artificial intelligence in cancer immunotherapy: Applications in neoantigen recognition, antibody design and immunotherapy response prediction. Semin Cancer Biol 2023; 91:50-69. [PMID: 36870459 DOI: 10.1016/j.semcancer.2023.02.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/13/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023]
Abstract
Cancer immunotherapy is a method of controlling and eliminating tumors by reactivating the body's cancer-immunity cycle and restoring its antitumor immune response. The increased availability of data, combined with advancements in high-performance computing and innovative artificial intelligence (AI) technology, has resulted in a rise in the use of AI in oncology research. State-of-the-art AI models for functional classification and prediction in immunotherapy research are increasingly used to support laboratory-based experiments. This review offers a glimpse of the current AI applications in immunotherapy, including neoantigen recognition, antibody design, and prediction of immunotherapy response. Advancing in this direction will result in more robust predictive models for developing better targets, drugs, and treatments, and these advancements will eventually make their way into the clinical setting, pushing AI forward in the field of precision oncology.
Collapse
Affiliation(s)
- Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yupeng Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Xiaoyi Zhu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yanling Wu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Tianlei Ying
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China.
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China; Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
20
|
Lei W, Yuan M, Long M, Zhang T, Huang YE, Liu H, Jiang W. scDR: Predicting Drug Response at Single-Cell Resolution. Genes (Basel) 2023; 14:genes14020268. [PMID: 36833194 PMCID: PMC9957092 DOI: 10.3390/genes14020268] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 01/09/2023] [Accepted: 01/18/2023] [Indexed: 01/22/2023] Open
Abstract
Heterogeneity exists inter- and intratumorally, which might lead to different drug responses. Therefore, it is extremely important to clarify the drug response at single-cell resolution. Here, we propose a precise single-cell drug response (scDR) prediction method for single-cell RNA sequencing (scRNA-seq) data. We calculated a drug-response score (DRS) for each cell by integrating drug-response genes (DRGs) and gene expression in scRNA-seq data. Then, scDR was validated through internal and external transcriptomics data from bulk RNA-seq and scRNA-seq of cell lines or patient tissues. In addition, scDR could be used to predict prognoses for BLCA, PAAD, and STAD tumor samples. Next, comparison with the existing method using 53,502 cells from 198 cancer cell lines showed the higher accuracy of scDR. Finally, we identified an intrinsic resistant cell subgroup in melanoma, and explored the possible mechanisms, such as cell cycle activation, by applying scDR to time series scRNA-seq data of dabrafenib treatment. Altogether, scDR was a credible method for drug response prediction at single-cell resolution, and helpful in drug resistant mechanism exploration.
Collapse
Affiliation(s)
- Wanyue Lei
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Mengqin Yuan
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Min Long
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Tao Zhang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Yu-e Huang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| | - Haizhou Liu
- College of Automation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
- Correspondence: (H.L.); (W.J.)
| | - Wei Jiang
- Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
- Correspondence: (H.L.); (W.J.)
| |
Collapse
|
21
|
Jihad M, Yet İ. Multiomics Integration at Single-Cell Resolution Using Bayesian Networks: A Case Study in Hepatocellular Carcinoma. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:24-33. [PMID: 36602810 DOI: 10.1089/omi.2022.0170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Multiomics data integration is one of the leading frontiers of complex disease research and integrative biology. The advances in single-cell sequencing technologies offer yet another crucial dimension in multiomics research. The single-cell studies enable the study and integration of multiomics data simultaneously in the same cell. We report in this study multiomics data integration in single-cell resolution using Bayesian networks (BNs) in a case study of hepatocellular carcinoma (HCC). A BN encodes the conditional dependencies/independencies of variables using a graphical model with an accompanying joint probability. RNA-seq and Reduced Representation Bisulfite Sequencing data were analyzed separately, and copy number variations were estimated by the hidden Markov model method. Several BN models were constructed to reveal omics' causal and associational relationships. These methods were subjected to a validation study using an independent data set. We show the heterogeneity of the multiple cellular layers of HCC at single-cell omics resolution by identifying best-fitted BN models of 295 genes. We also provide novel insights into the multiomics mechanistic relationships in the human lymphocyte antigen class I genes in HCC. To the best of our knowledge, this is the first study to focus on integrating omics data using a machine learning algorithm, BNs, at the single-cell resolution using a case study of HCC.
Collapse
Affiliation(s)
- Muntadher Jihad
- Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
| | - İdil Yet
- Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
| |
Collapse
|
22
|
Lin PC, Tsai YS, Yeh YM, Shen MR. Cutting-Edge AI Technologies Meet Precision Medicine to Improve Cancer Care. Biomolecules 2022; 12:biom12081133. [PMID: 36009026 PMCID: PMC9405970 DOI: 10.3390/biom12081133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 08/11/2022] [Accepted: 08/15/2022] [Indexed: 11/18/2022] Open
Abstract
To provide precision medicine for better cancer care, researchers must work on clinical patient data, such as electronic medical records, physiological measurements, biochemistry, computerized tomography scans, digital pathology, and the genetic landscape of cancer tissue. To interpret big biodata in cancer genomics, an operational flow based on artificial intelligence (AI) models and medical management platforms with high-performance computing must be set up for precision cancer genomics in clinical practice. To work in the fast-evolving fields of patient care, clinical diagnostics, and therapeutic services, clinicians must understand the fundamentals of the AI tool approach. Therefore, the present article covers the following four themes: (i) computational prediction of pathogenic variants of cancer susceptibility genes; (ii) AI model for mutational analysis; (iii) single-cell genomics and computational biology; (iv) text mining for identifying gene targets in cancer; and (v) the NVIDIA graphics processing units, DRAGEN field programmable gate arrays systems and AI medical cloud platforms in clinical next-generation sequencing laboratories. Based on AI medical platforms and visualization, large amounts of clinical biodata can be rapidly copied and understood using an AI pipeline. The use of innovative AI technologies can deliver more accurate and rapid cancer therapy targets.
Collapse
Affiliation(s)
- Peng-Chan Lin
- Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
- Department of Genomic Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
| | - Yi-Shan Tsai
- Department of Medical Imaging, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
| | - Yu-Min Yeh
- Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
| | - Meng-Ru Shen
- Institute of Clinical Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
- Department of Obstetrics and Gynecology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
- Department of Pharmacology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan
- Correspondence: ; Tel.: +886-6-235-3535
| |
Collapse
|