1
|
Li R, Shi F, Song L, Yu Z. scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data. BMC Genomics 2024; 25:393. [PMID: 38649804 PMCID: PMC11034052 DOI: 10.1186/s12864-024-10319-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Accurately deciphering clonal copy number substructure can provide insights into the evolutionary mechanism of cancer, and clustering single-cell copy number profiles has become an effective means to unmask intra-tumor heterogeneity (ITH). However, copy numbers inferred from single-cell DNA sequencing (scDNA-seq) data are error-prone due to technically confounding factors such as amplification bias and allele-dropout, and this makes it difficult to precisely identify the ITH. RESULTS We introduce a hybrid model called scGAL to infer clonal copy number substructure. It combines an autoencoder with a generative adversarial network to jointly analyze independent single-cell copy number profiles and gene expression data from same cell line. Under an adversarial learning framework, scGAL exploits complementary information from gene expression data to relieve the effects of noise in copy number data, and learns latent representations of scDNA-seq cells for accurate inference of the ITH. Evaluation results on three real cancer datasets suggest scGAL is able to accurately infer clonal architecture and surpasses other similar methods. In addition, assessment of scGAL on various simulated datasets demonstrates its high robustness against the changes of data size and distribution. scGAL can be accessed at: https://github.com/zhyu-lab/scgal . CONCLUSIONS Joint analysis of independent single-cell copy number and gene expression data from a same cell line can effectively exploit complementary information from individual omics, and thus gives more refined indication of clonal copy number substructure.
Collapse
Affiliation(s)
- Ruixiang Li
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
| | - Fangyuan Shi
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Lijuan Song
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Zhenhua Yu
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China.
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China.
| |
Collapse
|
2
|
Choi JM, Park C, Chae H. moSCminer: a cell subtype classification framework based on the attention neural network integrating the single-cell multi-omics dataset on the cloud. PeerJ 2024; 12:e17006. [PMID: 38426141 PMCID: PMC10903350 DOI: 10.7717/peerj.17006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 02/05/2024] [Indexed: 03/02/2024] Open
Abstract
Single-cell omics sequencing has rapidly advanced, enabling the quantification of diverse omics profiles at a single-cell resolution. To facilitate comprehensive biological insights, such as cellular differentiation trajectories, precise annotation of cell subtypes is essential. Conventional methods involve clustering cells and manually assigning subtypes based on canonical markers, a labor-intensive and expert-dependent process. Hence, an automated computational prediction framework is crucial. While several classification frameworks for predicting cell subtypes from single-cell RNA sequencing datasets exist, these methods solely rely on single-omics data, offering insights at a single molecular level. They often miss inter-omic correlations and a holistic understanding of cellular processes. To address this, the integration of multi-omics datasets from individual cells is essential for accurate subtype annotation. This article introduces moSCminer, a novel framework for classifying cell subtypes that harnesses the power of single-cell multi-omics sequencing datasets through an attention-based neural network operating at the omics level. By integrating three distinct omics datasets-gene expression, DNA methylation, and DNA accessibility-while accounting for their biological relationships, moSCminer excels at learning the relative significance of each omics feature. It then transforms this knowledge into a novel representation for cell subtype classification. Comparative evaluations against standard machine learning-based classifiers demonstrate moSCminer's superior performance, consistently achieving the highest average performance on real datasets. The efficacy of multi-omics integration is further corroborated through an in-depth analysis of the omics-level attention module, which identifies potential markers for cell subtype annotation. To enhance accessibility and scalability, moSCminer is accessible as a user-friendly web-based platform seamlessly connected to a cloud system, publicly accessible at http://203.252.206.118:5568. Notably, this study marks the pioneering integration of three single-cell multi-omics datasets for cell subtype identification.
Collapse
Affiliation(s)
- Joung Min Choi
- Department of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, Virginia, United States
| | - Chaelin Park
- Division of Computer Science, Sookmyung Women’s University, Seoul, South Korea
| | - Heejoon Chae
- Division of Computer Science, Sookmyung Women’s University, Seoul, South Korea
| |
Collapse
|
3
|
徐 晨, 王 寅, 魏 东, 李 文, 钱 晔, 潘 新, 雷 大. [Advances of spatial omics in the individualized diagnosis and treatment of head and neck cancer]. LIN CHUANG ER BI YAN HOU TOU JING WAI KE ZA ZHI = JOURNAL OF CLINICAL OTORHINOLARYNGOLOGY, HEAD, AND NECK SURGERY 2023; 37:729-733;739. [PMID: 37830120 PMCID: PMC10722126 DOI: 10.13201/j.issn.2096-7993.2023.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Indexed: 10/14/2023]
Abstract
Spatialomics is another research hotspot of biotechnology after single-cell sequencing technology, which can make up for the defect that single-cell sequencing technology can not obtain cell spatial distribution information. Spatialomics mainly studies the relative position of cells in tissue samples to reveal the effect of cell spatial distribution on diseases. In recent years, spatialomics has made new progress in the pathogenesis, target exploration, drug development and many other aspects of head and neck tumors. This paper summarizes the latest progress of spatialomics in the diagnosis and treatment of head and neck cancer.
Collapse
Affiliation(s)
- 晨阳 徐
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| | - 寅 王
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| | - 东敏 魏
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| | - 文明 李
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| | - 晔 钱
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| | - 新良 潘
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| | - 大鹏 雷
- 山东大学齐鲁医院耳鼻咽喉科,国家卫生健康委员会耳鼻喉科学重点实验室(山东大学)(济南,250012)Department of Otorhinolaryngology, Qilu Hospital of Shandong University, National Health Commission Key Laboratory of Otorhinolaryngology[Shandong University], Jinan, 250012, China
| |
Collapse
|
4
|
Flores JE, Claborne DM, Weller ZD, Webb-Robertson BJM, Waters KM, Bramer LM. Missing data in multi-omics integration: Recent advances through artificial intelligence. Front Artif Intell 2023; 6:1098308. [PMID: 36844425 PMCID: PMC9949722 DOI: 10.3389/frai.2023.1098308] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Biological systems function through complex interactions between various 'omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across 'omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more 'omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
Collapse
Affiliation(s)
- Javier E. Flores
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Daniel M. Claborne
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Zachary D. Weller
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Bobbie-Jo M. Webb-Robertson
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Katrina M. Waters
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Lisa M. Bramer
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| |
Collapse
|