1
|
Song J, Xie D, Wei X, Liu B, Yao F, Ye W. A cuproptosis-related lncRNAs signature predicts prognosis and reveals pivotal interactions between immune cells in colon cancer. Heliyon 2024; 10:e34586. [PMID: 39114018 PMCID: PMC11305305 DOI: 10.1016/j.heliyon.2024.e34586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 07/11/2024] [Accepted: 07/11/2024] [Indexed: 08/10/2024] Open
Abstract
Copper-mediated cell death presents distinct pathways from established apoptosis processes, suggesting alternative therapeutic approaches for colon cancer. Our research aims to develop a predictive framework utilizing long-noncoding RNAs (lncRNAs) related to cuproptosis to predict colon cancer outcomes while examining immune interactions and intercellular signaling. We obtained colon cancer-related human mRNA expression profiles and clinical information from the Cancer Genome Atlas repository. To isolate lncRNAs involved in cuproptosis, we applied Cox proportional hazards modeling alongside the least absolute shrinkage and selection operator technique. We elucidated the underlying mechanisms by examining the tumor mutational burden, the extent of immune cell penetration, and intercellular communication dynamics. Based on the model, drugs were predicted and validated with cytological experiments. A 13 lncRNA-cuproptosis-associated risk model was constructed. Two colon cancer cell lines were used to validate the predicted representative mRNAs with high correlation coefficients with copper-induced cell death. Survival enhancement in the low-risk cohort was evidenced by the trends in Kaplan-Meier survival estimates. Analysis of immune cell infiltration suggested that survival was induced by the increased infiltration of naïve CD4+ T cells and a reduction of M2 macrophages within the low-risk faction. Decreased infiltration of naïve B cells, resting NK cells, and M0 macrophages was significantly associated with better overall survival. Combined single-cell analysis suggested that CCL5-ACKR1, CCL2-ACKR1, and CCL5-CCR1 pathways play key roles in mediating intercellular dialogues among immune constituents within the neoplastic microhabitat. We identified three drugs with a high sensitivity in the high-risk group. In summary, this discovery establishes the possibility of using 13 cuproptosis-associated lncRNAs as a risk model to assess the prognosis, unravel the immune mechanisms and cell communication, and improve treatment options, which may provide a new idea for treating colon cancer.
Collapse
Affiliation(s)
- Jingru Song
- Department of Gastroenterology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, 310007, Zhejiang, China
| | - Dong Xie
- Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Xia Wei
- Department of Gastroenterology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, 310007, Zhejiang, China
| | - Binbin Liu
- Department of Gastroenterology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, 310007, Zhejiang, China
| | - Fang Yao
- Department of Gastroenterology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, 310007, Zhejiang, China
| | - Wei Ye
- Department of Gastroenterology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, 310007, Zhejiang, China
| |
Collapse
|
2
|
Ciortan M, Defrance M. GNN-based embedding for clustering scRNA-seq data. Bioinformatics 2022; 38:1037-1044. [PMID: 34850828 DOI: 10.1093/bioinformatics/btab787] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 10/15/2021] [Accepted: 11/15/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Single-cell RNA sequencing (scRNA-seq) provides transcriptomic profiling for individual cells, allowing researchers to study the heterogeneity of tissues, recognize rare cell identities and discover new cellular subtypes. Clustering analysis is usually used to predict cell class assignments and infer cell identities. However, the high sparsity of scRNA-seq data, accentuated by dropout events generates challenges that have motivated the development of numerous dedicated clustering methods. Nevertheless, there is still no consensus on the best performing method. RESULTS graph-sc is a new method leveraging a graph autoencoder network to create embeddings for scRNA-seq cell data. While this work analyzes the performance of clustering the embeddings with various clustering algorithms, other downstream tasks can also be performed. A broad experimental study has been performed on both simulated and scRNA-seq datasets. The results indicate that although there is no consistently best method across all the analyzed datasets, graph-sc compares favorably to competing techniques across all types of datasets. Furthermore, the proposed method is stable across consecutive runs, robust to input down-sampling, generally insensitive to changes in the network architecture or training parameters and more computationally efficient than other competing methods based on neural networks. Modeling the data as a graph provides increased flexibility to define custom features characterizing the genes, the cells and their interactions. Moreover, external data (e.g. gene network) can easily be integrated into the graph and used seamlessly under the same optimization task. AVAILABILITY AND IMPLEMENTATION https://github.com/ciortanmadalina/graph-sc. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Madalina Ciortan
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, Brussels, Belgium
| | - Matthieu Defrance
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, Brussels, Belgium
| |
Collapse
|
4
|
Ciortan M, Defrance M. Contrastive self-supervised clustering of scRNA-seq data. BMC Bioinformatics 2021; 22:280. [PMID: 34044773 PMCID: PMC8157426 DOI: 10.1186/s12859-021-04210-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 05/10/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-seq) has emerged has a main strategy to study transcriptional activity at the cellular level. Clustering analysis is routinely performed on scRNA-seq data to explore, recognize or discover underlying cell identities. The high dimensionality of scRNA-seq data and its significant sparsity accentuated by frequent dropout events, introducing false zero count observations, make the clustering analysis computationally challenging. Even though multiple scRNA-seq clustering techniques have been proposed, there is no consensus on the best performing approach. On a parallel research track, self-supervised contrastive learning recently achieved state-of-the-art results on images clustering and, subsequently, image classification. RESULTS We propose contrastive-sc, a new unsupervised learning method for scRNA-seq data that perform cell clustering. The method consists of two consecutive phases: first, an artificial neural network learns an embedding for each cell through a representation training phase. The embedding is then clustered in the second phase with a general clustering algorithm (i.e. KMeans or Leiden community detection). The proposed representation training phase is a new adaptation of the self-supervised contrastive learning framework, initially proposed for image processing, to scRNA-seq data. contrastive-sc has been compared with ten state-of-the-art techniques. A broad experimental study has been conducted on both simulated and real-world datasets, assessing multiple external and internal clustering performance metrics (i.e. ARI, NMI, Silhouette, Calinski scores). Our experimental analysis shows that constastive-sc compares favorably with state-of-the-art methods on both simulated and real-world datasets. CONCLUSION On average, our method identifies well-defined clusters in close agreement with ground truth annotations. Our method is computationally efficient, being fast to train and having a limited memory footprint. contrastive-sc maintains good performance when only a fraction of input cells is provided and is robust to changes in hyperparameters or network architecture. The decoupling between the creation of the embedding and the clustering phase allows the flexibility to choose a suitable clustering algorithm (i.e. KMeans when the number of expected clusters is known, Leiden otherwise) or to integrate the embedding with other existing techniques.
Collapse
Affiliation(s)
- Madalina Ciortan
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, Brussels, Belgium
| | - Matthieu Defrance
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, Brussels, Belgium.
| |
Collapse
|
5
|
Hoek A, Maibach K, Özmen E, Vazquez-Armendariz AI, Mengel JP, Hain T, Herold S, Goesmann A. WASP: a versatile, web-accessible single cell RNA-Seq processing platform. BMC Genomics 2021; 22:195. [PMID: 33736596 PMCID: PMC7977290 DOI: 10.1186/s12864-021-07469-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 02/23/2021] [Indexed: 11/16/2022] Open
Abstract
Background The technology of single cell RNA sequencing (scRNA-seq) has gained massively in popularity as it allows unprecedented insights into cellular heterogeneity as well as identification and characterization of (sub-)cellular populations. Furthermore, scRNA-seq is almost ubiquitously applicable in medical and biological research. However, these new opportunities are accompanied by additional challenges for researchers regarding data analysis, as advanced technical expertise is required in using bioinformatic software. Results Here we present WASP, a software for the processing of Drop-Seq-based scRNA-Seq data. Our software facilitates the initial processing of raw reads generated with the ddSEQ or 10x protocol and generates demultiplexed gene expression matrices including quality metrics. The processing pipeline is realized as a Snakemake workflow, while an R Shiny application is provided for interactive result visualization. WASP supports comprehensive analysis of gene expression matrices, including detection of differentially expressed genes, clustering of cellular populations and interactive graphical visualization of the results. The R Shiny application can be used with gene expression matrices generated by the WASP pipeline, as well as with externally provided data from other sources. Conclusions With WASP we provide an intuitive and easy-to-use tool to process and explore scRNA-seq data. To the best of our knowledge, it is currently the only freely available software package that combines pre- and post-processing of ddSEQ- and 10x-based data. Due to its modular design, it is possible to use any gene expression matrix with WASP’s post-processing R Shiny application. To simplify usage, WASP is provided as a Docker container. Alternatively, pre-processing can be accomplished via Conda, and a standalone version for Windows is available for post-processing, requiring only a web browser. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07469-6.
Collapse
Affiliation(s)
- Andreas Hoek
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, 35392, Giessen, Germany.
| | - Katharina Maibach
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, 35392, Giessen, Germany.,Algorithmic Bioinformatics, Justus Liebig University Giessen, 35392, Giessen, Germany
| | - Ebru Özmen
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, 35392, Giessen, Germany
| | - Ana Ivonne Vazquez-Armendariz
- Department of Internal Medicine II, and Cardio-Pulmonary Institute (CPI), Universities of Giessen and Marburg Lung Center (UGMLC), Member of the German Center for Lung Research (DZL) and The Institute of Lung Health (ILH), 35392, Giessen, Germany
| | - Jan Philipp Mengel
- Institute of Medical Microbiology, Justus Liebig University Giessen, 35392, Giessen, Germany
| | - Torsten Hain
- Institute of Medical Microbiology, Justus Liebig University Giessen, 35392, Giessen, Germany.,Center for Infection Research (DZIF), Justus-Liebig-University Giessen, Partner Site Giessen-Marburg-Langen, 35392, Giessen, Germany
| | - Susanne Herold
- Department of Internal Medicine II, and Cardio-Pulmonary Institute (CPI), Universities of Giessen and Marburg Lung Center (UGMLC), Member of the German Center for Lung Research (DZL) and The Institute of Lung Health (ILH), 35392, Giessen, Germany
| | - Alexander Goesmann
- Bioinformatics and Systems Biology, Justus Liebig University Giessen, 35392, Giessen, Germany.,Center for Infection Research (DZIF), Justus-Liebig-University Giessen, Partner Site Giessen-Marburg-Langen, 35392, Giessen, Germany
| |
Collapse
|