1
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-023-2561-0. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
2
|
Chen Z, Wang C, Huang S, Shi Y, Xi R. Directly selecting cell-type marker genes for single-cell clustering analyses. CELL REPORTS METHODS 2024; 4:100810. [PMID: 38981475 PMCID: PMC11294843 DOI: 10.1016/j.crmeth.2024.100810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 03/16/2024] [Accepted: 06/12/2024] [Indexed: 07/11/2024]
Abstract
In single-cell RNA sequencing (scRNA-seq) studies, cell types and their marker genes are often identified by clustering and differentially expressed gene (DEG) analysis. A common practice is to select genes using surrogate criteria such as variance and deviance, then cluster them using selected genes and detect markers by DEG analysis assuming known cell types. The surrogate criteria can miss important genes or select unimportant genes, while DEG analysis has the selection-bias problem. We present Festem, a statistical method for the direct selection of cell-type markers for downstream clustering. Festem distinguishes marker genes with heterogeneous distribution across cells that are cluster informative. Simulation and scRNA-seq applications demonstrate that Festem can sensitively select markers with high precision and enables the identification of cell types often missed by other methods. In a large intrahepatic cholangiocarcinoma dataset, we identify diverse CD8+ T cell types and potential prognostic marker genes.
Collapse
Affiliation(s)
- Zihao Chen
- School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing 100871, China
| | - Changhu Wang
- School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing 100871, China
| | - Siyuan Huang
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Yang Shi
- BeiGene (Beijing) Co., Ltd., Beijing 100871, China
| | - Ruibin Xi
- School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing 100871, China.
| |
Collapse
|
3
|
Li R, Shi F, Song L, Yu Z. scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data. BMC Genomics 2024; 25:393. [PMID: 38649804 PMCID: PMC11034052 DOI: 10.1186/s12864-024-10319-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Accurately deciphering clonal copy number substructure can provide insights into the evolutionary mechanism of cancer, and clustering single-cell copy number profiles has become an effective means to unmask intra-tumor heterogeneity (ITH). However, copy numbers inferred from single-cell DNA sequencing (scDNA-seq) data are error-prone due to technically confounding factors such as amplification bias and allele-dropout, and this makes it difficult to precisely identify the ITH. RESULTS We introduce a hybrid model called scGAL to infer clonal copy number substructure. It combines an autoencoder with a generative adversarial network to jointly analyze independent single-cell copy number profiles and gene expression data from same cell line. Under an adversarial learning framework, scGAL exploits complementary information from gene expression data to relieve the effects of noise in copy number data, and learns latent representations of scDNA-seq cells for accurate inference of the ITH. Evaluation results on three real cancer datasets suggest scGAL is able to accurately infer clonal architecture and surpasses other similar methods. In addition, assessment of scGAL on various simulated datasets demonstrates its high robustness against the changes of data size and distribution. scGAL can be accessed at: https://github.com/zhyu-lab/scgal . CONCLUSIONS Joint analysis of independent single-cell copy number and gene expression data from a same cell line can effectively exploit complementary information from individual omics, and thus gives more refined indication of clonal copy number substructure.
Collapse
Affiliation(s)
- Ruixiang Li
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
| | - Fangyuan Shi
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Lijuan Song
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China
| | - Zhenhua Yu
- School of Information Engineering, Ningxia University, Yinchuan, 750021, China.
- Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China.
| |
Collapse
|
4
|
Liu M, Lu J, Yu C, Zhao J, Wang L, Hu Y, Chen L, Han R, Liu Y, Sun M, Wei G, Wu S. Differentiation Potential of Hypodifferentiated Subsets of Nephrogenic Rests and Its Relationship to Prognosis in Wilms Tumor. Fetal Pediatr Pathol 2024; 43:123-139. [PMID: 38217324 DOI: 10.1080/15513815.2024.2303081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 01/02/2024] [Indexed: 01/15/2024]
Abstract
Background Wilms tumor (WT) is highly curable, although anaplastic histology or relapse imparts a worse prognosis. Nephrogenic rests (NR) associated with a high risk of developing WT are abnormally retained embryonic kidney precursor cells. Methods After pseudo-time analysis using single-cell RNA sequencing (scRNA-seq) data, we generated and validated a WT differentiation-related gene (WTDRG) signature to predict overall survival (OS) in children with a poor OS. Results A differentiation trajectory from NR to WT was identified and showed that hypodifferentiated subsets of NR could differentiate into WT. Classification of WT children with anaplastic histology or relapse based on the expression patterns of WTDRGs suggested that patients with relatively high levels of hypodifferentiated NR presented a poorer prognosis. A WTDRG-based risk model and a clinically applicable nomogram was developed. Conclusions These findings may inform oncogenesis of WT and interventions directed toward poor prognosis in WT children of anaplastic histology or relapse.
Collapse
Affiliation(s)
- Maolin Liu
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Jiandong Lu
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Chengjun Yu
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Jie Zhao
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Ling Wang
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Yang Hu
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Long Chen
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Rong Han
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Yan Liu
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Miao Sun
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Guanghui Wei
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Shengde Wu
- Department of Urology, Chongqing Key Laboratory of Pediatrics, Chongqing Key Laboratory of Children Urogenital Development and Tissue Engineering, Ministry of Education Key Laboratory of Child Development and Disorders, National Clinical Research Center for Child Health and Disorders, Children's Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
5
|
Shi Y, Zhu R. Analysis of damage-associated molecular patterns in amyotrophic lateral sclerosis based on ScRNA-seq and bulk RNA-seq data. Front Neurosci 2023; 17:1259742. [PMID: 37942135 PMCID: PMC10628000 DOI: 10.3389/fnins.2023.1259742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 10/11/2023] [Indexed: 11/10/2023] Open
Abstract
Background Amyotrophic Lateral Sclerosis (ALS) is a devastating neurodegenerative disorder characterized by the progressive loss of motor neurons. Despite extensive research, the exact etiology of ALS remains elusive. Emerging evidence highlights the critical role of the immune system in ALS pathogenesis and progression. Damage-Associated Molecular Patterns (DAMPs) are endogenous molecules released by stressed or damaged cells, acting as danger signals and activating immune responses. However, their specific involvement in ALS remains unclear. Methods We obtained single-cell RNA sequencing (scRNA-seq) data of ALS from the primary motor cortex in the Gene Expression Omnibus (GEO) database. To better understand genes associated with DAMPs, we performed analyses on cell-cell communication and trajectory. The abundance of immune-infiltrating cells was assessed using the single-sample Gene Set Enrichment Analysis (ssGSEA) method. We performed univariate Cox analysis to construct the risk model and utilized the least absolute shrinkage and selection operator (LASSO) analysis. Finally, we identified potential small molecule drugs targeting ALS by screening the Connectivity Map database (CMap) and confirmed their potential through molecular docking analysis. Results Our study annotated 10 cell types, with the expression of genes related to DAMPs predominantly observed in microglia. Analysis of intercellular communication revealed 12 ligand-receptor pairs in the pathways associated with DAMPs, where microglial cells acted as ligands. Among these pairs, the SPP1-CD44 pair demonstrated the greatest contribution. Furthermore, trajectory analysis demonstrated distinct differentiation fates of different microglial states. Additionally, we constructed a risk model incorporating four genes (TRPM2, ROCK1, HSP90AA1, and HSPA4). The validity of the risk model was supported by multivariate analysis. Moreover, external validation from dataset GSE112681 confirmed the predictive power of the model, which yielded consistent results with datasets GSE112676 and GSE112680. Lastly, the molecular docking analysis suggested that five compounds, namely mead-acid, nifedipine, nifekalant, androstenol, and hydrastine, hold promise as potential candidates for the treatment of ALS. Conclusion Taken together, our study demonstrated that DAMP entities were predominantly observed in microglial cells within the context of ALS. The utilization of a prognostic risk model can accurately predict ALS patient survival. Additionally, genes related to DAMPs may present viable drug targets for ALS therapy.
Collapse
Affiliation(s)
| | - Ruixia Zhu
- Department of Neurology, The First Affiliated Hospital of China Medical University, Shenyang, China
| |
Collapse
|
6
|
Zhao X, Zhang M, Jia Y, Liu W, Li S, Gao C, Zhang L, Ni B, Ruan Z, Dong R. Featured immune characteristics of COVID-19 and systemic lupus erythematosus revealed by multidimensional integrated analyses. Inflamm Res 2023; 72:1877-1894. [PMID: 37725104 DOI: 10.1007/s00011-023-01791-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 08/16/2023] [Accepted: 08/19/2023] [Indexed: 09/21/2023] Open
Abstract
BACKGROUND Coronavirus disease 2019 (COVID-19) shares similar immune characteristics with autoimmune diseases like systemic lupus erythematosus (SLE). However, such associations have not yet been investigated at the single-cell level. METHODS We integrated and analyzed RNA sequencing results from different patients and normal controls from the GEO database and identified subsets of immune cells that might involve in the pathogenesis of SLE and COVID- 19. We also disentangled the characteristic alterations in cell and molecular subset proportions as well as gene expression patterns in SLE patients compared with COVID-19 patients. RESULTS Key immune characteristic genes (such as CXCL10 and RACK1) and multiple immune-related pathways (such as the coronavirus disease-COVID-19, T-cell receptor signaling, and MIF-related signaling pathways) were identified. We also highlighted the differences in peripheral blood mononuclear cells (PBMCs) between SLE and COVID-19 patients. Moreover, we provided an opportunity to comprehensively probe underlying B-cell‒cell communication with multiple ligand-receptor pairs (MIF-CD74+CXCR4, MIF-CD74+CD44) and the differentiation trajectory of B-cell clusters that is deemed to promote cell state transitions in COVID-19 and SLE. CONCLUSIONS Our results demonstrate the immune response differences and immune characteristic similarities, such as the cytokine storm, between COVID-19 and SLE, which might pivotally function in the pathogenesis of the two diseases and provide potential intervention targets for both diseases.
Collapse
Affiliation(s)
- Xingwang Zhao
- Department of Dermatology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Mengjie Zhang
- Department of Pathophysiology, College of High Altitude Military Medicine, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Yuying Jia
- Department of Dermatology, The 901th Hospital of the Joint Logistics Support Force of PLA, Affiliated to Anhui Medical University, Hefei, Anhui, China
- Division of Life Sciences and Medicine, Dermatology Department of the First Affiliated Hospital of USTC, University of Science and Technology of China, Hefei, 230001, Anhui, People's Republic of China
| | - Wenying Liu
- Department of Dermatology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Shifei Li
- Department of Dermatology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Cuie Gao
- Department of Dermatology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Lian Zhang
- Department of Dermatology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Bing Ni
- Department of Pathophysiology, College of High Altitude Military Medicine, Army Medical University (Third Military Medical University), Chongqing, 400038, China
| | - Zhihua Ruan
- Department of Oncology and Southwest Cancer Center, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, 400038, China.
| | - Rui Dong
- Department of Pathophysiology, College of High Altitude Military Medicine, Army Medical University (Third Military Medical University), Chongqing, 400038, China.
- Chongqing International Institute for Immunology, Chongqing, 401320, China.
| |
Collapse
|
7
|
Zhang C, Hong X, Yu H, Xu H, Qiu X, Cai W, Hocher B, Dai W, Tang D, Liu D, Dai Y. Gene regulatory network study of rheumatoid arthritis in single-cell chromatin landscapes of peripheral blood mononuclear cells. Mod Rheumatol 2023; 33:739-750. [PMID: 35796437 DOI: 10.1093/mr/roac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/02/2022] [Accepted: 06/23/2022] [Indexed: 11/15/2022]
Abstract
OBJECTIVES Assays for transposase-accessible chromatin with single-cell sequencing (scATAC-seq) contribute to the progress in epigenetic studies. The purpose of our project was to discover the transcription factors (TFs) that were involved in the pathogenesis of rheumatoid arthritis (RA) at a single-cell resolution using epigenetic technology. METHODS Peripheral blood mononuclear cells of seven RA patients and seven natural controls were extracted nuclei suspensions for library construction. Subsequently, scATAC-seq was performed to generate a high-resolution map of active regulatory DNA for bioinformatics analysis. RESULTS We obtained 22 accessible chromatin patterns. Then, 10 key TFs were involved in RA pathogenesis by regulating the activity of mitogen-activated protein kinase. Consequently, two genes (PTPRC and SPAG9) regulated by 10 key TFs were found, which may be associated with RA disease pathogenesis, and these TFs were obviously enriched in RA patients (P < .05, fold change value > 1.2). With further quantitative polymerase chain reaction validation on PTPRC and SPAG9 in monocytes, we found differential expression of these two genes, which were regulated by eight TFs [ZNF384, HNF1B, DMRTA2, MEF2A, NFE2L1, CREB3L4 (var. 2), FOSL2::JUNB (var. 2), and MEF2B], showing highly accessible binding sites in RA patients. CONCLUSIONS These findings demonstrate the value of using scATAC-seq to reveal transcriptional regulatory variation in RA-derived peripheral blood mononuclear cells, providing insights into therapy from an epigenetic perspective.
Collapse
Affiliation(s)
- Cantong Zhang
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Xiaoping Hong
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Haiyan Yu
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Huixuan Xu
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Xiaofen Qiu
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Wanxia Cai
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Berthold Hocher
- Fifth Department of Medicine (Nephrology/Endocrinology/Rheumatology), University Medical Centre Mannheim, University of Heidelberg, Germany
| | - Weier Dai
- College of Natural Science, University of Texas at Austin, Austin, TX, USA
| | - Donge Tang
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Dongzhou Liu
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| | - Yong Dai
- The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong, China
| |
Collapse
|
8
|
Zhao S, Ye B, Chi H, Cheng C, Liu J. Identification of peripheral blood immune infiltration signatures and construction of monocyte-associated signatures in ovarian cancer and Alzheimer's disease using single-cell sequencing. Heliyon 2023; 9:e17454. [PMID: 37449151 PMCID: PMC10336450 DOI: 10.1016/j.heliyon.2023.e17454] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/12/2023] [Accepted: 06/18/2023] [Indexed: 07/18/2023] Open
Abstract
Background Ovarian cancer (OC) is a common tumor of the female reproductive system, while Alzheimer's disease (AD) is a prevalent neurodegenerative disease that primarily affects cognitive function in the elderly. Monocytes are immune cells in the blood that can enter tissues and transform into macrophages, thus participating in immune and inflammatory responses. Overall, monocytes may play an important role in Alzheimer's disease and ovarian cancer. Methods The CIBERSORT algorithm results indicate a potential crucial role of monocytes/macrophages in OC and AD. To identify monocyte marker genes, single-cell RNA-seq data of peripheral blood mononuclear cells (PBMCs) from OC and AD patients were analyzed. Enrichment analysis of various cell subpopulations was performed using the "irGSEA" R package. The estimation of cell cycle was conducted with the "tricycle" R package, and intercellular communication networks were analyzed using "CellChat". For 134 monocyte-associated genes (MRGs), bulk RNA-seq data from two diseased tissues were obtained. Cox regression analysis was employed to develop risk models, categorizing patients into high-risk (HR) and low-risk (LR) groups. The model's accuracy was validated using an external GEO cohort. The different risk groups were evaluated in terms of immune cell infiltration, mutational status, signaling pathways, immune checkpoint expression, and immunotherapy. To identify characteristic MRGs in AD, two machine learning algorithms, namely random forest and support vector machine (SVM), were utilized. Results Based on Cox regression analysis, a risk model consisting of seven genes was developed in OC, indicating a better prognosis for patients in the LR group. The LR group had a higher tumor mutation burden, immune cell infiltration abundance, and immune checkpoint expression. The results of the TIDE algorithm and the IMvigor210 cohort showed that the LR group was more likely to benefit from immunotherapy. Finally, ZFP36L1 and AP1S2 were identified as characteristic MRGs affecting OC and AD progression. Conclusion The risk profile containing seven genes identified in this study may help further guide clinical management and targeted therapy for OC. ZFP36L1 and AP1S2 may serve as biomarkers and new therapeutic targets for patients with OC and AD.
Collapse
Affiliation(s)
- Songyun Zhao
- Department of Neurosurgery, Wuxi People's Hospital Affiliated to Nanjing Medical University, Wuxi, 214000, China
| | - Bicheng Ye
- School of Clinical Medicine, Yangzhou Polytechnic College, Yangzhou, 225000, China
| | - Hao Chi
- Clinical Medical College, Southwest Medical University, Luzhou, 646000, China
| | - Chao Cheng
- Department of Neurosurgery, Wuxi People's Hospital Affiliated to Nanjing Medical University, Wuxi, 214000, China
| | - Jinhui Liu
- Department of Gynecology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, 210000, China
| |
Collapse
|
9
|
He S, Ding Y, Ji Z, Yuan B, Chen J, Ren W. HOPX is a tumor-suppressive biomarker that corresponds to T cell infiltration in skin cutaneous melanoma. Cancer Cell Int 2023; 23:122. [PMID: 37344870 DOI: 10.1186/s12935-023-02962-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 06/02/2023] [Indexed: 06/23/2023] Open
Abstract
BACKGROUND Skin cutaneous melanoma (SKCM) is the most threatening type of skin cancer. Approximately 55,000 people lose their lives every year due to SKCM, illustrating that it seriously threatens human life and health. Homeodomain-only protein homeobox (HOPX) is the smallest member of the homeodomain family and is widely expressed in a variety of tissues. HOPX is involved in regulating the homeostasis of hematopoietic stem cells and is closely related to the development of tumors such as breast cancer, nasopharyngeal carcinoma, and head and neck squamous cell carcinoma. However, its function in SKCM is unclear, and further studies are needed. METHODS We used the R language to construct ROC (Receiver-Operating Characteristic) curves, KM (Kaplan‒Meier) curves and nomograms based on databases such as the TCGA and GEO to analyze the diagnostic and prognostic value of HOPX in SKCM patients. Enrichment analysis, immune scoring, GSVA (Gene Set Variation Analysis), and single-cell sequencing were used to verify the association between HOPX expression and immune infiltration. In vitro experiments were performed using A375 cells for phenotypic validation. Transcriptome sequencing was performed to further analyze HOPX gene-related genes and their signaling pathways. RESULTS Compared to normal cells, SKCM cells had low HOPX expression (p < 0.001). Patients with high HOPX expression had a better prognosis (p < 0.01), and the marker had good diagnostic efficacy (AUC = 0.744). GO/KEGG (Gene Ontology/ Kyoto Encyclopedia of Genes and Genomes) analysis, GSVA and single-cell sequencing analysis showed that HOPX expression is associated with immune processes and high enrichment of T cells and could serve as an immune checkpoint in SKCM. Furthermore, cellular assays verified that HOPX inhibits the proliferation, migration and invasion of A375 cells and promotes apoptosis and S-phase arrest. Interestingly, tumor drug sensitivity analysis revealed that HOPX also plays an important role in reducing clinical drug resistance. CONCLUSION These findings suggest that HOPX is a blocker of SKCM progression that inhibits the proliferation of SKCM cells and promotes apoptosis. Furthermore, it may be a new diagnostic and prognostic indicator and a novel target for immunotherapy in SKCM patients.
Collapse
Affiliation(s)
- Song He
- Department of Laboratory Animals, College of Animal Sciences, Jilin University, Changchun, 130062, Jilin, P.R. China
| | - Yu Ding
- Department of Laboratory Animals, College of Animal Sciences, Jilin University, Changchun, 130062, Jilin, P.R. China
| | - Zhonghao Ji
- Department of Laboratory Animals, College of Animal Sciences, Jilin University, Changchun, 130062, Jilin, P.R. China
- Department of Basic Medicine, Changzhi Medical College, Changzhi, 046000, Shanxi, P.R. China
| | - Bao Yuan
- Department of Laboratory Animals, College of Animal Sciences, Jilin University, Changchun, 130062, Jilin, P.R. China
| | - Jian Chen
- Department of Laboratory Animals, College of Animal Sciences, Jilin University, Changchun, 130062, Jilin, P.R. China.
| | - Wenzhi Ren
- Department of Laboratory Animals, College of Animal Sciences, Jilin University, Changchun, 130062, Jilin, P.R. China.
| |
Collapse
|
10
|
Tian Z, Li X, Jiang D. Analysis of immunogenic cell death in atherosclerosis based on scRNA-seq and bulk RNA-seq data. Int Immunopharmacol 2023; 119:110130. [PMID: 37075670 DOI: 10.1016/j.intimp.2023.110130] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 03/22/2023] [Accepted: 03/29/2023] [Indexed: 04/21/2023]
Abstract
BACKGROUND Regulated cell death plays a very important role in atherosclerosis (AS). Despite a large number of studies, there is a lack of literature on immunogenic cell death (ICD) in AS. METHOD Carotid atherosclerotic plaque single-cell RNA (scRNA) sequencing data were analyzed to define involved cells and determine their transcriptomic characteristics. Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, CIBERSORT, ESTIMATE and ssGSEA (Gene Set Enrichment Analysis), consensus clustering analysis, random forest (RF), Decision Curve Analysis (DCA), and the Drug-Gene Interaction and DrugBank databases were applied for bulk sequencing data. All data were downloaded from Gene Expression Omnibus (GEO). RESULT mDCs and CTLs correlated obviously with AS occurrence and development (k2(mDCs) = 48.333, P < 0.001; k2(CTL) = 130.56, P < 0.001). In total, 21 differentially expressed genes were obtained for the bulk transcriptome; KEGG enrichment analysis results were similar to those for differentially expressed genes in endothelial cells. Eleven genes with a gene importance score > 1.5 were obtained in the training set and validated in the test set, resulting in 8 differentially expressed genes for ICD. A model to predict occurrence of AS and 56 drugs that may be used to treat AS were obtained with these 8 genes. CONCLUSION Immunogenic cell death occurs mainly in endothelial cells in AS. ICD maintains chronic inflammation in AS and plays a crucial role in its occurrence and development. ICD related genes may become drug-targeted genes for AS treatment.
Collapse
Affiliation(s)
- Zemin Tian
- Department of Vascular and Thyroid Surgery, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning 110001, China
| | - Xinyang Li
- Department of Vascular and Thyroid Surgery, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning 110001, China
| | - Delong Jiang
- Department of Vascular and Thyroid Surgery, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning 110001, China.
| |
Collapse
|
11
|
Mallick K, Chakraborty S, Mallik S, Bandyopadhyay S. A scalable unsupervised learning of scRNAseq data detects rare cells through integration of structure-preserving embedding, clustering and outlier detection. Brief Bioinform 2023; 24:bbad125. [PMID: 37185897 DOI: 10.1093/bib/bbad125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 02/06/2023] [Accepted: 02/24/2023] [Indexed: 05/17/2023] Open
Abstract
Single-cell RNA-seq analysis has become a powerful tool to analyse the transcriptomes of individual cells. In turn, it has fostered the possibility of screening thousands of single cells in parallel. Thus, contrary to the traditional bulk measurements that only paint a macroscopic picture, gene measurements at the cell level aid researchers in studying different tissues and organs at various stages. However, accurate clustering methods for such high-dimensional data remain exiguous and a persistent challenge in this domain. Of late, several methods and techniques have been promulgated to address this issue. In this article, we propose a novel framework for clustering large-scale single-cell data and subsequently identifying the rare-cell sub-populations. To handle such sparse, high-dimensional data, we leverage PaCMAP (Pairwise Controlled Manifold Approximation), a feature extraction algorithm that preserves both the local and the global structures of the data and Gaussian Mixture Model to cluster single-cell data. Subsequently, we exploit Edited Nearest Neighbours sampling and Isolation Forest/One-class Support Vector Machine to identify rare-cell sub-populations. The performance of the proposed method is validated using the publicly available datasets with varying degrees of cell types and rare-cell sub-populations. On several benchmark datasets, the proposed method outperforms the existing state-of-the-art methods. The proposed method successfully identifies cell types that constitute populations ranging from 0.1 to 8% with F1-scores of 0.91 0.09. The source code is available at https://github.com/scrab017/RarPG.
Collapse
Affiliation(s)
- Koushik Mallick
- Computer Science and Engineering, RCC Institute of Information Technology, Canal South Road, 700015, West Bengal, India
| | - Sikim Chakraborty
- Centre for Economy and Growth, Observer Research Foundation, Rouse Avenue, New Delhi, 110002, Delhi, India
| | - Saurav Mallik
- Department of Environmental Health, Harvard T H Chan School of Public Health, 677 Huntington Ave, 02115, MA, USA
| | - Sanghamitra Bandyopadhyay
- Machine Intelligence Unit, Indian Statistical Institute, Barrackpore Trunk Rd., 700108, West Bengal, India
| |
Collapse
|
12
|
Single-cell Sequence Analysis Combined with Multiple Machine Learning to Identify Markers in Sepsis Patients: LILRA5. Inflammation 2023:10.1007/s10753-023-01803-8. [PMID: 36920635 DOI: 10.1007/s10753-023-01803-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/04/2023] [Accepted: 03/06/2023] [Indexed: 03/16/2023]
Abstract
Sepsis is a disease with a very high mortality rate, mainly involving an immune-dysregulated response due to bacterial infection. Most studies are currently limited to the whole blood transcriptome level; however, at the single cell level, there is still a great deal unknown about specific cell subsets and disease markers. We obtained 29 peripheral blood single-cell sequencing data, including 66,283 cells from 10 confirmed samples of sepsis infection and 19 healthy samples. Cells related to the sepsis phenotype were identified and characterized by the "scissor" method. The regulatory relationships of sepsis-related phenotype cells in the cellular communication network were clarified using the "cell chat" method. The least absolute shrinkage and selection operator (LASSO), support vector machine (SVM), and random forest (RF) were used to identify sepsis signature genes of diagnostic value. External validation was performed using multiple datasets from the GEO database (GSE28750, GSE185263, GSE57065) and 40 clinical samples. Bayesian algorithm was used to calculate the regulatory network of LILRA5 co-expressed genes. The stability of atenolol-targeting LILRA5 was determined by molecular docking techniques. Ultimately, action trajectory and survival analyses demonstrate the effectiveness of atenolol-targeted LILRA5 in treating patients with sepsis. We successfully identified 1215 healthy phenotypic cells and 462 sepsis phenotypic cells. We focused on 447 monocytes of the sepsis phenotype. Among the cellular communications, there were a large number of differences between these cells and other immune cells showing a significant inflammatory phenotype compared to the healthy phenotypic cells. Together, the three machine learning algorithms identified the LILRA5 marker gene in sepsis patients, and validation results from multiple external datasets as well as real-world clinical samples demonstrated the robust diagnostic performance of LILRA5. The AUC values of LILRA5 in the external datasets GSE28750, GSE185263, and GSE57065 could reach 0.875, 0.940, and 0.980, in that order. Bayesian networks identified a large number of unknown regulatory relationships for LILRA5 co-expression. Molecular docking results demonstrated the possibility of atenolol targeting LILRA5 for the treatment of sepsis. Behavioral trajectory analysis and survival analysis demonstrate that atenolol has a desirable therapeutic effect. LILRA5 is a marker gene in sepsis patients, and atenolol can stably target LILRA5.
Collapse
|
13
|
Yao J, Liu T, Zhao Q, Ji Y, Bai J, Wang H, Yao R, Zhou X, Chen Y, Xu J. Genetic landscape and immune mechanism of monocytes associated with the progression of acute-on-chronic liver failure. Hepatol Int 2023; 17:676-688. [PMID: 36626090 DOI: 10.1007/s12072-022-10472-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 12/18/2022] [Indexed: 01/11/2023]
Abstract
OBJECTIVE Acute-on-chronic liver failure (ACLF) has a high prevalence and short-term mortality. Monocytes play an important role in the development of ACLF. However, the monocyte subpopulations with unique features and functions in ACLF and associated with disease progression remain poorly understood. We investigated the specific monocyte subpopulations associated with ACLF progression and their roles in inflammatory responses using the single-cell RNA sequencing (scRNA-seq). METHODS We performed scRNA-seq on 17,310 circulating monocytes from healthy controls and ACLF patients and genetically defined their subpopulations to characterize specific monocyte subpopulations associated with ACLF progression. RESULTS Five monocyte subpopulations were obtained, including pro-inflammatory monocytes, CD16 monocytes, HLA monocytes, megakaryocyte-like monocytes, and NK-like monocytes. Comparisons of the monocytes between ACLF patients and healthy controls showed that the pro-inflammatory monocytes had the most significant gene changes, among which the expressions of genes related to inflammatory responses and cell metabolism were significantly increased while the genes related to cell cycle progression were significantly decreased. Furthermore, compared with the ACLF survival group, the ACLF death group had significantly higher expressions of pro-inflammatory cytokines (e.g., IL-6) and their receptors, chemokines (e.g., CCL4 and CCL5), and inflammation-inducing factors (e.g., HES4). Additionally, validation using scRNA-seq and flow cytometry revealed the presence of a cell type-specific transcriptional signature of pro-inflammatory monocytes THBS1, whose production might reflect the disease progression and poor prognosis. CONCLUSIONS We present the accurate classification, molecular markers, and signaling pathways of monocytes associated with ACLF progression. Therapies targeting pro-inflammatory monocytes may be a promising approach for blocking ACLF progression.
Collapse
Affiliation(s)
- Jia Yao
- Department of Gastroenterology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China
| | - Tian Liu
- Department of Gastroenterology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China
| | - Qiang Zhao
- Department of Gastroenterology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China
| | - Yaqiu Ji
- Department of Biochemistry and Molecular Biology, School of Basic Medicine, Shanxi Medical University, Taiyuan, 030001, China
| | - Jinjia Bai
- Department of Gastroenterology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China
| | - Han Wang
- Department of Gastroenterology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China
| | - Ruoyu Yao
- Department of Gastroenterology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China
| | - Xiaoshuang Zhou
- Department of Nephrology, The Affiliated People's Hospital of Shanxi Medical University, Taiyuan, 030032, China.
| | - Yu Chen
- Fourth Department of Liver Disease (Difficult and Complicated Liver Diseases and Artificial Liver Center), Beijing You'an Hospital Affiliated to Capital Medical University, Beijing, 100069, China.
| | - Jun Xu
- The First Hospital of Shanxi Medical University, No. 85 Jiefang South Road, Yingze District, Taiyuan, 030032, Shanxi, China.
| |
Collapse
|
14
|
Poonia S, Goel A, Chawla S, Bhattacharya N, Rai P, Lee YF, Yap YS, West J, Bhagat AA, Tayal J, Mehta A, Ahuja G, Majumdar A, Ramalingam N, Sengupta D. Marker-free characterization of full-length transcriptomes of single live circulating tumor cells. Genome Res 2023; 33:80-95. [PMID: 36414416 PMCID: PMC9977151 DOI: 10.1101/gr.276600.122] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 11/10/2022] [Indexed: 11/23/2022]
Abstract
The identification and characterization of circulating tumor cells (CTCs) are important for gaining insights into the biology of metastatic cancers, monitoring disease progression, and medical management of the disease. The limiting factor in the enrichment of purified CTC populations is their sparse availability, heterogeneity, and altered phenotypes relative to the primary tumor. Intensive research both at the technical and molecular fronts led to the development of assays that ease CTC detection and identification from peripheral blood. Most CTC detection methods based on single-cell RNA sequencing (scRNA-seq) use a mix of size selection, marker-based white blood cell (WBC) depletion, and antibodies targeting tumor-associated antigens. However, the majority of these methods either miss out on atypical CTCs or suffer from WBC contamination. We present unCTC, an R package for unbiased identification and characterization of CTCs from single-cell transcriptomic data. unCTC features many standard and novel computational and statistical modules for various analyses. These include a novel method of scRNA-seq clustering, named deep dictionary learning using k-means clustering cost (DDLK), expression-based copy number variation (CNV) inference, and combinatorial, marker-based verification of the malignant phenotypes. DDLK enables robust segregation of CTCs and WBCs in the pathway space, as opposed to the gene expression space. We validated the utility of unCTC on scRNA-seq profiles of breast CTCs from six patients, captured and profiled using an integrated ClearCell FX and Polaris workflow that works by the principles of size-based separation of CTCs and marker-based WBC depletion.
Collapse
Affiliation(s)
- Sarita Poonia
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| | - Anurag Goel
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
- Department of Computer Science and Engineering, Delhi Technological University, New Delhi 110042, India
| | - Smriti Chawla
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| | - Namrata Bhattacharya
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| | - Priyadarshini Rai
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| | - Yi Fang Lee
- Biolidics Limited, Singapore 118257, Singapore
| | - Yoon Sim Yap
- National Cancer Centre Singapore, Singapore 169610, Singapore
| | - Jay West
- Fluidigm Corporation, South San Francisco, California 94080, USA
| | | | - Juhi Tayal
- Department of Research, Rajiv Gandhi Cancer Institute and Research Centre-Delhi (RGCIRC-Delhi), New Delhi 110085, India
| | - Anurag Mehta
- Department of Laboratory Services and Molecular Diagnostics, Rajiv Gandhi Cancer Institute and Research Centre-Delhi (RGCIRC-Delhi), New Delhi 110085, India
| | - Gaurav Ahuja
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| | - Angshul Majumdar
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
- Centre for Artificial Intelligence, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
- Department of Electronics & Communications Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| | | | - Debarka Sengupta
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
- Centre for Artificial Intelligence, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
| |
Collapse
|
15
|
Tian Z, Zhang P, Li X, Jiang D. Analysis of immunogenic cell death in ascending thoracic aortic aneurysms based on single-cell sequencing data. Front Immunol 2023; 14:1087978. [PMID: 37207221 PMCID: PMC10191229 DOI: 10.3389/fimmu.2023.1087978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 04/14/2023] [Indexed: 05/21/2023] Open
Abstract
Background At present, research on immunogenic cell death (ICD) is mainly associated with cancer therapy. Little is known about the role of ICD in cardiovascular disease, especially in ascending thoracic aortic aneurysms (ATAA). Method ATAA single-cell RNA (scRNA) sequencing data were analyzed to identify the involved cell types and determine their transcriptomic characteristics. The chi-square test, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses, Gene Set Enrichment Analysis (GSEA), and CellChat for cell-to-cell communication analysis from the Gene Expression Omnibus (GEO) database were used. Result A total of 10 cell types were identified, namely, monocytes, macrophages, CD4 T/NK (CD4+ T cells and natural killer T cells), mast cells, B/Plasma B cells, fibroblasts, endothelial cells, cytotoxic T cells (CD8+ T cells, CTLs), vascular smooth muscle cells (vSMCs), and mature dendritic cells (mDCs). A large number of inflammation-related pathways were present in the GSEA results. A large number of ICD-related pathways were found in the KEGG enrichment analysis of differentially expressed genes in endothelial cells. The number of mDCs and CTLs in the ATAA group was significantly different from that in the control group. A total of 44 pathway networks were obtained, of which 9 were associated with ICD in endothelial cells (CCL, CXCL, ANNEXIN, CD40, IL1, IL6, TNF, IFN-II, GALECTIN). The most important ligand-receptor pair by which endothelial cells act on CD4 T/NK cells, CTLs and mDCs is CXCL12-CXCR4. The most important ligand-receptor pair by which endothelial cells act on monocytes and macrophages is ANXA1-FPR1. The most important ligand-receptor pair by which CD4 T/NK cells and CTLs act on endothelial cells is CCL5-ACKR1. The most important ligand-receptor pair that myeloid cells (macrophages, monocytes and mDCs) act on endothelial cells is CXCL8-ACKR1. Moreover, vSMCs and fibroblasts mainly promote inflammatory responses through the MIF signaling pathway. Conclusion ICD is present in ATAA and plays an important role in the development of ATAA. The target cells of ICD may be mainly endothelial cells, in which the aortic endothelial cell ACKR1 receptor can not only promote T-cell infiltration through the CCL5 ligand but also promote myeloid cell infiltration through the CXCL8 ligand. ACKR1 and CXCL12 may become target genes for ATAA drug therapy in the future.
Collapse
Affiliation(s)
- Zemin Tian
- Department of Vascular and Thyroid Surgery, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
| | - Peng Zhang
- Department of Neurology, The First Affiliated Hospital of Kunming Medical University, Kunming, Yunnan, China
| | - Xinyang Li
- Department of Vascular and Thyroid Surgery, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
- *Correspondence: Delong Jiang, ; Xinyang Li,
| | - Delong Jiang
- Department of Vascular and Thyroid Surgery, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
- *Correspondence: Delong Jiang, ; Xinyang Li,
| |
Collapse
|
16
|
Teng X, Mou DC, Li HF, Jiao L, Wu SS, Pi JK, Wang Y, Zhu ML, Tang M, Liu Y. SIGIRR deficiency contributes to CD4 T cell abnormalities by facilitating the IL1/C/EBPβ/TNF-α signaling axis in rheumatoid arthritis. Mol Med 2022; 28:135. [DOI: 10.1186/s10020-022-00563-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 10/28/2022] [Indexed: 11/19/2022] Open
Abstract
Abstract
Background
Rheumatoid arthritis (RA) is a complex autoimmune disease with multiple etiological factors, among which aberrant memory CD4 T cells activation plays a key role in the initiation and perpetuation of the disease. SIGIRR (single immunoglobulin IL-1R-related receptor), a member of the IL-1 receptor (ILR) family, acts as a negative regulator of ILR and Toll-like receptor (TLR) downstream signaling pathways and inflammation. The aim of this study was to investigate the potential roles of SIGIRR on memory CD4 T cells in RA and the underlying cellular and molecular mechanisms.
Methods
Single-cell transcriptomics and bulk RNA sequencing data were integrated to predict SIGIRR gene distribution on different immune cell types of human PBMCs. Flow cytometry was employed to determine the differential expression of SIGIRR on memory CD4 T cells between the healthy and RA cohorts. A Spearman correlation study was used to determine the relationship between the percentage of SIGIRR+ memory CD4 T cells and RA disease activity. An AIA mouse model (antigen-induced arthritis) and CD4 T cells transfer experiments were performed to investigate the effect of SIGIRR deficiency on the development of arthritis in vivo. Overexpression of SIGIRR in memory CD4 T cells derived from human PBMCs or mouse spleens was utilized to confirm the roles of SIGIRR in the intracellular cytokine production of memory CD4 T cells. Immunoblots and RNA interference were employed to understand the molecular mechanism by which SIGIRR regulates TNF-α production in CD4 T cells.
Results
SIGIRR was preferentially distributed by human memory CD4 T cells, as revealed by single-cell RNA sequencing. SIGIRR expression was substantially reduced in RA patient-derived memory CD4 T cells, which was inversely associated with RA disease activity and related to enhanced TNF-α production. SIGIRR-deficient mice were more susceptible to antigen-induced arthritis (AIA), which was attributed to unleashed TNF-α production in memory CD4 T cells, confirmed by decreased TNF-α production resulting from ectopic expression of SIGIRR. Mechanistically, SIGIRR regulates the IL-1/C/EBPβ/TNF-α signaling axis, as established by experimental evidence and cis-acting factor bioinformatics analysis.
Conclusion
Taken together, SIGIRR deficiency in memory CD4 T cells in RA raises the possibility that receptor induction can target key abnormalities in T cells and represents a potentially novel strategy for immunomodulatory therapy.
Collapse
|
17
|
Chen M, Jia S, Xue M, Huang H, Xu Z, Yang D, Zhu W, Song Q. Dual-Stream Subspace Clustering Network for revealing gene targets in Alzheimer's disease. Comput Biol Med 2022; 151:106305. [PMID: 36401971 DOI: 10.1016/j.compbiomed.2022.106305] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 11/02/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
The rapid development of scRNA-seq technology in recent years has enabled us to capture high-throughput gene expression profiles at single-cell resolution, reveal the heterogeneity of complex cell populations, and greatly advance our understanding of the underlying mechanisms in human diseases. Traditional methods for gene co-expression clustering are limited to discovering effective gene groups in scRNA-seq data. In this paper, we propose a novel gene clustering method based on convolutional neural networks called Dual-Stream Subspace Clustering Network (DS-SCNet). DS-SCNet can accurately identify important gene clusters from large scales of single-cell RNA-seq data and provide useful information for downstream analysis. Based on the simulated datasets, DS-SCNet successfully clusters genes into different groups and outperforms mainstream gene clustering methods, such as DBSCAN and DESC, across different evaluation metrics. To explore the biological insights of our proposed method, we applied it to real scRNA-seq data of patients with Alzheimer's disease (AD). DS-SCNet analyzed the single-cell RNA-seq data with 10,850 genes, and accurately identified 8 optimal clusters from 6673 cells. Enrichment analysis of these gene clusters revealed functional signaling pathways including the ILS signaling, the Rho GTPase signaling, and hemostasis pathways. Further analysis of gene regulatory networks identified new hub genes such as ELF4 as important regulators of AD, which indicates that DS-SCNet contributes to the discovery and understanding of the pathogenesis in Alzheimer's disease.
Collapse
Affiliation(s)
- Minghan Chen
- Department of Computer Science, Wake Forest University, Winston-Salem, NC, USA
| | - Shishen Jia
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Mengfan Xue
- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, China; Zhejiang Lab, Hangzhou, Zhejiang, China
| | | | - Ziang Xu
- Department of Computer Science, Wake Forest University, Winston-Salem, NC, USA
| | - Defu Yang
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Wentao Zhu
- Zhejiang Lab, Hangzhou, Zhejiang, China.
| | - Qianqian Song
- Center for Cancer Genomics and Precision Oncology, Wake Forest Baptist Comprehensive Cancer Center, Wake Forest Baptist Medical Center, Winston Salem, NC, USA; Department of Cancer Biology, Wake Forest School of Medicine, Winston Salem, NC, USA.
| |
Collapse
|
18
|
Watson ER, Mora A, Taherian Fard A, Mar JC. How does the structure of data impact cell-cell similarity? Evaluating how structural properties influence the performance of proximity metrics in single cell RNA-seq data. Brief Bioinform 2022; 23:bbac387. [PMID: 36151725 PMCID: PMC9677483 DOI: 10.1093/bib/bbac387] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 07/26/2022] [Accepted: 08/11/2022] [Indexed: 12/14/2022] Open
Abstract
Accurately identifying cell-populations is paramount to the quality of downstream analyses and overall interpretations of single-cell RNA-seq (scRNA-seq) datasets but remains a challenge. The quality of single-cell clustering depends on the proximity metric used to generate cell-to-cell distances. Accordingly, proximity metrics have been benchmarked for scRNA-seq clustering, typically with results averaged across datasets to identify a highest performing metric. However, the 'best-performing' metric varies between studies, with the performance differing significantly between datasets. This suggests that the unique structural properties of an scRNA-seq dataset, specific to the biological system under study, have a substantial impact on proximity metric performance. Previous benchmarking studies have omitted to factor the structural properties into their evaluations. To address this gap, we developed a framework for the in-depth evaluation of the performance of 17 proximity metrics with respect to core structural properties of scRNA-seq data, including sparsity, dimensionality, cell-population distribution and rarity. We find that clustering performance can be improved substantially by the selection of an appropriate proximity metric and neighbourhood size for the structural properties of a dataset, in addition to performing suitable pre-processing and dimensionality reduction. Furthermore, popular metrics such as Euclidean and Manhattan distance performed poorly in comparison to several lessor applied metrics, suggesting that the default metric for many scRNA-seq methods should be re-evaluated. Our findings highlight the critical nature of tailoring scRNA-seq analyses pipelines to the dataset under study and provide practical guidance for researchers looking to optimize cell-similarity search for the structural properties of their own data.
Collapse
Affiliation(s)
- Ebony Rose Watson
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| | - Ariane Mora
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
| | - Atefeh Taherian Fard
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| | - Jessica Cara Mar
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
19
|
Pei Y, Wei Y, Peng B, Wang M, Xu W, Chen Z, Ke X, Rong L. Combining single-cell RNA sequencing of peripheral blood mononuclear cells and exosomal transcriptome to reveal the cellular and genetic profiles in COPD. Respir Res 2022; 23:260. [PMID: 36127695 PMCID: PMC9490964 DOI: 10.1186/s12931-022-02182-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 09/09/2022] [Indexed: 11/30/2022] Open
Abstract
Background It has been a long-held consensus that immune reactions primarily mediate the pathology of chronic obstructive pulmonary disease (COPD), and that exosomes may participate in immune regulation in COPD. However, the relationship between exosomes and peripheral immune status in patients with COPD remains unclear. Methods In this study, we sequenced plasma exosomes and performed single-cell RNA sequencing on peripheral blood mononuclear cells (PBMCs) from patients with COPD and healthy controls. Finally, we constructed competing endogenous RNA (ceRNA) and protein–protein interaction (PPI) networks to delineate the interactions between PBMCs and exosomes within COPD. Results We identified 135 mRNAs, 132 lncRNAs, and 359 circRNAs from exosomes that were differentially expressed in six patients with COPD compared with four healthy controls. Functional enrichment analyses revealed that many of these differentially expressed RNAs were involved in immune responses including defending viral infection and cytokine–cytokine receptor interaction. We also identified 18 distinct cell clusters of PBMCs in one patient and one control by using an unsupervised cluster analysis called uniform manifold approximation and projection (UMAP). According to resultant cell identification, it was likely that the proportions of monocytes, dendritic cells, and natural killer cells increased in the COPD patient we tested, meanwhile the proportions of B cells, CD4 + T cells, and naïve CD8 + T cells declined. Notably, CD8 + T effector memory CD45RA + (Temra) cell and CD8 + effector memory T (Tem) cell levels were elevated in patient with COPD, which were marked by their lower capacity to differentiate due to their terminal differentiation state and lower reactive capacity to viral pathogens. Conclusions We generated exosomal RNA profiling and single-cell transcriptomic profiling of PBMCs in COPD, described possible connection between impaired immune function and COPD development, and finally determined the possible role of exosomes in mediating local and systemic immune reactions. Supplementary Information The online version contains supplementary material available at 10.1186/s12931-022-02182-8.
Collapse
Affiliation(s)
- Yanli Pei
- Respiratory Medicine Department, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - Yuxi Wei
- Peking Union Medical College (PUMC), PUMC and Chinese Academy of Medical Sciences, Beijing, China
| | - Boshizhang Peng
- Peking Union Medical College (PUMC), PUMC and Chinese Academy of Medical Sciences, Beijing, China
| | - Mengqi Wang
- Peking Union Medical College (PUMC), PUMC and Chinese Academy of Medical Sciences, Beijing, China
| | - Wei Xu
- Respiratory Medicine Department, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - Zhe Chen
- Laboratory of Cough, Affiliated Kunshan Hospital of Jiangsu University, Suzhou, Jiangsu, China.
| | - Xindi Ke
- Peking Union Medical College (PUMC), PUMC and Chinese Academy of Medical Sciences, Beijing, China.
| | - Lei Rong
- Respiratory Medicine Department, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China.
| |
Collapse
|
20
|
Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity. Nat Commun 2022; 13:5455. [PMID: 36114209 PMCID: PMC9481560 DOI: 10.1038/s41467-022-33136-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 09/05/2022] [Indexed: 11/30/2022] Open
Abstract
Clustering is a powerful machine learning method for discovering similar patterns according to the proximity of elements in feature space. It is widely used in computer science, bioscience, geoscience, and economics. Although the state-of-the-art partition-based and connectivity-based clustering methods have been developed, weak connectivity and heterogeneous density in data impede their effectiveness. In this work, we propose a boundary-seeking Clustering algorithm using the local Direction Centrality (CDC). It adopts a density-independent metric based on the distribution of K-nearest neighbors (KNNs) to distinguish between internal and boundary points. The boundary points generate enclosed cages to bind the connections of internal points, thereby preventing cross-cluster connections and separating weakly-connected clusters. We demonstrate the validity of CDC by detecting complex structured clusters in challenging synthetic datasets, identifying cell types from single-cell RNA sequencing (scRNA-seq) and mass cytometry (CyTOF) data, recognizing speakers on voice corpuses, and testifying on various types of real-world benchmarks. Clustering is a powerful machine learning method for discovering similar patterns according to the proximity of elements in feature space. Here the authors propose a local direction centrality clustering algorithm that copes with heterogeneous density and weak connectivity issues.
Collapse
|
21
|
Liang L, Sun J, Teng T, Chen L, Li Z, Zhang Z, Gao Y, Zhang W. Expression Profile of Inflammation Response Genes and Potential Regulatory Mechanisms in Dilated Cardiomyopathy. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2022; 2022:1051652. [PMID: 36035223 PMCID: PMC9402291 DOI: 10.1155/2022/1051652] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 07/22/2022] [Accepted: 08/01/2022] [Indexed: 11/18/2022]
Abstract
Background The inflammatory response is important in dilated cardiomyopathy (DCM). However, the expression of inflammatory response genes (IRGs) and regulatory mechanisms in DCM has not been well characterized. Methods We analyzed 27,665 cells of single-cell RNA sequencing dataset of four DCM samples and two healthy controls (HC). IRGs among differentially expressed genes (DEGs) of active cell clusters were screened from the Molecular Signatures Database (MSigDB). The bulk sequencing dataset of 166 DCM patients and 166 HC was analyzed to explore the common IRGs. The biological functions of the IRGs were analyzed according to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. IRG-related transcription factors (TFs) were determined using the TRRUST database. The protein-protein interaction (PPI) network was constructed using the STRING database. Then, we established the noncoding RNA (ncRNA) regulatory network based on the StarBase database. Finally, the potential drugs that target IRGs were explored using the Drug Gene Interaction Database (DGIdb). Results The proportions of dendritic cells (DCs), B cells, NK cells, and T cells were increased in DCM patients, whereas monocytes were decreased. DCs expressed more IRGs in DCM. The GO and KEGG analyses indicated that the functional characteristics of active cells mainly focused on the immune response. Thirty-nine IRGs were commonly expressed among active cell cluster DEGs, bulk RNA DEGs, and inflammatory response-related genes. ETS1 plays an important role in regulation of IRG expression. The competing endogenous RNA regulatory network showed the relationship between ncRNA and IRGs. Sankey diagram showed that arachidonate 5-lipoxygenase (ALOX5) played a major role in regulation between TFs and potential drugs. Conclusion DCs infiltrate into the myocardium and contribute to the immune response in DCM. The transcription factor ETS1 plays an important role in regulation of IRGs. Moreover, ALOX5 may be a potential therapeutic target for DCM.
Collapse
Affiliation(s)
- Lifeng Liang
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Jiayi Sun
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Tianming Teng
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Lizhu Chen
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Zejian Li
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Zhen Zhang
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Yannan Gao
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Wenjuan Zhang
- Department of Cardiology, Tianjin Medical University General Hospital, Tianjin, China
| |
Collapse
|
22
|
scWizard: a web-based automated tool for classifying and annotating single cells and downstream analysis of single-cell RNA-seq data in cancers. Comput Struct Biotechnol J 2022; 20:4902-4909. [PMID: 36147672 PMCID: PMC9474308 DOI: 10.1016/j.csbj.2022.08.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 07/27/2022] [Accepted: 08/12/2022] [Indexed: 11/22/2022] Open
Abstract
scWizard provides comprehensive analysis pipeline for integration strategies of cancer scRNA-seq data. scWizard enables classification of 47 cell subtypes within the TME based on hierarchical model by deep neural network. scWizard gives a higher accuracy for annotation cell subtypes within the TME compared with five methods. scWizard packages is a point-and-click tool helping for researchers without proficient programming skills.
The emerging number of single-cell RNA-seq (scRNA-Seq) datasets allows the characterization of cell types across various cancer types. However, there is still lack of effective tools to integrate the various analysis of single-cells, especially for making fine annotation on subtype cells within the tumor microenvironment (TME). We developed scWizard, a point-and-click tool packaging automated process including our developed cell annotation method based on deep neural network learning and 11 downstream analyses methods. scWizard used 113,976 cells across 13 cancer types as a built-in reference dataset for training the hierarchical model enabling to automatedly classify and annotate 7 major cell types and 47 cell subtypes in the TME. scWizard provides a built-in pre-training set for user’s flexible choice, and gives a higher accuracy for annotation subtypes of tumor-derived T-lymphocytes/natural killer cells (T/NK) and myeloid cells from different cancer types compared with the existing five methods. scWizard has good robustness in three independent cancer datasets, with an accuracy of 0.98 in annotating major cell types, 0.85 in annotating myeloid cell subtypes and 0.79 in annotating T/NK cell subtypes, indicting the wide applicability of scWizard in different cell types of cancers. Finally, the automatic analysis and visualization function of scWizard are presented by using the intrahepatic cholangiocarcinoma (ICC) scRNA-Seq dataset as a case. scWizard focuses on decoding TME and covers various analysis flows for cancer scRNA-Seq study, and provides an easy-to-use tool and a user-friendly interface for researchers widely, to further accelerate the biological discovery of cancer research.
Collapse
|
23
|
Liu Y, Dong Y, Wu X, Wang X, Niu J. Identification of Immune Microenvironment Changes and the Expression of Immune-Related Genes in Liver Cirrhosis. Front Immunol 2022; 13:918445. [PMID: 35903097 PMCID: PMC9315064 DOI: 10.3389/fimmu.2022.918445] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 06/20/2022] [Indexed: 11/13/2022] Open
Abstract
Liver inflammation and the immune response have been recognized as critical contributors to cirrhosis pathogenesis. Immunity-related genes (IRGs) play an essential role in immune cell infiltration and immune reactions; however, the changes in the immune microenvironment and the expression of IRGs involved in cirrhosis remain unclear. CD45+ liver cell single-cell RNA (scRNA) sequencing data (GSE136103) from patients with cirrhosis were analyzed. The clusters were identified as known cell types through marker genes according to previous studies. GO and KEGG analyses among differentially expressed genes (DEGs) were performed. DEGs were screened to identify IRGs based on the ImmPort database. The protein-protein interaction (PPI) network of IRGs was generated using the STRING database. IRGs activity was calculated using the AUCell package. RNA microarray expression data (GSE45050) of cirrhosis were analyzed to confirm common IRGs and IRGs activity. Relevant regulatory transcription factors (TFs) were identified from the Human TFDB database. A total of ten clusters were obtained. CD8+ T cells and NK cells were significantly decreased in patients with cirrhosis, while CD4+ T memory cells were increased. Enrichment analyses showed that the DEGs focused on the regulation of immune cell activation and differentiation, NK-cell mediated cytotoxicity, and antigen processing and presentation. Four common TFs, IRF8, NR4A2, IKZF3, and REL were expressed in both the NK cluster and the DEGs of liver tissues. In conclusion, we proposed that the reduction of the CD8+ T cell cluster and NK cells, as well as the infiltration of CD4+ memory T cells, contributed to immune microenvironment changes in cirrhosis. IRF8, NR4A2, IKZF3, and REL may be involved in the transcriptional regulation of NK cells in liver fibrosis. The identified DEGs, IRGs, and pathways may serve critical roles in the development and progression of liver fibrosis.
Collapse
Affiliation(s)
- Yuwei Liu
- Department of Hepatology, Center of Infectious Diseases and Pathogen Biology, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Zoonosis Research, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Organ Transplantation, Ministry of Education, The First Hospital of Jilin University, Changchun, China
| | - Yutong Dong
- Department of Hepatology, Center of Infectious Diseases and Pathogen Biology, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Zoonosis Research, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Organ Transplantation, Ministry of Education, The First Hospital of Jilin University, Changchun, China
| | - Xiaojing Wu
- Department of Hepatology, Center of Infectious Diseases and Pathogen Biology, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Zoonosis Research, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Organ Transplantation, Ministry of Education, The First Hospital of Jilin University, Changchun, China
| | - Xiaomei Wang
- Department of Hepatology, Center of Infectious Diseases and Pathogen Biology, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Zoonosis Research, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Organ Transplantation, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- *Correspondence: Junqi Niu, ; Xiaomei Wang,
| | - Junqi Niu
- Department of Hepatology, Center of Infectious Diseases and Pathogen Biology, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Zoonosis Research, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- Key Laboratory of Organ Transplantation, Ministry of Education, The First Hospital of Jilin University, Changchun, China
- *Correspondence: Junqi Niu, ; Xiaomei Wang,
| |
Collapse
|
24
|
Ellis D, Wu D, Datta S. SAREV: A review on statistical analytics of single-cell RNA sequencing data. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2022; 14:e1558. [PMID: 36034329 PMCID: PMC9400796 DOI: 10.1002/wics.1558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 04/09/2021] [Indexed: 06/15/2023]
Abstract
Due to the development of next-generation RNA sequencing (NGS) technologies, there has been tremendous progress in research involving determining the role of genomics, transcriptomics and epigenomics in complex biological systems. However, scientists have realized that information obtained using earlier technology, frequently called 'bulk RNA-seq' data, provides information averaged across all the cells present in a tissue. Relatively newly developed single cell (scRNA-seq) technology allows us to provide transcriptomic information at a single-cell resolution. Nevertheless, these high-resolution data have their own complex natures and demand novel statistical data analysis methods to provide effective and highly accurate results on complex biological systems. In this review, we cover many such recently developed statistical methods for researchers wanting to pursue scRNA-seq statistical and computational research as well as scientific research about these existing methods and free software tools available for their generated data. This review is certainly not exhaustive due to page limitations. We have tried to cover the popular methods starting from quality control to the downstream analysis of finding differentially expressed genes and concluding with a brief description of network analysis.
Collapse
Affiliation(s)
- Dorothy Ellis
- Department of Biostatistics, University of Florida, School of Public Health and Health Professions, Gainesville, FL
| | - Dongyuan Wu
- Department of Biostatistics, University of Florida, School of Public Health and Health Professions, Gainesville, FL
| | - Susmita Datta
- Department of Biostatistics, University of Florida, School of Public Health and Health Professions, Gainesville, FL
| |
Collapse
|
25
|
Jiang H, Huang Y, Li Q. Spectral clustering of single cells using Siamese nerual network combined with improved affinity matrix. Brief Bioinform 2022; 23:6567703. [PMID: 35419595 DOI: 10.1093/bib/bbac113] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/02/2022] [Accepted: 03/08/2022] [Indexed: 11/14/2022] Open
Abstract
Limitations of bulk sequencing techniques on cell heterogeneity and diversity analysis have been pushed with the development of single-cell RNA-sequencing (scRNA-seq). To detect clusters of cells is a key step in the analysis of scRNA-seq. However, the high-dimensionality of scRNA-seq data and the imbalances in the number of different subcellular types are ubiquitous in real scRNA-seq data sets, which poses a huge challenge to the single-cell-type detection.We propose a meta-learning-based model, SiaClust, which is the combination of Siamese Convolutional Neural Network (CNN) and improved spectral clustering, to achieve scRNA-seq cell type detection. To be specific, with the help of the constrained Sigmoid kernel, the raw high-dimensionality data is mapped to a low-dimensional space, and the Siamese CNN learns the differences between the cell types in the low-dimensional feature space. The similarity matrix learned by Siamese CNN is used in combination with improved spectral clustering and t-distribution Stochastic Neighbor Embedding (t-SNE) for visualization. SiaClust highlights the differences between cell types by comparing the similarity of the samples, whereas blurring the differences within the cell types is better in processing high-dimensional and imbalanced data. SiaClust significantly improves clustering accuracy by using data generated by nine different species and tissues through different scNA-seq protocols for extensive evaluation, as well as analogies to state-of-the-art single-cell clustering models. More importantly, SiaClust accurately locates the exact site of dropout gene, and is more flexible with data size and cell type.
Collapse
Affiliation(s)
- Hanjing Jiang
- Key Laboratory of Image Information Processing and Intelligent Control of Education Ministry of China, Institute of Artificial Intelligence, School of Artificial Intelligence and Automation, 430074, Wuhan, China
| | - Yabing Huang
- Renmin Hospital of Wuhan University, Department of Pathology, 430060, Wuhan, China
| | - Qianpeng Li
- Chinese Academy of Sciences, Institute of Automation, 100190, Beijing, China
| |
Collapse
|
26
|
Rai P, Sengupta D, Majumdar A. SelfE: Gene Selection via Self-Expression for Single-Cell Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:624-632. [PMID: 32750851 DOI: 10.1109/tcbb.2020.2997326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Single-cell RNA sequencing has been proved to be advantageous in discerning molecular heterogeneity in seemingly similar cells in a tissue. Due to the paucity of starting RNA, a large fraction of transcripts fail to amplify during the polymerase chain reaction cycle. This gets compounded by trivial biological noise such as variability in the cell cycle specific genes. As a result expression matrix obtained from a single-cell study is highly sparse with a large number of missing values. This hinders downstream analysis of single-cell expression data. It has been observed that feature engineering significantly improves the analysis outcomes. Feature extraction methods such as principal component analysis and zero-inflated factor analysis have been shown to be useful for subsequent steps of data analysis including clustering. However, too little or no visible efforts have been observed for developing feature selection techniques, which offer transparency for the analyst's consumption. We propose SelfE, a novel l2,0 -minimization algorithm that determines an optimal subset of feature vectors that preserves sub-space structures as observed in the data. We compared SelfE with the commonly used feature selection methods for single-cell expression data analysis.
Collapse
|
27
|
Shenoy SR, Dey B. Funding for cancer research by an Indian funding agency, DBT. J Biosci 2021. [DOI: 10.1007/s12038-020-00121-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
28
|
Lu Y, Li K, Hu Y, Wang X. Expression of Immune Related Genes and Possible Regulatory Mechanisms in Alzheimer's Disease. Front Immunol 2021; 12:768966. [PMID: 34804058 PMCID: PMC8602845 DOI: 10.3389/fimmu.2021.768966] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 10/25/2021] [Indexed: 11/13/2022] Open
Abstract
Immune infiltration of peripheral natural killer (NK) cells in the brain has been observed in Alzheimer's disease (AD). Immunity-related genes (IRGs) play an essential role in immune infiltration; however, the expression of IRGs and possible regulatory mechanisms involved in AD remain unclear. The peripheral blood mononuclear cells (PBMCs) single-cell RNA (scRNA) sequencing data from patients with AD were analyzed and PBMCs obtained from the ImmPort database were screened for cluster marker genes. IRG activity was calculated using the AUCell package. A bulk sequencing dataset of AD brain tissues was analyzed to explore common IRGs between PBMCs and the brain. Relevant regulatory transcription factors (TFs) were identified from the Human TFDB database. The protein-protein interaction network of key TFs were generated using the STRING database. Eight clusters were identified, including memory CD4 T, NKT, NK, B, DC, CD8 T cells, and platelets. NK cells were significantly decreased in patients with AD, while CD4 T cells were increased. NK and DC cells exhibited the highest IRG activity. GO and KEGG analyses of the scRNA and bulk sequencing data showed that the DEGs focused on the immune response. Seventy common IRGs were found in both peripheral NK cells and the brain. Seventeen TFs were associated with IRG expression, and the PPI network indicated that STAT3, IRF1, and REL were the hub TFs. In conclusion, we propose that peripheral NK cells may infiltrate the brain and contribute to neuroinflammatory changes in AD through bioinformatic analysis of scRNA and bulk sequencing data. Moreover, STAT3 may be involved in the transcriptional regulation of IRGs in NK cells.
Collapse
Affiliation(s)
- Yanjun Lu
- Department of Laboratory Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Ke Li
- Department of Blood Transfusion, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yu Hu
- Institute of Pathology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xiong Wang
- Department of Laboratory Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
29
|
Fang Q, Su D, Ng W, Feng J. An Effective Biclustering-Based Framework for Identifying Cell Subpopulations From scRNA-seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2249-2260. [PMID: 32167906 DOI: 10.1109/tcbb.2020.2979717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The advent of single-cell RNA sequencing (scRNA-seq) techniques opens up new opportunities for studying the cell-specific changes in the transcriptomic data. An important research problem related with scRNA-seq data analysis is to identify cell subpopulations with distinct functions. However, the expression profiles of individual cells are usually measured over tens of thousands of genes, and it remains a difficult problem to effectively cluster the cells based on the high-dimensional profiles. An additional challenge of performing the analysis is that, the scRNA-seq data are often noisy and sometimes extremely sparse due to technical limitations and sampling deficiencies. In this paper, we propose a biclustering-based framework called DivBiclust that effectively identifies the cell subpopulations based on the high-dimensional noisy scRNA-seq data. Compared with nine state-of-the-art methods, DivBiclust excels in identifying cell subpopulations with high accuracy as evidenced by our experiments on ten real scRNA-seq datasets with different size and diverse dropout rates. The supplemental materials of DivBiclust, including the source codes, data, and a supplementary document, are available at https://www.github.com/Qiong-Fang/DivBiclust.
Collapse
|
30
|
Liu Z, Huang Y, Liang W, Bai J, Feng H, Fang Z, Tian G, Zhu Y, Zhang H, Wang Y, Liu A, Chen Y. Cascaded filter deterministic lateral displacement microchips for isolation and molecular analysis of circulating tumor cells and fusion cells. LAB ON A CHIP 2021; 21:2881-2891. [PMID: 34219135 DOI: 10.1039/d1lc00360g] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Precise isolation and analysis of circulating tumor cells (CTCs) from blood samples offer considerable potential for cancer research and personalized treatment. Currently, available CTC isolation approaches remain challenging in the quest for simple strategies to achieve cell isolation with both high separation efficiency and high purity, which limits the use of captured CTCs for downstream analyses. Here, we present a filter deterministic lateral displacement concept to achieve one-step and label-free CTC isolation with high throughput. Unlike conventional deterministic lateral displacement (DLD) devices, the proposed method uses a hydrodynamic cell sorting design by incorporating a filtration concept into a DLD structure, and enables high-throughput and clog-free isolation by a cascaded microfluidic design. The cascaded filter-DLD (CFD) design demonstrated enhanced performance for size-based cell separation, and achieved high separation efficiency (>96%), high cell purity (WBC removal rate 99.995%), high cell viability (>98%) and high processing rate (1 mL min-1). Samples from lung cancer patients were analyzed using the CFD-Chip, CTCs and tumor cell-leukocyte fusion cells were efficiently collected, and changes in CTC levels were used for treatment response monitoring. The CFD-Chip platform isolated CTCs with good viability, enabling direct downstream analysis with single-cell RNA sequencing. Transcriptome analysis of enriched CTCs identified new subtypes of CTCs such as tumor cell-leukocyte fusion cells, providing insights into cancer diagnostics and therapeutics.
Collapse
Affiliation(s)
- Zongbin Liu
- Shenzhen Zigzag Biotechnology Co., Ltd., Shenzhen, 518107, China.
| | - Yuqing Huang
- CAS Key Laboratory of Health Informatics, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Wenli Liang
- Tumor Department, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518035, China.
| | - Jing Bai
- Shenzhen Zigzag Biotechnology Co., Ltd., Shenzhen, 518107, China.
| | - Hongtao Feng
- CAS Key Laboratory of Health Informatics, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Zhihao Fang
- CAS Key Laboratory of Health Informatics, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Geng Tian
- Tumor Department, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518035, China.
| | - Yanjuan Zhu
- Department of Oncology, Guangdong Provincial Hospital of Traditional Chinese Medicine, The Second Clinical Medical College of Guangzhou University of Chinese Medicine, Guangzhou, 510120, China
| | - Haibo Zhang
- Department of Oncology, Guangdong Provincial Hospital of Traditional Chinese Medicine, The Second Clinical Medical College of Guangzhou University of Chinese Medicine, Guangzhou, 510120, China
| | - Yuanxiang Wang
- Department of Cardiothoracic Surgery, Shenzhen Children's Hospital, Shenzhen, 518038, China
| | - Aixue Liu
- Tumor Department, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518035, China.
| | - Yan Chen
- CAS Key Laboratory of Health Informatics, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
31
|
Qiu X, Yu H, Wu H, Hu Z, Zhou J, Lin H, Xue W, Cai W, Chen J, Yan Q, Dai W, Yang M, Tang D, Dai Y. Single-cell chromatin accessibility landscape of human umbilical cord blood in trisomy 18 syndrome. Hum Genomics 2021; 15:40. [PMID: 34193281 PMCID: PMC8246660 DOI: 10.1186/s40246-021-00338-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 05/29/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Trisomy 18 syndrome (Edwards syndrome, ES) is a type of aneuploidy caused by the presence of an extra chromosome 18. Aneuploidy is the leading cause of early pregnancy loss, intellectual disability, and multiple congenital anomalies. The research of trisomy 18 is progressing slowly, and the molecular characteristics of the disease mechanism and phenotype are still largely unclear. RESULTS In this study, we used the commercial Chromium platform (10× Genomics) to perform sc-ATAC-seq to measure chromatin accessibility in 11,611 single umbilical cord blood cells derived from one trisomy 18 syndrome patient and one healthy donor. We obtained 13 distinct major clusters of cells and identified them as 6 human umbilical cord blood mononuclear cell types using analysis tool. Compared with the NC group, the ES group had a lower ratio of T cells to NK cells, the ratio of monocytes/DC cell population did not change significantly, and the ratio of B cell nuclear progenitor and megakaryocyte erythroid cells was higher. The differential genes of ME-0 are enriched in Human T cell leukemia virus 1 infection pathway, and the differential peak genes of ME-1 are enriched in apopotosis pathway. We found that CCNB2 and MCM3 may be vital to the development of trisomy 18. CCNB2 and MCM3, which have been reported to be essential components of the cell cycle and chromatin. CONCLUSIONS We have identified 6 cell populations in cord blood. Disorder in megakaryocyte erythroid cells implicates trisomy 18 in perturbing fetal hematopoiesis. We identified a pathway in which the master differential regulatory pathway in the ME-0 cell population involves human T cell leukemia virus 1 infection, a pathway that is dysregulated in patients with trisomy 18 and which may increase the risk of leukemia in patients with trisomy 18. CCNB2 and MCM3 in progenitor may be vital to the development of trisomy 18. CCNB2 and MCM3, which have been reported to be essential components of the cell cycle and chromatin, may be related to chromosomal abnormalities in trisomy 18.
Collapse
Affiliation(s)
- Xiaofen Qiu
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China.,Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China.,College of Life Science, Guangxi Normal University, Guilin, Guangxi, 541004, People's Republic of China
| | - Haiyan Yu
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China
| | - Hongwei Wu
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China
| | - Zhiyang Hu
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China
| | - Jun Zhou
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China
| | - Hua Lin
- Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China
| | - Wen Xue
- Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China
| | - Wanxia Cai
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China
| | - Jiejing Chen
- Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China
| | - Qiang Yan
- Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China
| | - Weier Dai
- College of Natural Science, University of Texas at Austin, Austin, TX, 78712, USA
| | - Ming Yang
- Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China
| | - Donge Tang
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China.
| | - Yong Dai
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, The First Affiliated Hospital of Southern University of Science and Technology, The Second Clinical Medical College of Jinan University, Shenzhen People's Hospital, Shenzhen, Guangdong, 518020, People's Republic of China. .,Guangxi Key Laboratory of Metabolic Diseases Research, Department of Clinical Laboratory of Guilin, No. 924 Hospital, 541002, Guilin, Guangxi, People's Republic of China.
| |
Collapse
|
32
|
Yu H, Hong X, Wu H, Zheng F, Zeng Z, Dai W, Yin L, Liu D, Tang D, Dai Y. The Chromatin Accessibility Landscape of Peripheral Blood Mononuclear Cells in Patients With Systemic Lupus Erythematosus at Single-Cell Resolution. Front Immunol 2021; 12:641886. [PMID: 34084162 PMCID: PMC8168536 DOI: 10.3389/fimmu.2021.641886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 04/14/2021] [Indexed: 12/12/2022] Open
Abstract
Objective Systemic lupus erythematosus (SLE) is a complex autoimmune disease, and various immune cells are involved in the initiation, progression, and regulation of SLE. Our goal was to reveal the chromatin accessibility landscape of peripheral blood mononuclear cells (PBMCs) in SLE patients at single-cell resolution and identify the transcription factors (TFs) that may drive abnormal immune responses. Methods The assay for transposase accessible chromatin in single-cell sequencing (scATAC-seq) method was applied to map the landscape of active regulatory DNA in immune cells from SLE patients at single-cell resolution, followed by clustering, peak annotation and motif analysis of PBMCs in SLE. Results Peripheral blood mononuclear cells were robustly clustered based on their types without using antibodies. We identified twenty patterns of TF activation that drive abnormal immune responses in SLE patients. Then, we observed ten genes that were highly associated with SLE pathogenesis by altering T cell activity. Finally, we found 12 key TFs regulating the above six genes (CD83, ELF4, ITPKB, RAB27A, RUNX3, and ZMIZ1) that may be related to SLE disease pathogenesis and were significantly enriched in SLE patients (p <0.05, FC >2). With qPCR experiments on CD83, ELF4, RUNX3, and ZMIZ1 in B cells, we observed a significant difference in the expression of genes (ELF4, RUNX3, and ZMIZ1), which were regulated by seven TFs (EWSR1-FLI1, MAF, MAFA, NFIB, NR2C2 (var. 2), TBX4, and TBX5). Meanwhile, the seven TFs showed highly accessible binding sites in SLE patients. Conclusions These results confirm the importance of using single-cell sequencing to uncover the real features of immune cells in SLE patients, reveal key TFs in SLE-PBMCs, and provide foundational insights relevant for epigenetic therapy.
Collapse
Affiliation(s)
- Haiyan Yu
- Department of Clinical Medical Research Center, The Second Clinical Medical College, Jinan University (Shenzhen People's Hospital), Shenzhen, China.,The First Affiliated Hospital, Jinan University, Guangzhou, China
| | - Xiaoping Hong
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen Engineering Research Center of Autoimmune Disease, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, China
| | - Hongwei Wu
- Department of Nephrology, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong, China
| | - Fengping Zheng
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen Engineering Research Center of Autoimmune Disease, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, China
| | - Zhipeng Zeng
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen Engineering Research Center of Autoimmune Disease, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, China
| | - Weier Dai
- College of Natural Science, University of Texas at Austin, Austin, TX, United States
| | - Lianghong Yin
- Department of Nephrology, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong, China
| | - Dongzhou Liu
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen Engineering Research Center of Autoimmune Disease, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, China
| | - Donge Tang
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen Engineering Research Center of Autoimmune Disease, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, China
| | - Yong Dai
- Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen Engineering Research Center of Autoimmune Disease, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, China
| |
Collapse
|
33
|
Adil A, Kumar V, Jan AT, Asger M. Single-Cell Transcriptomics: Current Methods and Challenges in Data Acquisition and Analysis. Front Neurosci 2021; 15:591122. [PMID: 33967674 PMCID: PMC8100238 DOI: 10.3389/fnins.2021.591122] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 03/19/2021] [Indexed: 11/17/2022] Open
Abstract
Rapid cost drops and advancements in next-generation sequencing have made profiling of cells at individual level a conventional practice in scientific laboratories worldwide. Single-cell transcriptomics [single-cell RNA sequencing (SC-RNA-seq)] has an immense potential of uncovering the novel basis of human life. The well-known heterogeneity of cells at the individual level can be better studied by single-cell transcriptomics. Proper downstream analysis of this data will provide new insights into the scientific communities. However, due to low starting materials, the SC-RNA-seq data face various computational challenges: normalization, differential gene expression analysis, dimensionality reduction, etc. Additionally, new methods like 10× Chromium can profile millions of cells in parallel, which creates a considerable amount of data. Thus, single-cell data handling is another big challenge. This paper reviews the single-cell sequencing methods, library preparation, and data generation. We highlight some of the main computational challenges that require to be addressed by introducing new bioinformatics algorithms and tools for analysis. We also show single-cell transcriptomics data as a big data problem.
Collapse
Affiliation(s)
- Asif Adil
- Department of Computer Sciences, Baba Ghulam Shah Badshah University, Rajouri, India
| | - Vijay Kumar
- Department of Biotechnology, Yeungnam University, Gyeongsan, South Korea
| | - Arif Tasleem Jan
- School of Biosciences and Biotechnology, Baba Ghulam Shah Badshah University, Rajouri, India
| | - Mohammed Asger
- Department of Computer Sciences, Baba Ghulam Shah Badshah University, Rajouri, India
| |
Collapse
|
34
|
Gupta K, Mohanty SK, Mittal A, Kalra S, Kumar S, Mishra T, Ahuja J, Sengupta D, Ahuja G. The Cellular basis of loss of smell in 2019-nCoV-infected individuals. Brief Bioinform 2021; 22:873-881. [PMID: 32810867 PMCID: PMC7462334 DOI: 10.1093/bib/bbaa168] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/10/2020] [Accepted: 07/05/2020] [Indexed: 12/28/2022] Open
Abstract
A prominent clinical symptom of 2019-novel coronavirus (nCoV) infection is hyposmia/anosmia (decrease or loss of sense of smell), along with general symptoms such as fatigue, shortness of breath, fever and cough. The identity of the cell lineages that underpin the infection-associated loss of olfaction could be critical for the clinical management of 2019-nCoV-infected individuals. Recent research has confirmed the role of angiotensin-converting enzyme 2 (ACE2) and transmembrane protease serine 2 (TMPRSS2) as key host-specific cellular moieties responsible for the cellular entry of the virus. Accordingly, the ongoing medical examinations and the autopsy reports of the deceased individuals indicate that organs/tissues with high expression levels of ACE2, TMPRSS2 and other putative viral entry-associated genes are most vulnerable to the infection. We studied if anosmia in 2019-nCoV-infected individuals can be explained by the expression patterns associated with these host-specific moieties across the known olfactory epithelial cell types, identified from a recently published single-cell expression study. Our findings underscore selective expression of these viral entry-associated genes in a subset of sustentacular cells (SUSs), Bowman's gland cells (BGCs) and stem cells of the olfactory epithelium. Co-expression analysis of ACE2 and TMPRSS2 and protein-protein interaction among the host and viral proteins elected regulatory cytoskeleton protein-enriched SUSs as the most vulnerable cell type of the olfactory epithelium. Furthermore, expression, structural and docking analyses of ACE2 revealed the potential risk of olfactory dysfunction in four additional mammalian species, revealing an evolutionarily conserved infection susceptibility. In summary, our findings provide a plausible cellular basis for the loss of smell in 2019-nCoV-infected patients.
Collapse
Affiliation(s)
- Krishan Gupta
- Indraprastha Institute of Information Technology, Delhi
| | | | | | | | - Suvendu Kumar
- Indraprastha Institute of Information Technology, Delhi
| | - Tripti Mishra
- Indraprastha Institute of Information Technology, Delhi
| | - Jatin Ahuja
- Indraprastha Institute of Information Technology, Delhi
| | | | - Gaurav Ahuja
- Indraprastha Institute of Information Technology, Delhi
| |
Collapse
|
35
|
Abstract
Motivation Single-cell RNA-sequencing has grown massively in scale since its inception, presenting substantial analytic and computational challenges. Even simple downstream analyses, such as dimensionality reduction and clustering, require days of runtime and hundreds of gigabytes of memory for today’s largest datasets. In addition, current methods often favor common cell types, and miss salient biological features captured by small cell populations. Results Here we present Hopper, a single-cell toolkit that both speeds up the analysis of single-cell datasets and highlights their transcriptional diversity by intelligent subsampling, or sketching. Hopper realizes the optimal polynomial-time approximation of the Hausdorff distance between the full and downsampled dataset, ensuring that each cell is well-represented by some cell in the sample. Unlike prior sketching methods, Hopper adds points iteratively and allows for additional sampling from regions of interest, enabling fast and targeted multi-resolution analyses. In a dataset of over 1.3 million mouse brain cells, Hopper detects a cluster of just 64 macrophages expressing inflammatory genes (0.004% of the full dataset) from a Hopper sketch containing just 5000 cells, and several other small but biologically interesting immune cell populations invisible to analysis of the full data. On an even larger dataset consisting of ∼2 million developing mouse organ cells, we show Hopper’s even representation of important cell types in small sketches, in contrast with prior sketching methods. We also introduce Treehopper, which uses spatial partitioning to speed up Hopper by orders of magnitude with minimal loss in performance. By condensing transcriptional information encoded in large datasets, Hopper and Treehopper grant the individual user with a laptop the analytic capabilities of a large consortium. Availability and implementation The code for Hopper is available at https://github.com/bendemeo/hopper. In addition, we have provided sketches of many of the largest single-cell datasets, available at http://hopper.csail.mit.edu.
Collapse
Affiliation(s)
- Benjamin DeMeo
- Department of Bioinformatics, Harvard University, Cambridge, MA 02138, USA.,Computer Science and Artificial Intelligence Laboratory
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory.,Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
36
|
Liang Z, Li M, Zheng R, Tian Y, Yan X, Chen J, Wu FX, Wang J. SSRE: Cell Type Detection Based on Sparse Subspace Representation and Similarity Enhancement. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:282-291. [PMID: 33647482 PMCID: PMC8602764 DOI: 10.1016/j.gpb.2020.09.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 08/13/2020] [Accepted: 10/29/2020] [Indexed: 11/25/2022]
Abstract
Accurate identification of cell types from single-cell RNA sequencing (scRNA-seq) data plays a critical role in a variety of scRNA-seq analysis studies. This task corresponds to solving an unsupervised clustering problem, in which the similarity measurement between cells affects the result significantly. Although many approaches for cell type identification have been proposed, the accuracy still needs to be improved. In this study, we proposed a novel single-cell clustering framework based on similarity learning, called SSRE. SSRE models the relationships between cells based on subspace assumption, and generates a sparse representation of the cell-to-cell similarity. The sparse representation retains the most similar neighbors for each cell. Besides, three classical pairwise similarities are incorporated with a gene selection and enhancement strategy to further improve the effectiveness of SSRE. Tested on ten real scRNA-seq datasets and five simulated datasets, SSRE achieved the superior performance in most cases compared to several state-of-the-art single-cell clustering methods. In addition, SSRE can be extended to visualization of scRNA-seq data and identification of differentially expressed genes. The matlab and python implementations of SSRE are available at https://github.com/CSUBioGroup/SSRE.
Collapse
Affiliation(s)
- Zhenlan Liang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yu Tian
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Xuhua Yan
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jin Chen
- College of Medicine, University of Kentucky, Lexington, KY 40536, USA
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
37
|
Do VH, Rojas Ringeling F, Canzar S. Linear-time cluster ensembles of large-scale single-cell RNA-seq and multimodal data. Genome Res 2021; 31:677-688. [PMID: 33627473 PMCID: PMC8015854 DOI: 10.1101/gr.267906.120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 02/19/2021] [Indexed: 12/25/2022]
Abstract
A fundamental task in single-cell RNA-seq (scRNA-seq) analysis is the identification of transcriptionally distinct groups of cells. Numerous methods have been proposed for this problem, with a recent focus on methods for the cluster analysis of ultralarge scRNA-seq data sets produced by droplet-based sequencing technologies. Most existing methods rely on a sampling step to bridge the gap between algorithm scalability and volume of the data. Ignoring large parts of the data, however, often yields inaccurate groupings of cells and risks overlooking rare cell types. We propose method Specter that adopts and extends recent algorithmic advances in (fast) spectral clustering. In contrast to methods that cluster a (random) subsample of the data, we adopt the idea of landmarks that are used to create a sparse representation of the full data from which a spectral embedding can then be computed in linear time. We exploit Specter's speed in a cluster ensemble scheme that achieves a substantial improvement in accuracy over existing methods and identifies rare cell types with high sensitivity. Its linear-time complexity allows Specter to scale to millions of cells and leads to fast computation times in practice. Furthermore, on CITE-seq data that simultaneously measures gene and protein marker expression, we show that Specter is able to use multimodal omics measurements to resolve subtle transcriptomic differences between subpopulations of cells.
Collapse
Affiliation(s)
- Van Hoan Do
- Gene Center, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | | | - Stefan Canzar
- Gene Center, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| |
Collapse
|
38
|
Hong X, Meng S, Tang D, Wang T, Ding L, Yu H, Li H, Liu D, Dai Y, Yang M. Single-Cell RNA Sequencing Reveals the Expansion of Cytotoxic CD4 + T Lymphocytes and a Landscape of Immune Cells in Primary Sjögren's Syndrome. Front Immunol 2021; 11:594658. [PMID: 33603736 PMCID: PMC7884617 DOI: 10.3389/fimmu.2020.594658] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 12/14/2020] [Indexed: 12/31/2022] Open
Abstract
Objective Primary Sjögren’s syndrome (pSS) is a systemic autoimmune disease, and its pathogenetic mechanism is far from being understood. In this study, we aimed to explore the cellular and molecular mechanisms that lead to pathogenesis of this disease. Methods We applied single-cell RNA sequencing (scRNA-seq) to 57,288 peripheral blood mononuclear cells (PBMCs) from five patients with pSS and five healthy controls. The immune cell subsets and susceptibility genes involved in the pathogenesis of pSS were analyzed. Flow cytometry was preformed to verify the result of scRNA-seq. Results We identified two subpopulations significantly expand in pSS patients. The one highly expressing cytotoxicity genes is named as CD4+ CTLs cytotoxic T lymphocyte, and another highly expressing T cell receptor (TCR) variable gene is named as CD4+ TRAV13-2+ T cell. Flow cytometry results showed the percentages of CD4+ CTLs, which were profiled with CD4+ and GZMB+ staining; the total T cells of 10 patients with pSS were significantly higher than those of 10 healthy controls (P= 0.008). The expression level of IL-1β in macrophages, TCL1A in B cells, as well as interferon (IFN) response genes in most cell subsets was upregulated in the patients with pSS. Susceptibility genes including HLA-DRB5, CTLA4, and AQP3 were highly expressed in patients with pSS. Conclusions Our data revealed disease-specific immune cell subsets and provided some potential new targets of pSS. Specific expansion of CD4+ CTLs may be involved in the pathogenesis of pSS, which might give valuable insights for therapeutic interventions of pSS.
Collapse
Affiliation(s)
- Xiaoping Hong
- Department of Rheumatology and Immunology, Southern Medical University, Nanfang Hospital, Guangzhou, China.,Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Shuhui Meng
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Donge Tang
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Tingting Wang
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Liping Ding
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Haiyan Yu
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Heng Li
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Dongzhou Liu
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Yong Dai
- Department of Rheumatology and Immunology, Department of Clinical Medical Research Center, Guangdong Provincial Engineering Research Center of Autoimmune Disease Precision Medicine, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University, The First Affiliated Hospital Southern University of Science and Technology), Shenzhen, China
| | - Min Yang
- Department of Rheumatology and Immunology, Southern Medical University, Nanfang Hospital, Guangzhou, China
| |
Collapse
|
39
|
Wang HY, Zhao JP, Zheng CH. SUSCC: Secondary Construction of Feature Space based on UMAP for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data. Interdiscip Sci 2021; 13:83-90. [PMID: 33475958 DOI: 10.1007/s12539-020-00411-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 12/08/2020] [Accepted: 12/19/2020] [Indexed: 10/22/2022]
Abstract
Clustering is a common method to identify cell types in single cell analysis, but the increasing size of scRNA-seq datasets brings challenges to single cell clustering. Therefore, it is an urgent need to design a faster and more accurate clustering method for large-scale scRNA-seq data. In this paper, we proposed a new method for single cell clustering. First, a count matrix is constructed through normalization and gene filtration. Second, the raw data of gene expression matrix are projected to feature space constructed by secondary construction of feature space based on UMAP (Uniform Manifold Approximation and Projection). Third, the low-dimensional matrix on the feature space is randomly divided into two sub-matrices according to a certain proportion for clustering and classifying, respectively. Finally, one subset is clustered by k-means algorithm and then the other subset is classified by k-nearest neighbor algorithm based on clustering results. Experimental results show that our method can cluster the scRNA-seq datasets effectively.
Collapse
Affiliation(s)
- Hai-Yun Wang
- College of Mathematics and System Sciences, Xinjiang University, Urumqi, China
| | - Jian-Ping Zhao
- College of Mathematics and System Sciences, Xinjiang University, Urumqi, China. .,Institute of Mathematics and Physics, Xinjiang University, Urumqi, China.
| | - Chun-Hou Zheng
- College of Mathematics and System Sciences, Xinjiang University, Urumqi, China. .,College of Computer Science and Technology, Anhui University, Hefei, China.
| |
Collapse
|
40
|
Xie K, Huang Y, Zeng F, Liu Z, Chen T. scAIDE: clustering of large-scale single-cell RNA-seq data reveals putative and rare cell types. NAR Genom Bioinform 2020; 2:lqaa082. [PMID: 33575628 PMCID: PMC7671411 DOI: 10.1093/nargab/lqaa082] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 08/20/2020] [Accepted: 09/18/2020] [Indexed: 02/07/2023] Open
Abstract
Recent advancements in both single-cell RNA-sequencing technology and computational resources facilitate the study of cell types on global populations. Up to millions of cells can now be sequenced in one experiment; thus, accurate and efficient computational methods are needed to provide clustering and post-analysis of assigning putative and rare cell types. Here, we present a novel unsupervised deep learning clustering framework that is robust and highly scalable. To overcome the high level of noise, scAIDE first incorporates an autoencoder-imputation network with a distance-preserved embedding network (AIDE) to learn a good representation of data, and then applies a random projection hashing based k-means algorithm to accommodate the detection of rare cell types. We analyzed a 1.3 million neural cell dataset within 30 min, obtaining 64 clusters which were mapped to 19 putative cell types. In particular, we further identified three different neural stem cell developmental trajectories in these clusters. We also classified two subpopulations of malignant cells in a small glioblastoma dataset using scAIDE. We anticipate that scAIDE would provide a more in-depth understanding of cell development and diseases.
Collapse
Affiliation(s)
- Kaikun Xie
- Institute for Artificial Intelligence, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
- Tsinghua-Fuzhou Institute of Digital Technology, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
| | - Yu Huang
- Institute for Artificial Intelligence, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
- Tsinghua-Fuzhou Institute of Digital Technology, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
| | - Feng Zeng
- Department of Automation, Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Zehua Liu
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
- Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Ting Chen
- Institute for Artificial Intelligence, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
- Tsinghua-Fuzhou Institute of Digital Technology, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
41
|
Gene-regulatory network analysis of ankylosing spondylitis with a single-cell chromatin accessible assay. Sci Rep 2020; 10:19411. [PMID: 33173081 PMCID: PMC7655814 DOI: 10.1038/s41598-020-76574-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 10/21/2020] [Indexed: 02/06/2023] Open
Abstract
A detailed understanding of the gene-regulatory network in ankylosing spondylitis (AS) is vital for elucidating the mechanisms of AS pathogenesis. Assaying transposase-accessible chromatin in single cell sequencing (scATAC-seq) is a suitable method for revealing such networks. Thus, scATAC-seq was applied to define the landscape of active regulatory DNA in AS. As a result, there was a significant change in the percent of CD8+ T cells in PBMCs, and 37 differentially accessible transcription factor (TF) motifs were identified. T cells, monocytes-1 and dendritic cells were found to be crucial for the IL-17 signaling pathway and TNF signaling pathway, since they had 73 potential target genes regulated by 8 TF motifs with decreased accessibility in AS. Moreover, natural killer cells were involved in AS by increasing the accessibility to TF motifs TEAD1 and JUN to induce cytokine-cytokine receptor interactions. In addition, CD4+ T cells and CD8+ T cells may be vital for altering host immune functions through increasing the accessibility of TF motifs NR1H4 and OLIG (OLIGI and OLIG2), respectively. These results explain clear gene regulatory variation in PBMCs from AS patients, providing a foundational framework for the study of personal regulomes and delivering insights into epigenetic therapy.
Collapse
|
42
|
Genetic landscape and autoimmunity of monocytes in developing Vogt-Koyanagi-Harada disease. Proc Natl Acad Sci U S A 2020; 117:25712-25721. [PMID: 32989127 DOI: 10.1073/pnas.2002476117] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Vogt-Koyanagi-Harada (VKH) disease is a systemic autoimmune disorder affecting multiple organs, including eyes, skin, and central nervous system. It is known that monocytes significantly contribute to the development of autoimmune disease. However, the subset heterogeneity with unique functions and signatures in human circulating monocytes and the identity of disease-specific monocytic populations remain largely unknown. Here, we employed an advanced single-cell RNA sequencing technology to systematically analyze 11,259 human circulating monocytes and genetically defined their subpopulations. We constructed a precise atlas of human blood monocytes, identified six subpopulations-including S100A12, HLA, CD16, proinflammatory, megakaryocyte-like, and NK-like monocyte subsets-and uncovered two previously unidentified subsets: HLA and megakaryocyte-like monocyte subsets. Relative to healthy individuals, cellular composition, gene expression signatures, and activation states were markedly alternated in VKH patients utilizing cell type-specific programs, especially the CD16 and proinflammatory monocyte subpopulations. Notably, we discovered a disease-relevant subgroup, proinflammatory monocytes, which showed a discriminative gene expression signature indicative of inflammation, antiviral activity, and pathologic activation, and converted into a pathologic activation state implicating the active inflammation during VKH disease. Additionally, we found the cell type-specific transcriptional signature of proinflammatory monocytes, ISG15, whose production might reflect the treatment response. Taken together, in this study, we present discoveries on accurate classification, molecular markers, and signaling pathways for VKH disease-associated monocytes. Therapeutically targeting this proinflammatory monocyte subpopulation would provide an attractive approach for treating VKH, as well as other autoimmune diseases.
Collapse
|
43
|
Hie B, Peters J, Nyquist SK, Shalek AK, Berger B, Bryson BD. Computational Methods for Single-Cell RNA Sequencing. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-012220-100601] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has provided a high-dimensional catalog of millions of cells across species and diseases. These data have spurred the development of hundreds of computational tools to derive novel biological insights. Here, we outline the components of scRNA-seq analytical pipelines and the computational methods that underlie these steps. We describe available methods, highlight well-executed benchmarking studies, and identify opportunities for additional benchmarking studies and computational methods. As the biochemical approaches for single-cell omics advance, we propose coupled development of robust analytical pipelines suited for the challenges that new data present and principled selection of analytical methods that are suited for the biological questions to be addressed.
Collapse
Affiliation(s)
- Brian Hie
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Joshua Peters
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
| | - Sarah K. Nyquist
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Alex K. Shalek
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
- Department of Chemistry, Institute for Medical Engineering & Science (IMES), and Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Bryan D. Bryson
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
44
|
Niebler S, Müller A, Hankeln T, Schmidt B. RainDrop: Rapid activation matrix computation for droplet-based single-cell RNA-seq reads. BMC Bioinformatics 2020; 21:274. [PMID: 32611394 PMCID: PMC7329424 DOI: 10.1186/s12859-020-03593-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Accepted: 06/09/2020] [Indexed: 12/19/2022] Open
Abstract
Background Obtaining data from single-cell transcriptomic sequencing allows for the investigation of cell-specific gene expression patterns, which could not be addressed a few years ago. With the advancement of droplet-based protocols the number of studied cells continues to increase rapidly. This establishes the need for software tools for efficient processing of the produced large-scale datasets. We address this need by presenting RainDrop for fast gene-cell count matrix computation from single-cell RNA-seq data produced by 10x Genomics Chromium technology. Results RainDrop can process single-cell transcriptomic datasets consisting of 784 million reads sequenced from around 8.000 cells in less than 40 minutes on a standard workstation. It significantly outperforms the established Cell Ranger pipeline and the recently introduced Alevin tool in terms of runtime by a maximal (average) speedup of 30.4 (22.6) and 3.5 (2.4), respectively, while keeping high agreements of the generated results. Conclusions RainDrop is a software tool for highly efficient processing of large-scale droplet-based single-cell RNA-seq datasets on standard workstations written in C++. It is available at https://gitlab.rlp.net/stnieble/raindrop.
Collapse
Affiliation(s)
- Stefan Niebler
- Department of Computer Science, Johannes Gutenberg University, Mainz, 55099, Germany
| | - André Müller
- Department of Computer Science, Johannes Gutenberg University, Mainz, 55099, Germany
| | - Thomas Hankeln
- Molecular Genetics and Genome Analysis, Institute of Organismal and Molecular Evolution, Johannes Gutenberg University, Mainz, 55099, Germany
| | - Bertil Schmidt
- Department of Computer Science, Johannes Gutenberg University, Mainz, 55099, Germany.
| |
Collapse
|
45
|
Do VH, Elbassioni K, Canzar S. Sphetcher: Spherical Thresholding Improves Sketching of Single-Cell Transcriptomic Heterogeneity. iScience 2020; 23:101126. [PMID: 32438285 PMCID: PMC7235285 DOI: 10.1016/j.isci.2020.101126] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 04/20/2020] [Accepted: 04/28/2020] [Indexed: 12/11/2022] Open
Abstract
The massive size of single-cell RNA sequencing datasets often exceeds the capability of current computational analysis methods to solve routine tasks such as detection of cell types. Recently, geometric sketching was introduced as an alternative to uniform subsampling. It selects a subset of cells (the sketch) that evenly cover the transcriptomic space occupied by the original dataset, to accelerate downstream analyses and highlight rare cell types. Here, we propose algorithm Sphetcher that makes use of the thresholding technique to efficiently pick representative cells within spheres (as opposed to the typically used equal-sized boxes) that cover the entire transcriptomic space. We show that the spherical sketch computed by Sphetcher constitutes a more accurate representation of the original transcriptomic landscape. Our optimization scheme allows to include fairness aspects that can encode prior biological or experimental knowledge. We show how a fair sampling can inform the inference of the trajectory of human skeletal muscle myoblast differentiation. Sphetcher distils large-scale scRNA-seq data down to a small selection of cells Spheres of small radius around selected cells cover the original transcriptomic space Selection enhances and accelerates downstream analysis such as trajectory inference Sphetcher can leverage existing annotation of known cell types
Collapse
Affiliation(s)
- Van Hoan Do
- Gene Center, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Khaled Elbassioni
- Khalifa University of Science and Technology, P.O. Box: 127788, Abu Dhabi, UAE
| | - Stefan Canzar
- Gene Center, Ludwig-Maximilians-Universität München, 81377 Munich, Germany.
| |
Collapse
|
46
|
Cai Y, Dai Y, Wang Y, Yang Q, Guo J, Wei C, Chen W, Huang H, Zhu J, Zhang C, Zheng W, Wen Z, Liu H, Zhang M, Xing S, Jin Q, Feng CG, Chen X. Single-cell transcriptomics of blood reveals a natural killer cell subset depletion in tuberculosis. EBioMedicine 2020; 53:102686. [PMID: 32114394 PMCID: PMC7047188 DOI: 10.1016/j.ebiom.2020.102686] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 02/09/2020] [Accepted: 02/10/2020] [Indexed: 12/14/2022] Open
Abstract
Background Tuberculosis (TB) continues to be a critical global health problem, which killed millions of lives each year. Certain circulating cell subsets are thought to differentially modulate the host immune response towards Mycobacterium tuberculosis (Mtb) infection, but the nature and function of these subsets is unclear. Methods Peripheral blood mononuclear cells (PBMC) were isolated from healthy controls (HC), latent tuberculosis infection (LTBI) and active tuberculosis (TB) and then subjected to single-cell RNA sequencing (scRNA-seq) using 10 × Genomics platform. Unsupervised clustering of the cells based on the gene expression profiles using the Seurat package and passed to tSNE for clustering visualization. Flow cytometry was used to validate the subsets identified by scRNA-Seq. Findings Cluster analysis based on differential gene expression revealed both known and novel markers for all main PBMC cell types and delineated 29 cell subsets. By comparing the scRNA-seq datasets from HC, LTBI and TB, we found that infection changes the frequency of immune-cell subsets in TB. Specifically, we observed gradual depletion of a natural killer (NK) cell subset (CD3-CD7+GZMB+) from HC, to LTBI and TB. We further verified that the depletion of CD3-CD7+GZMB+ subset in TB and found an increase in this subset frequency after anti-TB treatment. Finally, we confirmed that changes in this subset frequency can distinguish patients with TB from LTBI and HC. Interpretation We propose that the frequency of CD3-CD7+GZMB+ in peripheral blood could be used as a novel biomarker for distinguishing TB from LTBI and HC. Fund The study was supported by Natural Science Foundation of China (81770013, 81525016, 81772145, 81871255 and 91942315), National Science and Technology Major Project (2017ZX10201301), Science and Technology Project of Shenzhen (JCYJ20170412101048337) and Guangdong Provincial Key Laboratory of Regional Immunity and Diseases (2019B030301009). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Collapse
Affiliation(s)
- Yi Cai
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Youchao Dai
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China; Research Institute of Infectious Diseases, Guangzhou Eighth People's Hospital, Guangzhou Medical University, Guangzhou 510000, China
| | - Yejun Wang
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Qianqing Yang
- Guangdong Key Lab for Diagnosis &Treatment of Emerging Infectious Diseases, Shenzhen Third People's Hospital, Southern University of Science and Technology, Shenzhen 518000, China
| | - Jiubiao Guo
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Cailing Wei
- Guangdong Key Lab for Diagnosis &Treatment of Emerging Infectious Diseases, Shenzhen Third People's Hospital, Southern University of Science and Technology, Shenzhen 518000, China
| | - Weixin Chen
- Guangdong Key Lab for Diagnosis &Treatment of Emerging Infectious Diseases, Shenzhen Third People's Hospital, Southern University of Science and Technology, Shenzhen 518000, China
| | - Huanping Huang
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Jialou Zhu
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Chi Zhang
- Shenzhen University General Hospital, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Weidong Zheng
- Shenzhen University General Hospital, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Zhihua Wen
- Yuebei Second People's Hospital, Shaoguan 512000, China
| | - Haiying Liu
- The MOH Key Laboratory of Systems Biology of Pathogens, Institute of Pathogen Biology, and Centre for Tuberculosis, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100176, China
| | - Mingxia Zhang
- Guangdong Key Lab for Diagnosis &Treatment of Emerging Infectious Diseases, Shenzhen Third People's Hospital, Southern University of Science and Technology, Shenzhen 518000, China
| | - Shaojun Xing
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China
| | - Qi Jin
- The MOH Key Laboratory of Systems Biology of Pathogens, Institute of Pathogen Biology, and Centre for Tuberculosis, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100176, China
| | - Carl G Feng
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China; Department of Infectious Diseases and Immunology, Sydney Medical School, the University of Sydney, Sydney, NSW 2006, Australia
| | - Xinchun Chen
- Guangdong Key Laboratory of Regional Immunity and Diseases, Department of Pathogen Biology, Shenzhen University School of Medicine, Shenzhen 518000, China.
| |
Collapse
|
47
|
Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CSO, Aparicio S, Baaijens J, Balvert M, Barbanson BD, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BP, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Rączkowska A, Reinders M, Ridder JD, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP, Schönhuth A. Eleven grand challenges in single-cell data science. Genome Biol 2020; 21:31. [PMID: 32033589 PMCID: PMC7007675 DOI: 10.1186/s13059-020-1926-6] [Citation(s) in RCA: 576] [Impact Index Per Article: 144.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 01/02/2020] [Indexed: 02/08/2023] Open
Abstract
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Collapse
Affiliation(s)
- David Lähnemann
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Johannes Köster
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
| | - Ewa Szczurek
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Davis J. McCarthy
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia
- Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD USA
| | - Mark D. Robinson
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Catalina A. Vallejos
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK
- The Alan Turing Institute, British Library, London, UK
| | - Kieran R. Campbell
- Department of Statistics, University of British Columbia, Vancouver, Canada
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Data Science Institute, University of British Columbia, Vancouver, Canada
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Ahmed Mahfouz
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Luca Pinello
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, USA
- Department of Pathology, Harvard Medical School, Boston, USA
- Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, USA
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | | | - Samuel Aparicio
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Jasmijn Baaijens
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
| | - Marleen Balvert
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| | - Buys de Barbanson
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Antonio Cappuccio
- Institute for Advanced Study, University of Amsterdam, Amsterdam, The Netherlands
| | - Giacomo Corleone
- Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK
| | - Bas E. Dutilh
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Maria Florescu
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rens Holmer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Thamar Jessurun Lobo
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Emma M. Keizer
- Biometris, Wageningen University & Research, Wageningen, The Netherlands
| | - Indu Khatri
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Center, Leiden, The Netherlands
| | - Szymon M. Kielbasa
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexey M. Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Boudewijn P.F. Lelieveldt
- PRB lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Ion I. Mandoiu
- Computer Science & Engineering Department, University of Connecticut, Storrs, USA
| | - John C. Marioni
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Felix Mölder
- Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Amir Niknejad
- Computation molecular design, Zuse Institute Berlin, Berlin, Germany
- Mathematics Department, Mount Saint Vincent, New York, USA
| | - Alicja Rączkowska
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
| | - Marcel Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Antoine-Emmanuel Saliba
- Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, Würzburg, Germany
| | - Antonios Somarakis
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Oliver Stegle
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, Heidelberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
| | - Huan Yang
- Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, Leiden, The Netherlands
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Alice C. McHardy
- Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | | | - Sohrab P. Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA
| | - Alexander Schönhuth
- Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
48
|
Shao X, Liao J, Lu X, Xue R, Ai N, Fan X. scCATCH: Automatic Annotation on Cell Types of Clusters from Single-Cell RNA Sequencing Data. iScience 2020; 23:100882. [PMID: 32062421 PMCID: PMC7031312 DOI: 10.1016/j.isci.2020.100882] [Citation(s) in RCA: 158] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 12/26/2019] [Accepted: 01/29/2020] [Indexed: 12/02/2022] Open
Abstract
Recent advancements in single-cell RNA sequencing (scRNA-seq) have facilitated the classification of thousands of cells through transcriptome profiling, wherein accurate cell type identification is critical for mechanistic studies. In most current analysis protocols, cell type-based cluster annotation is manually performed and heavily relies on prior knowledge, resulting in poor replicability of cell type annotation. This study aimed to introduce a single-cell Cluster-based Automatic Annotation Toolkit for Cellular Heterogeneity (scCATCH, https://github.com/ZJUFanLab/scCATCH). Using three benchmark datasets, the feasibility of evidence-based scoring and tissue-specific cellular annotation strategies were demonstrated by high concordance among cell types, and scCATCH outperformed Seurat, a popular method for marker genes identification, and cell-based annotation methods. Furthermore, scCATCH accurately annotated 67%–100% (average, 83%) clusters in six published scRNA-seq datasets originating from various tissues. The present results show that scCATCH accurately revealed cell identities with high reproducibility, thus potentially providing insights into mechanisms underlying disease pathogenesis and progression. Construction of a comprehensive tissue-specific reference database of cell markers Paired comparisons to identify potential marker genes for clusters to ensure accuracy Evidence-based scoring and annotation for clustered cells from scRNA-seq data Accurate and replicable annotation on cell types of clusters without prior knowledge
Collapse
Affiliation(s)
- Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jie Liao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiaoyan Lu
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Rui Xue
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ni Ai
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| |
Collapse
|
49
|
Sun S, Zhu J, Ma Y, Zhou X. Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis. Genome Biol 2019; 20:269. [PMID: 31823809 PMCID: PMC6902413 DOI: 10.1186/s13059-019-1898-6] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 11/22/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Dimensionality reduction is an indispensable analytic component for many areas of single-cell RNA sequencing (scRNA-seq) data analysis. Proper dimensionality reduction can allow for effective noise removal and facilitate many downstream analyses that include cell clustering and lineage reconstruction. Unfortunately, despite the critical importance of dimensionality reduction in scRNA-seq analysis and the vast number of dimensionality reduction methods developed for scRNA-seq studies, few comprehensive comparison studies have been performed to evaluate the effectiveness of different dimensionality reduction methods in scRNA-seq. RESULTS We aim to fill this critical knowledge gap by providing a comparative evaluation of a variety of commonly used dimensionality reduction methods for scRNA-seq studies. Specifically, we compare 18 different dimensionality reduction methods on 30 publicly available scRNA-seq datasets that cover a range of sequencing techniques and sample sizes. We evaluate the performance of different dimensionality reduction methods for neighborhood preserving in terms of their ability to recover features of the original expression matrix, and for cell clustering and lineage reconstruction in terms of their accuracy and robustness. We also evaluate the computational scalability of different dimensionality reduction methods by recording their computational cost. CONCLUSIONS Based on the comprehensive evaluation results, we provide important guidelines for choosing dimensionality reduction methods for scRNA-seq data analysis. We also provide all analysis scripts used in the present study at www.xzlab.org/reproduce.html.
Collapse
Affiliation(s)
- Shiquan Sun
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, 710072, People's Republic of China
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Jiaqiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ying Ma
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
50
|
Sinha D, Sinha P, Saha R, Bandyopadhyay S, Sengupta D. Improved dropClust R package with integrative analysis support for scRNA-seq data. Bioinformatics 2019; 36:btz823. [PMID: 31693086 DOI: 10.1093/bioinformatics/btz823] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 09/16/2019] [Accepted: 10/25/2019] [Indexed: 11/13/2022] Open
Abstract
SUMMARY DropClust leverages Locality Sensitive Hashing (LSH) to speed up clustering of large scale single cell expression data. Here we present the improved dropClust, a complete R package that is, fast, interoperable and minimally resource intensive. The new dropClust features a novel batch effect removal algorithm that allows integrative analysis of single cell RNA-seq (scRNA-seq) datasets. AVAILABILITY AND IMPLEMENTATION dropClust is freely available at https://github.com/debsin/dropClust as an R package. A lightweight online version of the dropClust is available at https://debsinha.shinyapps.io/dropClust/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Debajyoti Sinha
- SyMeC Data Center, Indian Statistical Institute, Kolkata, India
- Department of Computer Science & Engineering, University of Calcutta, Kolkata, India
| | - Pradyumn Sinha
- Department of Computer Science & Engineering, Delhi Technological University, Delhi, India
| | - Ritwik Saha
- Department of Computer Science & Engineering, Delhi Technological University, Delhi, India
| | | | - Debarka Sengupta
- Department of Computer Science & Engineering, Department of Computational Biology, Center for Artificial Intelligence, Indraprastha Institute of Information Technology, New Delhi, India
| |
Collapse
|