Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Z, Cui F, Wang C, Zhao L, Zou Q. Goals and approaches for each processing step for single-cell RNA sequencing data. Brief Bioinform 2020;22:6034054. [PMID: 33316046 DOI: 10.1093/bib/bbaa314] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 10/10/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open

For:	Zhang Z, Cui F, Wang C, Zhao L, Zou Q. Goals and approaches for each processing step for single-cell RNA sequencing data. Brief Bioinform 2020;22:6034054. [PMID: 33316046 DOI: 10.1093/bib/bbaa314] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 10/10/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open

Number

Cited by Other Article(s)

Sun Y, Pan Z, Wang Z, Wang H, Wei L, Cui F, Zou Q, Zhang Z. Single-cell transcriptome analysis reveals immune microenvironment changes and insights into the transition from DCIS to IDC with associated prognostic genes. J Transl Med 2024;22:894. [PMID: 39363164 PMCID: PMC11448450 DOI: 10.1186/s12967-024-05706-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 09/25/2024] [Indexed: 10/05/2024] Open

Chen M, Zou Q, Qi R, Ding Y. PseU-KeMRF: A Novel Method for Identifying RNA Pseudouridine Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:1423-1435. [PMID: 38625768 DOI: 10.1109/tcbb.2024.3389094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]

Abstract

Pseudouridine is a type of abundant RNA modification that is seen in many different animals and is crucial for a variety of biological functions. Accurately identifying pseudouridine sites within the RNA sequence is vital for the subsequent study of various biological mechanisms of pseudouridine. However, the use of traditional experimental methods faces certain challenges. The development of fast and convenient computational methods is necessary to accurately identify pseudouridine sites from RNA sequence information. To address this, we introduce a novel pseudouridine site prediction model called PseU-KeMRF, which can identify pseudouridine sites in three species, H. sapiens, S. cerevisiae, and M. musculus. Through comprehensive analysis, we selected four RNA coding schemes, including binary feature, position-specific trinucleotide propensity based on single strand (PSTNPss), nucleotide chemical property (NCP) and pseudo k-tuple composition (PseKNC). Then the support vector machine-recursive feature elimination (SVM-RFE) method was used for feature selection and the feature subset was optimized. Finally, the best feature subsets are input into the kernel based on multinomial random forests (KeMRF) classifier for cross-validation and independent testing. As a new classification method, compared with the traditional random forest, KeMRF not only improves the node splitting process of decision tree construction based on multinomial distribution, but also combines the easy to interpret kernel method for prediction, which makes the classification performance better. Our results indicate superior predictive performance of PseU-KeMRF over other existing models, which can prove that PseU-KeMRF is a highly competitive predictive model that can successfully identify pseudouridine sites in RNA sequences.

Collapse

Yan C, Zhu Y, Chen M, Yang K, Cui F, Zou Q, Zhang Z. Integration tools for scRNA-seq data and spatial transcriptomics sequencing data. Brief Funct Genomics 2024;23:295-302. [PMID: 38267084 DOI: 10.1093/bfgp/elae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/26/2023] [Accepted: 01/03/2024] [Indexed: 01/26/2024] Open

Derisoud E, Jiang H, Zhao A, Chavatte-Palmer P, Deng Q. Revealing the molecular landscape of human placenta: a systematic review and meta-analysis of single-cell RNA sequencing studies. Hum Reprod Update 2024;30:410-441. [PMID: 38478759 PMCID: PMC11215163 DOI: 10.1093/humupd/dmae006] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 02/12/2024] [Indexed: 07/02/2024] Open

Abstract

BACKGROUND

With increasing significance of developmental programming effects associated with placental dysfunction, more investigations are devoted to improving the characterization and understanding of placental signatures in health and disease. The placenta is a transitory but dynamic organ adapting to the shifting demands of fetal development and available resources of the maternal supply throughout pregnancy. Trophoblasts (cytotrophoblasts, syncytiotrophoblasts, and extravillous trophoblasts) are placental-specific cell types responsible for the main placental exchanges and adaptations. Transcriptomic studies with single-cell resolution have led to advances in understanding the placenta's role in health and disease. These studies, however, often show discrepancies in characterization of the different placental cell types.

OBJECTIVE AND RATIONALE

We aim to review the knowledge regarding placental structure and function gained from the use of single-cell RNA sequencing (scRNAseq), followed by comparing cell-type-specific genes, highlighting their similarities and differences. Moreover, we intend to identify consensus marker genes for the various trophoblast cell types across studies. Finally, we will discuss the contributions and potential applications of scRNAseq in studying pregnancy-related diseases.

SEARCH METHODS

We conducted a comprehensive systematic literature review to identify different cell types and their functions at the human maternal-fetal interface, focusing on all original scRNAseq studies on placentas published before March 2023 and published reviews (total of 28 studies identified) using PubMed search. Our approach involved curating cell types and subtypes that had previously been defined using scRNAseq and comparing the genes used as markers or identified as potential new markers. Next, we reanalyzed expression matrices from the six available scRNAseq raw datasets with cell annotations (four from first trimester and two at term), using Wilcoxon rank-sum tests to compare gene expression among studies and annotate trophoblast cell markers in both first trimester and term placentas. Furthermore, we integrated scRNAseq raw data available from 18 healthy first trimester and nine term placentas, and performed clustering and differential gene expression analysis. We further compared markers obtained with the analysis of annotated and raw datasets with the literature to obtain a common signature gene list for major placental cell types.

OUTCOMES

Variations in the sampling site, gestational age, fetal sex, and subsequent sequencing and analysis methods were observed between the studies. Although their proportions varied, the three trophoblast types were consistently identified across all scRNAseq studies, unlike other non-trophoblast cell types. Notably, no marker genes were shared by all studies for any of the investigated cell types. Moreover, most of the newly defined markers in one study were not observed in other studies. These discrepancies were confirmed by our analysis on trophoblast cell types, where hundreds of potential marker genes were identified in each study but with little overlap across studies. From 35 461 and 23 378 cells of high quality in the first trimester and term placentas, respectively, we obtained major placental cell types, including perivascular cells that previously had not been identified in the first trimester. Importantly, our meta-analysis provides marker genes for major placental cell types based on our extensive curation.

WIDER IMPLICATIONS

This review and meta-analysis emphasizes the need for establishing a consensus for annotating placental cell types from scRNAseq data. The marker genes identified here can be deployed for defining human placental cell types, thereby facilitating and improving the reproducibility of trophoblast cell annotation.

Collapse

Sun Y, Kong L, Huang J, Deng H, Bian X, Li X, Cui F, Dou L, Cao C, Zou Q, Zhang Z. A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data. Brief Funct Genomics 2024:elae023. [PMID: 38860675 DOI: 10.1093/bfgp/elae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/29/2024] [Accepted: 05/27/2024] [Indexed: 06/12/2024] Open

Wang H, Liu Z, Ma X. Learning Consistency and Specificity of Cells From Single-Cell Multi-Omic Data. IEEE J Biomed Health Inform 2024;28:3134-3145. [PMID: 38709615 DOI: 10.1109/jbhi.2024.3370868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]

Duan H, Zhang Y, Qiu H, Fu X, Liu C, Zang X, Xu A, Wu Z, Li X, Zhang Q, Zhang Z, Cui F. Machine learning-based prediction model for distant metastasis of breast cancer. Comput Biol Med 2024;169:107943. [PMID: 38211382 DOI: 10.1016/j.compbiomed.2024.107943] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 12/10/2023] [Accepted: 01/01/2024] [Indexed: 01/13/2024]

Jiang J, Pei H, Li J, Li M, Zou Q, Lv Z. FEOpti-ACVP: identification of novel anti-coronavirus peptide sequences based on feature engineering and optimization. Brief Bioinform 2024;25:bbae037. [PMID: 38366802 PMCID: PMC10939380 DOI: 10.1093/bib/bbae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024] Open

Ding Y, Zhou H, Zou Q, Yuan L. Identification of drug-side effect association via correntropy-loss based matrix factorization with neural tangent kernel. Methods 2023;219:73-81. [PMID: 37783242 DOI: 10.1016/j.ymeth.2023.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/18/2023] [Accepted: 09/20/2023] [Indexed: 10/04/2023] Open

Shi Q, Chen X, Zhang Z. Decoding Human Biology and Disease Using Single-cell Omics Technologies. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023;21:926-949. [PMID: 37739168 PMCID: PMC10928380 DOI: 10.1016/j.gpb.2023.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 05/22/2023] [Accepted: 06/08/2023] [Indexed: 09/24/2023]

Yu W, Wang C, Shang Z, Tian J. Unveiling novel insights in prostate cancer through single-cell RNA sequencing. Front Oncol 2023;13:1224913. [PMID: 37746302 PMCID: PMC10514910 DOI: 10.3389/fonc.2023.1224913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 08/15/2023] [Indexed: 09/26/2023] Open

Fan R, Ding Y, Zou Q, Yuan L. Multi-view local hyperplane nearest neighbor model based on independence criterion for identifying vesicular transport proteins. Int J Biol Macromol 2023;247:125774. [PMID: 37437677 DOI: 10.1016/j.ijbiomac.2023.125774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/30/2023] [Accepted: 07/07/2023] [Indexed: 07/14/2023]

Qian Y, Shang T, Guo F, Wang C, Cui Z, Ding Y, Wu H. Identification of DNA-binding protein based multiple kernel model. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:13149-13170. [PMID: 37501482 DOI: 10.3934/mbe.2023586] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]

Jiao L, Ren Y, Wang L, Gao C, Wang S, Song T. MulCNN: An efficient and accurate deep learning method based on gene embedding for cell type identification in single-cell RNA-seq data. Front Genet 2023;14:1179859. [PMID: 37082202 PMCID: PMC10110861 DOI: 10.3389/fgene.2023.1179859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Accepted: 03/27/2023] [Indexed: 04/07/2023] Open

Zhang J, Liu X, Huang Z, Wu C, Zhang F, Han A, Stalin A, Lu S, Guo S, Huang J, Liu P, Shi R, Zhai Y, Chen M, Zhou W, Bai M, Wu J. T cell-related prognostic risk model and tumor immune environment modulation in lung adenocarcinoma based on single-cell and bulk RNA sequencing. Comput Biol Med 2023;152:106460. [PMID: 36565482 DOI: 10.1016/j.compbiomed.2022.106460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 12/06/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]

Abstract

BACKGROUND

T cells are present in all stages of tumor formation and play an important role in the tumor microenvironment. We aimed to explore the expression profile of T cell marker genes, constructed a prognostic risk model based on these genes in Lung adenocarcinoma (LUAD), and investigated the link between this risk model and the immunotherapy response.

METHODS

We obtained the single-cell sequencing data of LUAD from the literature, and screened out 6 tissue biopsy samples, including 32,108 cells from patients with non-small cell lung cancer, to identify T cell marker genes in LUAD. Combined with TCGA database, a prognostic risk model based on T-cell marker gene was constructed, and the data from GEO database was used for verification. We also investigated the association between this risk model and immunotherapy response.

RESULTS

Based on scRNA-seq data 1839 T-cell marker genes were identified, after which a risk model consisting of 9 gene signatures for prognosis was constructed in combination with the TCGA dataset. This risk model divided patients into high-risk and low-risk groups based on overall survival. The multivariate analysis demonstrated that the risk model was an independent prognostic factor. Analysis of immune profiles showed that high-risk groups presented discriminative immune-cell infiltrations and immune-suppressive states. Risk scores of the model were closely correlated with Linoleic acid metabolism, intestinal immune network for IgA production and drug metabolism cytochrome P450.

CONCLUSION

Our study proposed a novel prognostic risk model based on T cell marker genes for LUAD patients. The survival of LUAD patients as well as treatment outcomes may be accurately predicted by the prognostic risk model, and make the high-risk population present different immune cell infiltration and immunosuppression state.

Collapse

Affiliation(s)

Jingyuan Zhang School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Xinkui Liu School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Zhihong Huang School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Chao Wu School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Fanqin Zhang School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Aiqing Han School of Management, Beijing University of Chinese Medicine, Beijing, 100029, China
Antony Stalin Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China
Shan Lu School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Siyu Guo School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Jiaqi Huang School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Pengyun Liu School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Rui Shi School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Yiyan Zhai School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Meilin Chen School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China
Wei Zhou Pharmacy Department, China-Japan Friendship Hospital, Beijing, 100029, China.
Meirong Bai Key Laboratory of Mongolian Medicine Research and Development Engineering, Ministry of Education, Tongliao, 028000, China.
Jiarui Wu School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, 100029, China.

Collapse

Ao C, Jiao S, Wang Y, Yu L, Zou Q. Biological Sequence Classification: A Review on Data and General Methods. RESEARCH (WASHINGTON, D.C.) 2022;2022:0011. [PMID: 39285948 PMCID: PMC11404319 DOI: 10.34133/research.0011] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 10/25/2022] [Indexed: 09/19/2024]

Zeng L, Yang K, Zhang T, Zhu X, Hao W, Chen H, Ge J. Research progress of single-cell transcriptome sequencing in autoimmune diseases and autoinflammatory disease: A review. J Autoimmun 2022;133:102919. [PMID: 36242821 DOI: 10.1016/j.jaut.2022.102919] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/16/2022] [Accepted: 09/19/2022] [Indexed: 12/07/2022]

Abstract

Autoimmunity refers to the phenomenon that the body's immune system produces antibodies or sensitized lymphocytes to its own tissues to cause an immune response. Immune disorders caused by autoimmunity can mediate autoimmune diseases. Autoimmune diseases have complicated pathogenesis due to the many types of cells involved, and the mechanism is still unclear. The emergence of single-cell research technology can solve the problem that ordinary transcriptome technology cannot be accurate to cell type. It provides unbiased results through independent analysis of cells in tissues and provides more mRNA information for identifying cell subpopulations, which provides a novel approach to study disruption of immune tolerance and disturbance of pro-inflammatory pathways on a cellular basis. It may fundamentally change the understanding of molecular pathways in the pathogenesis of autoimmune diseases and develop targeted drugs. Single-cell transcriptome sequencing (scRNA-seq) has been widely applied in autoimmune diseases, which provides a powerful tool for demonstrating the cellular heterogeneity of tissues involved in various immune inflammations, identifying pathogenic cell populations, and revealing the mechanism of disease occurrence and development. This review describes the principles of scRNA-seq, introduces common sequencing platforms and practical procedures, and focuses on the progress of scRNA-seq in 41 autoimmune diseases, which include 9 systemic autoimmune diseases and autoinflammatory diseases (rheumatoid arthritis, systemic lupus erythematosus, etc.) and 32 organ-specific autoimmune diseases (5 Skin diseases, 3 Nervous system diseases, 4 Eye diseases, 2 Respiratory system diseases, 2 Circulatory system diseases, 6 Liver, Gallbladder and Pancreas diseases, 2 Gastrointestinal system diseases, 3 Muscle, Bones and joint diseases, 3 Urinary system diseases, 2 Reproductive system diseases). This review also prospects the molecular mechanism targets of autoimmune diseases from the multi-molecular level and multi-dimensional analysis combined with single-cell multi-omics sequencing technology (such as scRNA-seq, Single cell ATAC-seq and single cell immune group library sequencing), which provides a reference for further exploring the pathogenesis and marker screening of autoimmune diseases and autoimmune inflammatory diseases in the future.

Collapse

Zhao S, Zhang L, Liu X. AE-TPGG: a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction. FRONTIERS OF COMPUTER SCIENCE 2022;17:173902. [PMID: 36320820 PMCID: PMC9607720 DOI: 10.1007/s11704-022-2011-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 04/15/2022] [Indexed: 06/16/2023]

Wang R, Peng G, Tam PPL, Jing N. Integration of computational analysis and spatial transcriptomics in single-cell study. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00084-5. [PMID: 35901961 PMCID: PMC10372908 DOI: 10.1016/j.gpb.2022.06.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 06/08/2022] [Accepted: 06/19/2022] [Indexed: 04/08/2023]

Liu G, Li M, Wang H, Lin S, Xu J, Li R, Tang M, Li C. D3K: The Dissimilarity-Density-Dynamic Radius K-means Clustering Algorithm for scRNA-Seq Data. Front Genet 2022;13:912711. [PMID: 35846121 PMCID: PMC9284269 DOI: 10.3389/fgene.2022.912711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 04/25/2022] [Indexed: 12/02/2022] Open

Zhang Z, Cui F, Su W, Dou L, Xu A, Cao C, Zou Q. webSCST: an interactive web application for single-cell RNA-sequencing data and spatial transcriptomic data integration. Bioinformatics 2022;38:3488-3489. [PMID: 35604082 DOI: 10.1093/bioinformatics/btac350] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 04/26/2022] [Accepted: 05/18/2022] [Indexed: 11/14/2022] Open

Jia Q, Chu H, Jin Z, Long H, Zhu B. High-throughput single-сell sequencing in cancer research. Signal Transduct Target Ther 2022;7:145. [PMID: 35504878 PMCID: PMC9065032 DOI: 10.1038/s41392-022-00990-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 03/23/2022] [Accepted: 04/08/2022] [Indexed: 12/22/2022] Open

Gan S, Deng H, Qiu Y, Alshahrani M, Liu S. DSAE-Impute: Learning Discriminative Stacked Autoencoders for Imputing Single-cell RNA-seq Data. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220330151024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract Aims: In this research, we aim to propose an accurate deep learning method to impute the missing values in scRNA-seq data. DSAE-Impute employs stacked autoencoders to capture gene expression characteristics in the original missing data and combines the discriminative correlation matrix between cells to capture global expression features during the training process, so as to accurately predict missing values. Background: Due to the limited amount of mRNA in single-cell, there are always many missing values in scRNA-seq data, which makes it impossible to accurately quantify the expression of single-cell RNA. The dropout phenomenon makes it impossible to detect the truly expressed genes in some cells, which greatly affects the downstream analysis on scRNA-seq data, such as cell cluster analysis and cell development trajectories. Objective: In this research, we aim to propose an accurate deep learning method to impute the missing values in scRNA-seq data. DSAE-Impute employs stacked autoencoders to capture gene expression characteristics in the original missing data and combines the discriminative correlation matrix between cells to capture global expression features during the training process, so as to accurately predict missing values. Method: We propose a novel deep learning model based on the discriminative stacked autoencoders to impute the missing values in scRNA-seq data, named DSAE-Impute. DSAE-Impute embeds the discriminative cell similarity to perfect the feature representation of stacked autoencoders, and comprehensively learns the scRNA-seq data expression pattern through layer-by-layer training to achieve accurate imputation. Result: We have systematically evaluated the performance of DSAE-Impute in the simulation and real datasets. The experimental results demonstrate that DSAE-Impute significantly improves downstream analysis, and its imputation results are more accurate compared with other state-of-the-art imputation methods. Conclusion: Extensive experiments show that compared with other state-of-the-art methods, the imputation results of DSAE-Impute on simulated and real datasets are more accurate and helpful for downstream analysis. Collapse

Lall S, Ray S, Bandyopadhyay S. A copula based topology preserving graph convolution network for clustering of single-cell RNA-seq data. PLoS Comput Biol 2022;18:e1009600. [PMID: 35271564 PMCID: PMC8979455 DOI: 10.1371/journal.pcbi.1009600] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 04/04/2022] [Accepted: 01/27/2022] [Indexed: 11/18/2022] Open

Abstract

Annotation of cells in single-cell clustering requires a homogeneous grouping of cell populations. There are various issues in single cell sequencing that effect homogeneous grouping (clustering) of cells, such as small amount of starting RNA, limited per-cell sequenced reads, cell-to-cell variability due to cell-cycle, cellular morphology, and variable reagent concentrations. Moreover, single cell data is susceptible to technical noise, which affects the quality of genes (or features) selected/extracted prior to clustering.

Here we introduce sc-CGconv (copula based graph convolution network for single clustering), a stepwise robust unsupervised feature extraction and clustering approach that formulates and aggregates cell–cell relationships using copula correlation (Ccor), followed by a graph convolution network based clustering approach. sc-CGconv formulates a cell-cell graph using Ccor that is learned by a graph-based artificial intelligence model, graph convolution network. The learned representation (low dimensional embedding) is utilized for cell clustering. sc-CGconv features the following advantages. a. sc-CGconv works with substantially smaller sample sizes to identify homogeneous clusters. b. sc-CGconv can model the expression co-variability of a large number of genes, thereby outperforming state-of-the-art gene selection/extraction methods for clustering. c. sc-CGconv preserves the cell-to-cell variability within the selected gene set by constructing a cell-cell graph through copula correlation measure. d. sc-CGconv provides a topology-preserving embedding of cells in low dimensional space.

One of the important aspects of single cell downstream analysis is to classify cells into subpopulations. This immediately leads to clustering of cells into homogeneous groups, which faces lots of issues due to (i) small amount of starting RNA, (ii) cell-to-cell variability, (iii) technical noise incorporated within the single cell sequencing technology, and (iv) unavailability of discriminating selected/extracted genes (features) in the preprocessing step of downstream analysis. We proposed sc-CGconv, stepwise feature extraction and clustering framework, which leverage landmark advantage of copula and graph convolution network in single-cell analysis domain. sc-CGconv outperforms the state-of-the-art feature selection/extraction methods in the preprocessing steps, performs well with small sample size data, can preserve the cell-to-cell variability within the extracted features, provides a topology-preserving embedding of cells in low dimensional space. sc-CGconv therefore successfully addresses the above-mentioned key challenges.

Collapse

Leote AC, Wu X, Beyer A. Regulatory network-based imputation of dropouts in single-cell RNA sequencing data. PLoS Comput Biol 2022;18:e1009849. [PMID: 35176023 PMCID: PMC8890719 DOI: 10.1371/journal.pcbi.1009849] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 03/02/2022] [Accepted: 01/18/2022] [Indexed: 01/07/2023] Open

Abstract Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Further, it is unknown if all genes equally benefit from imputation or which imputation method works best for a given gene. Here, we show that a transcriptional regulatory network learned from external, independent gene expression data improves dropout imputation. Using a variety of human scRNA-seq datasets we demonstrate that our network-based approach outperforms published state-of-the-art methods. The network-based approach performs particularly well for lowly expressed genes, including cell-type-specific transcriptional regulators. Further, the cell-to-cell variation of 11.3% to 48.8% of the genes could not be adequately imputed by any of the methods that we tested. In those cases gene expression levels were best predicted by the mean expression across all cells, i.e. assuming no measurable expression variation between cells. These findings suggest that different imputation methods are optimal for different genes. We thus implemented an R-package called ADImpute (available via Bioconductor https://bioconductor.org/packages/release/bioc/html/ADImpute.html) that automatically determines the best imputation method for each gene in a dataset. Our work represents a paradigm shift by demonstrating that there is no single best imputation method. Instead, we propose that imputation should maximally exploit external information and be adapted to gene-specific features, such as expression level and expression variation across cells. Collapse

Cui F, Zhang Z, Cao C, Zou Q, Chen D, Su X. Protein-DNA/RNA interactions: Machine intelligence tools and approaches in the era of artificial intelligence and big data. Proteomics 2022;22:e2100197. [PMID: 35112474 DOI: 10.1002/pmic.202100197] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/02/2022] [Accepted: 01/17/2022] [Indexed: 11/09/2022]

Cell Heterogeneity Analysis in Single-Cell RNA-seq Data Using Mixture Exponential Graph and Markov Random Field Model. BIOMED RESEARCH INTERNATIONAL 2021;2021:9919080. [PMID: 34095314 PMCID: PMC8164540 DOI: 10.1155/2021/9919080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 04/30/2021] [Indexed: 11/18/2022]

Zhang Z, Cui F, Lin C, Zhao L, Wang C, Zou Q. Critical downstream analysis steps for single-cell RNA sequencing data. Brief Bioinform 2021;22:6210064. [PMID: 33822873 DOI: 10.1093/bib/bbab105] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 02/20/2021] [Accepted: 03/09/2021] [Indexed: 12/13/2022] Open