Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yang Q, Xu Z, Zhou W, Wang P, Jiang Q, Juan L. An interpretable single-cell RNA sequencing data clustering method based on latent Dirichlet allocation. Brief Bioinform 2023;24:bbad199. [PMID: 37225419 PMCID: PMC10359080 DOI: 10.1093/bib/bbad199] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 05/04/2023] [Accepted: 05/08/2023] [Indexed: 05/26/2023] Open

For:	Yang Q, Xu Z, Zhou W, Wang P, Jiang Q, Juan L. An interpretable single-cell RNA sequencing data clustering method based on latent Dirichlet allocation. Brief Bioinform 2023;24:bbad199. [PMID: 37225419 PMCID: PMC10359080 DOI: 10.1093/bib/bbad199] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 05/04/2023] [Accepted: 05/08/2023] [Indexed: 05/26/2023] Open

Number

Cited by Other Article(s)

Rebboah E, Rezaie N, Williams BA, Weimer AK, Shi M, Yang X, Liang HY, Dionne LA, Reese F, Trout D, Jou J, Youngworth I, Reinholdt L, Morabito S, Snyder MP, Wold BJ, Mortazavi A. The ENCODE mouse postnatal developmental time course identifies regulatory programs of cell types and cell states. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.12.598567. [PMID: 38915583 PMCID: PMC11195270 DOI: 10.1101/2024.06.12.598567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]

Affiliation(s)

Elisabeth Rebboah Developmental and Cell Biology, University of California Irvine, Irvine, USA Center for Complex Biological Systems, University of California Irvine, Irvine, USA
Narges Rezaie Developmental and Cell Biology, University of California Irvine, Irvine, USA Center for Complex Biological Systems, University of California Irvine, Irvine, USA
Brian A. Williams Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
Annika K. Weimer Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, USA
Minyi Shi Department of Next Generation Sequencing and Microchemistry, Proteomics and Lipidomics, Genentech, San Francisco, USA
Xinqiong Yang Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
Heidi Yahan Liang Developmental and Cell Biology, University of California Irvine, Irvine, USA
Louise A. Dionne The Jackson Laboratory, Bar Harbor, USA
Fairlie Reese Developmental and Cell Biology, University of California Irvine, Irvine, USA
Diane Trout Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
Jennifer Jou Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
Ingrid Youngworth Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
Laura Reinholdt The Jackson Laboratory, Bar Harbor, USA
Samuel Morabito Developmental and Cell Biology, University of California Irvine, Irvine, USA Center for Complex Biological Systems, University of California Irvine, Irvine, USA
Michael P. Snyder Department of Genetics, Stanford University School of Medicine, Palo Alto, USA
Barbara J. Wold Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, USA
Ali Mortazavi Developmental and Cell Biology, University of California Irvine, Irvine, USA Center for Complex Biological Systems, University of California Irvine, Irvine, USA

Collapse

Tiong KL, Luzhbin D, Yeang CH. Assessing transcriptomic heterogeneity of single-cell RNASeq data by bulk-level gene expression data. BMC Bioinformatics 2024;25:209. [PMID: 38867193 PMCID: PMC11167951 DOI: 10.1186/s12859-024-05825-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open

Abstract

BACKGROUND

Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation.

RESULTS

We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data.

CONCLUSIONS

The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.

Collapse

Peng L, Gao P, Xiong W, Li Z, Chen X. Identifying potential ligand-receptor interactions based on gradient boosted neural network and interpretable boosting machine for intercellular communication analysis. Comput Biol Med 2024;171:108110. [PMID: 38367445 DOI: 10.1016/j.compbiomed.2024.108110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/24/2024] [Accepted: 02/04/2024] [Indexed: 02/19/2024]

Abstract

Cell-cell communication is essential to many key biological processes. Intercellular communication is generally mediated by ligand-receptor interactions (LRIs). Thus, building a comprehensive and high-quality LRI resource can significantly improve intercellular communication analysis. Meantime, due to lack of a "gold standard" dataset, it remains a challenge to evaluate LRI-mediated intercellular communication results. Here, we introduce CellGiQ, a high-confident LRI prediction framework for intercellular communication analysis. Highly confident LRIs are first inferred by LRI feature extraction with BioTriangle, LRI selection using LightGBM, and LRI classification based on ensemble of gradient boosted neural network and interpretable boosting machine. Subsequently, known and identified high-confident LRIs are filtered by combining single-cell RNA sequencing (scRNA-seq) data and further applied to intercellular communication inference through a quartile scoring strategy. To validation the predictions, CellGiQ exploited several evaluation strategies: using AUC and AUPR, it surpassed six competing LRI prediction models on four LRI datasets; through Venn diagrams and molecular docking, its predicted LRIs were validated by five other popular intercellular communication inference methods; based on the overlapping LRIs, it computed high Jaccard index with six other state-of-the-art intercellular communication prediction tools within human HNSCC tissues; by comparing with classical models and literature retrieve, its inferred HNSCC-related intercellular communication results was further validated. The novelty of this study is to identify high-confident LRIs based on machine learning as well as design several LRI validation ways, providing reference for computational LRI prediction. CellGiQ provides an open-source and useful tool to decompose LRI-mediated intercellular communication at single cell resolution. CellGiQ is freely available at https://github.com/plhhnu/CellGiQ.

Collapse

Rezaie N, Rebboah E, Williams BA, Liang HY, Reese F, Balderrama-Gutierrez G, Dionne LA, Reinholdt L, Trout D, Wold BJ, Mortazavi A. Identification of robust cellular programs using reproducible LDA that impact sex-specific disease progression in different genotypes of a mouse model of AD. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582178. [PMID: 38464087 PMCID: PMC10925135 DOI: 10.1101/2024.02.26.582178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]

Zhang H, Lu X, Lu B, Gullo G, Chen L. Measuring the composition of the tumor microenvironment with transcriptome analysis: past, present and future. Future Oncol 2024. [PMID: 38362731 DOI: 10.2217/fon-2023-0658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/24/2024] [Indexed: 02/17/2024] Open

Zhou S, Li Y, Wu W, Li L. scMMT: a multi-use deep learning approach for cell annotation, protein prediction and embedding in single-cell RNA-seq data. Brief Bioinform 2024;25:bbad523. [PMID: 38300515 PMCID: PMC10833085 DOI: 10.1093/bib/bbad523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/27/2023] [Accepted: 12/19/2023] [Indexed: 02/02/2024] Open

Zhang H, Lu X, Lu B, Chen L. scGEM: Unveiling the Nested Tree-Structured Gene Co-Expressing Modules in Single Cell Transcriptome Data. Cancers (Basel) 2023;15:4277. [PMID: 37686554 PMCID: PMC10486867 DOI: 10.3390/cancers15174277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/22/2023] [Accepted: 08/25/2023] [Indexed: 09/10/2023] Open