Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhu L, Lei J, Devlin B, Roeder K. A UNIFIED STATISTICAL FRAMEWORK FOR SINGLE CELL AND BULK RNA SEQUENCING DATA. Ann Appl Stat 2018;12:609-632. [PMID: 30174778 DOI: 10.1214/17-aoas1110] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

For:	Zhu L, Lei J, Devlin B, Roeder K. A UNIFIED STATISTICAL FRAMEWORK FOR SINGLE CELL AND BULK RNA SEQUENCING DATA. Ann Appl Stat 2018;12:609-632. [PMID: 30174778 DOI: 10.1214/17-aoas1110] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Number

Cited by Other Article(s)

Goss K, Horwitz EM. Single-cell multiomics to advance cell therapy. Cytotherapy 2025;27:137-145. [PMID: 39530970 DOI: 10.1016/j.jcyt.2024.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/21/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]

Sharifitabar M, Kazempour S, Razavian J, Sajedi S, Solhjoo S, Zare H. A deep neural network to de-noise single-cell RNA sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.20.624552. [PMID: 39605470 PMCID: PMC11601639 DOI: 10.1101/2024.11.20.624552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]

Özden F, Minary P. Learning to quantify uncertainty in off-target activity for CRISPR guide RNAs. Nucleic Acids Res 2024;52:e87. [PMID: 39275984 PMCID: PMC11472043 DOI: 10.1093/nar/gkae759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 08/07/2024] [Accepted: 08/23/2024] [Indexed: 09/16/2024] Open

Tiong KL, Luzhbin D, Yeang CH. Assessing transcriptomic heterogeneity of single-cell RNASeq data by bulk-level gene expression data. BMC Bioinformatics 2024;25:209. [PMID: 38867193 PMCID: PMC11167951 DOI: 10.1186/s12859-024-05825-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open

Abstract

BACKGROUND

Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation.

RESULTS

We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data.

CONCLUSIONS

The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.

Collapse

Wang L, Hong C, Song J, Yao J. CTEC: a cross-tabulation ensemble clustering approach for single-cell RNA sequencing data analysis. Bioinformatics 2024;40:btae130. [PMID: 38552307 PMCID: PMC10985676 DOI: 10.1093/bioinformatics/btae130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 02/11/2024] [Indexed: 04/04/2024] Open

Wusiman D, Li W, Guo L, Huang Z, Zhang Y, Zhang X, Zhao X, Li L, An Z, Li Z, Ying J, An C. Comprehensive analysis of single-cell and bulk RNA-sequencing data identifies B cell marker genes signature that predicts prognosis and analysis of immune checkpoints expression in head and neck squamous cell carcinoma. Heliyon 2023;9:e22656. [PMID: 38125461 PMCID: PMC10731009 DOI: 10.1016/j.heliyon.2023.e22656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/13/2023] [Accepted: 11/16/2023] [Indexed: 12/23/2023] Open

Affiliation(s)

Dilinaer Wusiman Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Wenbin Li Department of Pathology, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Lei Guo Department of Pathology, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Zehao Huang Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Yi Zhang Department of Pathology, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Xiwei Zhang Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Xiaohui Zhao Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Lin Li Department of Pathology, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Zhaohong An Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Zhengjiang Li Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Jianming Ying Department of Pathology, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
Changming An Department of Head and Neck Surgery, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China

Collapse

van den Oord EJCG, Aberg KA. Fine-grained cell-type specific association studies with human bulk brain data using a large single-nucleus RNA sequencing based reference panel. Sci Rep 2023;13:13004. [PMID: 37563216 PMCID: PMC10415334 DOI: 10.1038/s41598-023-39864-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 08/01/2023] [Indexed: 08/12/2023] Open

Pan Y, Landis JT, Moorad R, Wu D, Marron JS, Dittmer DP. The Poisson distribution model fits UMI-based single-cell RNA-sequencing data. BMC Bioinformatics 2023;24:256. [PMID: 37330471 PMCID: PMC10276395 DOI: 10.1186/s12859-023-05349-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 05/24/2023] [Indexed: 06/19/2023] Open

Jee DJ, Kong Y, Chun H. Deep Nonnegative Matrix Factorization Using a Variational Autoencoder With Application to Single-Cell RNA Sequencing Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:883-893. [PMID: 35511832 DOI: 10.1109/tcbb.2022.3172723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Karikomi M, Zhou P, Nie Q. DURIAN: an integrative deconvolution and imputation method for robust signaling analysis of single-cell transcriptomics data. Brief Bioinform 2022;23:6609525. [PMID: 35709795 PMCID: PMC9294432 DOI: 10.1093/bib/bbac223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/29/2022] [Accepted: 05/11/2022] [Indexed: 01/31/2023] Open

Ni Z, Zheng X, Zheng X, Zou X. scLRTD : A Novel Low Rank Tensor Decomposition Method for Imputing Missing Values in Single-Cell Multi-Omics Sequencing Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:1144-1153. [PMID: 32960767 DOI: 10.1109/tcbb.2020.3025804] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol 2022;23:31. [PMID: 35063006 PMCID: PMC8783472 DOI: 10.1186/s13059-022-02601-5] [Citation(s) in RCA: 178] [Impact Index Per Article: 59.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open

Bartlett TE, Jia P, Chandna S, Roy S. Inference of tissue relative proportions of the breast epithelial cell types luminal progenitor, basal, and luminal mature. Sci Rep 2021;11:23702. [PMID: 34880407 PMCID: PMC8655091 DOI: 10.1038/s41598-021-03161-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 11/26/2021] [Indexed: 12/15/2022] Open

Wang J, Roeder K, Devlin B. Bayesian estimation of cell type-specific gene expression with prior derived from single-cell data. Genome Res 2021;31:1807-1818. [PMID: 33837133 PMCID: PMC8494232 DOI: 10.1101/gr.268722.120] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 03/31/2021] [Indexed: 11/25/2022]

Yang Y, Li G, Xie Y, Wang L, Lagler TM, Yang Y, Liu J, Qian L, Li Y. iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement. Brief Bioinform 2021;22:bbab122. [PMID: 33839756 PMCID: PMC8579191 DOI: 10.1093/bib/bbab122] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 02/26/2021] [Accepted: 03/15/2021] [Indexed: 01/23/2023] Open

Patruno L, Maspero D, Craighero F, Angaroni F, Antoniotti M, Graudenzi A. A review of computational strategies for denoising and imputation of single-cell transcriptomic data. Brief Bioinform 2021;22:bbaa222. [PMID: 33003202 DOI: 10.1093/bib/bbaa222] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 08/07/2020] [Accepted: 08/19/2020] [Indexed: 12/18/2022] Open

Cui Y, Zhang S, Liang Y, Wang X, Ferraro TN, Chen Y. Consensus clustering of single-cell RNA-seq data by enhancing network affinity. Brief Bioinform 2021;22:6308199. [PMID: 34160582 PMCID: PMC8574980 DOI: 10.1093/bib/bbab236] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 05/29/2021] [Accepted: 06/01/2021] [Indexed: 12/18/2022] Open

Sarkar A, Stephens M. Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis. Nat Genet 2021;53:770-777. [PMID: 34031584 PMCID: PMC8370014 DOI: 10.1038/s41588-021-00873-4] [Citation(s) in RCA: 109] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 04/22/2021] [Indexed: 01/21/2023]

Kong Y, Kozik A, Nakatsu CH, Jones-Hall YL, Chun H. A zero-inflated non-negative matrix factorization for the deconvolution of mixed signals of biological data. Int J Biostat 2021;18:203-218. [PMID: 33783171 DOI: 10.1515/ijb-2020-0039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 02/23/2021] [Indexed: 12/18/2022]

Sánchez JA, Gil-Martinez AL, Cisterna A, García-Ruíz S, Gómez-Pascual A, Reynolds RH, Nalls M, Hardy J, Ryten M, Botía JA. Modeling multifunctionality of genes with secondary gene co-expression networks in human brain provides novel disease insights. Bioinformatics 2021;37:2905-2911. [PMID: 33734320 PMCID: PMC8479669 DOI: 10.1093/bioinformatics/btab175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 02/14/2021] [Accepted: 03/16/2021] [Indexed: 02/02/2023] Open

Sokolowski DJ, Faykoo-Martinez M, Erdman L, Hou H, Chan C, Zhu H, Holmes MM, Goldenberg A, Wilson MD. Single-cell mapper (scMappR): using scRNA-seq to infer the cell-type specificities of differentially expressed genes. NAR Genom Bioinform 2021;3:lqab011. [PMID: 33655208 PMCID: PMC7902236 DOI: 10.1093/nargab/lqab011] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 12/23/2020] [Accepted: 02/04/2021] [Indexed: 12/11/2022] Open

Dumitrascu B, Villar S, Mixon DG, Engelhardt BE. Optimal marker gene selection for cell type discrimination in single cell analyses. Nat Commun 2021;12:1186. [PMID: 33608535 PMCID: PMC7895823 DOI: 10.1038/s41467-021-21453-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 01/27/2021] [Indexed: 11/17/2022] Open

Dong M, Thennavan A, Urrutia E, Li Y, Perou CM, Zou F, Jiang Y. SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references. Brief Bioinform 2021;22:416-427. [PMID: 31925417 PMCID: PMC7820884 DOI: 10.1093/bib/bbz166] [Citation(s) in RCA: 147] [Impact Index Per Article: 36.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 11/04/2019] [Accepted: 12/02/2019] [Indexed: 12/14/2022] Open

Zeng P, Wangwu J, Lin Z. Coupled co-clustering-based unsupervised transfer learning for the integrative analysis of single-cell genomic data. Brief Bioinform 2020;22:6024740. [PMID: 33279962 DOI: 10.1093/bib/bbaa347] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 10/29/2020] [Accepted: 10/30/2020] [Indexed: 12/11/2022] Open

Camerlenghi F, Dumitrascu B, Ferrari F, Engelhardt BE, Favaro S. Nonparametric Bayesian multiarmed bandits for single-cell experiment design. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1370] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Xu J, Cai L, Liao B, Zhu W, Yang J. CMF-Impute: an accurate imputation tool for single-cell RNA-seq data. Bioinformatics 2020;36:3139-3147. [PMID: 32073612 DOI: 10.1093/bioinformatics/btaa109] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 01/16/2020] [Indexed: 12/31/2022] Open

Silverman JD, Roche K, Mukherjee S, David LA. Naught all zeros in sequence count data are the same. Comput Struct Biotechnol J 2020;18:2789-2798. [PMID: 33101615 PMCID: PMC7568192 DOI: 10.1016/j.csbj.2020.09.014] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 09/09/2020] [Accepted: 09/10/2020] [Indexed: 12/21/2022] Open

Sun B, Chen L. Quantile regression for challenging cases of eQTL mapping. Brief Bioinform 2020;21:1756-1765. [PMID: 31688892 PMCID: PMC7673343 DOI: 10.1093/bib/bbz097] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 06/24/2019] [Accepted: 07/06/2019] [Indexed: 11/13/2022] Open

Gan L, Vinci G, Allen GI. Correlation Imputation in Single cell RNA-seq using Auxiliary Information and Ensemble Learning. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2020;2020. [PMID: 34278382 DOI: 10.1145/3388440.3412462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Tao Y, Lei H, Lee AV, Ma J, Schwartz R. Neural Network Deconvolution Method for Resolving Pathway-Level Progression of Tumor Clonal Expression Programs With Application to Breast Cancer Brain Metastases. Front Physiol 2020;11:1055. [PMID: 33013452 PMCID: PMC7499245 DOI: 10.3389/fphys.2020.01055] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 07/31/2020] [Indexed: 02/03/2023] Open

Zhang S, Yang L, Yang J, Lin Z, Ng MK. Dimensionality reduction for single cell RNA sequencing data using constrained robust non-negative matrix factorization. NAR Genom Bioinform 2020;2:lqaa064. [PMID: 33575614 PMCID: PMC7671375 DOI: 10.1093/nargab/lqaa064] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 08/10/2020] [Accepted: 08/19/2020] [Indexed: 12/22/2022] Open

Li S, Crawford FW, Gerstein MB. Using sigLASSO to optimize cancer mutation signatures jointly with sampling likelihood. Nat Commun 2020;11:3575. [PMID: 32681003 PMCID: PMC7368050 DOI: 10.1038/s41467-020-17388-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 06/22/2020] [Indexed: 11/08/2022] Open

Chowdhury HA, Bhattacharyya DK, Kalita JK. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:1154-1173. [PMID: 30668502 DOI: 10.1109/tcbb.2019.2893170] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Tao Y, Lei H, Fu X, Lee AV, Ma J, Schwartz R. Robust and accurate deconvolution of tumor populations uncovers evolutionary mechanisms of breast cancer metastasis. Bioinformatics 2020;36:i407-i416. [PMID: 32657393 PMCID: PMC7355293 DOI: 10.1093/bioinformatics/btaa396] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Zhang L, Zhang S. Comparison of Computational Methods for Imputing Single-Cell RNA-Sequencing Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:376-389. [PMID: 29994128 DOI: 10.1109/tcbb.2018.2848633] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CSO, Aparicio S, Baaijens J, Balvert M, Barbanson BD, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BP, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Rączkowska A, Reinders M, Ridder JD, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP, Schönhuth A. Eleven grand challenges in single-cell data science. Genome Biol 2020;21:31. [PMID: 32033589 PMCID: PMC7007675 DOI: 10.1186/s13059-020-1926-6] [Citation(s) in RCA: 690] [Impact Index Per Article: 138.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 01/02/2020] [Indexed: 02/08/2023] Open

Affiliation(s)

David Lähnemann Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
Johannes Köster Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
Ewa Szczurek Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
Davis J. McCarthy Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia
Stephanie C. Hicks Department of Biostatistics, Johns Hopkins University, Baltimore, MD USA
Mark D. Robinson Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
Catalina A. Vallejos MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK The Alan Turing Institute, British Library, London, UK
Kieran R. Campbell Department of Statistics, University of British Columbia, Vancouver, Canada Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada Data Science Institute, University of British Columbia, Vancouver, Canada
Niko Beerenwinkel Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
Ahmed Mahfouz Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
Luca Pinello Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, USA Department of Pathology, Harvard Medical School, Boston, USA Broad Institute of Harvard and MIT, Cambridge, MA USA
Pavel Skums Department of Computer Science, Georgia State University, Atlanta, USA
Alexandros Stamatakis Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
Camille Stephan-Otto Attolini Institute for Research in Biomedicine, The Barcelona Institute of Science and Technology, Barcelona, Spain
Samuel Aparicio Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
Jasmijn Baaijens Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
Marleen Balvert Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands
Buys de Barbanson Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands Oncode Institute, Utrecht, The Netherlands Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
Antonio Cappuccio Institute for Advanced Study, University of Amsterdam, Amsterdam, The Netherlands
Giacomo Corleone Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, Imperial College London, London, UK
Bas E. Dutilh Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Nijmegen, The Netherlands
Maria Florescu Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands Oncode Institute, Utrecht, The Netherlands Quantitative biology, Hubrecht Institute, Utrecht, The Netherlands
Victor Guryev European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
Rens Holmer Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
Katharina Jahn Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
Thamar Jessurun Lobo European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
Emma M. Keizer Biometris, Wageningen University & Research, Wageningen, The Netherlands
Indu Khatri Department of Immunohematology and Blood Transfusion, Leiden University Medical Center, Leiden, The Netherlands
Szymon M. Kielbasa Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
Jan O. Korbel Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Alexey M. Kozlov Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
Tzu-Hao Kuo Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
Boudewijn P.F. Lelieveldt PRB lab, Delft University of Technology, Delft, The Netherlands Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
Ion I. Mandoiu Computer Science & Engineering Department, University of Connecticut, Storrs, USA
John C. Marioni Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
Tobias Marschall Center for Bioinformatics, Saarland University, Saarbrücken, Germany Max Planck Institute for Informatics, Saarbrücken, Germany
Felix Mölder Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
Amir Niknejad Computation molecular design, Zuse Institute Berlin, Berlin, Germany Mathematics Department, Mount Saint Vincent, New York, USA
Alicja Rączkowska Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland
Marcel Reinders Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
Jeroen de Ridder Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands Oncode Institute, Utrecht, The Netherlands
Antoine-Emmanuel Saliba Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, Würzburg, Germany
Antonios Somarakis Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
Oliver Stegle Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, Heidelberg, Germany
Fabian J. Theis Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
Huan Yang Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, Leiden, The Netherlands
Alex Zelikovsky Department of Computer Science, Georgia State University, Atlanta, USA The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
Alice C. McHardy Computational Biology of Infection Research Group, Helmholtz Centre for Infection Research, Braunschweig, Germany
Benjamin J. Raphael Department of Computer Science, Princeton University, Princeton, USA
Sohrab P. Shah Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA
Alexander Schönhuth Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, The Netherlands

Collapse

Lin Z, Zamanighomi M, Daley T, Ma S, Wong WH. Model-Based Approach to the Joint Analysis of Single-Cell Data on Chromatin Accessibility and Gene Expression. Stat Sci 2020. [DOI: 10.1214/19-sts714] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Elyanow R, Dumitrascu B, Engelhardt BE, Raphael BJ. netNMF-sc: leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis. Genome Res 2020;30:195-204. [PMID: 31992614 PMCID: PMC7050525 DOI: 10.1101/gr.251603.119] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 11/19/2019] [Indexed: 02/06/2023]

Jambusaria A, Hong Z, Zhang L, Srivastava S, Jana A, Toth PT, Dai Y, Malik AB, Rehman J. Endothelial heterogeneity across distinct vascular beds during homeostasis and inflammation. eLife 2020;9:51413. [PMID: 31944177 PMCID: PMC7002042 DOI: 10.7554/elife.51413] [Citation(s) in RCA: 205] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 01/15/2020] [Indexed: 12/18/2022] Open

Abstract

Blood vessels are lined by endothelial cells engaged in distinct organ-specific functions but little is known about their characteristic gene expression profiles. RNA-Sequencing of the brain, lung, and heart endothelial translatome identified specific pathways, transporters and cell-surface markers expressed in the endothelium of each organ, which can be visualized at http://www.rehmanlab.org/ribo. We found that endothelial cells express genes typically found in the surrounding tissues such as synaptic vesicle genes in the brain endothelium and cardiac contractile genes in the heart endothelium. Complementary analysis of endothelial single cell RNA-Seq data identified the molecular signatures shared across the endothelial translatome and single cell transcriptomes. The tissue-specific heterogeneity of the endothelium is maintained during systemic in vivo inflammatory injury as evidenced by the distinct responses to inflammatory stimulation. Our study defines endothelial heterogeneity and plasticity and provides a molecular framework to understand organ-specific vascular disease mechanisms and therapeutic targeting of individual vascular beds.

Blood vessels supply nutrients, oxygen and other key molecules to all of the organs in the body. Cells lining the blood vessels, called endothelial cells, regulate which molecules pass from the blood to the organs they supply. For example, brain endothelial cells prevent toxic molecules from getting into the brain, and lung endothelial cells allow immune cells into the lungs to fight off bacteria or viruses.

Determining which genes are switched on in the endothelial cells of major organs might allow scientists to determine what endothelial cells do in the brain, heart, and lung, and how they differ; or help scientists deliver drugs to a particular organ. If endothelial cells from different organs switch on different groups of genes, each of these groups of genes can be thought of as a ‘genetic signature’ that identifies endothelial cells from a specific organ.

Now, Jambusaria et al. show that brain, heart, and lung endothelial cells have distinct genetic signatures. The experiments used mice that had been genetically modified to have tags on their endothelial cells. These tags made it possible to isolate RNA – a molecule similar to DNA that contains the information about which genes are active – from endothelial cells without separating the cells from their tissue of origin. Next, RNA from endothelial cells in the heart, brain and lung was sequenced and analyzed.

The results show that each endothelial cell type has a distinct genetic signature under normal conditions and infection-like conditions. Unexpectedly, the experiments also showed that genes that were thought to only be switched on in the cells of specific tissues are also on in the endothelial cells lining the blood vessels of the tissue. For example, genes switched on in brain cells are also active in brain endothelial cells, and genes allowing heart muscle cells to pump are also on in the endothelial cells of the heart blood vessels.

The endothelial cell genetic signatures identified by Jambusaria et al. can be used as “postal codes” to target drugs to a specific organ via the endothelial cells that feed it. It might also be possible to use these genetic signatures to build organ-specific blood vessels from stem cells in the laboratory. Future work will try to answer why endothelial cells serving the heart and brain use genes from these organs.

Collapse

Droplet scRNA-seq is not zero-inflated. Nat Biotechnol 2020;38:147-150. [DOI: 10.1038/s41587-019-0379-5] [Citation(s) in RCA: 196] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Mardis ER. The Impact of Next-Generation Sequencing on Cancer Genomics: From Discovery to Clinic. Cold Spring Harb Perspect Med 2019;9:cshperspect.a036269. [PMID: 30397020 DOI: 10.1101/cshperspect.a036269] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Mercatelli D, Ray F, Giorgi FM. Pan-Cancer and Single-Cell Modeling of Genomic Alterations Through Gene Expression. Front Genet 2019;10:671. [PMID: 31379928 PMCID: PMC6657420 DOI: 10.3389/fgene.2019.00671] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 06/27/2019] [Indexed: 12/27/2022] Open

Ye W, Ji G, Ye P, Long Y, Xiao X, Li S, Su Y, Wu X. scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data. BMC Genomics 2019;20:347. [PMID: 31068142 PMCID: PMC6505295 DOI: 10.1186/s12864-019-5747-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 04/29/2019] [Indexed: 12/15/2022] Open

Abstract

Background

Single-cell RNA-sequencing (scRNA-seq) is fast becoming a powerful tool for profiling genome-scale transcriptomes of individual cells and capturing transcriptome-wide cell-to-cell variability. However, scRNA-seq technologies suffer from high levels of technical noise and variability, hindering reliable quantification of lowly and moderately expressed genes. Since most downstream analyses on scRNA-seq, such as cell type clustering and differential expression analysis, rely on the gene-cell expression matrix, preprocessing of scRNA-seq data is a critical preliminary step in the analysis of scRNA-seq data.

Results

We presented scNPF, an integrative scRNA-seq preprocessing framework assisted by network propagation and network fusion, for recovering gene expression loss, correcting gene expression measurements, and learning similarities between cells. scNPF leverages the context-specific topology inherent in the given data and the priori knowledge derived from publicly available molecular gene-gene interaction networks to augment gene-gene relationships in a data driven manner. We have demonstrated the great potential of scNPF in scRNA-seq preprocessing for accurately recovering gene expression values and learning cell similarity networks. Comprehensive evaluation of scNPF across a wide spectrum of scRNA-seq data sets showed that scNPF achieved comparable or higher performance than the competing approaches according to various metrics of internal validation and clustering accuracy. We have made scNPF an easy-to-use R package, which can be used as a versatile preprocessing plug-in for most existing scRNA-seq analysis pipelines or tools.

Conclusions

scNPF is a universal tool for preprocessing of scRNA-seq data, which jointly incorporates the global topology of priori interaction networks and the context-specific information encapsulated in the scRNA-seq data to capture both shared and complementary knowledge from diverse data sources. scNPF could be used to recover gene signatures and learn cell-to-cell similarities from emerging scRNA-seq data to facilitate downstream analyses such as dimension reduction, cell type clustering, and visualization.

Electronic supplementary material

The online version of this article (10.1186/s12864-019-5747-5) contains supplementary material, which is available to authorized users.

Collapse

Hicks SC, Townes FW, Teng M, Irizarry RA. Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics 2018;19:562-578. [PMID: 29121214 PMCID: PMC6215955 DOI: 10.1093/biostatistics/kxx053] [Citation(s) in RCA: 324] [Impact Index Per Article: 46.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 09/13/2017] [Indexed: 12/26/2022] Open

Abstract

Until recently, high-throughput gene expression technology, such as RNA-Sequencing (RNA-seq) required hundreds of thousands of cells to produce reliable measurements. Recent technical advances permit genome-wide gene expression measurement at the single-cell level. Single-cell RNA-Seq (scRNA-seq) is the most widely used and numerous publications are based on data produced with this technology. However, RNA-seq and scRNA-seq data are markedly different. In particular, unlike RNA-seq, the majority of reported expression levels in scRNA-seq are zeros, which could be either biologically-driven, genes not expressing RNA at the time of measurement, or technically-driven, genes expressing RNA, but not at a sufficient level to be detected by sequencing technology. Another difference is that the proportion of genes reporting the expression level to be zero varies substantially across single cells compared to RNA-seq samples. However, it remains unclear to what extent this cell-to-cell variation is being driven by technical rather than biological variation. Furthermore, while systematic errors, including batch effects, have been widely reported as a major challenge in high-throughput technologies, these issues have received minimal attention in published studies based on scRNA-seq technology. Here, we use an assessment experiment to examine data from published studies and demonstrate that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability. Specifically, we present evidence that some of these reported zeros are driven by technical variation by demonstrating that scRNA-seq produces more zeros than expected and that this bias is greater for lower expressed genes. In addition, this missing data problem is exacerbated by the fact that this technical variation varies cell-to-cell. Then, we show how this technical cell-to-cell variability can be confused with novel biological results. Finally, we demonstrate and discuss how batch-effects and confounded experiments can intensify the problem.

Collapse