Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJT, Mahfouz A. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol 2019;20:194. [PMID: 31500660 PMCID: PMC6734286 DOI: 10.1186/s13059-019-1795-z] [Citation(s) in RCA: 311] [Impact Index Per Article: 62.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 08/17/2019] [Indexed: 12/21/2022] Open

For:	Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJT, Mahfouz A. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol 2019;20:194. [PMID: 31500660 PMCID: PMC6734286 DOI: 10.1186/s13059-019-1795-z] [Citation(s) in RCA: 311] [Impact Index Per Article: 62.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 08/17/2019] [Indexed: 12/21/2022] Open

Number

Cited by Other Article(s)

151

Galdos FX, Xu S, Goodyer WR, Duan L, Huang YV, Lee S, Zhu H, Lee C, Wei N, Lee D, Wu SM. devCellPy is a machine learning-enabled pipeline for automated annotation of complex multilayered single-cell transcriptomic data. Nat Commun 2022;13:5271. [PMID: 36071107 PMCID: PMC9452519 DOI: 10.1038/s41467-022-33045-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 08/31/2022] [Indexed: 11/09/2022] Open

152

Zheng H, Wang S, Li X, Hu H. INSISTC: Incorporating network structure information for single-cell type classification. Genomics 2022;114:110480. [PMID: 36075505 DOI: 10.1016/j.ygeno.2022.110480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 08/30/2022] [Accepted: 09/04/2022] [Indexed: 11/27/2022]

153

Contrastive learning enables rapid mapping to multimodal single-cell atlas of multimillion scale. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00518-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

154

Ma WF, Turner AW, Gancayco C, Wong D, Song Y, Mosquera JV, Auguste G, Hodonsky CJ, Prabhakar A, Ekiz HA, van der Laan SW, Miller CL. PlaqView 2.0: A comprehensive web portal for cardiovascular single-cell genomics. Front Cardiovasc Med 2022;9:969421. [PMID: 36003902 PMCID: PMC9393487 DOI: 10.3389/fcvm.2022.969421] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 07/21/2022] [Indexed: 11/13/2022] Open

Affiliation(s)

Wei Feng Ma Medical Scientist Training Program, University of Virginia, Charlottesville, VA, United States Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
Adam W. Turner Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
Christina Gancayco Research Computing, University of Virginia, Charlottesville, VA, United States
Doris Wong Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, United States
Yipei Song Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States Department of Computer Engineering, University of Virginia, Charlottesville, VA, United States
Jose Verdezoto Mosquera Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States Research Computing, University of Virginia, Charlottesville, VA, United States
Gaëlle Auguste Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
Chani J. Hodonsky Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
Ajay Prabhakar Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States
H. Atakan Ekiz Department of Molecular Biology and Genetics, Izmir Institute of Technology, Gülbahçe, Turkey
Sander W. van der Laan Central Diagnostics Laboratory, Division Laboratories, Pharmacy, and Biomedical Genetics, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
Clint L. Miller Center for Public Health Genomics, University of Virginia, Charlottesville, VA, United States Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, United States Department of Public Health Sciences, University of Virginia, Charlottesville, VA, United States

Collapse

155

scWizard: a web-based automated tool for classifying and annotating single cells and downstream analysis of single-cell RNA-seq data in cancers. Comput Struct Biotechnol J 2022;20:4902-4909. [PMID: 36147672 PMCID: PMC9474308 DOI: 10.1016/j.csbj.2022.08.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 07/27/2022] [Accepted: 08/12/2022] [Indexed: 11/22/2022] Open

Abstract

•

scWizard provides comprehensive analysis pipeline for integration strategies of cancer scRNA-seq data.

•

scWizard enables classification of 47 cell subtypes within the TME based on hierarchical model by deep neural network.

•

scWizard gives a higher accuracy for annotation cell subtypes within the TME compared with five methods.

•

scWizard packages is a point-and-click tool helping for researchers without proficient programming skills.

The emerging number of single-cell RNA-seq (scRNA-Seq) datasets allows the characterization of cell types across various cancer types. However, there is still lack of effective tools to integrate the various analysis of single-cells, especially for making fine annotation on subtype cells within the tumor microenvironment (TME). We developed scWizard, a point-and-click tool packaging automated process including our developed cell annotation method based on deep neural network learning and 11 downstream analyses methods. scWizard used 113,976 cells across 13 cancer types as a built-in reference dataset for training the hierarchical model enabling to automatedly classify and annotate 7 major cell types and 47 cell subtypes in the TME. scWizard provides a built-in pre-training set for user’s flexible choice, and gives a higher accuracy for annotation subtypes of tumor-derived T-lymphocytes/natural killer cells (T/NK) and myeloid cells from different cancer types compared with the existing five methods. scWizard has good robustness in three independent cancer datasets, with an accuracy of 0.98 in annotating major cell types, 0.85 in annotating myeloid cell subtypes and 0.79 in annotating T/NK cell subtypes, indicting the wide applicability of scWizard in different cell types of cancers. Finally, the automatic analysis and visualization function of scWizard are presented by using the intrahepatic cholangiocarcinoma (ICC) scRNA-Seq dataset as a case. scWizard focuses on decoding TME and covers various analysis flows for cancer scRNA-Seq study, and provides an easy-to-use tool and a user-friendly interface for researchers widely, to further accelerate the biological discovery of cancer research.

Collapse

156

Hou W, Ji Z. Palo: spatially aware color palette optimization for single-cell and spatial data. Bioinformatics 2022;38:3654-3656. [PMID: 35642896 PMCID: PMC9272793 DOI: 10.1093/bioinformatics/btac368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/18/2022] [Accepted: 05/26/2022] [Indexed: 11/15/2022] Open

157

Ellis D, Wu D, Datta S. SAREV: A review on statistical analytics of single-cell RNA sequencing data. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2022;14:e1558. [PMID: 36034329 PMCID: PMC9400796 DOI: 10.1002/wics.1558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 04/09/2021] [Indexed: 06/15/2023]

158

Chen Z, Goldwasser J, Tuckman P, Liu J, Zhang J, Gerstein M. Forest Fire Clustering for single-cell sequencing combines iterative label propagation with parallelized Monte Carlo simulations. Nat Commun 2022;13:3538. [PMID: 35725981 PMCID: PMC9209427 DOI: 10.1038/s41467-022-31107-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 06/06/2022] [Indexed: 11/09/2022] Open

159

Zandavi SM, Koch FC, Vijayan A, Zanini F, Mora F, Ortega D, Vafaee F. Disentangling single-cell omics representation with a power spectral density-based feature extraction. Nucleic Acids Res 2022;50:5482-5492. [PMID: 35639509 PMCID: PMC9178020 DOI: 10.1093/nar/gkac436] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Revised: 04/26/2022] [Accepted: 05/10/2022] [Indexed: 12/13/2022] Open

160

Li J, Chen S, Pan X, Yuan Y, Shen HB. Cell clustering for spatial transcriptomics data with graph neural networks. NATURE COMPUTATIONAL SCIENCE 2022;2:399-408. [PMID: 38177586 DOI: 10.1038/s43588-022-00266-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 05/19/2022] [Indexed: 01/06/2024]

161

Single-cell views of the Plasmodium life cycle. Trends Parasitol 2022;38:748-757. [DOI: 10.1016/j.pt.2022.05.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 05/16/2022] [Accepted: 05/17/2022] [Indexed: 02/08/2023]

162

Dohmen J, Baranovskii A, Ronen J, Uyar B, Franke V, Akalin A. Identifying tumor cells at the single-cell level using machine learning. Genome Biol 2022;23:123. [PMID: 35637521 PMCID: PMC9150321 DOI: 10.1186/s13059-022-02683-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 05/06/2022] [Indexed: 12/15/2022] Open

163

Kumar S, Song M. Overcoming biases in causal inference of molecular interactions. Bioinformatics 2022;38:2818-2825. [PMID: 35561208 DOI: 10.1093/bioinformatics/btac206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 02/03/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open

164

Storrs EP, Zhou DC, Wendl MC, Wyczalkowski MA, Karpova A, Wang LB, Li Y, Southard-Smith A, Jayasinghe RG, Yao L, Liu R, Wu Y, Terekhanova NV, Zhu H, Herndon JM, Puram S, Chen F, Gillanders WE, Fields RC, Ding L. Pollock: fishing for cell states. BIOINFORMATICS ADVANCES 2022;2:vbac028. [PMID: 35603231 PMCID: PMC9115775 DOI: 10.1093/bioadv/vbac028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 04/06/2022] [Accepted: 05/10/2022] [Indexed: 11/24/2022]

Affiliation(s)

Erik P Storrs Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Daniel Cui Zhou Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Michael C Wendl Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Matthew A Wyczalkowski Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Alla Karpova Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Liang-Bo Wang Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Yize Li Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Austin Southard-Smith Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Reyka G Jayasinghe Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Lijun Yao Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Ruiyang Liu Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Yige Wu Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Nadezhda V Terekhanova Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
Houxiang Zhu Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
John M Herndon Department of Surgery, Washington University in St. Louis, St. Louis, MO 63110, USA,Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
Sid Puram Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA
Feng Chen Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA
William E Gillanders Department of Surgery, Washington University in St. Louis, St. Louis, MO 63110, USA,Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
Ryan C Fields Department of Surgery, Washington University in St. Louis, St. Louis, MO 63110, USA,Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
Li Ding Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA,McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA,Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA,To whom correspondence should be addressed.

Collapse

165

Domínguez Conde C, Xu C, Jarvis LB, Rainbow DB, Wells SB, Gomes T, Howlett SK, Suchanek O, Polanski K, King HW, Mamanova L, Huang N, Szabo PA, Richardson L, Bolt L, Fasouli ES, Mahbubani KT, Prete M, Tuck L, Richoz N, Tuong ZK, Campos L, Mousa HS, Needham EJ, Pritchard S, Li T, Elmentaite R, Park J, Rahmani E, Chen D, Menon DK, Bayraktar OA, James LK, Meyer KB, Yosef N, Clatworthy MR, Sims PA, Farber DL, Saeb-Parsy K, Jones JL, Teichmann SA. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 2022;376:eabl5197. [PMID: 35549406 PMCID: PMC7612735 DOI: 10.1126/science.abl5197] [Citation(s) in RCA: 289] [Impact Index Per Article: 144.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Affiliation(s)

C Domínguez Conde Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
C Xu Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
LB Jarvis Department of Clinical Neurosciences, University of Cambridge
DB Rainbow Department of Clinical Neurosciences, University of Cambridge
SB Wells Department of Systems Biology, Columbia University Irving Medical Center
T Gomes Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
SK Howlett Department of Clinical Neurosciences, University of Cambridge
O Suchanek Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge, UK
K Polanski Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
HW King Centre for Immunobiology, Blizard Institute, Queen Mary University of London, London, UK
L Mamanova Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
N Huang Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
PA Szabo Department of Microbiology and Immunology, Columbia University Irving Medical Center
L Richardson Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
L Bolt Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
ES Fasouli Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
KT Mahbubani Department of Surgery, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge, UK
M Prete Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
L Tuck Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
N Richoz Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge, UK
ZK Tuong Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge, UK
L Campos Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK West Suffolk Hospital NHS Trust, Bury Saint Edmunds, UK
HS Mousa Department of Clinical Neurosciences, University of Cambridge
EJ Needham Department of Clinical Neurosciences, University of Cambridge
S Pritchard Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
T Li Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
R Elmentaite Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
J Park Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
E Rahmani Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
D Chen Department of Systems Biology, Columbia University Irving Medical Center
DK Menon Department of Anaesthesia, University of Cambridge, Cambridge, UK
OA Bayraktar Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
LK James Centre for Immunobiology, Blizard Institute, Queen Mary University of London, London, UK
KB Meyer Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
N Yosef Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA Chan Zuckerberg Biohub, San Francisco, CA, USA Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, USA
MR Clatworthy Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge, UK
PA Sims Department of Systems Biology, Columbia University Irving Medical Center
DL Farber Department of Microbiology and Immunology, Columbia University Irving Medical Center
K Saeb-Parsy Department of Surgery, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge, UK
JL Jones Department of Clinical Neurosciences, University of Cambridge
SA Teichmann Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Theory of Condensed Matter, Cavendish Laboratory, Department of Physics, University of Cambridge, JJ Thomson Ave, Cambridge CB3 0HE, UK

Collapse

166

Zhang Y, Zhang F, Wang Z, Wu S, Tian W. scMAGIC: accurately annotating single cells using two rounds of reference-based classification. Nucleic Acids Res 2022;50:e43. [PMID: 34986249 PMCID: PMC9071478 DOI: 10.1093/nar/gkab1275] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 11/08/2021] [Accepted: 12/14/2021] [Indexed: 11/21/2022] Open

167

Zeng Y, Wei Z, Zhong F, Pan Z, Lu Y, Yang Y. A parameter-free deep embedded clustering method for single-cell RNA-seq data. Brief Bioinform 2022;23:6582003. [PMID: 35524494 DOI: 10.1093/bib/bbac172] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 03/25/2022] [Accepted: 04/18/2022] [Indexed: 11/12/2022] Open

168

Hosseini N, Mehrabian A, Mostafavi H. Modeling climate change effects on spatial distribution of wild Aegilops L. (Poaceae) toward food security management and biodiversity conservation in Iran. INTEGRATED ENVIRONMENTAL ASSESSMENT AND MANAGEMENT 2022;18:697-708. [PMID: 34617662 DOI: 10.1002/ieam.4531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 09/14/2021] [Accepted: 09/28/2021] [Indexed: 06/13/2023]

169

Abondio P, De Intinis C, da Silva Gonçalves Vianez Júnior JL, Pace L. SINGLE CELL MULTIOMIC APPROACHES TO DISENTANGLE T CELL HETEROGENEITY. Immunol Lett 2022;246:37-51. [DOI: 10.1016/j.imlet.2022.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 04/16/2022] [Accepted: 04/26/2022] [Indexed: 11/29/2022]

170

Bridges K, Miller-Jensen K. Mapping and Validation of scRNA-Seq-Derived Cell-Cell Communication Networks in the Tumor Microenvironment. Front Immunol 2022;13:885267. [PMID: 35572582 PMCID: PMC9096838 DOI: 10.3389/fimmu.2022.885267] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 03/25/2022] [Indexed: 01/25/2023] Open

171

CASSL: A cell-type annotation method for single cell transcriptomics data using semi-supervised learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03440-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

172

Upadhyay P, Ray S. A Regularized Multi-Task Learning Approach for Cell Type Detection in Single-Cell RNA Sequencing Data. Front Genet 2022;13:788832. [PMID: 35495159 PMCID: PMC9043858 DOI: 10.3389/fgene.2022.788832] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 02/16/2022] [Indexed: 11/29/2022] Open

173

Jiang H, Huang Y, Li Q. Spectral clustering of single cells using Siamese nerual network combined with improved affinity matrix. Brief Bioinform 2022;23:6567703. [PMID: 35419595 DOI: 10.1093/bib/bbac113] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/02/2022] [Accepted: 03/08/2022] [Indexed: 11/14/2022] Open

174

Heydari AA, Davalos OA, Zhao L, Hoyer KK, Sindi SS. ACTIVA: realistic single-cell RNA-seq generation with automatic cell-type identification using introspective variational autoencoders. Bioinformatics 2022;38:2194-2201. [PMID: 35179571 PMCID: PMC9004654 DOI: 10.1093/bioinformatics/btac095] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 01/19/2022] [Accepted: 02/15/2022] [Indexed: 02/04/2023] Open

Abstract

MOTIVATION

Single-cell RNA sequencing (scRNAseq) technologies allow for measurements of gene expression at a single-cell resolution. This provides researchers with a tremendous advantage for detecting heterogeneity, delineating cellular maps or identifying rare subpopulations. However, a critical complication remains: the low number of single-cell observations due to limitations by rarity of subpopulation, tissue degradation or cost. This absence of sufficient data may cause inaccuracy or irreproducibility of downstream analysis. In this work, we present Automated Cell-Type-informed Introspective Variational Autoencoder (ACTIVA): a novel framework for generating realistic synthetic data using a single-stream adversarial variational autoencoder conditioned with cell-type information. Within a single framework, ACTIVA can enlarge existing datasets and generate specific subpopulations on demand, as opposed to two separate models [such as single-cell GAN (scGAN) and conditional scGAN (cscGAN)]. Data generation and augmentation with ACTIVA can enhance scRNAseq pipelines and analysis, such as benchmarking new algorithms, studying the accuracy of classifiers and detecting marker genes. ACTIVA will facilitate analysis of smaller datasets, potentially reducing the number of patients and animals necessary in initial studies.

RESULTS

We train and evaluate models on multiple public scRNAseq datasets. In comparison to GAN-based models (scGAN and cscGAN), we demonstrate that ACTIVA generates cells that are more realistic and harder for classifiers to identify as synthetic which also have better pair-wise correlation between genes. Data augmentation with ACTIVA significantly improves classification of rare subtypes (more than 45% improvement compared with not augmenting and 4% better than cscGAN) all while reducing run-time by an order of magnitude in comparison to both models.

AVAILABILITY AND IMPLEMENTATION

The codes and datasets are hosted on Zenodo (https://doi.org/10.5281/zenodo.5879639). Tutorials are available at https://github.com/SindiLab/ACTIVA.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

175

Yin Q, Liu Q, Fu Z, Zeng W, Zhang B, Zhang X, Jiang R, Lv H. scGraph: a graph neural network-based approach to automatically identify cell types. Bioinformatics 2022;38:2996-3003. [PMID: 35394015 DOI: 10.1093/bioinformatics/btac199] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 12/13/2021] [Accepted: 04/07/2020] [Indexed: 11/13/2022] Open

Affiliation(s)

Qijin Yin Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Qiao Liu Department of Statistics, Stanford University Stanford, CA 94305
Zhuoran Fu Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Wanwen Zeng Department of Statistics, Stanford University Stanford, CA 94305.,College of Software, Nankai University, Tianjin, 300350, China
Boheng Zhang Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Xuegong Zhang Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Rui Jiang Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Hairong Lv Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China.,Fuzhou Institute of Data Technology, Changle, Fuzhou, 350200, China

Collapse

176

Kong W, Fu YC, Holloway EM, Garipler G, Yang X, Mazzoni EO, Morris SA. Capybara: A computational tool to measure cell identity and fate transitions. Cell Stem Cell 2022;29:635-649.e11. [PMID: 35354062 PMCID: PMC9040453 DOI: 10.1016/j.stem.2022.03.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/18/2022] [Accepted: 03/03/2022] [Indexed: 01/14/2023]

Affiliation(s)

Wenjun Kong Department of Developmental Biology, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Center of Regenerative Medicine, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA
Yuheng C Fu Department of Developmental Biology, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Center of Regenerative Medicine, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA
Emily M Holloway Department of Developmental Biology, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Center of Regenerative Medicine, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA
Görkem Garipler Department of Biology, New York University, New York, NY 10003, USA
Xue Yang Department of Developmental Biology, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Center of Regenerative Medicine, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA
Esteban O Mazzoni Department of Biology, New York University, New York, NY 10003, USA
Samantha A Morris Department of Developmental Biology, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA; Center of Regenerative Medicine, Washington University School of Medicine in St. Louis, 660 S. Euclid Avenue, Campus Box 8103, St. Louis, MO 63110, USA.

Collapse

177

Xu W, He H, Guo Z, Li W. Evaluation of machine learning models on protein level inference from prioritized RNA features. Brief Bioinform 2022;23:6555405. [PMID: 35352096 DOI: 10.1093/bib/bbac091] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 02/16/2022] [Accepted: 02/23/2022] [Indexed: 11/12/2022] Open

178

Zeng Z, Li Y, Li Y, Luo Y. Statistical and machine learning methods for spatially resolved transcriptomics data analysis. Genome Biol 2022;23:83. [PMID: 35337374 PMCID: PMC8951701 DOI: 10.1186/s13059-022-02653-7] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 03/15/2022] [Indexed: 01/28/2023] Open

179

Cao X, Xing L, Majd E, He H, Gu J, Zhang X. A Systematic Evaluation of Supervised Machine Learning Algorithms for Cell Phenotype Classification Using Single-Cell RNA Sequencing Data. Front Genet 2022;13:836798. [PMID: 35281805 PMCID: PMC8905542 DOI: 10.3389/fgene.2022.836798] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 01/18/2022] [Indexed: 11/13/2022] Open

Abstract

The new technology of single-cell RNA sequencing (scRNA-seq) can yield valuable insights into gene expression and give critical information about the cellular compositions of complex tissues. In recent years, vast numbers of scRNA-seq datasets have been generated and made publicly available, and this has enabled researchers to train supervised machine learning models for predicting or classifying various cell-level phenotypes. This has led to the development of many new methods for analyzing scRNA-seq data. Despite the popularity of such applications, there has as yet been no systematic investigation of the performance of these supervised algorithms using predictors from various sizes of scRNA-seq datasets. In this study, 13 popular supervised machine learning algorithms for cell phenotype classification were evaluated using published real and simulated datasets with diverse cell sizes. This benchmark comprises two parts. In the first, real datasets were used to assess the computing speed and cell phenotype classification performance of popular supervised algorithms. The classification performances were evaluated using the area under the receiver operating characteristic curve, F1-score, Precision, Recall, and false-positive rate. In the second part, we evaluated gene-selection performance using published simulated datasets with a known list of real genes. The results showed that ElasticNet with interactions performed the best for small and medium-sized datasets. The NaiveBayes classifier was found to be another appropriate method for medium-sized datasets. With large datasets, the performance of the XGBoost algorithm was found to be excellent. Ensemble algorithms were not found to be significantly superior to individual machine learning methods. Including interactions in the ElasticNet algorithm caused a significant performance improvement for small datasets. The linear discriminant analysis algorithm was found to be the best choice when speed is critical; it is the fastest method, it can scale to handle large sample sizes, and its performance is not much worse than the top performers.

Collapse

180

Sun X, Lin X, Li Z, Wu H. A comprehensive comparison of supervised and unsupervised methods for cell type identification in single-cell RNA-seq. Brief Bioinform 2022;23:6502554. [PMID: 35021202 PMCID: PMC8921620 DOI: 10.1093/bib/bbab567] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 11/19/2021] [Accepted: 12/11/2021] [Indexed: 01/26/2023] Open

Abstract

The cell type identification is among the most important tasks in single-cell RNA-sequencing (scRNA-seq) analysis. Many in silico methods have been developed and can be roughly categorized as either supervised or unsupervised. In this study, we investigated the performances of 8 supervised and 10 unsupervised cell type identification methods using 14 public scRNA-seq datasets of different tissues, sequencing protocols and species. We investigated the impacts of a number of factors, including total amount of cells, number of cell types, sequencing depth, batch effects, reference bias, cell population imbalance, unknown/novel cell type, and computational efficiency and scalability. Instead of merely comparing individual methods, we focused on factors' impacts on the general category of supervised and unsupervised methods. We found that in most scenarios, the supervised methods outperformed the unsupervised methods, except for the identification of unknown cell types. This is particularly true when the supervised methods use a reference dataset with high informational sufficiency, low complexity and high similarity to the query dataset. However, such outperformance could be undermined by some undesired dataset properties investigated in this study, which lead to uninformative and biased reference datasets. In these scenarios, unsupervised methods could be comparable to supervised methods. Our study not only explained the cell typing methods' behaviors under different experimental settings but also provided a general guideline for the choice of method according to the scientific goal and dataset properties. Finally, our evaluation workflow is implemented as a modularized R pipeline that allows future evaluation of new methods. Availability: All the source codes are available at https://github.com/xsun28/scRNAIdent.

Collapse

181

Ianevski A, Giri AK, Aittokallio T. Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data. Nat Commun 2022;13:1246. [PMID: 35273156 PMCID: PMC8913782 DOI: 10.1038/s41467-022-28803-w] [Citation(s) in RCA: 185] [Impact Index Per Article: 92.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 02/03/2022] [Indexed: 12/29/2022] Open

182

Andreatta M, Berenstein AJ, Carmona SJ. scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets. Bioinformatics 2022;38:2642-2644. [PMID: 35258562 PMCID: PMC9048671 DOI: 10.1093/bioinformatics/btac141] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/21/2022] [Accepted: 03/04/2022] [Indexed: 01/22/2023] Open

183

Li D, Velazquez JJ, Ding J, Hislop J, Ebrahimkhani MR, Bar-Joseph Z. TraSig: inferring cell-cell interactions from pseudotime ordering of scRNA-Seq data. Genome Biol 2022;23:73. [PMID: 35255944 PMCID: PMC8900372 DOI: 10.1186/s13059-022-02629-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 02/09/2022] [Indexed: 02/08/2023] Open

184

Goyal M, Serrano G, Argemi J, Shomorony I, Hernaez M, Ochoa I. JIND: Joint Integration and Discrimination for Automated Single-Cell Annotation. Bioinformatics 2022;38:2488-2495. [PMID: 35253844 PMCID: PMC9278043 DOI: 10.1093/bioinformatics/btac140] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 02/24/2022] [Accepted: 03/03/2022] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

An important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classification algorithms on previously annotated datasets. Existing pipelines employ dataset integration methods in order to remove potential batch effects between source (annotated) and target (unannotated) datasets. However, the integration and classification steps are usually independent of each other and performed by different tools. We propose JIND, a neural-network-based framework for automated cell-type identification that performs integration in a space suitably chosen to facilitate cell classification. To account for batch effects, JIND performs a novel asymmetric alignment in which unseen cells are mapped onto the previously learned latent space, avoiding the need of retraining the classification model for new datasets. JIND also learns cell-type-specific confidence thresholds to identify cells that cannot be reliably classified.

RESULTS

We show on several batched datasets that the joint approach to integration and classification of JIND outperforms in accuracy existing pipelines, and a smaller fraction of cells is rejected as unlabeled as a result of the cell-specific confidence thresholds. Moreover, we investigate cells misclassified by JIND and provide evidence suggesting that they could be due to outliers in the annotated datasets or errors in the original approach used for annotation of the target batch.

AVAILABILITY

Implementation for JIND is available at https://github.com/mohit1997/JIND and at https://doi.org/10.5281/zenodo.6246322.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

185

Zhang R, Luo Y, Ma J, Zhang M, Wang S. scPretrain: multi-task self-supervised learning for cell-type classification. Bioinformatics 2022;38:1607-1614. [PMID: 34999749 DOI: 10.1093/bioinformatics/btac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 12/25/2021] [Accepted: 01/04/2022] [Indexed: 02/03/2023] Open

Abstract

MOTIVATION

Rapidly generated scRNA-seq datasets enable us to understand cellular differences and the function of each individual cell at single-cell resolution. Cell-type classification, which aims at characterizing and labeling groups of cells according to their gene expression, is one of the most important steps for single-cell analysis. To facilitate the manual curation process, supervised learning methods have been used to automatically classify cells. Most of the existing supervised learning approaches only utilize annotated cells in the training step while ignoring the more abundant unannotated cells. In this article, we proposed scPretrain, a multi-task self-supervised learning approach that jointly considers annotated and unannotated cells for cell-type classification. scPretrain consists of a pre-training step and a fine-tuning step. In the pre-training step, scPretrain uses a multi-task learning framework to train a feature extraction encoder based on each dataset's pseudo-labels, where only unannotated cells are used. In the fine-tuning step, scPretrain fine-tunes this feature extraction encoder using the limited annotated cells in a new dataset.

RESULTS

We evaluated scPretrain on 60 diverse datasets from different technologies, species and organs, and obtained a significant improvement on both cell-type classification and cell clustering. Moreover, the representations obtained by scPretrain in the pre-training step also enhanced the performance of conventional classifiers, such as random forest, logistic regression and support-vector machines. scPretrain is able to effectively utilize the massive amount of unlabeled data and be applied to annotating increasingly generated scRNA-seq datasets.

AVAILABILITY AND IMPLEMENTATION

The data and code underlying this article are available in scPretrain: Multi-task self-supervised learning for cell type classification, at https://github.com/ruiyi-zhang/scPretrain and https://zenodo.org/record/5802306.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

186

Xu Y, Baumgart SJ, Stegmann CM, Hayat S. MACA: marker-based automatic cell-type annotation for single-cell expression data. Bioinformatics 2022;38:1756-1760. [PMID: 34935911 DOI: 10.1093/bioinformatics/btab840] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 10/07/2021] [Accepted: 12/17/2021] [Indexed: 02/03/2023] Open

187

Lin L, Shi W, Ye J, Li J. Multi‐source single‐cell data integration by MAW barycenter for gaussian mixture models. Biometrics 2022. [DOI: 10.1111/biom.13630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 01/29/2022] [Indexed: 11/26/2022]

188

Li H, Qu L, Yang Y, Zhang H, Li X, Zhang X. Single-cell Transcriptomic Architecture Unraveling the Complexity of Tumor Heterogeneity in Distal Cholangiocarcinoma. Cell Mol Gastroenterol Hepatol 2022;13:1592-1609.e9. [PMID: 35219893 PMCID: PMC9043309 DOI: 10.1016/j.jcmgh.2022.02.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/17/2022] [Accepted: 02/17/2022] [Indexed: 01/03/2023]

189

Wilson SB, Howden SE, Vanslambrouck JM, Dorison A, Alquicira-Hernandez J, Powell JE, Little MH. DevKidCC allows for robust classification and direct comparisons of kidney organoid datasets. Genome Med 2022. [PMID: 35189942 DOI: 10.1101/2021.01.20.427346] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2023] Open

Abstract

BACKGROUND

While single-cell transcriptional profiling has greatly increased our capacity to interrogate biology, accurate cell classification within and between datasets is a key challenge. This is particularly so in pluripotent stem cell-derived organoids which represent a model of a developmental system. Here, clustering algorithms and selected marker genes can fail to accurately classify cellular identity while variation in analyses makes it difficult to meaningfully compare datasets. Kidney organoids provide a valuable resource to understand kidney development and disease. However, direct comparison of relative cellular composition between protocols has proved challenging. Hence, an unbiased approach for classifying cell identity is required.

METHODS

The R package, scPred, was trained on multiple single cell RNA-seq datasets of human fetal kidney. A hierarchical model classified cellular subtypes into nephron, stroma and ureteric epithelial elements. This model, provided in the R package DevKidCC ( github.com/KidneyRegeneration/DevKidCC ), was then used to predict relative cell identity within published kidney organoid datasets generated using distinct cell lines and differentiation protocols, interrogating the impact of such variations. The package contains custom functions for the display of differential gene expression within cellular subtypes.

RESULTS

DevKidCC was used to directly compare between distinct kidney organoid protocols, identifying differences in relative proportions of cell types at all hierarchical levels of the model and highlighting variations in stromal and unassigned cell types, nephron progenitor prevalence and relative maturation of individual epithelial segments. Of note, DevKidCC was able to distinguish distal nephron from ureteric epithelium, cell types with overlapping profiles that have previously confounded analyses. When applied to a variation in protocol via the addition of retinoic acid, DevKidCC identified a consequential depletion of nephron progenitors.

CONCLUSIONS

The application of DevKidCC to kidney organoids reproducibly classifies component cellular identity within distinct single-cell datasets. The application of the tool is summarised in an interactive Shiny application, as are examples of the utility of in-built functions for data presentation. This tool will enable the consistent and rapid comparison of kidney organoid protocols, driving improvements in patterning to kidney endpoints and validating new approaches.

Collapse

190

Wilson SB, Howden SE, Vanslambrouck JM, Dorison A, Alquicira-Hernandez J, Powell JE, Little MH. DevKidCC allows for robust classification and direct comparisons of kidney organoid datasets. Genome Med 2022;14:19. [PMID: 35189942 PMCID: PMC8862535 DOI: 10.1186/s13073-022-01023-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Accepted: 02/08/2022] [Indexed: 12/20/2022] Open

Abstract

Background

Methods

The R package, scPred, was trained on multiple single cell RNA-seq datasets of human fetal kidney. A hierarchical model classified cellular subtypes into nephron, stroma and ureteric epithelial elements. This model, provided in the R package DevKidCC (github.com/KidneyRegeneration/DevKidCC), was then used to predict relative cell identity within published kidney organoid datasets generated using distinct cell lines and differentiation protocols, interrogating the impact of such variations. The package contains custom functions for the display of differential gene expression within cellular subtypes.

Results

Conclusions

Supplementary Information

The online version contains supplementary material available at 10.1186/s13073-022-01023-z.

Collapse

191

Chen X, Chen S, Song S, Gao Z, Hou L, Zhang X, Lv H, Jiang R. Cell type annotation of single-cell chromatin accessibility data via supervised Bayesian embedding. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-021-00432-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

192

Tang H, Yu X, Liu R, Zeng T. Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion. Brief Bioinform 2022;23:6518046. [PMID: 35106553 PMCID: PMC8921615 DOI: 10.1093/bib/bbab584] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 12/06/2021] [Accepted: 12/20/2021] [Indexed: 01/05/2023] Open

Abstract

Feature representation and discriminative learning are proven models and technologies in artificial intelligence fields; however, major challenges for machine learning on large biological datasets are learning an effective model with mechanistical explanation on the model determination and prediction. To satisfy such demands, we developed Vec2image, an explainable convolutional neural network framework for characterizing the feature engineering, feature selection and classifier training that is mainly based on the collaboration of principal component coordinate conversion, deep residual neural networks and embedded k-nearest neighbor representation on pseudo images of high-dimensional biological data, where the pseudo images represent feature measurements and feature associations simultaneously. Vec2image has achieved better performance compared with other popular methods and illustrated its efficiency on feature selection in cell marker identification from tissue-specific single-cell datasets. In particular, in a case study on type 2 diabetes (T2D) by multiple human islet scRNA-seq datasets, Vec2image first displayed robust performance on T2D classification model building across different datasets, then a specific Vec2image model was trained to accurately recognize the cell state and efficiently rank feature genes relevant to T2D which uncovered potential T2D cellular pathogenesis; and next the cell activity changes, cell composition imbalances and cell–cell communication dysfunctions were associated to our finding T2D feature genes from both population-shared and individual-specific perspectives. Collectively, Vec2image is a new and efficient explainable artificial intelligence methodology that can be widely applied in human-readable classification and prediction on the basis of pseudo image representation of biological deep sequencing data.

Collapse

193

Amblard E, Bac J, Chervov A, Soumelis V, Zinovyev A. Hubness reduction improves clustering and trajectory inference in single-cell transcriptomic data. Bioinformatics 2022;38:1045-1051. [PMID: 34871374 DOI: 10.1093/bioinformatics/btab795] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 11/05/2021] [Accepted: 11/17/2021] [Indexed: 02/03/2023] Open

Abstract

MOTIVATION

Single-cell RNA-seq (scRNAseq) datasets are characterized by large ambient dimensionality, and their analyses can be affected by various manifestations of the dimensionality curse. One of these manifestations is the hubness phenomenon, i.e. existence of data points with surprisingly large incoming connectivity degree in the datapoint neighbourhood graph. Conventional approach to dampen the unwanted effects of high dimension consists in applying drastic dimensionality reduction. It remains unexplored if this step can be avoided thus retaining more information than contained in the low-dimensional projections, by correcting directly hubness.

RESULTS

We investigated hubness in scRNAseq data. We show that hub cells do not represent any visible technical or biological bias. The effect of various hubness reduction methods is investigated with respect to the clustering, trajectory inference and visualization tasks in scRNAseq datasets. We show that hubness reduction generates neighbourhood graphs with properties more suitable for applying machine learning methods; and that it outperforms other state-of-the-art methods for improving neighbourhood graphs. As a consequence, clustering, trajectory inference and visualization perform better, especially for datasets characterized by large intrinsic dimensionality. Hubness is an important phenomenon characterizing data point neighbourhood graphs computed for various types of sequencing datasets. Reducing hubness can be beneficial for the analysis of scRNAseq data with large intrinsic dimensionality in which case it can be an alternative to drastic dimensionality reduction.

AVAILABILITY AND IMPLEMENTATION

The code used to analyze the datasets and produce the figures of this article is available from https://github.com/sysbio-curie/schubness.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

194

Teng H, Yuan Y, Bar-Joseph Z. Clustering spatial transcriptomics data. Bioinformatics 2022;38:997-1004. [PMID: 34623423 PMCID: PMC8796363 DOI: 10.1093/bioinformatics/btab704] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 08/28/2021] [Accepted: 10/06/2021] [Indexed: 02/04/2023] Open

195

Liu B, Li Y, Zhang L. Analysis and Visualization of Spatial Transcriptomic Data. Front Genet 2022;12:785290. [PMID: 35154244 PMCID: PMC8829434 DOI: 10.3389/fgene.2021.785290] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 12/24/2021] [Indexed: 12/21/2022] Open

196

Li J, Sheng Q, Shyr Y, Liu Q. scMRMA: single cell multiresolution marker-based annotation. Nucleic Acids Res 2022;50:e7. [PMID: 34648021 PMCID: PMC8789072 DOI: 10.1093/nar/gkab931] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 09/10/2021] [Accepted: 09/28/2021] [Indexed: 01/22/2023] Open

197

Lin Y, Wu TY, Wan S, Yang JYH, Wong WH, Wang YXR. scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning. Nat Biotechnol 2022;40:703-710. [DOI: 10.1038/s41587-021-01161-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 11/16/2021] [Indexed: 12/11/2022]

198

Flores M, Liu Z, Zhang T, Hasib MM, Chiu YC, Ye Z, Paniagua K, Jo S, Zhang J, Gao SJ, Jin YF, Chen Y, Huang Y. Deep learning tackles single-cell analysis-a survey of deep learning for scRNA-seq analysis. Brief Bioinform 2022;23:bbab531. [PMID: 34929734 PMCID: PMC8769926 DOI: 10.1093/bib/bbab531] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/15/2021] [Accepted: 11/16/2021] [Indexed: 12/17/2022] Open

Affiliation(s)

Mario Flores Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Zhentao Liu Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Tinghe Zhang Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Md Musaddaqui Hasib Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Yu-Chiao Chiu Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
Zhenqing Ye Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, TX 78229, USA
Karla Paniagua Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Sumin Jo Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Jianqiu Zhang Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Shou-Jiang Gao Department of Microbiology and Molecular Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania, PA 15232, USA UPMC Hillman Cancer Center, University of Pittsburgh, PA 15232, USA
Yu-Fang Jin Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
Yidong Chen Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, TX 78229, USA
Yufei Huang Department of Medicine, School of Medicine, University of Pittsburgh, PA 15232, USA UPMC Hillman Cancer Center, University of Pittsburgh, PA 15232, USA

Collapse

199

Watson ER, Taherian Fard A, Mar JC. Computational Methods for Single-Cell Imaging and Omics Data Integration. Front Mol Biosci 2022;8:768106. [PMID: 35111809 PMCID: PMC8801747 DOI: 10.3389/fmolb.2021.768106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open

200

Nguyen V, Griss J. scAnnotatR: framework to accurately classify cell types in single-cell RNA-sequencing data. BMC Bioinformatics 2022;23:44. [PMID: 35038984 PMCID: PMC8762856 DOI: 10.1186/s12859-022-04574-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/11/2022] [Indexed: 12/02/2022] Open