Yan J, Ma M, Yu Z. bmVAE: a variational autoencoder method for clustering single-cell mutation data.
Bioinformatics 2022;
39:6881080. [PMID:
36478203 PMCID:
PMC9825778 DOI:
10.1093/bioinformatics/btac790]
[Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 10/26/2022] [Accepted: 12/06/2022] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION
Genetic intra-tumor heterogeneity (ITH) characterizes the differences in genomic variations between tumor clones, and accurately unmasking ITH is important for personalized cancer therapy. Single-cell DNA sequencing now emerges as a powerful means for deciphering underlying ITH based on point mutations of single cells. However, detecting tumor clones from single-cell mutation data remains challenging due to the error-prone and discrete nature of the data.
RESULTS
We introduce bmVAE, a bioinformatics tool for learning low-dimensional latent representation of single cell based on a variational autoencoder and then clustering cells into subpopulations in the latent space. bmVAE takes single-cell binary mutation data as inputs, and outputs inferred cell subpopulations as well as their genotypes. To achieve this, the bmVAE framework is designed to consist of three modules including dimensionality reduction, cell clustering and genotype estimation. We assess the method on various synthetic datasets where different factors including false negative rate, data size and data heterogeneity are considered in simulation, and further demonstrate its effectiveness on two real datasets. The results suggest bmVAE is highly effective in reasoning ITH, and performs competitive to existing methods.
AVAILABILITY AND IMPLEMENTATION
bmVAE is freely available at https://github.com/zhyu-lab/bmvae.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Collapse