Chan TE, Stumpf MPH, Babtie AC. Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures.
Cell Syst 2019;
5:251-267.e3. [PMID:
28957658 PMCID:
PMC5624513 DOI:
10.1016/j.cels.2017.08.014]
[Citation(s) in RCA: 283] [Impact Index Per Article: 56.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 04/26/2017] [Accepted: 08/24/2017] [Indexed: 12/03/2022]
Abstract
While single-cell gene expression experiments present new challenges for data processing, the cell-to-cell variability observed also reveals statistical relationships that can be used by information theory. Here, we use multivariate information theory to explore the statistical dependencies between triplets of genes in single-cell gene expression datasets. We develop PIDC, a fast, efficient algorithm that uses partial information decomposition (PID) to identify regulatory relationships between genes. We thoroughly evaluate the performance of our algorithm and demonstrate that the higher-order information captured by PIDC allows it to outperform pairwise mutual information-based algorithms when recovering true relationships present in simulated data. We also infer gene regulatory networks from three experimental single-cell datasets and illustrate how network context, choices made during analysis, and sources of variability affect network inference. PIDC tutorials and open-source software for estimating PID are available. PIDC should facilitate the identification of putative functional relationships and mechanistic hypotheses from single-cell transcriptomic data.
PIDC infers gene regulatory networks from single-cell transcriptomic data
Multivariate information measures and context in PIDC improve network inference
Heterogeneity in single-cell data carries information about gene-gene interactions
Fast, efficient, open-source software is made freely available
Collapse