1
|
Luo S, Germain PL, Robinson MD, von Meyenn F. Benchmarking computational methods for single-cell chromatin data analysis. Genome Biol 2024; 25:225. [PMID: 39152456 PMCID: PMC11328424 DOI: 10.1186/s13059-024-03356-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 07/29/2024] [Indexed: 08/19/2024] Open
Abstract
BACKGROUND Single-cell chromatin accessibility assays, such as scATAC-seq, are increasingly employed in individual and joint multi-omic profiling of single cells. As the accumulation of scATAC-seq and multi-omics datasets continue, challenges in analyzing such sparse, noisy, and high-dimensional data become pressing. Specifically, one challenge relates to optimizing the processing of chromatin-level measurements and efficiently extracting information to discern cellular heterogeneity. This is of critical importance, since the identification of cell types is a fundamental step in current single-cell data analysis practices. RESULTS We benchmark 8 feature engineering pipelines derived from 5 recent methods to assess their ability to discover and discriminate cell types. By using 10 metrics calculated at the cell embedding, shared nearest neighbor graph, or partition levels, we evaluate the performance of each method at different data processing stages. This comprehensive approach allows us to thoroughly understand the strengths and weaknesses of each method and the influence of parameter selection. CONCLUSIONS Our analysis provides guidelines for choosing analysis methods for different datasets. Overall, feature aggregation, SnapATAC, and SnapATAC2 outperform latent semantic indexing-based methods. For datasets with complex cell-type structures, SnapATAC and SnapATAC2 are preferred. With large datasets, SnapATAC2 and ArchR are most scalable.
Collapse
Affiliation(s)
- Siyuan Luo
- Laboratory of Nutrition and Metabolic Epigenetics, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Pierre-Luc Germain
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
- Institute for Neuroscience, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
| | - Mark D Robinson
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland.
| | - Ferdinand von Meyenn
- Laboratory of Nutrition and Metabolic Epigenetics, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
2
|
Rachid Zaim S, Pebworth MP, McGrath I, Okada L, Weiss M, Reading J, Czartoski JL, Torgerson TR, McElrath MJ, Bumol TF, Skene PJ, Li XJ. MOCHA's advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human cohorts. Nat Commun 2024; 15:6828. [PMID: 39122670 PMCID: PMC11316085 DOI: 10.1038/s41467-024-50612-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 07/13/2024] [Indexed: 08/12/2024] Open
Abstract
Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) is being increasingly used to study gene regulation. However, major analytical gaps limit its utility in studying gene regulatory programs in complex diseases. In response, MOCHA (Model-based single cell Open CHromatin Analysis) presents major advances over existing analysis tools, including: 1) improving identification of sample-specific open chromatin, 2) statistical modeling of technical drop-out with zero-inflated methods, 3) mitigation of false positives in single cell analysis, 4) identification of alternative transcription-starting-site regulation, and 5) modules for inferring temporal gene regulatory networks from longitudinal data. These advances, in addition to open chromatin analyses, provide a robust framework after quality control and cell labeling to study gene regulatory programs in human disease. We benchmark MOCHA with four state-of-the-art tools to demonstrate its advances. We also construct cross-sectional and longitudinal gene regulatory networks, identifying potential mechanisms of COVID-19 response. MOCHA provides researchers with a robust analytical tool for functional genomic inference from scATAC-seq data.
Collapse
Affiliation(s)
| | | | | | - Lauren Okada
- Allen Institute for Immunology, Seattle, WA, USA
| | - Morgan Weiss
- Allen Institute for Immunology, Seattle, WA, USA
| | | | - Julie L Czartoski
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - M Juliana McElrath
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | | | - Xiao-Jun Li
- Allen Institute for Immunology, Seattle, WA, USA.
| |
Collapse
|
3
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-023-2561-0. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
4
|
Guan PY, Lee JS, Wang L, Lin KZ, Mei W, Chen L, Jiang Y. Destin2: Integrative and cross-modality analysis of single-cell chromatin accessibility data. Front Genet 2023; 14:1089936. [PMID: 36873935 PMCID: PMC9981783 DOI: 10.3389/fgene.2023.1089936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 02/06/2023] [Indexed: 02/19/2023] Open
Abstract
We propose Destin2, a novel statistical and computational method for cross-modality dimension reduction, clustering, and trajectory reconstruction for single-cell ATAC-seq data. The framework integrates cellular-level epigenomic profiles from peak accessibility, motif deviation score, and pseudo-gene activity and learns a shared manifold using the multimodal input, followed by clustering and/or trajectory inference. We apply Destin2 to real scATAC-seq datasets with both discretized cell types and transient cell states and carry out benchmarking studies against existing methods based on unimodal analyses. Using cell-type labels transferred with high confidence from unmatched single-cell RNA sequencing data, we adopt four performance assessment metrics and demonstrate how Destin2 corroborates and improves upon existing methods. Using single-cell RNA and ATAC multiomic data, we further exemplify how Destin2's cross-modality integrative analyses preserve true cell-cell similarities using the matched cell pairs as ground truths. Destin2 is compiled as a freely available R package available at https://github.com/yuchaojiang/Destin2.
Collapse
Affiliation(s)
- Peter Y Guan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States
| | - Jin Seok Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States
| | - Lihao Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States
| | - Kevin Z Lin
- Department of Statistics, University of Pennsylvania, Philadelphia, PA, Unites States
| | - Wenwen Mei
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States
| | - Li Chen
- Department of Biostatistics, University of Florida, Gainesville, FL, Unites States
| | - Yuchao Jiang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States.,Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, Unites States
| |
Collapse
|
5
|
Preissl S, Gaulton KJ, Ren B. Characterizing cis-regulatory elements using single-cell epigenomics. Nat Rev Genet 2023; 24:21-43. [PMID: 35840754 PMCID: PMC9771884 DOI: 10.1038/s41576-022-00509-1] [Citation(s) in RCA: 65] [Impact Index Per Article: 65.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/24/2022] [Indexed: 12/24/2022]
Abstract
Cell type-specific gene expression patterns and dynamics during development or in disease are controlled by cis-regulatory elements (CREs), such as promoters and enhancers. Distinct classes of CREs can be characterized by their epigenomic features, including DNA methylation, chromatin accessibility, combinations of histone modifications and conformation of local chromatin. Tremendous progress has been made in cataloguing CREs in the human genome using bulk transcriptomic and epigenomic methods. However, single-cell epigenomic and multi-omic technologies have the potential to provide deeper insight into cell type-specific gene regulatory programmes as well as into how they change during development, in response to environmental cues and through disease pathogenesis. Here, we highlight recent advances in single-cell epigenomic methods and analytical tools and discuss their readiness for human tissue profiling.
Collapse
Affiliation(s)
- Sebastian Preissl
- Center for Epigenomics, University of California San Diego, La Jolla, CA, USA.
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| | - Kyle J Gaulton
- Department of Paediatrics, Paediatric Diabetes Research Center, University of California San Diego, La Jolla, CA, USA.
| | - Bing Ren
- Center for Epigenomics, University of California San Diego, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
| |
Collapse
|
6
|
Albrecht S, Andreani T, Andrade-Navarro MA, Fontaine JF. Single-cell specific and interpretable machine learning models for sparse scChIP-seq data imputation. PLoS One 2022; 17:e0270043. [PMID: 35776722 PMCID: PMC9249201 DOI: 10.1371/journal.pone.0270043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 06/02/2022] [Indexed: 11/19/2022] Open
Abstract
MOTIVATION Single-cell Chromatin ImmunoPrecipitation DNA-Sequencing (scChIP-seq) analysis is challenging due to data sparsity. High degree of sparsity in biological high-throughput single-cell data is generally handled with imputation methods that complete the data, but specific methods for scChIP-seq are lacking. We present SIMPA, a scChIP-seq data imputation method leveraging predictive information within bulk data from the ENCODE project to impute missing protein-DNA interacting regions of target histone marks or transcription factors. RESULTS Imputations using machine learning models trained for each single cell, each ChIP protein target, and each genomic region accurately preserve cell type clustering and improve pathway-related gene identification on real human data. Results on bulk data simulating single cells show that the imputations are single-cell specific as the imputed profiles are closer to the simulated cell than to other cells related to the same ChIP protein target and the same cell type. Simulations also show that 100 input genomic regions are already enough to train single-cell specific models for the imputation of thousands of undetected regions. Furthermore, SIMPA enables the interpretation of machine learning models by revealing interaction sites of a given single cell that are most important for the imputation model trained for a specific genomic region. The corresponding feature importance values derived from promoter-interaction profiles of H3K4me3, an activating histone mark, highly correlate with co-expression of genes that are present within the cell-type specific pathways in 2 real human and mouse datasets. The SIMPA's interpretable imputation method allows users to gain a deep understanding of individual cells and, consequently, of sparse scChIP-seq datasets. AVAILABILITY AND IMPLEMENTATION Our interpretable imputation algorithm was implemented in Python and is available at https://github.com/salbrec/SIMPA.
Collapse
Affiliation(s)
- Steffen Albrecht
- Institute of Organismic and Molecular Evolution (iOME), Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Tommaso Andreani
- Institute of Organismic and Molecular Evolution (iOME), Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| | - Miguel A. Andrade-Navarro
- Institute of Organismic and Molecular Evolution (iOME), Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Jean Fred Fontaine
- Institute of Organismic and Molecular Evolution (iOME), Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
7
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
8
|
Kusi M, Zand M, Lin LL, Chen M, Lopez A, Lin CL, Wang CM, Lucio ND, Kirma NB, Ruan J, Huang THM, Mitsuya K. 2-Hydroxyglutarate destabilizes chromatin regulatory landscape and lineage fidelity to promote cellular heterogeneity. Cell Rep 2022; 38:110220. [PMID: 35021081 PMCID: PMC8811753 DOI: 10.1016/j.celrep.2021.110220] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 09/23/2021] [Accepted: 12/15/2021] [Indexed: 02/07/2023] Open
Abstract
The epigenome delineates lineage-specific transcriptional programs and restricts cell plasticity to prevent non-physiological cell fate transitions. Although cell diversification fosters tumor evolution and therapy resistance, upstream mechanisms that regulate the stability and plasticity of the cancer epigenome remain elusive. Here we show that 2-hydroxyglutarate (2HG) not only suppresses DNA repair but also mediates the high-plasticity chromatin landscape. A combination of single-cell epigenomics and multi-omics approaches demonstrates that 2HG disarranges otherwise well-preserved stable nucleosome positioning and promotes cell-to-cell variability. 2HG induces loss of motif accessibility to the luminal-defining transcriptional factors FOXA1, FOXP1, and GATA3 and a shift from luminal to basal-like gene expression. Breast tumors with high 2HG exhibit enhanced heterogeneity with undifferentiated epigenomic signatures linked to adverse prognosis. Further, ascorbate-2-phosphate (A2P) eradicates heterogeneity and impairs growth of high 2HG-producing breast cancer cells. These findings suggest 2HG as a key determinant of cancer plasticity and provide a rational strategy to counteract tumor cell evolution. Kusi et al. show that the oncometabolite 2-hydroxyglutarate (2HG) initiates cell-level epigenome fluctuations in the chromatin regulatory landscape, accompanied by loss of lineage fidelity. Breast tumors with high 2HG accumulation exhibit enhanced cellular heterogeneity with undifferentiated stem-like epigenomic signatures. The findings suggest metabolic derangement as a molecular origin of breast cancer heterogeneity.
Collapse
Affiliation(s)
- Meena Kusi
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Maryam Zand
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Li-Ling Lin
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Meizhen Chen
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Anthony Lopez
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Chun-Lin Lin
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Chiou-Miin Wang
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Nicholas D Lucio
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Nameer B Kirma
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA
| | - Jianhua Ruan
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Tim H-M Huang
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA.
| | - Kohzoh Mitsuya
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229, USA.
| |
Collapse
|
9
|
Stuart T, Srivastava A, Madad S, Lareau CA, Satija R. Single-cell chromatin state analysis with Signac. Nat Methods 2021; 18:1333-1341. [PMID: 34725479 PMCID: PMC9255697 DOI: 10.1038/s41592-021-01282-5] [Citation(s) in RCA: 549] [Impact Index Per Article: 183.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 08/27/2021] [Indexed: 11/08/2022]
Abstract
The recent development of experimental methods for measuring chromatin state at single-cell resolution has created a need for computational tools capable of analyzing these datasets. Here we developed Signac, a comprehensive toolkit for the analysis of single-cell chromatin data. Signac enables an end-to-end analysis of single-cell chromatin data, including peak calling, quantification, quality control, dimension reduction, clustering, integration with single-cell gene expression datasets, DNA motif analysis and interactive visualization. Through its seamless compatibility with the Seurat package, Signac facilitates the analysis of diverse multimodal single-cell chromatin data, including datasets that co-assay DNA accessibility with gene expression, protein abundance and mitochondrial genotype. We demonstrate scaling of the Signac framework to analyze datasets containing over 700,000 cells.
Collapse
Affiliation(s)
- Tim Stuart
- New York Genome Center, New York City, NY, USA.
- Center for Genomics and Systems Biology, New York University, New York City, NY, USA.
| | - Avi Srivastava
- New York Genome Center, New York City, NY, USA
- Center for Genomics and Systems Biology, New York University, New York City, NY, USA
| | - Shaista Madad
- New York Genome Center, New York City, NY, USA
- Center for Genomics and Systems Biology, New York University, New York City, NY, USA
| | - Caleb A Lareau
- Department of Genetics and Pathology, Stanford University, Stanford, CA, USA
| | - Rahul Satija
- New York Genome Center, New York City, NY, USA.
- Center for Genomics and Systems Biology, New York University, New York City, NY, USA.
| |
Collapse
|
10
|
Asada K, Takasawa K, Machino H, Takahashi S, Shinkai N, Bolatkan A, Kobayashi K, Komatsu M, Kaneko S, Okamoto K, Hamamoto R. Single-Cell Analysis Using Machine Learning Techniques and Its Application to Medical Research. Biomedicines 2021; 9:biomedicines9111513. [PMID: 34829742 PMCID: PMC8614827 DOI: 10.3390/biomedicines9111513] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/06/2021] [Accepted: 10/19/2021] [Indexed: 01/14/2023] Open
Abstract
In recent years, the diversity of cancer cells in tumor tissues as a result of intratumor heterogeneity has attracted attention. In particular, the development of single-cell analysis technology has made a significant contribution to the field; technologies that are centered on single-cell RNA sequencing (scRNA-seq) have been reported to analyze cancer constituent cells, identify cell groups responsible for therapeutic resistance, and analyze gene signatures of resistant cell groups. However, although single-cell analysis is a powerful tool, various issues have been reported, including batch effects and transcriptional noise due to gene expression variation and mRNA degradation. To overcome these issues, machine learning techniques are currently being introduced for single-cell analysis, and promising results are being reported. In addition, machine learning has also been used in various ways for single-cell analysis, such as single-cell assay of transposase accessible chromatin sequencing (ATAC-seq), chromatin immunoprecipitation sequencing (ChIP-seq) analysis, and multi-omics analysis; thus, it contributes to a deeper understanding of the characteristics of human diseases, especially cancer, and supports clinical applications. In this review, we present a comprehensive introduction to the implementation of machine learning techniques in medical research for single-cell analysis, and discuss their usefulness and future potential.
Collapse
Affiliation(s)
- Ken Asada
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
- Correspondence: (K.A.); (R.H.); Tel.: +81-3-3547-5271 (R.H.)
| | - Ken Takasawa
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
| | - Hidenori Machino
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
| | - Satoshi Takahashi
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
| | - Norio Shinkai
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan
| | - Amina Bolatkan
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (K.K.); (S.K.)
| | - Kazuma Kobayashi
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (K.K.); (S.K.)
| | - Masaaki Komatsu
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; (K.T.); (H.M.); (S.T.); (N.S.); (A.B.); (M.K.)
| | - Syuzo Kaneko
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (K.K.); (S.K.)
| | - Koji Okamoto
- Division of Cancer Differentiation, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan;
| | - Ryuji Hamamoto
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (K.K.); (S.K.)
- Correspondence: (K.A.); (R.H.); Tel.: +81-3-3547-5271 (R.H.)
| |
Collapse
|
11
|
Danese A, Richter ML, Chaichoompu K, Fischer DS, Theis FJ, Colomé-Tatché M. EpiScanpy: integrated single-cell epigenomic analysis. Nat Commun 2021; 12:5228. [PMID: 34471111 PMCID: PMC8410937 DOI: 10.1038/s41467-021-25131-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Accepted: 07/22/2021] [Indexed: 11/14/2022] Open
Abstract
EpiScanpy is a toolkit for the analysis of single-cell epigenomic data, namely single-cell DNA methylation and single-cell ATAC-seq data. To address the modality specific challenges from epigenomics data, epiScanpy quantifies the epigenome using multiple feature space constructions and builds a nearest neighbour graph using epigenomic distance between cells. EpiScanpy makes the many existing scRNA-seq workflows from scanpy available to large-scale single-cell data from other -omics modalities, including methods for common clustering, dimension reduction, cell type identification and trajectory learning techniques, as well as an atlas integration tool for scATAC-seq datasets. The toolkit also features numerous useful downstream functions, such as differential methylation and differential openness calling, mapping epigenomic features of interest to their nearest gene, or constructing gene activity matrices using chromatin openness. We successfully benchmark epiScanpy against other scATAC-seq analysis tools and show its outperformance at discriminating cell types.
Collapse
Affiliation(s)
- Anna Danese
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Maria L Richter
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Kridsadakorn Chaichoompu
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - David S Fischer
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
- Department of Mathematics, Technical University of Munich, Garching, Germany.
| | - Maria Colomé-Tatché
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
- Biomedical Center (BMC), Physiological Chemistry, Faculty of Medicine, LMU Munich, Planegg-Martinsried, Germany.
| |
Collapse
|
12
|
Yu F, Sankaran VG, Yuan GC. CUT&RUNTools 2.0: a pipeline for single-cell and bulk-level CUT&RUN and CUT&Tag data analysis. Bioinformatics 2021; 38:252-254. [PMID: 34244724 PMCID: PMC8696090 DOI: 10.1093/bioinformatics/btab507] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 07/01/2021] [Accepted: 07/07/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Genome-wide profiling of transcription factor binding and chromatin states is a widely-used approach for mechanistic understanding of gene regulation. Recent technology development has enabled such profiling at single-cell resolution. However, an end-to-end computational pipeline for analyzing such data is still lacking. RESULTS Here, we have developed a flexible pipeline for analysis and visualization of single-cell CUT&Tag and CUT&RUN data, which provides functions for sequence alignment, quality control, dimensionality reduction, cell clustering, data aggregation and visualization. Furthermore, it is also seamlessly integrated with the functions in original CUT&RUNTools for population-level analyses. As such, this provides a valuable toolbox for the community. AVAILABILITY AND IMPLEMENTATION https://github.com/fl-yu/CUT-RUNTools-2.0. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fulong Yu
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA,Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115, USA,Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA,Program in Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02115, USA
| | - Vijay G Sankaran
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA,Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115, USA,Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA,Program in Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02115, USA
| | | |
Collapse
|
13
|
Gautam V, Mittal A, Kalra S, Mohanty SK, Gupta K, Rani K, Naidu S, Mishra T, Sengupta D, Ahuja G. EcTracker: Tracking and elucidating ectopic expression leveraging large-scale scRNA-seq studies. Brief Bioinform 2021; 22:6309926. [PMID: 34184038 DOI: 10.1093/bib/bbab237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 05/31/2021] [Accepted: 06/01/2021] [Indexed: 11/13/2022] Open
Abstract
Dramatic genomic alterations, either inducible or in a pathological state, dismantle the core regulatory networks, leading to the activation of normally silent genes. Despite possessing immense therapeutic potential, accurate detection of these transcripts is an ever-challenging task, as it requires prior knowledge of the physiological gene expression levels. Here, we introduce EcTracker, an R-/Shiny-based single-cell data analysis web server that bestows a plethora of functionalities that collectively enable the quantitative and qualitative assessments of bona fide cell types or tissue-specific transcripts and, conversely, the ectopically expressed genes in the single-cell ribonucleic acid sequencing datasets. Moreover, it also allows regulon analysis to identify the key transcriptional factors regulating the user-selected gene signatures. To demonstrate the EcTracker functionality, we reanalyzed the CRISPR interference (CRISPRi) dataset of the human embryonic stem cells differentiated into endoderm lineage and identified the prominent enrichment of a specific gene signature in the SMAD2 knockout cells whose identity was ambiguous in the original study. The key distinguishing features of EcTracker lie within its processing speed, availability of multiple add-on modules, interactive graphical user interface and comprehensiveness. In summary, EcTracker provides an easy-to-perform, integrative and end-to-end single-cell data analysis platform that allows decoding of cellular identities, identification of ectopically expressed genes and their regulatory networks, and therefore, collectively imparts a novel dimension for analyzing single-cell datasets.
Collapse
Affiliation(s)
- Vishakha Gautam
- Indraprastha Institute of Information Technology, Delhi, India
| | - Aayushi Mittal
- Indraprastha Institute of Information Technology, Delhi, India
| | - Siddhant Kalra
- Indraprastha Institute of Information Technology, Delhi, India
| | | | - Krishan Gupta
- Indraprastha Institute of Information Technology, Delhi, India
| | - Komal Rani
- Indraprastha Institute of Information Technology, Delhi, India
| | - Srivatsava Naidu
- Department of Biomedical Engineering, Indian Institute of Technology Ropar, India
| | | | - Debarka Sengupta
- Department of Computational Biology and Department of Computer Science at the Indraprastha Institute of Information Technology, India
| | - Gaurav Ahuja
- Department of Computational Biology at the Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), India
| |
Collapse
|
14
|
Nussbaum DP, Rushing CN, Sun Z, Yerokun BA, Worni M, Saunders RS, McClellan MB, Niedzwiecki D, Greenup RA, Blazer DG. Hospital-level compliance with the commission on cancer's quality of care measures and the association with patient survival. Cancer Med 2021; 10:3533-3544. [PMID: 33943026 PMCID: PMC8178497 DOI: 10.1002/cam4.3875] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 03/08/2021] [Accepted: 03/13/2021] [Indexed: 11/23/2022] Open
Abstract
Background Quality measurement has become a priority for national healthcare reform, and valid measures are necessary to discriminate hospital performance and support value‐based healthcare delivery. The Commission on Cancer (CoC) is the largest cancer‐specific accreditor of hospital quality in the United States and has implemented Quality of Care Measures to evaluate cancer care delivery. However, none has been formally tested as a valid metric for assessing hospital performance based on actual patient outcomes. Methods Eligibility and compliance with the Quality of Care Measures are reported within the National Cancer Database, which also captures data for robust patient‐level risk adjustment. Hospital‐level compliance was calculated for the core measures, and the association with patient survival was tested using Cox regression. Results Seven hundred sixty‐eight thousand nine hundred sixty‐nine unique cancer cases were included from 1323 facilities. Increasing hospital‐level compliance was associated with improved survival for only two measures, including a 35% reduced risk of mortality for the gastric cancer measure G15RLN (HR 0.65, 95% CI 0.58–0.72) and a 19% reduced risk of mortality for the colon cancer measure 12RLN (HR 0.81, 95% CI 0.77–0.85). For the lung cancer measure LNoSurg, increasing compliance was paradoxically associated with an increased risk of mortality (HR 1.14, 95% CI 1.08–1.20). For the remaining measures, hospital‐level compliance demonstrated no consistent association with patient survival. Conclusion Hospital‐level compliance with the CoC’s Quality of Care Measures is not uniformly aligned with patient survival. In their current form, these measures do not reliably discriminate hospital performance and are limited as a tool for value‐based healthcare delivery.
Collapse
Affiliation(s)
| | - Christel N Rushing
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Zhifei Sun
- Department of Surgery, Duke University, Durham, NC, USA
| | | | - Mathias Worni
- Department of Visceral Surgery, Clarunis, University Centre for Gastrointestinal and Liver Diseases, St. Clara Hospital and University Hospital, Basel, Switzerland.,Swiss Institute for Translational and Entrepreneurial Medicine, Bern, Switzerland
| | - Robert S Saunders
- Duke University, Robert J. Margolis Center for Health Policy, Durham, NC, USA
| | - Mark B McClellan
- Duke University, Robert J. Margolis Center for Health Policy, Durham, NC, USA
| | - Donna Niedzwiecki
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Rachel A Greenup
- Department of Surgery and Population Health Sciences, Duke University, Duke Cancer Institute, Durham, NC, USA
| | - Dan G Blazer
- Department of Surgery, Duke University, Duke Cancer Institute, Durham, NC, USA
| |
Collapse
|
15
|
Rai MF, Wu CL, Capellini TD, Guilak F, Dicks AR, Muthuirulan P, Grandi F, Bhutani N, Westendorf JJ. Single Cell Omics for Musculoskeletal Research. Curr Osteoporos Rep 2021; 19:131-140. [PMID: 33559841 PMCID: PMC8743139 DOI: 10.1007/s11914-021-00662-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/19/2021] [Indexed: 02/04/2023]
Abstract
PURPOSE OF REVIEW The ability to analyze the molecular events occurring within individual cells as opposed to populations of cells is revolutionizing our understanding of musculoskeletal tissue development and disease. Single cell studies have the great potential of identifying cellular subpopulations that work in a synchronized fashion to regenerate and repair damaged tissues during normal homeostasis. In addition, such studies can elucidate how these processes break down in disease as well as identify cellular subpopulations that drive the disease. This review highlights three emerging technologies: single cell RNA sequencing (scRNA-seq), Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq), and Cytometry by Time-Of-Flight (CyTOF) mass cytometry. RECENT FINDINGS Technological and bioinformatic tools to analyze the transcriptome, epigenome, and proteome at the individual cell level have advanced rapidly making data collection relatively easy; however, understanding how to access and interpret the data remains a challenge for many scientists. It is, therefore, of paramount significance to educate the musculoskeletal community on how single cell technologies can be used to answer research questions and advance translation. This article summarizes talks given during a workshop on "Single Cell Omics" at the 2020 annual meeting of the Orthopedic Research Society. Studies that applied scRNA-seq, ATAC-seq, and CyTOF mass cytometry to cartilage development and osteoarthritis are reviewed. This body of work shows how these cutting-edge tools can advance our understanding of the cellular heterogeneity and trajectories of lineage specification during development and disease.
Collapse
Affiliation(s)
- Muhammad Farooq Rai
- Department of Orthopaedic Surgery, Washington University, St. Louis, MO, USA
| | - Chia-Lung Wu
- Department of Orthopaedic Surgery, Washington University and Shriners Hospitals for Children, St. Louis, MO, USA
| | - Terence D Capellini
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Farshid Guilak
- Department of Orthopaedic Surgery, Washington University and Shriners Hospitals for Children, St. Louis, MO, USA
| | - Amanda R Dicks
- Department of Orthopaedic Surgery, Washington University and Shriners Hospitals for Children, St. Louis, MO, USA
| | | | - Fiorella Grandi
- Department of Orthopedic Surgery, Stanford University, Stanford, CA, USA
| | - Nidhi Bhutani
- Department of Orthopedic Surgery, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
16
|
Navidi Z, Zhang L, Wang B. simATAC: a single-cell ATAC-seq simulation framework. Genome Biol 2021; 22:74. [PMID: 33663563 PMCID: PMC7934446 DOI: 10.1186/s13059-021-02270-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 01/13/2021] [Indexed: 12/21/2022] Open
Abstract
Single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) identifies regulated chromatin accessibility modules at the single-cell resolution. Robust evaluation is critical to the development of scATAC-seq pipelines, which calls for reproducible datasets for benchmarking. We hereby present the simATAC framework, an R package that generates scATAC-seq count matrices that highly resemble real scATAC-seq datasets in library size, sparsity, and chromatin accessibility signals. simATAC deploys statistical models derived from analyzing 90 real scATAC-seq cell groups. simATAC provides a robust and systematic approach to generate in silico scATAC-seq samples with known cell labels for assessing analytical pipelines.
Collapse
Affiliation(s)
- Zeinab Navidi
- Peter Munk Cardiac Centre, University Health Network, Toronto, Canada
| | - Lin Zhang
- Department of Statistical Sciences, University of Toronto, Toronto, Canada
| | - Bo Wang
- Peter Munk Cardiac Centre, University Health Network, Toronto, Canada. .,Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada. .,Department of Computer Science, University of Toronto, Toronto, Canada. .,Vector Institute, Toronto, Canada.
| |
Collapse
|
17
|
Granja JM, Corces MR, Pierce SE, Bagdatli ST, Choudhry H, Chang HY, Greenleaf WJ. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat Genet 2021; 53:403-411. [PMID: 33633365 PMCID: PMC8012210 DOI: 10.1038/s41588-021-00790-6] [Citation(s) in RCA: 550] [Impact Index Per Article: 183.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 01/19/2021] [Indexed: 12/26/2022]
Abstract
The advent of single-cell chromatin accessibility profiling has accelerated the ability to map gene regulatory landscapes but has outpaced the development of scalable software to rapidly extract biological meaning from these data. Here we present a software suite for single-cell analysis of regulatory chromatin in R (ArchR; https://www.archrproject.com/) that enables fast and comprehensive analysis of single-cell chromatin accessibility data. ArchR provides an intuitive, user-focused interface for complex single-cell analyses, including doublet removal, single-cell clustering and cell type identification, unified peak set generation, cellular trajectory identification, DNA element-to-gene linkage, transcription factor footprinting, mRNA expression level prediction from chromatin accessibility and multi-omic integration with single-cell RNA sequencing (scRNA-seq). Enabling the analysis of over 1.2 million single cells within 8 h on a standard Unix laptop, ArchR is a comprehensive software suite for end-to-end analysis of single-cell chromatin accessibility that will accelerate the understanding of gene regulation at the resolution of individual cells. ArchR is a software suite that enables efficient and end-to-end analysis of single-cell chromatin accessibility data (scATAC-seq).
Collapse
Affiliation(s)
- Jeffrey M Granja
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA. .,Program in Biophysics, Stanford University, Stanford, CA, USA. .,Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA.
| | - M Ryan Corces
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA.,Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.,Gladstone Institute of Neurological Disease, Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Sarah E Pierce
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.,Program in Cancer Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - S Tansu Bagdatli
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Hani Choudhry
- Department of Biochemistry, Faculty of Science, Cancer and Mutagenesis Unit, King Fahd Center for Medical Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Howard Y Chang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA. .,Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA. .,Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA.
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA. .,Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA. .,Department of Applied Physics, Stanford University, Stanford, CA, USA. .,Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
18
|
Sinha S, Satpathy AT, Zhou W, Ji H, Stratton JA, Jaffer A, Bahlis N, Morrissy S, Biernaskie JA. Profiling Chromatin Accessibility at Single-cell Resolution. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:172-190. [PMID: 33581341 PMCID: PMC8602754 DOI: 10.1016/j.gpb.2020.06.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 03/04/2020] [Accepted: 08/15/2020] [Indexed: 01/22/2023]
Abstract
How distinct transcriptional programs are enacted to generate cellular heterogeneity and plasticity, and enable complex fate decisions are important open questions. One key regulator is the cell’s epigenome state that drives distinct transcriptional programs by regulating chromatin accessibility. Genome-wide chromatin accessibility measurements can impart insights into regulatory sequences (in)accessible to DNA-binding proteins at a single-cell resolution. This review outlines molecular methods and bioinformatic tools for capturing cell-to-cell chromatin variation using single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) in a scalable fashion. It also covers joint profiling of chromatin with transcriptome/proteome measurements, computational strategies to integrate multi-omic measurements, and predictive bioinformatic tools to infer chromatin accessibility from single-cell transcriptomic datasets. Methodological refinements that increase power for cell discovery through robust chromatin coverage and integrate measurements from multiple modalities will further expand our understanding of gene regulation during homeostasis and disease.
Collapse
Affiliation(s)
- Sarthak Sinha
- Department of Comparative Biology & Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada.
| | - Ansuman T Satpathy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Weiqiang Zhou
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Hongkai Ji
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Jo A Stratton
- Department of Comparative Biology & Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada; Hotchkiss Brain Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Arzina Jaffer
- Department of Comparative Biology & Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Nizar Bahlis
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, AB T2N 4Z6, Canada
| | - Sorana Morrissy
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada; Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, AB T2N 4Z6, Canada; Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Jeff A Biernaskie
- Department of Comparative Biology & Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada; Hotchkiss Brain Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.
| |
Collapse
|
19
|
Minnoye L, Marinov GK, Krausgruber T, Pan L, Marand AP, Secchia S, Greenleaf WJ, Furlong EEM, Zhao K, Schmitz RJ, Bock C, Aerts S. Chromatin accessibility profiling methods. NATURE REVIEWS. METHODS PRIMERS 2021; 1:10. [PMID: 38410680 PMCID: PMC10895463 DOI: 10.1038/s43586-020-00008-9] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/01/2020] [Indexed: 02/06/2023]
Abstract
Chromatin accessibility, or the physical access to chromatinized DNA, is a widely studied characteristic of the eukaryotic genome. As active regulatory DNA elements are generally 'accessible', the genome-wide profiling of chromatin accessibility can be used to identify candidate regulatory genomic regions in a tissue or cell type. Multiple biochemical methods have been developed to profile chromatin accessibility, both in bulk and at the single-cell level. Depending on the method, enzymatic cleavage, transposition or DNA methyltransferases are used, followed by high-throughput sequencing, providing a view of genome-wide chromatin accessibility. In this Primer, we discuss these biochemical methods, as well as bioinformatics tools for analysing and interpreting the generated data, and insights into the key regulators underlying developmental, evolutionary and disease processes. We outline standards for data quality, reproducibility and deposition used by the genomics community. Although chromatin accessibility profiling is invaluable to study gene regulation, alone it provides only a partial view of this complex process. Orthogonal assays facilitate the interpretation of accessible regions with respect to enhancer-promoter proximity, functional transcription factor binding and regulatory function. We envision that technological improvements including single-molecule, multi-omics and spatial methods will bring further insight into the secrets of genome regulation.
Collapse
Affiliation(s)
- Liesbeth Minnoye
- Center for Brain & Disease Research, VIB-KU Leuven, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | | | - Thomas Krausgruber
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Lixia Pan
- Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung and Blood Institute, NIH, Bethesda, MD, USA
| | | | - Stefano Secchia
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | | | - Eileen E M Furlong
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Keji Zhao
- Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung and Blood Institute, NIH, Bethesda, MD, USA
| | | | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
- Institute of Artificial Intelligence and Decision Support, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Stein Aerts
- Center for Brain & Disease Research, VIB-KU Leuven, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| |
Collapse
|
20
|
Understanding the epigenetic landscape and cellular architecture of childhood brain tumors. Neurochem Int 2020; 144:104940. [PMID: 33333210 DOI: 10.1016/j.neuint.2020.104940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 12/12/2020] [Indexed: 11/22/2022]
Abstract
Pediatric brain tumors are the leading cancer-related cause of death in children and adolescents in the United States, affecting on average 1 in 2000 children per year. Recent advances in cancer genomics have led to profound discoveries about the underlying molecular biology and ontogeny of these tumors. In particular, these studies have revealed epigenetic dysregulation to be one of the main hallmarks of pediatric brain tumorigenesis. In this review, we will highlight a number of important recent findings about the nature of this dysregulation in different types of pediatric brain tumors as well as examine their implications for preclinical research and clinical practice. Specifically, we discuss the emergence of methylation signatures as tools for tumor stratification/classification while also highlighting the importance of mutations that directly affect the epigenome and clarifying their impact on risk stratification and pediatric brain tumor biology. We then incorporate recent advances in our understanding of pediatric brain tumor cellular architecture and emphasize the link between epigenetic dysregulation and the "stalled" development seen in many of these malignant neoplasms. Lastly, we explore recentwork investigating the use of these mutated epigenomic regulators as therapeutic targets and extrapolate their utility in overcoming this "stalling" to halt tumor growth.
Collapse
|
21
|
Prompsy P, Kirchmeier P, Marsolier J, Deloger M, Servant N, Vallot C. Interactive analysis of single-cell epigenomic landscapes with ChromSCape. Nat Commun 2020; 11:5702. [PMID: 33177523 PMCID: PMC7658988 DOI: 10.1038/s41467-020-19542-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 10/21/2020] [Indexed: 01/05/2023] Open
Abstract
Chromatin modifications orchestrate the dynamic regulation of gene expression during development and in disease. Bulk approaches have characterized the wide repertoire of histone modifications across cell types, detailing their role in shaping cell identity. However, these population-based methods do not capture cell-to-cell heterogeneity of chromatin landscapes, limiting our appreciation of the role of chromatin in dynamic biological processes. Recent technological developments enable the mapping of histone marks at single-cell resolution, opening up perspectives to characterize the heterogeneity of chromatin marks in complex biological systems over time. Yet, existing tools used to analyze bulk histone modifications profiles are not fit for the low coverage and sparsity of single-cell epigenomic datasets. Here, we present ChromSCape, a user-friendly interactive Shiny/R application distributed as a Bioconductor package, that processes single-cell epigenomic data to assist the biological interpretation of chromatin landscapes within cell populations. ChromSCape analyses the distribution of repressive and active histone modifications as well as chromatin accessibility landscapes from single-cell datasets. Using ChromSCape, we deconvolve chromatin landscapes within the tumor micro-environment, identifying distinct H3K27me3 landscapes associated with cell identity and breast tumor subtype.
Collapse
Affiliation(s)
- Pacôme Prompsy
- CNRS UMR3244, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France.
- Translational Research Department, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France.
| | - Pia Kirchmeier
- CNRS UMR3244, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France
- Translational Research Department, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France
| | - Justine Marsolier
- CNRS UMR3244, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France
- Translational Research Department, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France
| | - Marc Deloger
- INSERM U900, Institut Curie, PSL Research University, Mines ParisTech, 26 rue d'Ulm, 75005, Paris, France
| | - Nicolas Servant
- INSERM U900, Institut Curie, PSL Research University, Mines ParisTech, 26 rue d'Ulm, 75005, Paris, France
| | - Céline Vallot
- CNRS UMR3244, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France.
- Translational Research Department, Institut Curie, PSL Research University, 26 rue d'Ulm, 75005, Paris, France.
| |
Collapse
|
22
|
Ji Z, Zhou W, Hou W, Ji H. Single-cell ATAC-seq signal extraction and enhancement with SCATE. Genome Biol 2020; 21:161. [PMID: 32620137 PMCID: PMC7333383 DOI: 10.1186/s13059-020-02075-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 06/15/2020] [Indexed: 01/25/2023] Open
Abstract
Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) is the state-of-the-art technology for analyzing genome-wide regulatory landscapes in single cells. Single-cell ATAC-seq data are sparse and noisy, and analyzing such data is challenging. Existing computational methods cannot accurately reconstruct activities of individual cis-regulatory elements (CREs) in individual cells or rare cell subpopulations. We present a new statistical framework, SCATE, that adaptively integrates information from co-activated CREs, similar cells, and publicly available regulome data to substantially increase the accuracy for estimating activities of individual CREs. We demonstrate that SCATE can be used to better reconstruct the regulatory landscape of a heterogeneous sample.
Collapse
Affiliation(s)
- Zhicheng Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD, 21205 USA
| | - Weiqiang Zhou
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD, 21205 USA
| | - Wenpin Hou
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD, 21205 USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD, 21205 USA
| |
Collapse
|
23
|
Baek S, Lee I. Single-cell ATAC sequencing analysis: From data preprocessing to hypothesis generation. Comput Struct Biotechnol J 2020; 18:1429-1439. [PMID: 32637041 PMCID: PMC7327298 DOI: 10.1016/j.csbj.2020.06.012] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 06/03/2020] [Accepted: 06/07/2020] [Indexed: 12/21/2022] Open
Abstract
Most genetic variations associated with human complex traits are located in non-coding genomic regions. Therefore, understanding the genotype-to-phenotype axis requires a comprehensive catalog of functional non-coding genomic elements, most of which are involved in epigenetic regulation of gene expression. Genome-wide maps of open chromatin regions can facilitate functional analysis of cis- and trans-regulatory elements via their connections with trait-associated sequence variants. Currently, Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is considered the most accessible and cost-effective strategy for genome-wide profiling of chromatin accessibility. Single-cell ATAC-seq (scATAC-seq) technology has also been developed to study cell type-specific chromatin accessibility in tissue samples containing a heterogeneous cellular population. However, due to the intrinsic nature of scATAC-seq data, which are highly noisy and sparse, accurate extraction of biological signals and devising effective biological hypothesis are difficult. To overcome such limitations in scATAC-seq data analysis, new methods and software tools have been developed over the past few years. Nevertheless, there is no consensus for the best practice of scATAC-seq data analysis yet. In this review, we discuss scATAC-seq technology and data analysis methods, ranging from preprocessing to downstream analysis, along with an up-to-date list of published studies that involved the application of this method. We expect this review will provide a guideline for successful data generation and analysis methods using appropriate software tools and databases for the study of chromatin accessibility at single-cell resolution.
Collapse
Affiliation(s)
- Seungbyn Baek
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul 03722, Korea
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul 03722, Korea
| |
Collapse
|
24
|
Urrutia E, Chen L, Zhou H, Jiang Y. Destin: toolkit for single-cell analysis of chromatin accessibility. Bioinformatics 2020; 35:3818-3820. [PMID: 30821321 PMCID: PMC6761983 DOI: 10.1093/bioinformatics/btz141] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Revised: 02/10/2019] [Accepted: 02/25/2019] [Indexed: 01/25/2023] Open
Abstract
Summary Single-cell assay of transposase-accessible chromatin followed by sequencing (scATAC-seq) is an emerging new technology for the study of gene regulation with single-cell resolution. The data from scATAC-seq are unique—sparse, binary and highly variable even within the same cell type. As such, neither methods developed for bulk ATAC-seq nor single-cell RNA-seq data are appropriate. Here, we present Destin, a bioinformatic and statistical framework for comprehensive scATAC-seq data analysis. Destin performs cell-type clustering via weighted principle component analysis, weighting accessible chromatin regions by existing genomic annotations and publicly available regulomic datasets. The weights and additional tuning parameters are determined via model-based likelihood. We evaluated the performance of Destin using downsampled bulk ATAC-seq data of purified samples and scATAC-seq data from seven diverse experiments. Compared to existing methods, Destin was shown to outperform across all datasets and platforms. For demonstration, we further applied Destin to 2088 adult mouse forebrain cells and identified cell-type-specific association of previously reported schizophrenia GWAS loci. Availability and implementation Destin toolkit is freely available as an R package at https://github.com/urrutiag/destin. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Eugene Urrutia
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Li Chen
- Department of Health Outcomes Research and Policy, Harrison School of Pharmacy, Auburn University, Auburn, AL, USA
| | - Haibo Zhou
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.,Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC, USA.,Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
25
|
Franzén O, Björkegren JLM. alona: a web server for single-cell RNA-seq analysis. Bioinformatics 2020; 36:3910-3912. [PMID: 32324845 PMCID: PMC7320629 DOI: 10.1093/bioinformatics/btaa269] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 03/27/2020] [Accepted: 04/16/2020] [Indexed: 01/01/2023] Open
Abstract
SUMMARY Single-cell RNA sequencing (scRNA-seq) is a technology to measure gene expression in single cells. It has enabled discovery of new cell types and established cell type atlases of tissues and organs. The widespread adoption of scRNA-seq has created a need for user-friendly software for data analysis. We have developed a web server, alona that incorporates several of the most popular single-cell analysis algorithms into a flexible pipeline. alona can perform quality filtering, normalization, batch correction, clustering, cell type annotation and differential gene expression analysis. Data are visualized in the web browser using an interface based on JavaScript, allowing the user to query genes of interest and visualize the cluster structure. alona accepts a compressed gene expression matrix and identifies cell clusters with a graph-based clustering strategy. Cell types are identified from a comprehensive collection of marker genes or by specifying a custom set of marker genes. AVAILABILITY AND IMPLEMENTATION The service runs at https://alona.panglaodb.se and the Python package can be downloaded from https://oscar-franzen.github.io/adobo/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Oscar Franzén
- Department of Medicine, Integrated Cardio Metabolic Centre, Karolinska Institutet, Huddinge 14157, Sweden
| | - Johan L M Björkegren
- Department of Medicine, Integrated Cardio Metabolic Centre, Karolinska Institutet, Huddinge 14157, Sweden
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
26
|
Smith JP, Sheffield NC. Analytical Approaches for ATAC-seq Data Analysis. CURRENT PROTOCOLS IN HUMAN GENETICS 2020; 106:e101. [PMID: 32543102 PMCID: PMC8191135 DOI: 10.1002/cphg.101] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
ATAC-seq, the assay for transposase-accessible chromatin using sequencing, is a quick and efficient approach to investigating the chromatin accessibility landscape. Investigating chromatin accessibility has broad utility for answering many biological questions, such as mapping nucleosomes, identifying transcription factor binding sites, and measuring differential activity of DNA regulatory elements. Because the ATAC-seq protocol is both simple and relatively inexpensive, there has been a rapid increase in the availability of chromatin accessibility data. Furthermore, advances in ATAC-seq protocols are rapidly extending its breadth to additional experimental conditions, cell types, and species. Accompanying the increase in data, there has also been an explosion of new tools and analytical approaches for analyzing it. Here, we explain the fundamentals of ATAC-seq data processing, summarize common analysis approaches, and review computational tools to provide recommendations for different research questions. This primer provides a starting point and a reference for analysis of ATAC-seq data. © 2020 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Jason P. Smith
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia
| | - Nathan C. Sheffield
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia
- Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia
| |
Collapse
|
27
|
Yu W, Uzun Y, Zhu Q, Chen C, Tan K. scATAC-pro: a comprehensive workbench for single-cell chromatin accessibility sequencing data. Genome Biol 2020; 21:94. [PMID: 32312293 PMCID: PMC7169039 DOI: 10.1186/s13059-020-02008-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Accepted: 04/02/2020] [Indexed: 02/08/2023] Open
Abstract
Single-cell chromatin accessibility sequencing has become a powerful technology for understanding epigenetic heterogeneity of complex tissues. However, there is a lack of open-source software for comprehensive processing, analysis, and visualization of such data generated using all existing experimental protocols. Here, we present scATAC-pro for quality assessment, analysis, and visualization of single-cell chromatin accessibility sequencing data. scATAC-pro computes a range of quality control metrics for several key steps of experimental protocols, with a flexible choice of methods. It generates summary reports for both quality assessment and downstream analysis. scATAC-pro is available at https://github.com/tanlabcode/scATAC-pro.
Collapse
Affiliation(s)
- Wenbao Yu
- Center for Childhood Cancer Research, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Yasin Uzun
- Center for Childhood Cancer Research, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Qin Zhu
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| | - Changya Chen
- Center for Childhood Cancer Research, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Kai Tan
- Center for Childhood Cancer Research, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Pediatrics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
28
|
Ryan GE, Farley EK. Functional genomic approaches to elucidate the role of enhancers during development. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2020; 12:e1467. [PMID: 31808313 PMCID: PMC7027484 DOI: 10.1002/wsbm.1467] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 10/02/2019] [Accepted: 10/11/2019] [Indexed: 12/22/2022]
Abstract
Successful development depends on the precise tissue-specific regulation of genes by enhancers, genetic elements that act as switches to control when and where genes are expressed. Because enhancers are critical for development, and the majority of disease-associated mutations reside within enhancers, it is essential to understand which sequences within enhancers are important for function. Advances in sequencing technology have enabled the rapid generation of genomic data that predict putative active enhancers, but functionally validating these sequences at scale remains a fundamental challenge. Herein, we discuss the power of genome-wide strategies used to identify candidate enhancers, and also highlight limitations and misconceptions that have arisen from these data. We discuss the use of massively parallel reporter assays to test enhancers for function at scale. We also review recent advances in our ability to study gene regulation during development, including CRISPR-based tools to manipulate genomes and single-cell transcriptomics to finely map gene expression. Finally, we look ahead to a synthesis of complementary genomic approaches that will advance our understanding of enhancer function during development. This article is categorized under: Physiology > Mammalian Physiology in Health and Disease Developmental Biology > Developmental Processes in Health and Disease Laboratory Methods and Technologies > Genetic/Genomic Methods.
Collapse
Affiliation(s)
- Genevieve E. Ryan
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| | - Emma K. Farley
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| |
Collapse
|
29
|
Chen H, Lareau C, Andreani T, Vinyard ME, Garcia SP, Clement K, Andrade-Navarro MA, Buenrostro JD, Pinello L. Assessment of computational methods for the analysis of single-cell ATAC-seq data. Genome Biol 2019; 20:241. [PMID: 31739806 PMCID: PMC6859644 DOI: 10.1186/s13059-019-1854-5] [Citation(s) in RCA: 172] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 10/03/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans), lead to inherent data sparsity (1-10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10-45% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level. RESULTS We present a benchmarking framework that is applied to 10 computational methods for scATAC-seq on 13 synthetic and real datasets from different assays, profiling cell types from diverse tissues and organisms. Methods for processing and featurizing scATAC-seq data were compared by their ability to discriminate cell types when combined with common unsupervised clustering approaches. We rank evaluated methods and discuss computational challenges associated with scATAC-seq analysis including inherently sparse data, determination of features, peak calling, the effects of sequencing coverage and noise, and clustering performance. Running times and memory requirements are also discussed. CONCLUSIONS This reference summary of scATAC-seq methods offers recommendations for best practices with consideration for both the non-expert user and the methods developer. Despite variation across methods and datasets, SnapATAC, Cusanovich2018, and cisTopic outperform other methods in separating cell populations of different coverages and noise levels in both synthetic and real datasets. Notably, SnapATAC is the only method able to analyze a large dataset (> 80,000 cells).
Collapse
Affiliation(s)
- Huidong Chen
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, 02129, USA
- Department of Pathology, Harvard Medical School, Boston, MA, 02115, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Caleb Lareau
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Tommaso Andreani
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, 02129, USA
- Department of Pathology, Harvard Medical School, Boston, MA, 02115, USA
- Faculty of Biology, Computational Biology and Data Mining Lab, Johannes Gutenberg University of Mainz, 55128, Mainz, Germany
| | - Michael E Vinyard
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, 02129, USA
- Department of Pathology, Harvard Medical School, Boston, MA, 02115, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, 02142, USA
| | - Sara P Garcia
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA
| | - Kendell Clement
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, 02129, USA
- Department of Pathology, Harvard Medical School, Boston, MA, 02115, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Computational Biology and Data Mining Lab, Johannes Gutenberg University of Mainz, 55128, Mainz, Germany
| | - Jason D Buenrostro
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Luca Pinello
- Molecular Pathology Unit, Massachusetts General Hospital Research Institute, Charlestown, MA, 02129, USA.
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, 02129, USA.
- Department of Pathology, Harvard Medical School, Boston, MA, 02115, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.
| |
Collapse
|
30
|
Zhou W, Ji Z, Fang W, Ji H. Global prediction of chromatin accessibility using small-cell-number and single-cell RNA-seq. Nucleic Acids Res 2019; 47:e121. [PMID: 31428792 PMCID: PMC6821224 DOI: 10.1093/nar/gkz716] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 07/20/2019] [Accepted: 08/11/2019] [Indexed: 11/13/2022] Open
Abstract
Conventional high-throughput genomic technologies for mapping regulatory element activities in bulk samples such as ChIP-seq, DNase-seq and FAIRE-seq cannot analyze samples with small numbers of cells. The recently developed low-input and single-cell regulome mapping technologies such as ATAC-seq and single-cell ATAC-seq (scATAC-seq) allow analyses of small-cell-number and single-cell samples, but their signals remain highly discrete or noisy. Compared to these regulome mapping technologies, transcriptome profiling by RNA-seq is more widely used. Transcriptome data in single-cell and small-cell-number samples are more continuous and often less noisy. Here, we show that one can globally predict chromatin accessibility and infer regulatory element activities using RNA-seq. Genome-wide chromatin accessibility predicted by RNA-seq from 30 cells can offer better accuracy than ATAC-seq from 500 cells. Predictions based on single-cell RNA-seq (scRNA-seq) can more accurately reconstruct bulk chromatin accessibility than using scATAC-seq. Integrating ATAC-seq with predictions from RNA-seq increases the power and value of both methods. Thus, transcriptome-based prediction provides a new tool for decoding gene regulatory circuitry in samples with limited cell numbers.
Collapse
Affiliation(s)
- Weiqiang Zhou
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA
| | - Zhicheng Ji
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA
| | - Weixiang Fang
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA
| |
Collapse
|
31
|
Li R, Quon G. scBFA: modeling detection patterns to mitigate technical noise in large-scale single-cell genomics data. Genome Biol 2019; 20:193. [PMID: 31500668 PMCID: PMC6734238 DOI: 10.1186/s13059-019-1806-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 08/28/2019] [Indexed: 12/13/2022] Open
Abstract
Technical variation in feature measurements, such as gene expression and locus accessibility, is a key challenge of large-scale single-cell genomic datasets. We show that this technical variation in both scRNA-seq and scATAC-seq datasets can be mitigated by analyzing feature detection patterns alone and ignoring feature quantification measurements. This result holds when datasets have low detection noise relative to quantification noise. We demonstrate state-of-the-art performance of detection pattern models using our new framework, scBFA, for both cell type identification and trajectory inference. Performance gains can also be realized in one line of R code in existing pipelines.
Collapse
Affiliation(s)
- Ruoxin Li
- Graduate Group in Biostatistics, University of California, Davis, Davis, CA, USA
- Genome Center, University of California, Davis, Davis, CA, USA
| | - Gerald Quon
- Graduate Group in Biostatistics, University of California, Davis, Davis, CA, USA.
- Genome Center, University of California, Davis, Davis, CA, USA.
- Department of Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA.
| |
Collapse
|
32
|
Baker SM, Rogerson C, Hayes A, Sharrocks AD, Rattray M. Classifying cells with Scasat, a single-cell ATAC-seq analysis tool. Nucleic Acids Res 2019; 47:e10. [PMID: 30335168 PMCID: PMC6344856 DOI: 10.1093/nar/gky950] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Accepted: 10/04/2018] [Indexed: 01/22/2023] Open
Abstract
ATAC-seq is a recently developed method to identify the areas of open chromatin in a cell. These regions usually correspond to active regulatory elements and their location profile is unique to a given cell type. When done at single-cell resolution, ATAC-seq provides an insight into the cell-to-cell variability that emerges from otherwise identical DNA sequences by identifying the variability in the genomic location of open chromatin sites in each of the cells. This paper presents Scasat (single-cell ATAC-seq analysis tool), a complete pipeline to process scATAC-seq data with simple steps. Scasat treats the data as binary and applies statistical methods that are especially suitable for binary data. The pipeline is developed in a Jupyter notebook environment that holds the executable code along with the necessary description and results. It is robust, flexible, interactive and easy to extend. Within Scasat we developed a novel differential accessibility analysis method based on information gain to identify the peaks that are unique to a cell. The results from Scasat showed that open chromatin locations corresponding to potential regulatory elements can account for cellular heterogeneity and can identify regulatory regions that separates cells from a complex population.
Collapse
Affiliation(s)
- Syed Murtuza Baker
- Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK
| | - Connor Rogerson
- Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK
| | - Andrew Hayes
- Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK
| | - Andrew D Sharrocks
- Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK.,Manchester Academic Health Science Centre (MAHSC), Core Technology Facility, The University of Manchester, Manchester M13 9NT, UK
| | - Magnus Rattray
- Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK
| |
Collapse
|
33
|
Bravo González-Blas C, Minnoye L, Papasokrati D, Aibar S, Hulselmans G, Christiaens V, Davie K, Wouters J, Aerts S. cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data. Nat Methods 2019; 16:397-400. [PMID: 30962623 DOI: 10.1038/s41592-019-0367-1] [Citation(s) in RCA: 227] [Impact Index Per Article: 45.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 02/28/2019] [Indexed: 12/17/2022]
Abstract
We present cisTopic, a probabilistic framework used to simultaneously discover coaccessible enhancers and stable cell states from sparse single-cell epigenomics data ( http://github.com/aertslab/cistopic ). Using a compendium of single-cell ATAC-seq datasets from differentiating hematopoietic cells, brain and transcription factor perturbations, we demonstrate that topic modeling can be exploited for robust identification of cell types, enhancers and relevant transcription factors. cisTopic provides insight into the mechanisms underlying regulatory heterogeneity in cell populations.
Collapse
Affiliation(s)
- Carmen Bravo González-Blas
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Liesbeth Minnoye
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Dafni Papasokrati
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Sara Aibar
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Gert Hulselmans
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Valerie Christiaens
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Kristofer Davie
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Jasper Wouters
- VIB Center for Brain & Disease Research, Leuven, Belgium.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Stein Aerts
- VIB Center for Brain & Disease Research, Leuven, Belgium. .,Department of Human Genetics, KU Leuven, Leuven, Belgium.
| |
Collapse
|
34
|
Dahl JA, Gilfillan GD. How low can you go? Pushing the limits of low-input ChIP-seq. Brief Funct Genomics 2019; 17:89-95. [PMID: 29087438 DOI: 10.1093/bfgp/elx037] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In the past decade, chromatin immunoprecipitation sequencing (ChIP-seq) has emerged as the dominant technique for those wishing to perform genome-wide protein:DNA profiling. Owing to the tissue- and cell-type-specific nature of epigenetic marks, the field has been driven towards obtaining data from ever-lower cell numbers. In this review, we focus on the methodological developments that have lowered input requirements and the biological findings they have enabled, as we strive towards the ultimate goal of robust single-cell ChIP-seq.
Collapse
|
35
|
Zhu X, Wolfgruber TK, Tasato A, Arisdakessian C, Garmire DG, Garmire LX. Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists. Genome Med 2017; 9:108. [PMID: 29202807 PMCID: PMC5716224 DOI: 10.1186/s13073-017-0492-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 11/07/2017] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level. Computational methods to process scRNA-Seq data are not very accessible to bench scientists as they require a significant amount of bioinformatic skills. RESULTS We have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface. Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene-expression normalization, imputation, gene filtering, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction. CONCLUSIONS Granatum enables broad adoption of scRNA-Seq technology by empowering bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at http://garmiregroup.org/granatum/app.
Collapse
Affiliation(s)
- Xun Zhu
- Graduate Program in Molecular Biology and Bioengineering, University of Hawaii at Manoa, Honolulu, HI, 96816, USA
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Thomas K Wolfgruber
- Graduate Program in Molecular Biology and Bioengineering, University of Hawaii at Manoa, Honolulu, HI, 96816, USA
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Austin Tasato
- Department of Electrical Engineering, University of Hawaii at Manoa, Honolulu, HI, 96816, USA
| | - Cédric Arisdakessian
- Graduate Program in Molecular Biology and Bioengineering, University of Hawaii at Manoa, Honolulu, HI, 96816, USA
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - David G Garmire
- Department of Electrical Engineering, University of Hawaii at Manoa, Honolulu, HI, 96816, USA
| | - Lana X Garmire
- Graduate Program in Molecular Biology and Bioengineering, University of Hawaii at Manoa, Honolulu, HI, 96816, USA.
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA.
| |
Collapse
|