1
|
Katebi A, Chen X, Ramirez D, Li S, Lu M. Data-driven modeling of core gene regulatory network underlying leukemogenesis in IDH mutant AML. NPJ Syst Biol Appl 2024; 10:38. [PMID: 38594351 PMCID: PMC11003984 DOI: 10.1038/s41540-024-00366-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 03/29/2024] [Indexed: 04/11/2024] Open
Abstract
Acute myeloid leukemia (AML) is characterized by uncontrolled proliferation of poorly differentiated myeloid cells, with a heterogenous mutational landscape. Mutations in IDH1 and IDH2 are found in 20% of the AML cases. Although much effort has been made to identify genes associated with leukemogenesis, the regulatory mechanism of AML state transition is still not fully understood. To alleviate this issue, here we develop a new computational approach that integrates genomic data from diverse sources, including gene expression and ATAC-seq datasets, curated gene regulatory interaction databases, and mathematical modeling to establish models of context-specific core gene regulatory networks (GRNs) for a mechanistic understanding of tumorigenesis of AML with IDH mutations. The approach adopts a new optimization procedure to identify the top network according to its accuracy in capturing gene expression states and its flexibility to allow sufficient control of state transitions. From GRN modeling, we identify key regulators associated with the function of IDH mutations, such as DNA methyltransferase DNMT1, and network destabilizers, such as E2F1. The constructed core regulatory network and outcomes of in-silico network perturbations are supported by survival data from AML patients. We expect that the combined bioinformatics and systems-biology modeling approach will be generally applicable to elucidate the gene regulation of disease progression.
Collapse
Affiliation(s)
- Ataur Katebi
- Department of Bioengineering, Northeastern University, Boston, MA, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
| | - Xiaowen Chen
- Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Daniel Ramirez
- Department of Bioengineering, Northeastern University, Boston, MA, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
| | - Sheng Li
- Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
- Department of Computer Science & Engineering, University of Connecticut, Storrs, CT, USA.
- The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA.
| | - Mingyang Lu
- Department of Bioengineering, Northeastern University, Boston, MA, USA.
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA.
| |
Collapse
|
2
|
Katebi A, Chen X, Li S, Lu M. Data-driven modeling of core gene regulatory network underlying leukemogenesis in IDH mutant AML. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.29.551111. [PMID: 37577526 PMCID: PMC10418072 DOI: 10.1101/2023.07.29.551111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Acute myeloid leukemia (AML) is characterized by uncontrolled proliferation of poorly differentiated myeloid cells, with a heterogenous mutational landscape. Mutations in IDH1 and IDH2 are found in 20% of the AML cases. Although much effort has been made to identify genes associated with leukemogenesis, the regulatory mechanism of AML state transition is still not fully understood. To alleviate this issue, here we develop a new computational approach that integrates genomic data from diverse sources, including gene expression and ATAC-seq datasets, curated gene regulatory interaction databases, and mathematical modeling to establish models of context-specific core gene regulatory networks (GRNs) for a mechanistic understanding of tumorigenesis of AML with IDH mutations. The approach adopts a novel optimization procedure to identify the optimal network according to its accuracy in capturing gene expression states and its flexibility to allow sufficient control of state transitions. From GRN modeling, we identify key regulators associated with the function of IDH mutations, such as DNA methyltransferase DNMT1, and network destabilizers, such as E2F1. The constructed core regulatory network and outcomes of in-silico network perturbations are supported by survival data from AML patients. We expect that the combined bioinformatics and systems-biology modeling approach will be generally applicable to elucidate the gene regulation of disease progression.
Collapse
Affiliation(s)
- Ataur Katebi
- Department of Bioengineering, Northeastern University, Boston, MA, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
| | - Xiaowen Chen
- Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Sheng Li
- Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Department of Computer Science & Engineering, University of Connecticut, Storrs, CT, USA
| | - Mingyang Lu
- Department of Bioengineering, Northeastern University, Boston, MA, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA, USA
| |
Collapse
|
3
|
Zhang S, Wang S. ATAC-DEA: A Web-Based ATAC-Seq Data Differential Peak and Annotation Analysis Application. J Comput Biol 2023; 30:337-345. [PMID: 36656543 DOI: 10.1089/cmb.2022.0033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Assay for transposase-accessible chromatin sequencing (ATAC-seq) has become one of the most widely used sequencing methods in studies of gene regulation, aiming to identify open chromatin sites and decipher how chromatin accessibility regulates gene expression. However, due to a lack of programming experience or minimal bioinformatics training, it is difficult for biologists to fully explore and interpret ATAC-seq results. Despite several available programs or websites that allow researchers to analyze and visualize ATAC-seq data, several limitations exist. ATAC-seq data differential expression analysis (ATAC-DEA), a web application that facilitates the exploration and visualization of differential peak analysis and annotation from ATAC-seq data, was developed (www.atac-dea.xyz:3838/ATAC-DEA). ATAC-DEA uses DiffBind and ChIPpeakAnno to process differential peak and annotation analysis results. ATAC-DEA has five features: (1) runs on a web server; (2) processes three files into one small file, which is used as the input for ATAC-DEA; (3) availability of various downloadable plots; (4) multifactor analysis and customized contrast model; and (5) annotates individual, overlapped, and differential peaks. It provides an easy-to-use user interface (UI) design for users to explore the data and modify the parameters interactively based on experimental purposes. ATAC-DEA allows biologists to generate user-friendly visual results from ATAC-seq downstream analysis.
Collapse
Affiliation(s)
- Shilong Zhang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China
| | - Sufang Wang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China
| |
Collapse
|
4
|
Jiang S, Chen Y, Han S, Lv L, Li L. Next-Generation Sequencing Applications for the Study of Fungal Pathogens. Microorganisms 2022; 10:microorganisms10101882. [PMID: 36296159 PMCID: PMC9609632 DOI: 10.3390/microorganisms10101882] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/13/2022] [Accepted: 09/15/2022] [Indexed: 11/16/2022] Open
Abstract
Next-generation sequencing (NGS) has become a widely used technology in biological research. NGS applications for clinical pathogen detection have become vital technologies. It is increasingly common to perform fast, accurate, and specific detection of clinical specimens using NGS. Pathogenic fungi with high virulence and drug resistance cause life-threatening clinical infections. NGS has had a significant biotechnological impact on detecting bacteria and viruses but is not equally applicable to fungi. There is a particularly urgent clinical need to use NGS to help identify fungi causing infections and prevent negative impacts. This review summarizes current research on NGS applications for fungi and offers a visual method of fungal detection. With the development of NGS and solutions for overcoming sequencing limitations, we suggest clinicians test specimens as soon as possible when encountering infections of unknown cause, suspected infections in vital organs, or rapidly progressive disease.
Collapse
Affiliation(s)
- Shiman Jiang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Centre for Infectious Diseases, Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, 79 Qingchun Rd., Hangzhou 310003, China
| | - Yanfei Chen
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Centre for Infectious Diseases, Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, 79 Qingchun Rd., Hangzhou 310003, China
| | - Shengyi Han
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Centre for Infectious Diseases, Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, 79 Qingchun Rd., Hangzhou 310003, China
| | - Longxian Lv
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Centre for Infectious Diseases, Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, 79 Qingchun Rd., Hangzhou 310003, China
| | - Lanjuan Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Centre for Infectious Diseases, Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, 79 Qingchun Rd., Hangzhou 310003, China
- Jinan Microecological Biomedicine Shandong Laboratory, Jinan 250021, China
- Correspondence: ; Tel.: +86-0571-8723-6458
| |
Collapse
|
5
|
Hu S, Wang X, Wang T, Wang L, Liu L, Ren W, Liu X, Zhang W, Liao W, Liao Z, Zou R, Zhang X. Differential enrichment of H3K9me3 in intrahepatic cholangiocarcinoma. BMC Med Genomics 2022; 15:185. [PMID: 36028818 PMCID: PMC9414128 DOI: 10.1186/s12920-022-01338-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 08/23/2022] [Indexed: 12/05/2022] Open
Abstract
Background Intrahepatic cholangiocarcinoma (ICC) is a malignant tumor, which poses a serious threat to human health. Histone 3 lysine 9 trimethylation (H3K9me3) is a post-translational modification involved in regulating a broad range of biological processes and has been considered as potential therapeutic target in types of cancer. However, there is limited research on investigating profiles of histone modification H3K9me3 in ICC patients. Methods In this study, we applied the ChIP-seq technique to investigate the effect of H3K9me3 on ICC. Anti-H3K9me3 antibody was used for ChIP-seq in ICC (RBE cell lines) and HIBEpic (normal cell lines). MACS2 (peak-calling tools) was then used to identify the peaks recorded in RBE and HIBEpic cell lines. Gene expression, mutation and clinical data were downloaded from TCGA and cBioPortal databases. Results H3K9me3 exhibited abnormal methylation and influenced the process of abnormal gene expression in patients suffering from ICC. The Wnt/β-Catenin signaling pathway (also known as simply the WNT signaling pathway) was enriched in H3K9me3-regulated genes. Conclusions We are the first to report that H3K9me3 may play an important role in the progression of ICC. It promotes the understanding of epigenetic molecular mechanisms for ICC. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-022-01338-1.
Collapse
Affiliation(s)
- Sheng Hu
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Xuejun Wang
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Tao Wang
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Lianmin Wang
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Lixin Liu
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Wenjun Ren
- Department of Cardiovascular Surgery, The First People's Hospital of Yunnan Province, Kunming, China.,Department of Thoracic Surgery, The Second Affiliated Hospital of Kunming Medical University, Kunming, China
| | - Xiaoyong Liu
- Department of Cardiology, The Second Affiliated Hospital of Kunming Medical University, Kunming, China
| | - Weihan Zhang
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Weiran Liao
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Zhoujun Liao
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China
| | - Renchao Zou
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China.
| | - Xiaowen Zhang
- Department of Hepatobiliary Surgery, The Second Affiliated Hospital of Kunming Medical University, No. 374, Dianmain Road, Kunming, China.
| |
Collapse
|
6
|
Luo L, Gribskov M, Wang S. Bibliometric review of ATAC-Seq and its application in gene expression. Brief Bioinform 2022; 23:6543486. [PMID: 35255493 PMCID: PMC9116206 DOI: 10.1093/bib/bbac061] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/06/2022] [Accepted: 02/09/2022] [Indexed: 11/30/2022] Open
Abstract
With recent advances in high-throughput next-generation sequencing, it is possible to describe the regulation and expression of genes at multiple levels. An assay for transposase-accessible chromatin using sequencing (ATAC-seq), which uses Tn5 transposase to sequence protein-free binding regions of the genome, can be combined with chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) and ribonucleic acid sequencing (RNA-seq) to provide a detailed description of gene expression. Here, we reviewed the literature on ATAC-seq and described the characteristics of ATAC-seq publications. We then briefly introduced the principles of RNA-seq, ChIP-seq and ATAC-seq, focusing on the main features of the techniques. We built a phylogenetic tree from species that had been previously studied by using ATAC-seq. Studies of Mus musculus and Homo sapiens account for approximately 90% of the total ATAC-seq data, while other species are still in the process of accumulating data. We summarized the findings from human diseases and other species, illustrating the cutting-edge discoveries and the role of multi-omics data analysis in current research. Moreover, we collected and compared ATAC-seq analysis pipelines, which allowed biological researchers who lack programming skills to better analyze and explore ATAC-seq data. Through this review, it is clear that multi-omics analysis and single-cell sequencing technology will become the mainstream approach in future research.
Collapse
Affiliation(s)
- Liheng Luo
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| | - Michael Gribskov
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Sufang Wang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| |
Collapse
|
7
|
Multi-omics strategies for personalized and predictive medicine: past, current, and future translational opportunities. Emerg Top Life Sci 2022; 6:215-225. [PMID: 35234253 DOI: 10.1042/etls20210244] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 02/13/2022] [Accepted: 02/21/2022] [Indexed: 12/12/2022]
Abstract
Precision medicine is driven by the paradigm shift of empowering clinicians to predict the most appropriate course of action for patients with complex diseases and improve routine medical and public health practice. It promotes integrating collective and individualized clinical data with patient specific multi-omics data to develop therapeutic strategies, and knowledgebase for predictive and personalized medicine in diverse populations. This study is based on the hypothesis that understanding patient's metabolomics and genetic make-up in conjunction with clinical data will significantly lead to determining predisposition, diagnostic, prognostic and predictive biomarkers and optimal paths providing personalized care for diverse and targeted chronic, acute, and infectious diseases. This study briefs emerging significant, and recently reported multi-omics and translational approaches aimed to facilitate implementation of precision medicine. Furthermore, it discusses current grand challenges, and the future need of Findable, Accessible, Intelligent, and Reproducible (FAIR) approach to accelerate diagnostic and preventive care delivery strategies beyond traditional symptom-driven, disease-causal medical practice.
Collapse
|
8
|
Ahmed Z. Precision medicine with multi-omics strategies, deep phenotyping, and predictive analysis. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2022; 190:101-125. [DOI: 10.1016/bs.pmbts.2022.02.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
9
|
Smith JP, Corces MR, Xu J, Reuter VP, Chang HY, Sheffield NC. PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments. NAR Genom Bioinform 2021; 3:lqab101. [PMID: 34859208 PMCID: PMC8632735 DOI: 10.1093/nargab/lqab101] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 09/30/2021] [Accepted: 11/15/2021] [Indexed: 12/18/2022] Open
Abstract
As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. PEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project. BSD2-licensed code and documentation are available at https://pepatac.databio.org.
Collapse
Affiliation(s)
- Jason P Smith
- Center for Public Health Genomics, University of Virginia, VA,22908, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, VA 22908 USA
| | - M Ryan Corces
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94304, USA
| | - Jin Xu
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94304, USA
| | - Vincent P Reuter
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, PA 19087, USA
| | - Howard Y Chang
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94304, USA
| | - Nathan C Sheffield
- Center for Public Health Genomics, University of Virginia, VA,22908, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, VA 22908 USA
- Department of Public Health Sciences, University of Virginia, VA 22908, USA
- Department of Biomedical Engineering, University of Virginia, VA 22908, USA
| |
Collapse
|
10
|
Ahmed Z, Renart EG, Zeeshan S. Genomics pipelines to investigate susceptibility in whole genome and exome sequenced data for variant discovery, annotation, prediction and genotyping. PeerJ 2021; 9:e11724. [PMID: 34395068 PMCID: PMC8320519 DOI: 10.7717/peerj.11724] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 06/14/2021] [Indexed: 12/12/2022] Open
Abstract
Over the last few decades, genomics is leading toward audacious future, and has been changing our views about conducting biomedical research, studying diseases, and understanding diversity in our society across the human species. The whole genome and exome sequencing (WGS/WES) are two of the most popular next-generation sequencing (NGS) methodologies that are currently being used to detect genetic variations of clinical significance. Investigating WGS/WES data for the variant discovery and genotyping is based on the nexus of different data analytic applications. Although several bioinformatics applications have been developed, and many of those are freely available and published. Timely finding and interpreting genetic variants are still challenging tasks among diagnostic laboratories and clinicians. In this study, we are interested in understanding, evaluating, and reporting the current state of solutions available to process the NGS data of variable lengths and types for the identification of variants, alleles, and haplotypes. Residing within the scope, we consulted high quality peer reviewed literature published in last 10 years. We were focused on the standalone and networked bioinformatics applications proposed to efficiently process WGS and WES data, and support downstream analysis for gene-variant discovery, annotation, prediction, and interpretation. We have discussed our findings in this manuscript, which include but not are limited to the set of operations, workflow, data handling, involved tools, technologies and algorithms and limitations of the assessed applications.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Department of Medicine, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Eduard Gibert Renart
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| |
Collapse
|
11
|
Lu RJH, Liu YT, Huang CW, Yen MR, Lin CY, Chen PY. ATACgraph: Profiling Genome-Wide Chromatin Accessibility From ATAC-seq. Front Genet 2021; 11:618478. [PMID: 33584814 PMCID: PMC7874078 DOI: 10.3389/fgene.2020.618478] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Accepted: 12/11/2020] [Indexed: 11/13/2022] Open
Abstract
Assay for transposase-accessible chromatin using sequencing data (ATAC-seq) is an efficient and precise method for revealing chromatin accessibility across the genome. Most of the current ATAC-seq tools follow chromatin immunoprecipitation sequencing (ChIP-seq) strategies that do not consider ATAC-seq-specific properties. To incorporate specific ATAC-seq quality control and the underlying biology of chromatin accessibility, we developed a bioinformatics software named ATACgraph for analyzing and visualizing ATAC-seq data. ATACgraph profiles accessible chromatin regions and provides ATAC-seq-specific information including definitions of nucleosome-free regions (NFRs) and nucleosome-occupied regions. ATACgraph also allows identification of differentially accessible regions between two ATAC-seq datasets. ATACgraph incorporates the docker image with the Galaxy platform to provide an intuitive user experience via the graphical interface. Without tedious installation processes on a local machine or cloud, users can analyze data through activated websites using pre-designed workflows or customized pipelines composed of ATACgraph modules. Overall, ATACgraph is an effective tool designed for ATAC-seq for biologists with minimal bioinformatics knowledge to analyze chromatin accessibility. ATACgraph can be run on any ATAC-seq data with no limit to specific genomes. As validation, we demonstrated ATACgraph on human genome to showcase its functions for ATAC-seq interpretation. This software is publicly accessible and can be downloaded at https://github.com/RitataLU/ATACgraph.
Collapse
Affiliation(s)
- Rita Jui-Hsien Lu
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan.,Department of Medicine, Washington University in St. Louis, St. Louis, MO, United States
| | - Yen-Ting Liu
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Chih Wei Huang
- Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Ming-Ren Yen
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Chung-Yen Lin
- Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Pao-Yang Chen
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
12
|
Hand2 Selectively Reorganizes Chromatin Accessibility to Induce Pacemaker-like Transcriptional Reprogramming. Cell Rep 2020; 27:2354-2369.e7. [PMID: 31116981 PMCID: PMC6657359 DOI: 10.1016/j.celrep.2019.04.077] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 02/25/2019] [Accepted: 04/17/2019] [Indexed: 01/01/2023] Open
Abstract
Gata4, Hand2, Mef2c, and Tbx5 (GHMT) can reprogram transduced fibroblasts into induced pacemaker-like myocytes (iPMs), but the underlying mechanisms remain obscure. Here, we explore the role of Hand2 in iPM formation by using a combination of transcriptome, genome, and biochemical as-says. We found many shared transcriptional signatures between iPMs and the endogenous sinoatrial node (SAN), yet key regulatory networks remain missing. We demonstrate that Hand2 augments chromatin accessibility at loci involved in sarcomere organization, electrical coupling, and membrane depolarization. Focusing on an established cardiac Hand2 cistrome, we observe selective reorganization of chromatin accessibility to promote pacemaker-specific gene expression. Moreover, we identify a Hand2 cardiac subtype diversity (CSD) domain through biochemical analysis of the N terminus. By integrating our RNA-seq and ATAC-seq datasets, we highlight desmosome organization as a hallmark feature of iPM formation. Collectively, our results illuminate Hand2-dependent mechanisms that may guide future efforts to rationally improve iPM formation. Gata4, Hand2, Mef2c, and Tbx5 can reprogram fibroblasts into cardiomyocyte-like cells, including induced pacemakers (iPMs). Fernandez-Perez et al. show that Hand2 coordinates this process by influencing chromatin accessibility and gene expression in fibroblasts undergoing iPM lineage conversion. These insights could eventually inform the production of superior replacement cells.
Collapse
|
13
|
Smith JP, Sheffield NC. Analytical Approaches for ATAC-seq Data Analysis. CURRENT PROTOCOLS IN HUMAN GENETICS 2020; 106:e101. [PMID: 32543102 PMCID: PMC8191135 DOI: 10.1002/cphg.101] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
ATAC-seq, the assay for transposase-accessible chromatin using sequencing, is a quick and efficient approach to investigating the chromatin accessibility landscape. Investigating chromatin accessibility has broad utility for answering many biological questions, such as mapping nucleosomes, identifying transcription factor binding sites, and measuring differential activity of DNA regulatory elements. Because the ATAC-seq protocol is both simple and relatively inexpensive, there has been a rapid increase in the availability of chromatin accessibility data. Furthermore, advances in ATAC-seq protocols are rapidly extending its breadth to additional experimental conditions, cell types, and species. Accompanying the increase in data, there has also been an explosion of new tools and analytical approaches for analyzing it. Here, we explain the fundamentals of ATAC-seq data processing, summarize common analysis approaches, and review computational tools to provide recommendations for different research questions. This primer provides a starting point and a reference for analysis of ATAC-seq data. © 2020 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Jason P. Smith
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia
| | - Nathan C. Sheffield
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia
- Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia
| |
Collapse
|
14
|
Reske JJ, Wilson MR, Chandler RL. ATAC-seq normalization method can significantly affect differential accessibility analysis and interpretation. Epigenetics Chromatin 2020; 13:22. [PMID: 32321567 PMCID: PMC7178746 DOI: 10.1186/s13072-020-00342-y] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 04/11/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Chromatin dysregulation is associated with developmental disorders and cancer. Numerous methods for measuring genome-wide chromatin accessibility have been developed in the genomic era to interrogate the function of chromatin regulators. A recent technique which has gained widespread use due to speed and low input requirements with native chromatin is the Assay for Transposase-Accessible Chromatin, or ATAC-seq. Biologists have since used this method to compare chromatin accessibility between two cellular conditions. However, approaches for calculating differential accessibility can yield conflicting results, and little emphasis is placed on choice of normalization method during differential ATAC-seq analysis, especially when global chromatin alterations might be expected. RESULTS Using an in vivo ATAC-seq data set generated in our recent report, we observed differences in chromatin accessibility patterns depending on the data normalization method used to calculate differential accessibility. This observation was further verified on published ATAC-seq data from yeast. We propose a generalized workflow for differential accessibility analysis using ATAC-seq data. We further show this workflow identifies sites of differential chromatin accessibility that correlate with gene expression and is sensitive to differential analysis using negative controls. CONCLUSIONS We argue that researchers should systematically compare multiple normalization methods before continuing with differential accessibility analysis. ATAC-seq users should be aware of the interpretations of potential bias within experimental data and the assumptions of the normalization method implemented.
Collapse
Affiliation(s)
- Jake J Reske
- Department of Obstetrics, Gynecology and Reproductive Biology, College of Human Medicine, Michigan State University, Grand Rapids, MI, 49503, USA
| | - Mike R Wilson
- Department of Obstetrics, Gynecology and Reproductive Biology, College of Human Medicine, Michigan State University, Grand Rapids, MI, 49503, USA
| | - Ronald L Chandler
- Department of Obstetrics, Gynecology and Reproductive Biology, College of Human Medicine, Michigan State University, Grand Rapids, MI, 49503, USA. .,Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI, 49503, USA.
| |
Collapse
|
15
|
Zeeshan S, Xiong R, Liang BT, Ahmed Z. 100 Years of evolving gene-disease complexities and scientific debutants. Brief Bioinform 2019; 21:885-905. [PMID: 30972412 DOI: 10.1093/bib/bbz038] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 03/06/2019] [Accepted: 03/08/2019] [Indexed: 12/22/2022] Open
Abstract
It's been over 100 years since the word `gene' is around and progressively evolving in several scientific directions. Time-to-time technological advancements have heavily revolutionized the field of genomics, especially when it's about, e.g. triple code development, gene number proposition, genetic mapping, data banks, gene-disease maps, catalogs of human genes and genetic disorders, CRISPR/Cas9, big data and next generation sequencing, etc. In this manuscript, we present the progress of genomics from pea plant genetics to the human genome project and highlight the molecular, technical and computational developments. Studying genome and epigenome led to the fundamentals of development and progression of human diseases, which includes chromosomal, monogenic, multifactorial and mitochondrial diseases. World Health Organization has classified, standardized and maintained all human diseases, when many academic and commercial online systems are sharing information about genes and linking to associated diseases. To efficiently fathom the wealth of this biological data, there is a crucial need to generate appropriate gene annotation repositories and resources. Our focus has been how many gene-disease databases are available worldwide and which sources are authentic, timely updated and recommended for research and clinical purposes. In this manuscript, we have discussed and compared 43 such databases and bioinformatics applications, which enable users to connect, explore and, if possible, download gene-disease data.
Collapse
Affiliation(s)
- Saman Zeeshan
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
| | - Ruoyun Xiong
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Bruce T Liang
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA.,Pat and Jim Calhoun Cardiology Center, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Zeeshan Ahmed
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| |
Collapse
|
16
|
Bhattacharyya S, Sathe AA, Bhakta M, Xing C, Munshi NV. PAN-INTACT enables direct isolation of lineage-specific nuclei from fibrous tissues. PLoS One 2019; 14:e0214677. [PMID: 30939177 PMCID: PMC6445515 DOI: 10.1371/journal.pone.0214677] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 03/18/2019] [Indexed: 12/27/2022] Open
Abstract
Recent studies have highlighted the extraordinary cell type diversity that exists within mammalian organs, yet the molecular drivers of such heterogeneity remain elusive. To address this issue, much attention has been focused on profiling the transcriptome and epigenome of individual cell types. However, standard cell type isolation methods based on surface or fluorescent markers remain problematic for cells residing within organs with significant connective tissue. Since the nucleus contains both genomic and transcriptomic information, the isolation of nuclei tagged in specific cell types (INTACT) method provides an attractive solution. Although INTACT has been successfully applied to plants, flies, zebrafish, frogs, and mouse brain and adipose tissue, broad use across mammalian organs remains challenging. Here we describe the PAN-INTACT method, which can be used to isolate cell type specific nuclei from fibrous mouse organs, which are particularly problematic. As a proof-of-concept, we demonstrate successful isolation of cell type-specific nuclei from the mouse heart, which contains substantial connective tissue and harbors multiple cell types, including cardiomyocytes, fibroblasts, endothelial cells, and epicardial cells. Compared to established techniques, PAN-INTACT allows more rapid isolation of cardiac nuclei to facilitate downstream applications. We show cell type-specific isolation of nuclei from the hearts of Nkx2-5Cre/+; R26Sun1-2xsf-GFP-6xmyc/+ mice, which we confirm by expression of lineage markers. Furthermore, we perform Assay for Transposase Accessible Chromatin (ATAC)-Seq to provide high-fidelity chromatin accessibility maps of Nkx2-5+ nuclei. To extend the applicability of PAN-INTACT, we also demonstrate successful isolation of Wt1+ podocytes from adult kidney. Taken together, our data suggest that PAN-INTACT is broadly applicable for profiling the transcriptional and epigenetic landscape of specific cell types. Thus, we envision that our method can be used to systematically probe mechanistic details of cell type-specific functions within individual organs of intact mice.
Collapse
Affiliation(s)
- Samadrita Bhattacharyya
- Department of Internal Medicine, Division of Cardiology, UT Southwestern Medical Center, Dallas, Texas, United States of America
| | - Adwait A. Sathe
- McDermott Center for Human Growth and Development, UT Southwestern Medical Center, Dallas, Texas, United States of America
| | - Minoti Bhakta
- Department of Internal Medicine, Division of Cardiology, UT Southwestern Medical Center, Dallas, Texas, United States of America
| | - Chao Xing
- McDermott Center for Human Growth and Development, UT Southwestern Medical Center, Dallas, Texas, United States of America
| | - Nikhil V. Munshi
- Department of Internal Medicine, Division of Cardiology, UT Southwestern Medical Center, Dallas, Texas, United States of America
- McDermott Center for Human Growth and Development, UT Southwestern Medical Center, Dallas, Texas, United States of America
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, Texas, United States of America
- Hamon Center for Regenerative Science and Medicine, Dallas, Texas, United States of America
- * E-mail:
| |
Collapse
|
17
|
Divate M, Cheung E. GUAVA: A Graphical User Interface for the Analysis and Visualization of ATAC-seq Data. Front Genet 2018; 9:250. [PMID: 30065749 PMCID: PMC6056626 DOI: 10.3389/fgene.2018.00250] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 06/25/2018] [Indexed: 01/11/2023] Open
Abstract
Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is a powerful genomic technology that is used for the global mapping and analysis of open chromatin regions. However, for users to process and analyze such data they either have to use a number of complicated bioinformatic tools or attempt to use the currently available ATAC-seq analysis software, which are not very user friendly and lack visualization of the ATAC-seq results. Because of these issues, biologists with minimal bioinformatics background who wish to process and analyze their own ATAC-seq data by themselves will find these tasks difficult and ultimately will need to seek help from bioinformatics experts. Moreover, none of the available tools provide complete solution for ATAC-seq data analysis. Therefore, to enable non-programming researchers to analyze ATAC-seq data on their own, we developed a tool called Graphical User interface for the Analysis and Visualization of ATAC-seq data (GUAVA). GUAVA is a standalone software that provides users with a seamless solution from beginning to end including adapter trimming, read mapping, the identification and differential analysis of ATAC-seq peaks, functional annotation, and the visualization of ATAC-seq results. We believe GUAVA will be a highly useful and time-saving tool for analyzing ATAC-seq data for biologists with minimal or no bioinformatics background. Since GUAVA can also operate through command-line, it can easily be integrated into existing pipelines, thus providing flexibility to users with computational experience.
Collapse
Affiliation(s)
- Mayur Divate
- Faculty of Health Sciences, University of Macau, Macau, Macau
| | - Edwin Cheung
- Faculty of Health Sciences, University of Macau, Macau, Macau
| |
Collapse
|
18
|
Ou J, Liu H, Yu J, Kelliher MA, Castilla LH, Lawson ND, Zhu LJ. ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data. BMC Genomics 2018; 19:169. [PMID: 29490630 PMCID: PMC5831847 DOI: 10.1186/s12864-018-4559-3] [Citation(s) in RCA: 120] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 02/20/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND ATAC-seq (Assays for Transposase-Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. Compared to earlier methods for assaying chromatin accessibility, ATAC-seq is faster and easier to perform, does not require cross-linking, has higher signal to noise ratio, and can be performed on small cell numbers. However, to ensure a successful ATAC-seq experiment, step-by-step quality assurance processes, including both wet lab quality control and in silico quality assessment, are essential. While several tools have been developed or adopted for assessing read quality, identifying nucleosome occupancy and accessible regions from ATAC-seq data, none of the tools provide a comprehensive set of functionalities for preprocessing and quality assessment of aligned ATAC-seq datasets. RESULTS We have developed a Bioconductor package, ATACseqQC, for easily generating various diagnostic plots to help researchers quickly assess the quality of their ATAC-seq data. In addition, this package contains functions to preprocess aligned ATAC-seq data for subsequent peak calling. Here we demonstrate the utilities of our package using 25 publicly available ATAC-seq datasets from four studies. We also provide guidelines on what the diagnostic plots should look like for an ideal ATAC-seq dataset. CONCLUSIONS This software package has been used successfully for preprocessing and assessing several in-house and public ATAC-seq datasets. Diagnostic plots generated by this package will facilitate the quality assessment of ATAC-seq data, and help researchers to evaluate their own ATAC-seq experiments as well as select high-quality ATAC-seq datasets from public repositories such as GEO to avoid generating hypotheses or drawing conclusions from low-quality ATAC-seq experiments. The software, source code, and documentation are freely available as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/ATACseqQC.html .
Collapse
Affiliation(s)
- Jianhong Ou
- Department of Cell Biology, Duke University Medical Center, Durham, NC 27710 USA
| | - Haibo Liu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Jun Yu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Michelle A. Kelliher
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Lucio H. Castilla
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Nathan D. Lawson
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Lihua Julie Zhu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
- Department of Molecular Medicine, Program in Bioinformatics and Integrative Biology, Worcester, MA 01655 USA
| |
Collapse
|