1
|
Liu H, Ma W. DiffGR: Detecting Differentially Interacting Genomic Regions from Hi-C Contact Maps. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae028. [PMID: 39222712 DOI: 10.1093/gpbjnl/qzae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 06/19/2023] [Accepted: 10/01/2023] [Indexed: 09/04/2024]
Abstract
Recent advances in high-throughput chromosome conformation capture (Hi-C) techniques have allowed us to map genome-wide chromatin interactions and uncover higher-order chromatin structures, thereby shedding light on the principles of genome architecture and functions. However, statistical methods for detecting changes in large-scale chromatin organization such as topologically associating domains (TADs) are still lacking. Here, we proposed a new statistical method, DiffGR, for detecting differentially interacting genomic regions at the TAD level between Hi-C contact maps. We utilized the stratum-adjusted correlation coefficient to measure similarity of local TAD regions. We then developed a nonparametric approach to identify statistically significant changes of genomic interacting regions. Through simulation studies, we demonstrated that DiffGR can robustly and effectively discover differential genomic regions under various conditions. Furthermore, we successfully revealed cell type-specific changes in genomic interacting regions in both human and mouse Hi-C datasets, and illustrated that DiffGR yielded consistent and advantageous results compared with state-of-the-art differential TAD detection methods. The DiffGR R package is published under the GNU General Public License (GPL) ≥ 2 license and is publicly available at https://github.com/wmalab/DiffGR.
Collapse
Affiliation(s)
- Huiling Liu
- Department of Statistics, University of California Riverside, Riverside, CA 92521, USA
| | - Wenxiu Ma
- Department of Statistics, University of California Riverside, Riverside, CA 92521, USA
| |
Collapse
|
2
|
Choppavarapu L, Fang K, Liu T, Jin VX. Hi-C profiling in tissues reveals 3D chromatin-regulated breast tumor heterogeneity and tumor-specific looping-mediated biological pathways. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.13.584872. [PMID: 38559097 PMCID: PMC10979939 DOI: 10.1101/2024.03.13.584872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Current knowledge in three-dimensional (3D) chromatin regulation in normal and disease states was mostly accumulated through Hi-C profiling in in vitro cell culture system. The limitations include failing to recapitulate disease-specific physiological properties and often lacking clinically relevant disease microenvironment. In this study, we conduct tissue-specific Hi-C profiling in a pilot cohort of 12 breast tissues comprising of two normal tissues (NTs) and ten ER+ breast tumor tissues (TTs) including five primary tumors (PTs), and five tamoxifen-treated recurrent tumors (RTs). We find largely preserved compartments, highly heterogeneous topological associated domains (TADs) and intensively variable chromatin loops among breast tumors, demonstrating 3D chromatin-regulated breast tumor heterogeneity. Further cross-examination identifies RT-specific looping-mediated biological pathways and suggests CA2, an enhancer-promoter looping (EPL)-mediated target gene within the bicarbonate transport metabolism pathway, might play a role in driving the tamoxifen resistance. Remarkably, the inhibition of CA2 not only impedes tumor growth both in vitro and in vivo , but also reverses chromatin looping. Our study thus yields significant mechanistic insights into the role and clinical relevance of 3D chromatin architecture in breast cancer endocrine resistance.
Collapse
|
3
|
Gilbertson EN, Brand CM, McArthur E, Rinker DC, Kuang S, Pollard KS, Capra JA. Machine learning reveals the diversity of human 3D chromatin contact patterns. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.22.573104. [PMID: 38187606 PMCID: PMC10769343 DOI: 10.1101/2023.12.22.573104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Understanding variation in chromatin contact patterns across human populations is critical for interpreting non-coding variants and their ultimate effects on gene expression and phenotypes. However, experimental determination of chromatin contacts at a population-scale is prohibitively expensive. To overcome this challenge, we develop and validate a machine learning method to quantify the diversity 3D chromatin contacts at 2 kilobase resolution from genome sequence alone. We then apply this approach to thousands of diverse modern humans and the inferred human-archaic hominin ancestral genome. While patterns of 3D contact divergence genome-wide are qualitatively similar to patterns of sequence divergence, we find that 3D divergence in local 1-megabase genomic windows does not follow sequence divergence. In particular, we identify 392 windows with significantly greater 3D divergence than expected from sequence. Moreover, 26% of genomic windows have rare 3D contact variation observed in a small number of individuals. Using in silico mutagenesis we find that most sequence changes to do not result in changes to 3D chromatin contacts. However in windows with substantial 3D divergence, just one or a few variants can lead to divergent 3D chromatin contacts without the individuals carrying those variants having high sequence divergence. In summary, inferring 3D chromatin contact maps across human populations reveals diverse contact patterns. We anticipate that these genetically diverse maps of 3D chromatin contact will provide a reference for future work on the function and evolution of 3D chromatin contact variation across human populations.
Collapse
Affiliation(s)
- Erin N Gilbertson
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
| | - Colin M Brand
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| | - Evonne McArthur
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN
- Department of Medicine, University of Washington, Seattle, WA
| | - David C Rinker
- Department of Biological Sciences, Vanderbilt University, Nashville, TN
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
| | - Katherine S Pollard
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Chan Zuckerberg Biohub SF, San Francisco, CA
| | - John A Capra
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| |
Collapse
|
4
|
Sefer E. A comparison of topologically associating domain callers over mammals at high resolution. BMC Bioinformatics 2022; 23:127. [PMID: 35413815 PMCID: PMC9006547 DOI: 10.1186/s12859-022-04674-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 04/07/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Topologically associating domains (TADs) are locally highly-interacting genome regions, which also play a critical role in regulating gene expression in the cell. TADs have been first identified while investigating the 3D genome structure over High-throughput Chromosome Conformation Capture (Hi-C) interaction dataset. Substantial degree of efforts have been devoted to develop techniques for inferring TADs from Hi-C interaction dataset. Many TAD-calling methods have been developed which differ in their criteria and assumptions in TAD inference. Correspondingly, TADs inferred via these callers vary in terms of both similarities and biological features they are enriched in. RESULT We have carried out a systematic comparison of 27 TAD-calling methods over mammals. We use Micro-C, a recent high-resolution variant of Hi-C, to compare TADs at a very high resolution, and classify the methods into 3 categories: feature-based methods, Clustering methods, Graph-partitioning methods. We have evaluated TAD boundaries, gaps between adjacent TADs, and quality of TADs across various criteria. We also found particularly CTCF and Cohesin proteins to be effective in formation of TADs with corner dots. We have also assessed the callers performance on simulated datasets since a gold standard for TADs is missing. TAD sizes and numbers change remarkably between TAD callers and dataset resolutions, indicating that TADs are hierarchically-organized domains, instead of disjoint regions. A core subset of feature-based TAD callers regularly perform the best while inferring reproducible domains, which are also enriched for TAD related biological properties. CONCLUSION We have analyzed the fundamental principles of TAD-calling methods, and identified the existing situation in TAD inference across high resolution Micro-C interaction datasets over mammals. We come up with a systematic, comprehensive, and concise framework to evaluate the TAD-calling methods performance across Micro-C datasets. Our research will be useful in selecting appropriate methods for TAD inference and evaluation based on available data, experimental design, and biological question of interest. We also introduce our analysis as a benchmarking tool with publicly available source code.
Collapse
Affiliation(s)
- Emre Sefer
- Department of Computer Science, Ozyegin University, Istanbul, Turkey.
| |
Collapse
|
5
|
Fino J, Marques B, Dong Z, David D. SVInterpreter: A Comprehensive Topologically Associated Domain-Based Clinical Outcome Prediction Tool for Balanced and Unbalanced Structural Variants. Front Genet 2021; 12:757170. [PMID: 34925449 PMCID: PMC8671832 DOI: 10.3389/fgene.2021.757170] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 10/12/2021] [Indexed: 11/13/2022] Open
Abstract
With the advent of genomic sequencing, a number of balanced and unbalanced structural variants (SVs) can be detected per individual. Mainly due to incompleteness and the scattered nature of the available annotation data of the human genome, manual interpretation of the SV's clinical significance is laborious and cumbersome. Since bioinformatic tools developed for this task are limited, a comprehensive tool to assist clinical outcome prediction of SVs is warranted. Herein, we present SVInterpreter, a free Web application, which analyzes both balanced and unbalanced SVs using topologically associated domains (TADs) as genome units. Among others, gene-associated data (as function and dosage sensitivity), phenotype similarity scores, and copy number variants (CNVs) scoring metrics are retrieved for an informed SV interpretation. For evaluation, we retrospectively applied SVInterpreter to 97 balanced (translocations and inversions) and 125 unbalanced (deletions, duplications, and insertions) previously published SVs, and 145 SVs identified from 20 clinical samples. Our results showed the ability of SVInterpreter to support the evaluation of SVs by (1) confirming more than half of the predictions of the original studies, (2) decreasing 40% of the variants of uncertain significance, and (3) indicating several potential position effect events. To our knowledge, SVInterpreter is the most comprehensive TAD-based tool to identify the possible disease-causing candidate genes and to assist prediction of the clinical outcome of SVs. SVInterpreter is available at http://dgrctools-insa.min-saude.pt/cgi-bin/SVInterpreter.py.
Collapse
Affiliation(s)
- Joana Fino
- Department of Human Genetics, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| | - Bárbara Marques
- Department of Human Genetics, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| | - Zirui Dong
- Department of Obstetrics and Gynaecology, The Chinese University of Hong Kong, Hong Kong, China
- Shenzhen Research Institute, The Chinese University of Hong Kong, Shenzhen, China
- Hong Kong Hub of Pediatric Excellence, The Chinese University of Hong Kong, Hong Kong, China
| | - Dezső David
- Department of Human Genetics, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| |
Collapse
|
6
|
Marti-Marimon M, Vialaneix N, Lahbib-Mansais Y, Zytnicki M, Camut S, Robelin D, Yerle-Bouissou M, Foissac S. Major Reorganization of Chromosome Conformation During Muscle Development in Pig. Front Genet 2021; 12:748239. [PMID: 34675966 PMCID: PMC8523936 DOI: 10.3389/fgene.2021.748239] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 09/14/2021] [Indexed: 12/12/2022] Open
Abstract
The spatial organization of the genome in the nucleus plays a crucial role in eukaryotic cell functions, yet little is known about chromatin structure variations during late fetal development in mammals. We performed in situ high-throughput chromosome conformation capture (Hi-C) sequencing of DNA from muscle samples of pig fetuses at two late stages of gestation. Comparative analysis of the resulting Hi-C interaction matrices between both groups showed widespread differences of different types. First, we discovered a complex landscape of stable and group-specific Topologically Associating Domains (TADs). Investigating the nuclear partition of the chromatin into transcriptionally active and inactive compartments, we observed a genome-wide fragmentation of these compartments between 90 and 110 days of gestation. Also, we identified and characterized the distribution of differential cis- and trans-pairwise interactions. In particular, trans-interactions at chromosome extremities revealed a mechanism of telomere clustering further confirmed by 3D Fluorescence in situ Hybridization (FISH). Altogether, we report major variations of the three-dimensional genome conformation during muscle development in pig, involving several levels of chromatin remodeling and structural regulation.
Collapse
Affiliation(s)
| | | | | | | | - Sylvie Camut
- GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | - David Robelin
- GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | | | - Sylvain Foissac
- GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| |
Collapse
|
7
|
Animesh S, Choudhary R, Wong BJH, Koh CTJ, Ng XY, Tay JKX, Chong WQ, Jian H, Chen L, Goh BC, Fullwood MJ. Profiling of 3D Genome Organization in Nasopharyngeal Cancer Needle Biopsy Patient Samples by a Modified Hi-C Approach. Front Genet 2021; 12:673530. [PMID: 34539729 PMCID: PMC8446523 DOI: 10.3389/fgene.2021.673530] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 07/31/2021] [Indexed: 11/16/2022] Open
Abstract
Nasopharyngeal cancer (NPC), a cancer derived from epithelial cells in the nasopharynx, is a cancer common in China, Southeast Asia, and Africa. The three-dimensional (3D) genome organization of nasopharyngeal cancer is poorly understood. A major challenge in understanding the 3D genome organization of cancer samples is the lack of a method for the characterization of chromatin interactions in solid cancer needle biopsy samples. Here, we developed Biop-C, a modified in situ Hi-C method using solid cancer needle biopsy samples. We applied Biop-C to characterize three nasopharyngeal cancer solid cancer needle biopsy patient samples. We identified topologically associated domains (TADs), chromatin interaction loops, and frequently interacting regions (FIREs) at key oncogenes in nasopharyngeal cancer from the Biop-C heatmaps. We observed that the genomic features are shared at some important oncogenes, but the patients also display extensive heterogeneity at certain genomic loci. On analyzing the super enhancer landscape in nasopharyngeal cancer cell lines, we found that the super enhancers are associated with FIREs and can be linked to distal genes via chromatin loops in NPC. Taken together, our results demonstrate the utility of our Biop-C method in investigating 3D genome organization in solid cancers.
Collapse
Affiliation(s)
- Sambhavi Animesh
- Cancer Science Institute of Singapore, Centre for Translational Medicine, National University of Singapore, Singapore, Singapore
| | - Ruchi Choudhary
- Cancer Science Institute of Singapore, Centre for Translational Medicine, National University of Singapore, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | | | - Charlotte Tze Jia Koh
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Xin Yi Ng
- Department of Haematology-Oncology, National University Cancer Institute, National University Health System, Singapore, Singapore
| | - Joshua Kai Xun Tay
- Department of Otolaryngology - Head and Neck Surgery, National University of Singapore, Singapore, Singapore
| | - Wan-Qin Chong
- Department of Haematology-Oncology, National University Cancer Institute, National University Health System, Singapore, Singapore
| | - Han Jian
- Cancer Science Institute of Singapore, Centre for Translational Medicine, National University of Singapore, Singapore, Singapore
| | - Leilei Chen
- Cancer Science Institute of Singapore, Centre for Translational Medicine, National University of Singapore, Singapore, Singapore.,Department of Anatomy, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Boon Cher Goh
- Cancer Science Institute of Singapore, Centre for Translational Medicine, National University of Singapore, Singapore, Singapore.,Department of Haematology-Oncology, National University Cancer Institute, National University Health System, Singapore, Singapore.,Department of Pharmacology, Yong Loo Lin School of Medicine, National University Health System, Singapore, Singapore
| | - Melissa Jane Fullwood
- Cancer Science Institute of Singapore, Centre for Translational Medicine, National University of Singapore, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
8
|
Poppenberg KE, Zebraski HR, Avasthi N, Waqas M, Siddiqui AH, Jarvis JN, Tutino VM. Epigenetic landscapes of intracranial aneurysm risk haplotypes implicate enhancer function of endothelial cells and fibroblasts in dysregulated gene expression. BMC Med Genomics 2021; 14:162. [PMID: 34134708 PMCID: PMC8210394 DOI: 10.1186/s12920-021-01007-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 06/02/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome-wide association studies have identified many single nucleotide polymorphisms (SNPs) associated with increased risk for intracranial aneurysm (IA). However, how such variants affect gene expression within IA is poorly understood. We used publicly-available ChIP-Seq data to study chromatin landscapes surrounding risk loci to determine whether IA-associated SNPs affect functional elements that regulate gene expression in cell types comprising IA tissue. METHODS We mapped 16 significant IA-associated SNPs to linkage disequilibrium (LD) blocks within human genome. Using ChIP-Seq data, we examined these regions for presence of H3K4me1, H3K27ac, and H3K9ac histone marks (typically associated with latent/active enhancers). This analysis was conducted in several cell types that are present in IA tissue (endothelial cells, smooth muscle cells, fibroblasts, macrophages, monocytes, neutrophils, T cells, B cells, NK cells). In cell types with significant histone enrichment, we used HiC data to investigate topologically associated domains (TADs) encompassing the LD blocks to identify genes that may be affected by IA-associated variants. Bioinformatics were performed to determine the biological significance of these genes. Genes within HiC-defined TADs were also compared to differentially expressed genes from RNA-seq/microarray studies of IA tissues. RESULTS We found that endothelial cells and fibroblasts, rather than smooth muscle or immune cells, have significant enrichment for enhancer marks on IA risk haplotypes (p < 0.05). Bioinformatics demonstrated that genes within TADs subsuming these regions are associated with structural extracellular matrix components and enzymatic activity. The majority of histone marked TADs (83% fibroblasts [IMR90], 77% HUVEC) encompassed at least one differentially expressed gene from IA tissue studies. CONCLUSIONS These findings provide evidence that genetic variants associated with IA risk act on endothelial cells and fibroblasts. There is strong circumstantial evidence that this may be mediated through altered enhancer function, as genes in TADs encompassing enhancer marks have also been shown to be differentially expressed in IA tissue. These genes are largely related to organization and regulation of the extracellular matrix. This study builds upon our previous (Poppenberg et al., BMC Med Genomics, 2019) by including a more diverse set of data from additional cell types and by identifying potential affected genes (i.e. those in TADs).
Collapse
Affiliation(s)
- Kerry E Poppenberg
- Canon Stroke and Vascular Research Center, University at Buffalo, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA
- Department of Neurosurgery, University at Buffalo, Buffalo, NY, USA
| | - Haley R Zebraski
- Canon Stroke and Vascular Research Center, University at Buffalo, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA
- Department of Biomedical Engineering, University at Buffalo, Buffalo, NY, USA
| | - Naval Avasthi
- Canon Stroke and Vascular Research Center, University at Buffalo, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA
- Department of Biomedical Engineering, University at Buffalo, Buffalo, NY, USA
| | - Muhammad Waqas
- Canon Stroke and Vascular Research Center, University at Buffalo, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA
- Department of Neurosurgery, University at Buffalo, Buffalo, NY, USA
| | - Adnan H Siddiqui
- Canon Stroke and Vascular Research Center, University at Buffalo, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA
- Department of Neurosurgery, University at Buffalo, Buffalo, NY, USA
| | - James N Jarvis
- Department of Pediatrics, University at Buffalo, Buffalo, NY, USA
| | - Vincent M Tutino
- Canon Stroke and Vascular Research Center, University at Buffalo, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.
- Department of Neurosurgery, University at Buffalo, Buffalo, NY, USA.
- Department of Biomedical Engineering, University at Buffalo, Buffalo, NY, USA.
- Department of Pathology and Anatomical Sciences, University at Buffalo, Buffalo, NY, USA.
- Department of Mechanical and Aerospace Engineering, University at Buffalo, Buffalo, NY, USA.
| |
Collapse
|
9
|
Xing H, Wu Y, Zhang MQ, Chen Y. Deciphering hierarchical organization of topologically associated domains through change-point testing. BMC Bioinformatics 2021; 22:183. [PMID: 33838653 PMCID: PMC8037919 DOI: 10.1186/s12859-021-04113-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/30/2021] [Indexed: 12/20/2022] Open
Abstract
Background The nucleus of eukaryotic cells spatially packages chromosomes into a hierarchical and distinct segregation that plays critical roles in maintaining transcription regulation. High-throughput methods of chromosome conformation capture, such as Hi-C, have revealed topologically associating domains (TADs) that are defined by biased chromatin interactions within them. Results We introduce a novel method, HiCKey, to decipher hierarchical TAD structures in Hi-C data and compare them across samples. We first derive a generalized likelihood-ratio (GLR) test for detecting change-points in an interaction matrix that follows a negative binomial distribution or general mixture distribution. We then employ several optimal search strategies to decipher hierarchical TADs with p values calculated by the GLR test. Large-scale validations of simulation data show that HiCKey has good precision in recalling known TADs and is robust against random collisions of chromatin interactions. By applying HiCKey to Hi-C data of seven human cell lines, we identified multiple layers of TAD organization among them, but the vast majority had no more than four layers. In particular, we found that TAD boundaries are significantly enriched in active chromosomal regions compared to repressed regions. Conclusions HiCKey is optimized for processing large matrices constructed from high-resolution Hi-C experiments. The method and theoretical result of the GLR test provide a general framework for significance testing of similar experimental chromatin interaction data that may not fully follow negative binomial distributions but rather more general mixture distributions. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04113-8.
Collapse
Affiliation(s)
- Haipeng Xing
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, 100 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Yingru Wu
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, 100 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Michael Q Zhang
- Center for System Biology, University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA
| | - Yong Chen
- Department of Molecular and Cellular Biosciences, Rowan University, 201 Mullica Hill Rd, Glassboro, NJ, 08028, USA.
| |
Collapse
|
10
|
Understanding transcription across scales: From base pairs to chromosomes. Mol Cell 2021; 81:1601-1616. [PMID: 33770487 DOI: 10.1016/j.molcel.2021.03.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 02/23/2021] [Accepted: 02/26/2021] [Indexed: 02/07/2023]
Abstract
The influence of genome organization on transcription is central to our understanding of cell type specification. Higher-order genome organization is established through short- and long-range DNA interactions. Coordination of these interactions, from single atoms to entire chromosomes, plays a fundamental role in transcriptional control of gene expression. Loss of this coupling can result in disease. Analysis of transcriptional regulation typically involves disparate experimental approaches, from structural studies that define angstrom-level interactions to cell-biological and genomic approaches that assess mesoscale relationships. Thus, to fully understand the mechanisms that regulate gene expression, it is critical to integrate the findings gained across these distinct size scales. In this review, I illustrate fundamental ways in which cells regulate transcription in the context of genome organization.
Collapse
|
11
|
Eres IE, Gilad Y. A TAD Skeptic: Is 3D Genome Topology Conserved? Trends Genet 2021; 37:216-223. [PMID: 33203573 PMCID: PMC8120795 DOI: 10.1016/j.tig.2020.10.009] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 10/17/2020] [Accepted: 10/20/2020] [Indexed: 01/08/2023]
Abstract
The notion that topologically associating domains (TADs) are highly conserved across species is prevalent in the field of 3D genomics. However, what exactly is meant by 'highly conserved' and what are the actual comparative data that support this notion? To address these questions, we performed a historical review of the relevant literature and retraced numerous citation chains to reveal the primary data that were used as the basis for the widely accepted conclusion that TADs are highly conserved across evolution. A thorough review of the available evidence suggests the answer may be more complex than what is commonly presented.
Collapse
Affiliation(s)
- Ittai E Eres
- Department of Human Genetics, University of Chicago, Cummings Life Science Center, 928 E. 58th St., Chicago, IL 60637, USA
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Cummings Life Science Center, 928 E. 58th St., Chicago, IL 60637, USA; Section of Genetic Medicine, Department of Medicine, University of Chicago, 5841 S. Maryland Ave., N417, MC6091, Chicago, IL 60637, USA.
| |
Collapse
|
12
|
McArthur E, Capra JA. Topologically associating domain boundaries that are stable across diverse cell types are evolutionarily constrained and enriched for heritability. Am J Hum Genet 2021; 108:269-283. [PMID: 33545030 PMCID: PMC7895846 DOI: 10.1016/j.ajhg.2021.01.001] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 12/29/2020] [Indexed: 12/22/2022] Open
Abstract
Topologically associating domains (TADs) are fundamental units of three-dimensional (3D) nuclear organization. The regions bordering TADs-TAD boundaries-contribute to the regulation of gene expression by restricting interactions of cis-regulatory sequences to their target genes. TAD and TAD-boundary disruption have been implicated in rare-disease pathogenesis; however, we have a limited framework for integrating TADs and their variation across cell types into the interpretation of common-trait-associated variants. Here, we investigate an attribute of 3D genome architecture-the stability of TAD boundaries across cell types-and demonstrate its relevance to understanding how genetic variation in TADs contributes to complex disease. By synthesizing TAD maps across 37 diverse cell types with 41 genome-wide association studies (GWASs), we investigate the differences in disease association and evolutionary pressure on variation in TADs versus TAD boundaries. We demonstrate that genetic variation in TAD boundaries contributes more to complex-trait heritability, especially for immunologic, hematologic, and metabolic traits. We also show that TAD boundaries are more evolutionarily constrained than TADs. Next, stratifying boundaries by their stability across cell types, we find substantial variation. Compared to boundaries unique to a specific cell type, boundaries stable across cell types are further enriched for complex-trait heritability, evolutionary constraint, CTCF binding, and housekeeping genes. Thus, considering TAD boundary stability across cell types provides valuable context for understanding the genome's functional landscape and enabling variant interpretation that takes 3D structure into account.
Collapse
Affiliation(s)
- Evonne McArthur
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37235, USA
| | - John A Capra
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37235, USA; Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA; Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, 94158; Bakar Institute for Computational Health Sciences, University of California, San Francisco, CA, 94158.
| |
Collapse
|
13
|
Abstract
Exploration of genetic variant-to-gene relationships by quantitative trait loci such as expression QTLs is a frequently used tool in genome-wide association studies. However, the wide range of public QTL databases and the lack of batch annotation features complicate a comprehensive annotation of GWAS results. In this work, we introduce the tool “Qtlizer” for annotating lists of variants in human with associated changes in gene expression and protein abundance using an integrated database of published QTLs. Features include incorporation of variants in linkage disequilibrium and reverse search by gene names. Analyzing the database for base pair distances between best significant eQTLs and their affected genes suggests that the commonly used cis-distance limit of 1,000,000 base pairs might be too restrictive, implicating a substantial amount of wrongly and yet undetected eQTLs. We also ranked genes with respect to the maximum number of tissue-specific eQTL studies in which a most significant eQTL signal was consistent. For the top 100 genes we observed the strongest enrichment with housekeeping genes (P = 2 × 10–6) and with the 10% highest expressed genes (P = 0.005) after grouping eQTLs by r2 > 0.95, underlining the relevance of LD information in eQTL analyses. Qtlizer can be accessed via https://genehopper.de/qtlizer or by using the respective Bioconductor R-package (https://doi.org/10.18129/B9.bioc.Qtlizer).
Collapse
|
14
|
Cresswell KG, Dozmorov MG. TADCompare: An R Package for Differential and Temporal Analysis of Topologically Associated Domains. Front Genet 2020; 11:158. [PMID: 32211023 PMCID: PMC7076128 DOI: 10.3389/fgene.2020.00158] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Accepted: 02/11/2020] [Indexed: 12/02/2022] Open
Abstract
Recent research using chromatin conformation capture technologies, such as Hi-C, has demonstrated the importance of topologically associated domains (TADs) and smaller chromatin loops, collectively referred hereafter as "interacting domains." Many such domains change during development or disease, and exhibit cell- and condition-specific differences. Quantification of the dynamic behavior of interacting domains will help to better understand genome regulation. Methods for comparing interacting domains between cells and conditions are highly limited. We developed TADCompare, a method for differential analysis of boundaries of interacting domains between two or more Hi-C datasets. TADCompare is based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of boundary differences. Using this measure, we introduce methods for identifying differential and consensus boundaries of interacting domains and tracking boundary changes over time. We further propose a novel framework for the systematic classification of boundary changes. Colocalization- and gene enrichment analysis of different types of boundary changes demonstrated distinct biological functionality associated with them. TADCompare is available on https://github.com/dozmorovlab/TADCompare and Bioconductor (submitted).
Collapse
|
15
|
Lindblad-Toh K. What animals can teach us about evolution, the human genome, and human disease. Ups J Med Sci 2020; 125:1-9. [PMID: 32054372 PMCID: PMC7054949 DOI: 10.1080/03009734.2020.1722298] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 01/23/2020] [Indexed: 12/14/2022] Open
Abstract
During the past 20 years, since I started as a postdoc, the world of genetics and genomics has changed dramatically. My main research goal throughout my career has been to understand human disease genetics, and I have developed comparative genomics and comparative genetics to generate resources and tools for understanding human disease. Through comparative genomics I have worked to sequence enough mammals to understand the functional potential of each base in the human genome as well as chosen vertebrates to study the evolutionary changes that have given many species their key traits. Through comparative genetics, I have developed the dog as a model for human disease, characterising the genome itself and determining a list of germ-line loci and somatic mutations causing complex diseases and cancer in the dog. Pulling all these findings and resources together opens new doors for understanding genome evolution, the genetics of complex traits and cancer in man and his best friend.
Collapse
Affiliation(s)
- Kerstin Lindblad-Toh
- Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|