1
|
Wu H, Zhou B, Zhou H, Zhang P, Wang M. Be-1DCNN: a neural network model for chromatin loop prediction based on bagging ensemble learning. Brief Funct Genomics 2023; 22:475-484. [PMID: 37133976 DOI: 10.1093/bfgp/elad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/10/2023] [Accepted: 03/29/2023] [Indexed: 05/04/2023] Open
Abstract
The chromatin loops in the three-dimensional (3D) structure of chromosomes are essential for the regulation of gene expression. Despite the fact that high-throughput chromatin capture techniques can identify the 3D structure of chromosomes, chromatin loop detection utilizing biological experiments is arduous and time-consuming. Therefore, a computational method is required to detect chromatin loops. Deep neural networks can form complex representations of Hi-C data and provide the possibility of processing biological datasets. Therefore, we propose a bagging ensemble one-dimensional convolutional neural network (Be-1DCNN) to detect chromatin loops from genome-wide Hi-C maps. First, to obtain accurate and reliable chromatin loops in genome-wide contact maps, the bagging ensemble learning method is utilized to synthesize the prediction results of multiple 1DCNN models. Second, each 1DCNN model consists of three 1D convolutional layers for extracting high-dimensional features from input samples and one dense layer for producing the prediction results. Finally, the prediction results of Be-1DCNN are compared to those of the existing models. The experimental results indicate that Be-1DCNN predicts high-quality chromatin loops and outperforms the state-of-the-art methods using the same evaluation metrics. The source code of Be-1DCNN is available for free at https://github.com/HaoWuLab-Bioinformatics/Be1DCNN.
Collapse
Affiliation(s)
- Hao Wu
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
- School of Software, Shandong University, Jinan, 250101 Shandong, China
| | - Bing Zhou
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| | - Haoru Zhou
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| | - Pengyu Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| | - Meili Wang
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| |
Collapse
|
2
|
Gong W, Wee J, Wu MC, Sun X, Li C, Xia K. Persistent spectral simplicial complex-based machine learning for chromosomal structural analysis in cellular differentiation. Brief Bioinform 2022; 23:6583209. [PMID: 35536545 DOI: 10.1093/bib/bbac168] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 04/12/2022] [Accepted: 03/13/2022] [Indexed: 11/13/2022] Open
Abstract
The three-dimensional (3D) chromosomal structure plays an essential role in all DNA-templated processes, including gene transcription, DNA replication and other cellular processes. Although developing chromosome conformation capture (3C) methods, such as Hi-C, which can generate chromosomal contact data characterized genome-wide chromosomal structural properties, understanding 3D genomic nature-based on Hi-C data remains lacking. Here, we propose a persistent spectral simplicial complex (PerSpectSC) model to describe Hi-C data for the first time. Specifically, a filtration process is introduced to generate a series of nested simplicial complexes at different scales. For each of these simplicial complexes, its spectral information can be calculated from the corresponding Hodge Laplacian matrix. PerSpectSC model describes the persistence and variation of the spectral information of the nested simplicial complexes during the filtration process. Different from all previous models, our PerSpectSC-based features provide a quantitative global-scale characterization of chromosome structures and topology. Our descriptors can successfully classify cell types and also cellular differentiation stages for all the 24 types of chromosomes simultaneously. In particular, persistent minimum best characterizes cell types and Dim (1) persistent multiplicity best characterizes cellular differentiation. These results demonstrate the great potential of our PerSpectSC-based models in polymeric data analysis.
Collapse
Affiliation(s)
- Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124.,Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - JunJie Wee
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Min-Chun Wu
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Xiaohan Sun
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| |
Collapse
|
3
|
Nicoletti C. Methods for the Differential Analysis of Hi-C Data. Methods Mol Biol 2022; 2301:61-95. [PMID: 34415531 DOI: 10.1007/978-1-0716-1390-0_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The 3D organization of chromatin within the nucleus enables dynamic regulation and cell type-specific transcription of the genome. This is true at multiple levels of resolution: on a large scale, with chromosomes occupying distinct volumes (chromosome territories); at the level of individual chromatin fibers, which are organized into compartmentalized domains (e.g., Topologically Associating Domains-TADs), and at the level of short-range chromatin interactions between functional elements of the genome (e.g., enhancer-promoter loops).The widespread availability of Chromosome Conformation Capture (3C)-based high-throughput techniques has been instrumental in advancing our knowledge of chromatin nuclear organization. In particular, Hi-C has the potential to achieve the most comprehensive characterization of chromatin 3D interactions, as it is theoretically able to detect any pair of restriction fragments connected as a result of ligation by proximity.This chapter will illustrate how to compare the chromatin interactome in different experimental conditions, starting from pre-computed Hi-C contact matrices, how to visualize the results, and how to correlate the observed variations in chromatin interaction strength with changes in gene expression.
Collapse
Affiliation(s)
- Chiara Nicoletti
- Development, Aging and Regeneration Program, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA.
| |
Collapse
|
4
|
Yang T, He X, An L, Li Q. Methods to Assess the Reproducibility and Similarity of Hi-C Data. Methods Mol Biol 2022; 2301:17-37. [PMID: 34415529 DOI: 10.1007/978-1-0716-1390-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Hi-C experiments are costly to perform and involve multiple complex experimental steps. Reproducibility of Hi-C data is essential for ensuring the validity of the scientific conclusions drawn from the data. In this chapter, we describe several recently developed computational methods for assessing reproducibility of Hi-C replicate experiments. These methods can also be used to assess the similarity between any two Hi-C samples.
Collapse
Affiliation(s)
- Tao Yang
- Bioinformatics and Genomics Program, Pennsylvania State University, University Park, PA, USA
| | - Xi He
- Bioinformatics and Genomics Program, Pennsylvania State University, University Park, PA, USA
| | - Lin An
- Bioinformatics and Genomics Program, Pennsylvania State University, University Park, PA, USA
| | - Qunhua Li
- Department of Statistics, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
5
|
Xing H, Wu Y, Zhang MQ, Chen Y. Deciphering hierarchical organization of topologically associated domains through change-point testing. BMC Bioinformatics 2021; 22:183. [PMID: 33838653 PMCID: PMC8037919 DOI: 10.1186/s12859-021-04113-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/30/2021] [Indexed: 12/20/2022] Open
Abstract
Background The nucleus of eukaryotic cells spatially packages chromosomes into a hierarchical and distinct segregation that plays critical roles in maintaining transcription regulation. High-throughput methods of chromosome conformation capture, such as Hi-C, have revealed topologically associating domains (TADs) that are defined by biased chromatin interactions within them. Results We introduce a novel method, HiCKey, to decipher hierarchical TAD structures in Hi-C data and compare them across samples. We first derive a generalized likelihood-ratio (GLR) test for detecting change-points in an interaction matrix that follows a negative binomial distribution or general mixture distribution. We then employ several optimal search strategies to decipher hierarchical TADs with p values calculated by the GLR test. Large-scale validations of simulation data show that HiCKey has good precision in recalling known TADs and is robust against random collisions of chromatin interactions. By applying HiCKey to Hi-C data of seven human cell lines, we identified multiple layers of TAD organization among them, but the vast majority had no more than four layers. In particular, we found that TAD boundaries are significantly enriched in active chromosomal regions compared to repressed regions. Conclusions HiCKey is optimized for processing large matrices constructed from high-resolution Hi-C experiments. The method and theoretical result of the GLR test provide a general framework for significance testing of similar experimental chromatin interaction data that may not fully follow negative binomial distributions but rather more general mixture distributions. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04113-8.
Collapse
Affiliation(s)
- Haipeng Xing
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, 100 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Yingru Wu
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, 100 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Michael Q Zhang
- Center for System Biology, University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA
| | - Yong Chen
- Department of Molecular and Cellular Biosciences, Rowan University, 201 Mullica Hill Rd, Glassboro, NJ, 08028, USA.
| |
Collapse
|
6
|
Guarnera E, Tan ZW, Berezovsky IN. Three-dimensional chromatin ensemble reconstruction via stochastic embedding. Structure 2021; 29:622-634.e3. [PMID: 33567266 DOI: 10.1016/j.str.2021.01.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 11/17/2020] [Accepted: 01/13/2021] [Indexed: 01/04/2023]
Abstract
We propose a comprehensive method for reconstructing the whole-genome chromatin ensemble from the Hi-C data. The procedure starts from Markov state modeling (MSM), delineating the structural hierarchy of chromatin organization with partitioning and effective interactions archetypal for corresponding levels of hierarchy. The stochastic embedding procedure introduced in this work provides the 3D ensemble reconstruction, using effective interactions obtained by the MSM as the input. As a result, we obtain the structural ensemble of a genome, allowing one to model the functional and the cell-type variability in the chromatin structure. The whole-genome reconstructions performed on the human B lymphoblastoid (GM12878) and lung fibroblast (IMR90) Hi-C data unravel distinctions in their morphologies and in the spatial arrangement of intermingling chromosomal territories, paving the way to studies of chromatin dynamics, developmental changes, and conformational transitions taking place in normal cells and during potential pathological developments.
Collapse
Affiliation(s)
- Enrico Guarnera
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A(∗)STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore
| | - Zhen Wah Tan
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A(∗)STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore
| | - Igor N Berezovsky
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A(∗)STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore; Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, Singapore 117597, Singapore.
| |
Collapse
|
7
|
Abstract
Hi-C data is important for studying chromatin three-dimensional structure. However, the resolution of most existing Hi-C data is generally coarse due to sequencing cost. Therefore, it will be helpful if we can predict high-resolution Hi-C data from low-coverage sequencing data. Here we developed a novel and simple computational method based on deep learning named super-resolution Hi-C (SRHiC) to enhance the resolution of Hi-C data. We verified SRHiC on Hi-C data in human cell line. We also evaluated the generalization power of SRHiC by enhancing Hi-C data resolution in other human and mouse cell types. Results showed that SRHiC outperforms the state-of-the-art methods in accuracy of prediction.
Collapse
Affiliation(s)
- Zhilan Li
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Zhiming Dai
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China.,Guangdong Province Key Laboratory of Big Data Analysis and Processing, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
8
|
Abstract
Hi-C has been predominately used to study the genome-wide interactions of genomes. In Hi-C experiments, it is believed that biases originating from different systematic deviations lead to extraneous variability among raw samples, and affect the reliability of downstream interpretations. As an important pipeline in Hi-C analysis, normalization seeks to remove the unwanted systematic biases; thus, a comparison between Hi-C normalization methods benefits their choice and the downstream analysis. In this article, a comprehensive comparison is proposed to investigate six Hi-C normalization methods in terms of multiple considerations. In light of comparison results, it has been shown that a cross-sample approach significantly outperforms individual sample methods in most considerations. The differences between these methods are analyzed, some practical recommendations are given, and the results are summarized in a table to facilitate the choice of the six normalization methods. The source code for the implementation of these methods is available at https://github.com/lhqxinghun/bioinformatics/tree/master/Hi-C/NormCompare.
Collapse
|
9
|
Abstract
The three-dimensional organisation of the genome plays a crucial role in developmental gene regulation. In recent years, techniques to investigate this organisation have become more accessible to labs worldwide due to improvements in protocols and decreases in the cost of high-throughput sequencing. However, the resulting datasets are complex and can be challenging to analyse and interpret. Here, we provide a guide to visualisation approaches that can aid the interpretation of such datasets and the communication of biological results.
Collapse
Affiliation(s)
- Elizabeth Ing-Simmons
- Max Planck Institute for Molecular Biomedicine, Roentgenstrasse 20, DE-48149 Muenster, Germany
| | - Juan M Vaquerizas
- Max Planck Institute for Molecular Biomedicine, Roentgenstrasse 20, DE-48149 Muenster, Germany
| |
Collapse
|
10
|
Abstract
Chromosome conformation capture experiments such as Hi-C are used to map the three-dimensional spatial organization of genomes. One specific feature of the 3D organization is known as topologically associating domains (TADs), which are densely interacting, contiguous chromatin regions playing important roles in regulating gene expression. A few algorithms have been proposed to detect TADs. In particular, the structure of Hi-C data naturally inspires application of community detection methods. However, one of the drawbacks of community detection is that most methods take exchangeability of the nodes in the network for granted; whereas the nodes in this case, that is, the positions on the chromosomes, are not exchangeable. We propose a network model for detecting TADs using Hi-C data that takes into account this nonexchangeability. in addition, our model explicitly makes use of cell-type specific CTCF binding sites as biological covariates and can be used to identify conserved TADs across multiple cell types. The model leads to a likelihood objective that can be efficiently optimized via relaxation. We also prove that when suitably initialized, this model finds the underlying TAD structure with high probability. using simulated data, we show the advantages of our method and the caveats of popular community detection methods, such as spectral clustering, in this application. Applying our method to real Hi-C data, we demonstrate the domains identified have desirable epigenetic features and compare them across different cell types.
Collapse
|
11
|
Abstract
Background Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. Results We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. Conclusions By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Collapse
Affiliation(s)
- Tizian Schulz
- Faculty of Technology and CeBiTec, Bielefeld University, Universitätsstr. 25, Bielefeld, 33615, Germany.,International Research Training Group 1906 "Computational Methods for the Analysis of the Diversity and Dynamics of Genomes", Universitätsstr. 25, Bielefeld, 33615, Germany
| | - Jens Stoye
- Faculty of Technology and CeBiTec, Bielefeld University, Universitätsstr. 25, Bielefeld, 33615, Germany
| | - Daniel Doerr
- Faculty of Technology and CeBiTec, Bielefeld University, Universitätsstr. 25, Bielefeld, 33615, Germany.
| |
Collapse
|
12
|
Zhan Y, Giorgetti L, Tiana G. Modelling genome-wide topological associating domains in mouse embryonic stem cells. Chromosome Res 2017; 25:5-14. [PMID: 28108933 DOI: 10.1007/s10577-016-9544-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Revised: 12/12/2016] [Accepted: 12/19/2016] [Indexed: 01/21/2023]
Abstract
Chromosome conformation capture (3C)-based techniques such as chromosome conformation capture carbon copy (5C) and Hi-C revealed that the folding of mammalian chromosomes is highly hierarchical. A fundamental structural unit in the hierarchy is represented by topologically associating domains (TADs), sub-megabase regions of the genome within which the chromatin fibre preferentially interacts. 3C-based methods provide the mean contact probabilities between chromosomal loci, averaged over a large number of cells, and do not give immediate access to the single-cell conformations of the chromatin fibre. However, coarse-grained polymer models based on 5C data can be used to extract the single-cell conformations of single TADs. Here, we extend this approach to analyse around 2500 TADs in murine embryonic stem cells based on high-resolution Hi-C data. This allowed to predict the cell-to-cell variability in single contacts within genome-wide TADs and correlations between them. Based on these results, we predict that TADs are more similar to ideal chains than to globules in terms of their physical size and three-dimensional shape distribution. Furthermore, we show that their physical size and the degree of structural anisotropy of single TADs are correlated with the level of transcriptional activity of the genes that it harbours. Finally, we show that a large number of multiplets of genomic loci co-localize more often than expected by random, and these loci are particularly enriched in promoters, enhancers and CTCF-bound sites. These results provide the first genome-wide structural reconstruction of TADs using polymeric models obeying the laws of thermodynamics and reveal important universal trends in the correlation between chromosome structure and transcription.
Collapse
Affiliation(s)
- Y Zhan
- Friedrich Miescher Institute for Biomedical Research, CH-4058, Basel, Switzerland
| | - L Giorgetti
- Friedrich Miescher Institute for Biomedical Research, CH-4058, Basel, Switzerland.
| | - G Tiana
- Center for Complexity and Biosystems and Department of Physics, Università degli Studi di Milano and INFN, I-20133, Milan, Italy.
| |
Collapse
|
13
|
Park J, Lin S. A random effect model for reconstruction of spatial chromatin structure. Biometrics 2016; 73:52-62. [PMID: 27214023 DOI: 10.1111/biom.12544] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2015] [Revised: 03/01/2016] [Accepted: 04/01/2016] [Indexed: 11/28/2022]
Abstract
A gene may be controlled by distal enhancers and repressors, not merely by regulatory elements in its promoter. Spatial organization of chromosomes is the mechanism that brings genes and their distal regulatory elements into close proximity. Recent molecular techniques, coupled with Next Generation Sequencing (NGS) technology, enable genome-wide detection of physical contacts between distant genomic loci. In particular, Hi-C is an NGS-aided assay for the study of genome-wide spatial interactions. The availability of such data makes it possible to reconstruct the underlying three-dimensional (3D) spatial chromatin structure. In this article, we present the Poisson Random effect Architecture Model (PRAM) for such an inference. The main feature of PRAM that separates it from previous methods is that it addresses the issue of over-dispersion and takes correlations among contact counts into consideration, thereby achieving greater consistency with observed data. PRAM was applied to Hi-C data to illustrate its performance and to compare the predicted distances with those measured by a Fluorescence In Situ Hybridization (FISH) validation experiment. Further, PRAM was compared to other methods in the literature based on both real and simulated data.
Collapse
Affiliation(s)
- Jincheol Park
- Department of Statistics, Keimyung University, Daegu, South Korea.,Department of Statistics, The Ohio State University, Columbus, Ohio 43210, U.S.A.,Mathematical Biosciences Institute, The Ohio State University, Columbus, Ohio 43210, U.S.A
| | - Shili Lin
- Department of Statistics, The Ohio State University, Columbus, Ohio 43210, U.S.A.,Mathematical Biosciences Institute, The Ohio State University, Columbus, Ohio 43210, U.S.A
| |
Collapse
|
14
|
Xu Z, Zhang G, Duan Q, Chai S, Zhang B, Wu C, Jin F, Yue F, Li Y, Hu M. HiView: an integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. BMC Res Notes 2016; 9:159. [PMID: 26969411 PMCID: PMC4788823 DOI: 10.1186/s13104-016-1947-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 02/22/2016] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits and diseases. However, most of them are located in the non-protein coding regions, and therefore it is challenging to hypothesize the functions of these non-coding GWAS variants. Recent large efforts such as the ENCODE and Roadmap Epigenomics projects have predicted a large number of regulatory elements. However, the target genes of these regulatory elements remain largely unknown. Chromatin conformation capture based technologies such as Hi-C can directly measure the chromatin interactions and have generated an increasingly comprehensive catalog of the interactome between the distal regulatory elements and their potential target genes. Leveraging such information revealed by Hi-C holds the promise of elucidating the functions of genetic variants in human diseases. RESULTS In this work, we present HiView, the first integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. HiView is able to display Hi-C data and statistical evidence for chromatin interactions in genomic regions surrounding any given GWAS variant, enabling straightforward visualization and interpretation. CONCLUSIONS We believe that as the first GWAS variants-centered Hi-C genome browser, HiView is a useful tool guiding post-GWAS functional genomics studies. HiView is freely accessible at: http://www.unc.edu/~yunmli/HiView .
Collapse
Affiliation(s)
- Zheng Xu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, 27599, USA.,Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA.,Department of Computer Science, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Guosheng Zhang
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Qing Duan
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Shengjie Chai
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Baqun Zhang
- School of Statistics, Renmin University of China, Beijing, 100872, China
| | - Cong Wu
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Fulai Jin
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, 27599, USA. .,Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA. .,Department of Computer Science, University of North Carolina, Chapel Hill, NC, 27599, USA.
| | - Ming Hu
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY, 10016, USA.
| |
Collapse
|