1
|
Tavallaee G, Orouji E. Mapping the 3D genome architecture. Comput Struct Biotechnol J 2024; 27:89-101. [PMID: 39816913 PMCID: PMC11732852 DOI: 10.1016/j.csbj.2024.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 12/17/2024] [Accepted: 12/20/2024] [Indexed: 01/18/2025] Open
Abstract
The spatial organization of the genome plays a critical role in regulating gene expression, cellular differentiation, and genome stability. This review provides an in-depth examination of the methodologies, computational tools, and frameworks developed to map the three-dimensional (3D) architecture of the genome, focusing on both ligation-based and ligation-free techniques. We also explore the limitations of these methods, including biases introduced by restriction enzyme digestion and ligation inefficiencies, and compare them to more recent ligation-free approaches such as Genome Architecture Mapping (GAM) and Split-Pool Recognition of Interactions by Tag Extension (SPRITE). These techniques offer unique insights into higher-order chromatin structures by bypassing ligation steps, thus enabling the capture of complex multi-way interactions that are often challenging to resolve with traditional methods. Furthermore, we discuss the integration of chromatin interaction data with other genomic layers through multimodal approaches, including recent advances in single-cell technologies like sci-HiC and scSPRITE, which help unravel the heterogeneity of chromatin architecture in development and disease.
Collapse
Affiliation(s)
- Ghazaleh Tavallaee
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Elias Orouji
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| |
Collapse
|
2
|
Pérez-de Los Santos FJ, Sotelo-Fonseca JE, Ramírez-Colmenero A, Nützmann HW, Fernandez-Valverde SL, Oktaba K. Plant In Situ Hi-C Experimental Protocol and Bioinformatic Analysis. Methods Mol Biol 2022; 2512:217-247. [PMID: 35818008 DOI: 10.1007/978-1-0716-2429-6_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Hi-C enables the characterization of the 0conformation of the genome in the three-dimensional nuclear space. This technique has revolutionized our ability to detect interactions between linearly distant genomic sites on a genome-wide scale. Here, we detail a protocol to carry out in situ Hi-C in plants and describe a straightforward bioinformatics pipeline for the analysis of such data, in particular for comparing samples from different organs or conditions.
Collapse
Affiliation(s)
- Francisco J Pérez-de Los Santos
- Unidad de Genómica Avanzada, Langebio, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico
| | - Jesús Emiliano Sotelo-Fonseca
- Unidad de Genómica Avanzada, Langebio, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico
- Unidad Irapuato, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico
| | - América Ramírez-Colmenero
- Unidad de Genómica Avanzada, Langebio, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico
- Unidad Irapuato, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico
| | - Hans-Wilhelm Nützmann
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, UK.
| | - Selene L Fernandez-Valverde
- Unidad de Genómica Avanzada, Langebio, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico.
| | - Katarzyna Oktaba
- Unidad Irapuato, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Guanajuato, Mexico.
| |
Collapse
|
3
|
Galan S, Machnik N, Kruse K, Díaz N, Marti-Renom MA, Vaquerizas JM. CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction. Nat Genet 2020; 52:1247-1255. [PMID: 33077914 PMCID: PMC7610641 DOI: 10.1038/s41588-020-00712-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 09/04/2020] [Indexed: 12/11/2022]
Abstract
Dynamic changes in the three-dimensional (3D) organization of chromatin are associated with central biological processes, such as transcription, replication and development. Therefore, the comprehensive identification and quantification of these changes is fundamental to understanding of evolutionary and regulatory mechanisms. Here, we present Comparison of Hi-C Experiments using Structural Similarity (CHESS), an algorithm for the comparison of chromatin contact maps and automatic differential feature extraction. We demonstrate the robustness of CHESS to experimental variability and showcase its biological applications on (1) interspecies comparisons of syntenic regions in human and mouse models; (2) intraspecies identification of conformational changes in Zelda-depleted Drosophila embryos; (3) patient-specific aberrant chromatin conformation in a diffuse large B-cell lymphoma sample; and (4) the systematic identification of chromatin contact differences in high-resolution Capture-C data. In summary, CHESS is a computationally efficient method for the comparison and classification of changes in chromatin contact data.
Collapse
Affiliation(s)
- Silvia Galan
- Max Planck Institute for Molecular Biomedicine, Münster, Germany
- National Centre for Genomic Analysis, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Nick Machnik
- Max Planck Institute for Molecular Biomedicine, Münster, Germany
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Kai Kruse
- Max Planck Institute for Molecular Biomedicine, Münster, Germany
| | - Noelia Díaz
- Max Planck Institute for Molecular Biomedicine, Münster, Germany
| | - Marc A Marti-Renom
- National Centre for Genomic Analysis, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
- Pompeu Fabra University, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies, Barcelona, Spain
| | - Juan M Vaquerizas
- Max Planck Institute for Molecular Biomedicine, Münster, Germany.
- Medical Research Council London Institute of Medical Sciences, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|
4
|
Bulathsinghalage C, Liu L. Network-based method for regions with statistically frequent interchromosomal interactions at single-cell resolution. BMC Bioinformatics 2020; 21:369. [PMID: 32998686 PMCID: PMC7526258 DOI: 10.1186/s12859-020-03689-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Chromosome conformation capture-based methods, especially Hi-C, enable scientists to detect genome-wide chromatin interactions and study the spatial organization of chromatin, which plays important roles in gene expression regulation, DNA replication and repair etc. Thus, developing computational methods to unravel patterns behind the data becomes critical. Existing computational methods focus on intrachromosomal interactions and ignore interchromosomal interactions partly because there is no prior knowledge for interchromosomal interactions and the frequency of interchromosomal interactions is much lower while the search space is much larger. With the development of single-cell technologies, the advent of single-cell Hi-C makes interrogating the spatial structure of chromatin at single-cell resolution possible. It also brings a new type of frequency information, the number of single cells with chromatin interactions between two disjoint chromosome regions. RESULTS Considering the lack of computational methods on interchromosomal interactions and the unsurprisingly frequent intrachromosomal interactions along the diagonal of a chromatin contact map, we propose a computational method dedicated to analyzing interchromosomal interactions of single-cell Hi-C with this new frequency information. To the best of our knowledge, our proposed tool is the first to identify regions with statistically frequent interchromosomal interactions at single-cell resolution. We demonstrate that the tool utilizing networks and binomial statistical tests can identify interesting structural regions through visualization, comparison and enrichment analysis and it also supports different configurations to provide users with flexibility. CONCLUSIONS It will be a useful tool for analyzing single-cell Hi-C interchromosomal interactions.
Collapse
Affiliation(s)
| | - Lu Liu
- North Dakota State University, 1340 Administration Ave, Fargo, 58102, USA.
| |
Collapse
|
5
|
Fernandez LR, Gilgenast TG, Phillips-Cremins JE. 3DeFDR: statistical methods for identifying cell type-specific looping interactions in 5C and Hi-C data. Genome Biol 2020; 21:219. [PMID: 32859248 PMCID: PMC7496221 DOI: 10.1186/s13059-020-02061-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 05/27/2020] [Indexed: 11/18/2022] Open
Abstract
An important unanswered question in chromatin biology is the extent to which long-range looping interactions change across developmental models, genetic perturbations, drug treatments, and disease states. Computational tools for rigorous assessment of cell type-specific loops across multiple biological conditions are needed. We present 3DeFDR, a simple and effective statistical tool for classifying dynamic loops across biological conditions from Chromosome-Conformation-Capture-Carbon-Copy (5C) and Hi-C data. Our work provides a statistical framework and open-source coding libraries for sensitive detection of cell type-specific loops in high-resolution 5C and Hi-C data from multiple cellular conditions.
Collapse
Affiliation(s)
- Lindsey R Fernandez
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Thomas G Gilgenast
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jennifer E Phillips-Cremins
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, 19104, USA. .,Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA. .,Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
6
|
Stansfield JC, Cresswell KG, Dozmorov MG. multiHiCcompare: joint normalization and comparative analysis of complex Hi-C experiments. Bioinformatics 2020; 35:2916-2923. [PMID: 30668639 DOI: 10.1093/bioinformatics/btz048] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Revised: 12/14/2018] [Accepted: 01/17/2019] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION With the development of chromatin conformation capture technology and its high-throughput derivative Hi-C sequencing, studies of the three-dimensional interactome of the genome that involve multiple Hi-C datasets are becoming available. To account for the technology-driven biases unique to each dataset, there is a distinct need for methods to jointly normalize multiple Hi-C datasets. Previous attempts at removing biases from Hi-C data have made use of techniques which normalize individual Hi-C datasets, or, at best, jointly normalize two datasets. RESULTS Here, we present multiHiCcompare, a cyclic loess regression-based joint normalization technique for removing biases across multiple Hi-C datasets. In contrast to other normalization techniques, it properly handles the Hi-C-specific decay of chromatin interaction frequencies with the increasing distance between interacting regions. multiHiCcompare uses the general linear model framework for comparative analysis of multiple Hi-C datasets, adapted for the Hi-C-specific decay of chromatin interaction frequencies. multiHiCcompare outperforms other methods when detecting a priori known chromatin interaction differences from jointly normalized datasets. Applied to the analysis of auxin-treated versus untreated experiments, and CTCF depletion experiments, multiHiCcompare was able to recover the expected epigenetic and gene expression signatures of loss of chromatin interactions and reveal novel insights. AVAILABILITY AND IMPLEMENTATION multiHiCcompare is freely available on GitHub and as a Bioconductor R package https://bioconductor.org/packages/multiHiCcompare. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- John C Stansfield
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
| | - Kellen G Cresswell
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
| | - Mikhail G Dozmorov
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
7
|
Cook KB, Hristov BH, Le Roch KG, Vert JP, Noble WS. Measuring significant changes in chromatin conformation with ACCOST. Nucleic Acids Res 2020; 48:2303-2311. [PMID: 32034421 PMCID: PMC7049724 DOI: 10.1093/nar/gkaa069] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 01/17/2020] [Accepted: 02/03/2020] [Indexed: 12/17/2022] Open
Abstract
Chromatin conformation assays such as Hi-C cannot directly measure differences in 3D architecture between cell types or cell states. For this purpose, two or more Hi-C experiments must be carried out, but direct comparison of the resulting Hi-C matrices is confounded by several features of Hi-C data. Most notably, the genomic distance effect, whereby contacts between pairs of genomic loci that are proximal along the chromosome exhibit many more Hi-C contacts that distal pairs of loci, dominates every Hi-C matrix. Furthermore, the form that this distance effect takes often varies between different Hi-C experiments, even between replicate experiments. Thus, a statistical confidence measure designed to identify differential Hi-C contacts must accurately account for the genomic distance effect or risk being misled by large-scale but artifactual differences. ACCOST (Altered Chromatin COnformation STatistics) accomplishes this goal by extending the statistical model employed by DEseq, re-purposing the ‘size factors,’ which were originally developed to account for differences in read depth between samples, to instead model the genomic distance effect. We show via analysis of simulated and real data that ACCOST provides unbiased statistical confidence estimates that compare favorably with competing methods such as diffHiC, FIND and HiCcompare. ACCOST is freely available with an Apache license at https://bitbucket.org/noblelab/accost.
Collapse
Affiliation(s)
- Kate B Cook
- Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, USA
| | - Borislav H Hristov
- Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, USA
| | - Karine G Le Roch
- Department of Cell Biology, University of California, Riverside, CA 92521, USA
| | - Jean Philippe Vert
- Google Brain, Paris, 75009, France.,Centre for Computational Biology, MINES ParisTech, PSL University, Paris, 75009, France
| | - William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, USA.,Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2355, USA
| |
Collapse
|
8
|
de Anda-Jáuregui G, Hernández-Lemus E. Computational Oncology in the Multi-Omics Era: State of the Art. Front Oncol 2020; 10:423. [PMID: 32318338 PMCID: PMC7154096 DOI: 10.3389/fonc.2020.00423] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 03/10/2020] [Indexed: 12/24/2022] Open
Abstract
Cancer is the quintessential complex disease. As technologies evolve faster each day, we are able to quantify the different layers of biological elements that contribute to the emergence and development of malignancies. In this multi-omics context, the use of integrative approaches is mandatory in order to gain further insights on oncological phenomena, and to move forward toward the precision medicine paradigm. In this review, we will focus on computational oncology as an integrative discipline that incorporates knowledge from the mathematical, physical, and computational fields to further the biomedical understanding of cancer. We will discuss the current roles of computation in oncology in the context of multi-omic technologies, which include: data acquisition and processing; data management in the clinical and research settings; classification, diagnosis, and prognosis; and the development of models in the research setting, including their use for therapeutic target identification. We will discuss the machine learning and network approaches as two of the most promising emerging paradigms, in computational oncology. These approaches provide a foundation on how to integrate different layers of biological description into coherent frameworks that allow advances both in the basic and clinical settings.
Collapse
Affiliation(s)
- Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Cátedras Conacyt Para Jóvenes Investigadores, National Council on Science and Technology, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
9
|
Tang B, Li F, Li J, Zhao W, Zhang Z. Delta: a new web-based 3D genome visualization and analysis platform. Bioinformatics 2019; 34:1409-1410. [PMID: 29253110 DOI: 10.1093/bioinformatics/btx805] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Accepted: 12/14/2017] [Indexed: 11/13/2022] Open
Abstract
Summary Delta is an integrative visualization and analysis platform to facilitate visually annotating and exploring the 3D physical architecture of genomes. Delta takes Hi-C or ChIA-PET contact matrix as input and predicts the topologically associating domains and chromatin loops in the genome. It then generates a physical 3D model which represents the plausible consensus 3D structure of the genome. Delta features a highly interactive visualization tool which enhances the integration of genome topology/physical structure with extensive genome annotation by juxtaposing the 3D model with diverse genomic assay outputs. Finally, by visually comparing the 3D model of the β-globin gene locus and its annotation, we speculated a plausible transitory interaction pattern in the locus. Experimental evidence was found to support this speculation by literature survey. This served as an example of intuitive hypothesis testing with the help of Delta. Availability and implementation Delta is freely accessible from http://delta.big.ac.cn, and the source code is available at https://github.com/zhangzhwlab/delta. Contact zhangzhihua@big.ac.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bixia Tang
- CAS Key Laboratory of Genome Sciences and Information, Chinese Academy of Sciences, Beijing 101300, China.,BIG Data Center (BIGD), Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 101300, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Feifei Li
- CAS Key Laboratory of Genome Sciences and Information, Chinese Academy of Sciences, Beijing 101300, China
| | - Jing Li
- CAS Key Laboratory of Genome Sciences and Information, Chinese Academy of Sciences, Beijing 101300, China
| | - Wenming Zhao
- BIG Data Center (BIGD), Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 101300, China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Chinese Academy of Sciences, Beijing 101300, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
10
|
Durand NC, Robinson JT, Shamim MS, Machol I, Mesirov JP, Lander ES, Aiden EL. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst 2019; 3:99-101. [PMID: 27467250 DOI: 10.1016/j.cels.2015.07.012] [Citation(s) in RCA: 1238] [Impact Index Per Article: 206.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Revised: 07/23/2015] [Accepted: 07/29/2015] [Indexed: 10/21/2022]
Abstract
Hi-C experiments study how genomes fold in 3D, generating contact maps containing features as small as 20 bp and as large as 200 Mb. Here we introduce Juicebox, a tool for exploring Hi-C and other contact map data. Juicebox allows users to zoom in and out of Hi-C maps interactively, just as a user of Google Earth might zoom in and out of a geographic map. Maps can be compared to one another, or to 1D tracks or 2D feature sets.
Collapse
Affiliation(s)
- Neva C Durand
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005, USA; Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - James T Robinson
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA; Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - Muhammad S Shamim
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005, USA
| | - Ido Machol
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005, USA
| | - Jill P Mesirov
- Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | - Eric S Lander
- Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA; Department of Biology, MIT, Cambridge, MA 02139, USA; Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Erez Lieberman Aiden
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005, USA; Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA; Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA.
| |
Collapse
|
11
|
Dozmorov MG. Epigenomic annotation-based interpretation of genomic data: from enrichment analysis to machine learning. Bioinformatics 2018; 33:3323-3330. [PMID: 29028263 DOI: 10.1093/bioinformatics/btx414] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 06/22/2017] [Indexed: 12/12/2022] Open
Abstract
Motivation One of the goals of functional genomics is to understand the regulatory implications of experimentally obtained genomic regions of interest (ROIs). Most sequencing technologies now generate ROIs distributed across the whole genome. The interpretation of these genome-wide ROIs represents a challenge as the majority of them lie outside of functionally well-defined protein coding regions. Recent efforts by the members of the International Human Epigenome Consortium have generated volumes of functional/regulatory data (reference epigenomic datasets), effectively annotating the genome with epigenomic properties. Consequently, a wide variety of computational tools has been developed utilizing these epigenomic datasets for the interpretation of genomic data. Results The purpose of this review is to provide a structured overview of practical solutions for the interpretation of ROIs with the help of epigenomic data. Starting with epigenomic enrichment analysis, we discuss leading tools and machine learning methods utilizing epigenomic and 3D genome structure data. The hierarchy of tools and methods reviewed here presents a practical guide for the interpretation of genome-wide ROIs within an epigenomic context. Contact mikhail.dozmorov@vcuhealth.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mikhail G Dozmorov
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298, USA
| |
Collapse
|
12
|
Li R, Liu Y, Hou Y, Gan J, Wu P, Li C. 3D genome and its disorganization in diseases. Cell Biol Toxicol 2018; 34:351-365. [DOI: 10.1007/s10565-018-9430-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2017] [Accepted: 03/26/2018] [Indexed: 01/25/2023]
|
13
|
Waldispühl J, Zhang E, Butyaev A, Nazarova E, Cyr Y. Storage, visualization, and navigation of 3D genomics data. Methods 2018; 142:74-80. [PMID: 29792917 DOI: 10.1016/j.ymeth.2018.05.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 05/07/2018] [Accepted: 05/09/2018] [Indexed: 01/27/2023] Open
Abstract
The field of 3D genomics grew at increasing rates in the last decade. The volume and complexity of 2D and 3D data produced is progressively outpacing the capacities of the technology previously used for distributing genome sequences. The emergence of new technologies provides also novel opportunities for the development of innovative approaches. In this paper, we review the state-of-the-art computing technology, as well as the solutions adopted by the platforms currently available.
Collapse
Affiliation(s)
| | - Eric Zhang
- School of Computer Science, McGill University, Montréal, Canada
| | | | - Elena Nazarova
- School of Computer Science, McGill University, Montréal, Canada
| | - Yan Cyr
- Beam Me Up Labs, Montréal, Canada
| |
Collapse
|
14
|
Djekidel MN, Chen Y, Zhang MQ. FIND: difFerential chromatin INteractions Detection using a spatial Poisson process. Genome Res 2018; 28:412-422. [PMID: 29440282 PMCID: PMC5848619 DOI: 10.1101/gr.212241.116] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 01/08/2018] [Indexed: 12/11/2022]
Abstract
Polymer-based simulations and experimental studies indicate the existence of a spatial dependency between the adjacent DNA fibers involved in the formation of chromatin loops. However, the existing strategies for detecting differential chromatin interactions assume that the interacting segments are spatially independent from the other segments nearby. To resolve this issue, we developed a new computational method, FIND, which considers the local spatial dependency between interacting loci. FIND uses a spatial Poisson process to detect differential chromatin interactions that show a significant difference in their interaction frequency and the interaction frequency of their neighbors. Simulation and biological data analysis show that FIND outperforms the widely used count-based methods and has a better signal-to-noise ratio.
Collapse
Affiliation(s)
- Mohamed Nadhir Djekidel
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for Synthetic and System Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yang Chen
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for Synthetic and System Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China
| | - Michael Q Zhang
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for Synthetic and System Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China
- Department of Molecular and Cell Biology, Center for Systems Biology, The University of Texas, Dallas, Richardson, Texas 75080-3021, USA
| |
Collapse
|
15
|
Yu S, Lemos B. The long-range interaction map of ribosomal DNA arrays. PLoS Genet 2018; 14:e1007258. [PMID: 29570716 PMCID: PMC5865718 DOI: 10.1371/journal.pgen.1007258] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2017] [Accepted: 02/15/2018] [Indexed: 11/28/2022] Open
Abstract
The repeated rDNA array gives rise to the nucleolus, an organelle that is central to cellular processes as varied as stress response, cell cycle regulation, RNA modification, cell metabolism, and genome stability. The rDNA array is also responsible for the production of more than 70% of all cellular RNAs (the ribosomal RNAs). The rRNAs are produced from two sets of loci: the 5S rDNA array resides exclusively on human chromosome 1 while the 45S rDNA arrays reside on the short arm of five human acrocentric chromosomes. These critical genome elements have remained unassembled and have been excluded from all Hi-C analyses to date. Here we built the first high resolution map of 5S and 45S rDNA array contacts with the rest of the genome combining over 15 billion Hi-C reads from several experiments. The data enabled sufficiently high coverage to map rDNA-genome interactions with 1MB resolution and identify rDNA-gene contacts. The map showed that the 5S and 45S arrays display preferential contact at common sites along the genome but are not themselves sufficiently close to yield 5S-45S Hi-C contacts. Ribosomal DNA contacts are enriched in segments of closed, repressed, and late replicating chromatin, as well as CTCF binding sites. Finally, we identified functional categories whose dispersed genes coalesced in proximity to the rDNA arrays or instead avoided proximity with the rDNA arrays. The observations further our understanding of the spatial localization of rDNA arrays and their contribution to the architecture of the cell nucleus.
Collapse
Affiliation(s)
- Shoukai Yu
- Program in Molecular and Integrative Physiological Sciences & Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA, United States of America
| | - Bernardo Lemos
- Program in Molecular and Integrative Physiological Sciences & Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA, United States of America
| |
Collapse
|
16
|
Jamge S, Stam M, Angenent GC, Immink RGH. A cautionary note on the use of chromosome conformation capture in plants. PLANT METHODS 2017; 13:101. [PMID: 29177001 PMCID: PMC5691870 DOI: 10.1186/s13007-017-0251-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 11/08/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND The chromosome conformation capture (3C) technique is a method to study chromatin interactions at specific genomic loci. Initially established for yeast the 3C technique has been adapted to plants in recent years in order to study chromatin interactions and their role in transcriptional gene regulation. As the plant scientific community continues to implement this technology, a discussion on critical controls, validations steps and interpretation of 3C data is essential to fully benefit from 3C in plants. RESULTS Here we assess the reliability and robustness of the 3C technique for the detection of chromatin interactions in Arabidopsis. As a case study, we applied this methodology to the genomic locus of a floral integrator gene SUPPRESSOR OF OVEREXPRESSION OF CONSTANS1 (SOC1), and demonstrate the need of several controls and standard validation steps to allow a meaningful interpretation of 3C data. The intricacies of this promising but challenging technique are discussed in depth. CONCLUSIONS The 3C technique offers an interesting opportunity to study chromatin interactions at a resolution infeasible by microscopy. However, for interpretation of 3C interaction data and identification of true interactions, 3C technology demands a stringent experimental setup and extreme caution.
Collapse
Affiliation(s)
- Suraj Jamge
- Laboratory of Molecular Biology, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | - Maike Stam
- Swammerdam Institute for Life Sciences, Universiteit van Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
| | - Gerco C. Angenent
- Laboratory of Molecular Biology, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
- Wageningen Plant Research, Bioscience, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | - Richard G. H. Immink
- Laboratory of Molecular Biology, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
- Wageningen Plant Research, Bioscience, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| |
Collapse
|
17
|
Orlov YL, Thierry O, Bogomolov AG, Tsukanov AV, Kulakova EV, Galieva ER, Bragin AO, Li G. [Computer methods of analysis of chromosome contacts in the cell nucleus based on sequencing technology data]. BIOMEDIT︠S︡INSKAI︠A︡ KHIMII︠A︡ 2017; 63:418-422. [PMID: 29080874 DOI: 10.18097/pbmc20176305418] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The study spatial chromosome structure and chromosome folding in the interphase cell nucleus is an important challenge of world science. Detection of eukaryotic genome regions that physically interact with each other could be done by modern sequencing technologies. A basic method of chromosome folding by total sequencing of contacting DNA fragments is HI-C. Long-range chromosomal interactions play an important role in gene transcription and regulation. The study of chromosome interactions, 3D (three-dimensional) genome structure and its effect on gene transcription allows revealing fundamental biological processes from a viewpoint of structural regulation and are important for cancer research. The technique of chromatin immunoprecipitation and subsequent sequencing (ChIP-seq) make possible to determine binding sites of transcription factors that regulate expression of eukaryotic genes; genome transcription factors binding maps have been. The ChIA-PET technology allows exploring not only target protein binding sites, but also pairs of such sites on proximally located and interacting with each other chromosomes co-located in three-dimensional space of the cell nucleus. Here we discuss the principles of the construction of genomic maps and matrices of chromosome contacts according to ChIA-PET and Hi-C data that capture the chromosome conformation and overview existing software for 3D genome analysis including in house programs of gene location analysis in topological domains.
Collapse
Affiliation(s)
- Y L Orlov
- Novosibirsk State University, Novosibirsk, Russia; Marine Biology Research Institute, Sevastopol, Russia
| | - O Thierry
- Novosibirsk State University, Novosibirsk, Russia; University of Bordeaux, Bordeaux, France
| | - A G Bogomolov
- Novosibirsk State University, Novosibirsk, Russia; Institute of Cytology and Genetics, Novosibirsk, Russia
| | - A V Tsukanov
- Novosibirsk State University, Novosibirsk, Russia
| | - E V Kulakova
- Novosibirsk State University, Novosibirsk, Russia
| | - E R Galieva
- Novosibirsk State University, Novosibirsk, Russia
| | - A O Bragin
- Institute of Cytology and Genetics, Novosibirsk, Russia
| | - G Li
- Huazhong Agricultural University, Wuhan, Hubei, China
| |
Collapse
|
18
|
Liu L, Ruan J. Utilizing networks for differential analysis of chromatin interactions. J Bioinform Comput Biol 2017; 15:1740008. [PMID: 29113562 DOI: 10.1142/s021972001740008x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Chromatin conformation capture with high-throughput sequencing (Hi-C) is a powerful technique to detect genome-wide chromatin interactions. In this paper, we introduce two novel approaches to detect differentially interacting genomic regions between two Hi-C experiments using a network model. To make input data from multiple experiments comparable, we propose a normalization strategy guided by network topological properties. We then devise two measurements, using local and global connectivity information from the chromatin interaction networks, respectively, to assess the interaction differences between two experiments. When multiple replicates are present in experiments, our approaches provide the flexibility for users to either pool all replicates together to therefore increase the network coverage, or to use the replicates in parallel to increase the signal to noise ratio. We show that while the local method works better in detecting changes from simulated networks, the global method performs better on real Hi-C data. The local and global methods, regardless of pooling, are always superior to two existing methods. Furthermore, our methods work well on both unweighted and weighted networks and our normalization strategy significantly improves the performance compared with raw networks without normalization. Therefore, we believe our methods will be useful for identifying differentially interacting genomic regions.
Collapse
Affiliation(s)
- Lu Liu
- * College of Information Technology and Engineering, Marshall University, One John Marshall Drive, Huntington, WV 25755, USA
| | - Jianhua Ruan
- † Department of Computer Science, The University of Texas at San Antonio, One UTSA Circle, San Antonio, Texas 78249, USA
| |
Collapse
|
19
|
Kumar R, Sobhy H, Stenberg P, Lizana L. Genome contact map explorer: a platform for the comparison, interactive visualization and analysis of genome contact maps. Nucleic Acids Res 2017; 45:e152. [PMID: 28973466 PMCID: PMC5622372 DOI: 10.1093/nar/gkx644] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2016] [Accepted: 07/19/2017] [Indexed: 12/23/2022] Open
Abstract
Hi-C experiments generate data in form of large genome contact maps (Hi-C maps). These show that chromosomes are arranged in a hierarchy of three-dimensional compartments. But to understand how these compartments form and by how much they affect genetic processes such as gene regulation, biologists and bioinformaticians need efficient tools to visualize and analyze Hi-C data. However, this is technically challenging because these maps are big. In this paper, we remedied this problem, partly by implementing an efficient file format and developed the genome contact map explorer platform. Apart from tools to process Hi-C data, such as normalization methods and a programmable interface, we made a graphical interface that let users browse, scroll and zoom Hi-C maps to visually search for patterns in the Hi-C data. In the software, it is also possible to browse several maps simultaneously and plot related genomic data. The software is openly accessible to the scientific community.
Collapse
Affiliation(s)
- Rajendra Kumar
- Integrated Science Lab, Umeå University, 901 87, Umeå, Sweden.,Department of Physics, Umeå University, 901 87, Umeå, Sweden
| | - Haitham Sobhy
- Department of Molecular Biology, Umeå University, 901 87, Umeå, Sweden
| | - Per Stenberg
- Department of Molecular Biology, Umeå University, 901 87, Umeå, Sweden.,Division of CBRN Security and Defence, FOI-Swedish Defence Research Agency, 906 21, Umeå, Sweden
| | - Ludvig Lizana
- Integrated Science Lab, Umeå University, 901 87, Umeå, Sweden.,Department of Physics, Umeå University, 901 87, Umeå, Sweden
| |
Collapse
|
20
|
Grob S, Grossniklaus U. Chromosome conformation capture-based studies reveal novel features of plant nuclear architecture. CURRENT OPINION IN PLANT BIOLOGY 2017; 36:149-157. [PMID: 28411415 DOI: 10.1016/j.pbi.2017.03.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 03/13/2017] [Accepted: 03/14/2017] [Indexed: 06/07/2023]
Abstract
Nuclear genome organization has recently received increasing attention due to its manifold functions in basic nuclear processes, such as replication, transcription, and the maintenance of genome integrity. Using technologies based on chromosome conformation capture, such as Hi-C, we now have the possibility to study the three-dimensional organization of the genome at unprecedented resolution, shedding light onto a previously unexplored level of nuclear architecture. In plants, research in this field is still in its infancy but a number of publications provided first insights into basic principles of nuclear genome organization and the factors that influence it. Apart from general aspects, newly discovered three-dimensional conformations, such as the KNOT, raise special interest on how nuclear organization may influence the function of the genome in previously unexpected ways.
Collapse
Affiliation(s)
- Stefan Grob
- Department of Plant and Microbial Biology & Zurich-Basel Plant Science Center, University of Zurich, 8008 Zurich, Switzerland.
| | - Ueli Grossniklaus
- Department of Plant and Microbial Biology & Zurich-Basel Plant Science Center, University of Zurich, 8008 Zurich, Switzerland
| |
Collapse
|
21
|
Abstract
High-throughput assays for measuring the three-dimensional (3D) configuration of DNA have provided unprecedented insights into the relationship between DNA 3D configuration and function. Data interpretation from assays such as ChIA-PET and Hi-C is challenging because the data is large and cannot be easily rendered using standard genome browsers. An effective Hi-C visualization tool must provide several visualization modes and be capable of viewing the data in conjunction with existing, complementary data. We review five software tools that do not require programming expertise. We summarize their complementary functionalities, and highlight which tool is best equipped for specific tasks.
Collapse
Affiliation(s)
- Galip Gürkan Yardımcı
- Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, 98105 WA USA
| | - William Stafford Noble
- Department of Genome Sciences, Department of Computer Science and Engineering, University of Washington, 3720 15th Ave NE, Seattle, 98105 WA USA
| |
Collapse
|
22
|
Abstract
Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics.
Collapse
Affiliation(s)
- Adam E Handel
- Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3QX, UK.
- Weatherall Institute of Molecular Medicine, University of Oxford, Headley Way, Oxford, OX3 9DS, UK.
| |
Collapse
|
23
|
QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks. PLoS Comput Biol 2016; 12:e1004809. [PMID: 27336171 PMCID: PMC4919057 DOI: 10.1371/journal.pcbi.1004809] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 05/12/2016] [Indexed: 01/30/2023] Open
Abstract
Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. AVAILABILITY: QuIN’s web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.
Collapse
|
24
|
Xu Z, Zhang G, Duan Q, Chai S, Zhang B, Wu C, Jin F, Yue F, Li Y, Hu M. HiView: an integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. BMC Res Notes 2016; 9:159. [PMID: 26969411 PMCID: PMC4788823 DOI: 10.1186/s13104-016-1947-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 02/22/2016] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits and diseases. However, most of them are located in the non-protein coding regions, and therefore it is challenging to hypothesize the functions of these non-coding GWAS variants. Recent large efforts such as the ENCODE and Roadmap Epigenomics projects have predicted a large number of regulatory elements. However, the target genes of these regulatory elements remain largely unknown. Chromatin conformation capture based technologies such as Hi-C can directly measure the chromatin interactions and have generated an increasingly comprehensive catalog of the interactome between the distal regulatory elements and their potential target genes. Leveraging such information revealed by Hi-C holds the promise of elucidating the functions of genetic variants in human diseases. RESULTS In this work, we present HiView, the first integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. HiView is able to display Hi-C data and statistical evidence for chromatin interactions in genomic regions surrounding any given GWAS variant, enabling straightforward visualization and interpretation. CONCLUSIONS We believe that as the first GWAS variants-centered Hi-C genome browser, HiView is a useful tool guiding post-GWAS functional genomics studies. HiView is freely accessible at: http://www.unc.edu/~yunmli/HiView .
Collapse
Affiliation(s)
- Zheng Xu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, 27599, USA.,Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA.,Department of Computer Science, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Guosheng Zhang
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Qing Duan
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Shengjie Chai
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Baqun Zhang
- School of Statistics, Renmin University of China, Beijing, 100872, China
| | - Cong Wu
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Fulai Jin
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, 27599, USA. .,Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA. .,Department of Computer Science, University of North Carolina, Chapel Hill, NC, 27599, USA.
| | - Ming Hu
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY, 10016, USA.
| |
Collapse
|
25
|
Mora A, Sandve GK, Gabrielsen OS, Eskeland R. In the loop: promoter-enhancer interactions and bioinformatics. Brief Bioinform 2015; 17:980-995. [PMID: 26586731 PMCID: PMC5142009 DOI: 10.1093/bib/bbv097] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 09/26/2015] [Indexed: 12/17/2022] Open
Abstract
Enhancer-promoter regulation is a fundamental mechanism underlying differential transcriptional regulation. Spatial chromatin organization brings remote enhancers in contact with target promoters in cis to regulate gene expression. There is considerable evidence for promoter-enhancer interactions (PEIs). In the recent years, genome-wide analyses have identified signatures and mapped novel enhancers; however, being able to precisely identify their target gene(s) requires massive biological and bioinformatics efforts. In this review, we give a short overview of the chromatin landscape and transcriptional regulation. We discuss some key concepts and problems related to chromatin interaction detection technologies, and emerging knowledge from genome-wide chromatin interaction data sets. Then, we critically review different types of bioinformatics analysis methods and tools related to representation and visualization of PEI data, raw data processing and PEI prediction. Lastly, we provide specific examples of how PEIs have been used to elucidate a functional role of non-coding single-nucleotide polymorphisms. The topic is at the forefront of epigenetic research, and by highlighting some future bioinformatics challenges in the field, this review provides a comprehensive background for future PEI studies.
Collapse
|
26
|
Shavit Y, Merelli I, Milanesi L, Lio’ P. How computer science can help in understanding the 3D genome architecture. Brief Bioinform 2015; 17:733-44. [DOI: 10.1093/bib/bbv085] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Indexed: 01/20/2023] Open
|
27
|
Schmid MW, Grob S, Grossniklaus U. HiCdat: a fast and easy-to-use Hi-C data analysis tool. BMC Bioinformatics 2015; 16:277. [PMID: 26334796 PMCID: PMC4559209 DOI: 10.1186/s12859-015-0678-x] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 07/20/2015] [Indexed: 12/25/2022] Open
Abstract
Background The study of nuclear architecture using Chromosome Conformation Capture (3C) technologies is a novel frontier in biology. With further reduction in sequencing costs, the potential of Hi-C in describing nuclear architecture as a phenotype is only about to unfold. To use Hi-C for phenotypic comparisons among different cell types, conditions, or genetic backgrounds, Hi-C data processing needs to be more accessible to biologists. Results HiCdat provides a simple graphical user interface for data pre-processing and a collection of higher-level data analysis tools implemented in R. Data pre-processing also supports a wide range of additional data types required for in-depth analysis of the Hi-C data (e.g. RNA-Seq, ChIP-Seq, and BS-Seq). Conclusions HiCdat is easy-to-use and provides solutions starting from aligned reads up to in-depth analyses. Importantly, HiCdat is focussed on the analysis of larger structural features of chromosomes, their correlation to genomic and epigenomic features, and on comparative studies. It uses simple input and output formats and can therefore easily be integrated into existing workflows or combined with alternative tools. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0678-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marc W Schmid
- Institute of Plant Biology, University of Zurich, Zollikerstrasse 107, Zürich, 8008, Switzerland. .,Zurich-Basel Plant Science Center, Universitätstrasse 2, Zürich, 8092, Switzerland.
| | - Stefan Grob
- Institute of Plant Biology, University of Zurich, Zollikerstrasse 107, Zürich, 8008, Switzerland. .,Zurich-Basel Plant Science Center, Universitätstrasse 2, Zürich, 8092, Switzerland.
| | - Ueli Grossniklaus
- Institute of Plant Biology, University of Zurich, Zollikerstrasse 107, Zürich, 8008, Switzerland. .,Zurich-Basel Plant Science Center, Universitätstrasse 2, Zürich, 8092, Switzerland.
| |
Collapse
|
28
|
Ay F, Noble WS. Analysis methods for studying the 3D architecture of the genome. Genome Biol 2015; 16:183. [PMID: 26328929 PMCID: PMC4556012 DOI: 10.1186/s13059-015-0745-7] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 08/10/2015] [Indexed: 11/10/2022] Open
Abstract
The rapidly increasing quantity of genome-wide chromosome conformation capture data presents great opportunities and challenges in the computational modeling and interpretation of the three-dimensional genome. In particular, with recent trends towards higher-resolution high-throughput chromosome conformation capture (Hi-C) data, the diversity and complexity of biological hypotheses that can be tested necessitates rigorous computational and statistical methods as well as scalable pipelines to interpret these datasets. Here we review computational tools to interpret Hi-C data, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.
Collapse
Affiliation(s)
- Ferhat Ay
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA. .,Feinberg School of Medicine, Northwestern University, Chicago, 60661, IL, USA.
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA. .,Department of Computer Science and Engineering, University of Washington, Seattle, 98195, WA, USA.
| |
Collapse
|
29
|
Lun ATL, Smyth GK. diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics 2015; 16:258. [PMID: 26283514 PMCID: PMC4539688 DOI: 10.1186/s12859-015-0683-0] [Citation(s) in RCA: 118] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 07/22/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Chromatin conformation capture with high-throughput sequencing (Hi-C) is a technique that measures the in vivo intensity of interactions between all pairs of loci in the genome. Most conventional analyses of Hi-C data focus on the detection of statistically significant interactions. However, an alternative strategy involves identifying significant changes in the interaction intensity (i.e., differential interactions) between two or more biological conditions. This is more statistically rigorous and may provide more biologically relevant results. RESULTS Here, we present the diffHic software package for the detection of differential interactions from Hi-C data. diffHic provides methods for read pair alignment and processing, counting into bin pairs, filtering out low-abundance events and normalization of trended or CNV-driven biases. It uses the statistical framework of the edgeR package to model biological variability and to test for significant differences between conditions. Several options for the visualization of results are also included. The use of diffHic is demonstrated with real Hi-C data sets. Performance against existing methods is also evaluated with simulated data. CONCLUSIONS On real data, diffHic is able to successfully detect interactions with significant differences in intensity between biological conditions. It also compares favourably to existing software tools on simulated data sets. These results suggest that diffHic is a viable approach for differential analyses of Hi-C data.
Collapse
Affiliation(s)
- Aaron T L Lun
- The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC, 3052, Melbourne, Australia.
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, 3010, Melbourne, Australia.
| | - Gordon K Smyth
- The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC, 3052, Melbourne, Australia.
- Department of Mathematics and Statistics, The University of Melbourne, Parkville, VIC, 3010, Melbourne, Australia.
| |
Collapse
|