1
|
Kadlof M, Banecki K, Chiliński M, Plewczynski D. Chromatin image-driven modelling. Methods 2024; 226:54-60. [PMID: 38636797 DOI: 10.1016/j.ymeth.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/13/2024] [Accepted: 04/05/2024] [Indexed: 04/20/2024] Open
Abstract
The challenge of modelling the spatial conformation of chromatin remains an open problem. While multiple data-driven approaches have been proposed, each has limitations. This work introduces two image-driven modelling methods based on the Molecular Dynamics Flexible Fitting (MDFF) approach: the force method and the correlational method. Both methods have already been used successfully in protein modelling. We propose a novel way to employ them for building chromatin models directly from 3D images. This approach is termed image-driven modelling. Additionally, we introduce the initial structure generator, a tool designed to generate optimal starting structures for the proposed algorithms. The methods are versatile and can be applied to various data types, with minor modifications to accommodate new generation imaging techniques.
Collapse
Affiliation(s)
- Michał Kadlof
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.
| | - Krzysztof Banecki
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Mateusz Chiliński
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland; Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Dariusz Plewczynski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland
| |
Collapse
|
2
|
Shao W, Wang J, Zhang Y, Zhang C, Chen J, Chen Y, Fei Z, Ma Z, Sun X, Jiao C. The jet-like chromatin structure defines active secondary metabolism in fungi. Nucleic Acids Res 2024; 52:4906-4921. [PMID: 38407438 PMCID: PMC11109943 DOI: 10.1093/nar/gkae131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 02/06/2024] [Accepted: 02/10/2024] [Indexed: 02/27/2024] Open
Abstract
Eukaryotic genomes are spatially organized within the nucleus in a nonrandom manner. However, fungal genome arrangement and its function in development and adaptation remain largely unexplored. Here, we show that the high-order chromosome structure of Fusarium graminearum is sculpted by both H3K27me3 modification and ancient genome rearrangements. Active secondary metabolic gene clusters form a structure resembling chromatin jets. We demonstrate that these jet-like domains, which can propagate symmetrically for 54 kb, are prevalent in the genome and correlate with active gene transcription and histone acetylation. Deletion of GCN5, which encodes a core and functionally conserved histone acetyltransferase, blocks the formation of the domains. Insertion of an exogenous gene within the jet-like domain significantly augments its transcription. These findings uncover an interesting link between alterations in chromatin structure and the activation of fungal secondary metabolism, which could be a general mechanism for fungi to rapidly respond to environmental cues, and highlight the utility of leveraging three-dimensional genome organization in improving gene transcription in eukaryotes.
Collapse
Affiliation(s)
- Wenyong Shao
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jingrui Wang
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yueqi Zhang
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chaofan Zhang
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, Zhejiang, China
| | - Jie Chen
- National Joint Engineering Laboratory of Biopesticide Preparation, College of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou 311300, Zhejiang, China
| | - Yun Chen
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Zhangjun Fei
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
| | - Zhonghua Ma
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xuepeng Sun
- Collaborative Innovation Center for Efficient and Green Production of Agriculture in Mountainous Areas of Zhejiang Province, College of Horticulture Science, Zhejiang A&F University, Hangzhou 311300, Zhejiang, China
| | - Chen Jiao
- State Key Laboratory of Rice Biology, Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
3
|
Guo Z, Liu J, Wang Y, Chen M, Wang D, Xu D, Cheng J. Diffusion models in bioinformatics and computational biology. NATURE REVIEWS BIOENGINEERING 2024; 2:136-154. [PMID: 38576453 PMCID: PMC10994218 DOI: 10.1038/s44222-023-00114-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/25/2023] [Indexed: 04/06/2024]
Abstract
Denoising diffusion models embody a type of generative artificial intelligence that can be applied in computer vision, natural language processing and bioinformatics. In this Review, we introduce the key concepts and theoretical foundations of three diffusion modelling frameworks (denoising diffusion probabilistic models, noise-conditioned scoring networks and score stochastic differential equations). We then explore their applications in bioinformatics and computational biology, including protein design and generation, drug and small-molecule design, protein-ligand interaction modelling, cryo-electron microscopy image data analysis and single-cell data analysis. Finally, we highlight open-source diffusion model tools and consider the future applications of diffusion models in bioinformatics.
Collapse
Affiliation(s)
- Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Yanli Wang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Mengrui Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Duolin Wang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| |
Collapse
|
4
|
Liu T, Qiu QT, Hua KJ, Ma BG. Chromosome structure modeling tools and their evaluation in bacteria. Brief Bioinform 2024; 25:bbae044. [PMID: 38385874 PMCID: PMC10883143 DOI: 10.1093/bib/bbae044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/31/2023] [Accepted: 01/22/2024] [Indexed: 02/23/2024] Open
Abstract
The three-dimensional (3D) structure of bacterial chromosomes is crucial for understanding chromosome function. With the growing availability of high-throughput chromosome conformation capture (3C/Hi-C) data, the 3D structure reconstruction algorithms have become powerful tools to study bacterial chromosome structure and function. It is highly desired to have a recommendation on the chromosome structure reconstruction tools to facilitate the prokaryotic 3D genomics. In this work, we review existing chromosome 3D structure reconstruction algorithms and classify them based on their underlying computational models into two categories: constraint-based modeling and thermodynamics-based modeling. We briefly compare these algorithms utilizing 3C/Hi-C datasets and fluorescence microscopy data obtained from Escherichia coli and Caulobacter crescentus, as well as simulated datasets. We discuss current challenges in the 3D reconstruction algorithms for bacterial chromosomes, primarily focusing on software usability. Finally, we briefly prospect future research directions for bacterial chromosome structure reconstruction algorithms.
Collapse
Affiliation(s)
- Tong Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Qin-Tian Qiu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Kang-Jian Hua
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Bin-Guang Ma
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
5
|
Norollahi SE, Vahidi S, Shams S, Keymoradzdeh A, Soleymanpour A, Solymanmanesh N, Mirzajani E, Jamkhaneh VB, Samadani AA. Analytical and therapeutic profiles of DNA methylation alterations in cancer; an overview of changes in chromatin arrangement and alterations in histone surfaces. Horm Mol Biol Clin Investig 2023; 44:337-356. [PMID: 36799246 DOI: 10.1515/hmbci-2022-0043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2022] [Accepted: 01/24/2023] [Indexed: 02/18/2023]
Abstract
DNA methylation is the most important epigenetic element that activates the inhibition of gene transcription and is included in the pathogenesis of all types of malignancies. Remarkably, the effectors of DNA methylation are DNMTs (DNA methyltransferases) that catalyze de novo or keep methylation of hemimethylated DNA after the DNA replication process. DNA methylation structures in cancer are altered, with three procedures by which DNA methylation helps cancer development which are including direct mutagenesis, hypomethylation of the cancer genome, and also focal hypermethylation of the promoters of TSGs (tumor suppressor genes). Conspicuously, DNA methylation, nucleosome remodeling, RNA-mediated targeting, and histone modification balance modulate many biological activities that are essential and indispensable to the genesis of cancer and also can impact many epigenetic changes including DNA methylation and histone modifications as well as adjusting of non-coding miRNAs expression in prevention and treatment of many cancers. Epigenetics points to heritable modifications in gene expression that do not comprise alterations in the DNA sequence. The nucleosome is the basic unit of chromatin, consisting of 147 base pairs (bp) of DNA bound around a histone octamer comprised of one H3/H4 tetramer and two H2A/H2B dimers. DNA methylation is preferentially distributed over nucleosome regions and is less increased over flanking nucleosome-depleted DNA, implying a connection between nucleosome positioning and DNA methylation. In carcinogenesis, aberrations in the epigenome may also include in the progression of drug resistance. In this report, we report the rudimentary notes behind these epigenetic signaling pathways and emphasize the proofs recommending that their misregulation can conclude in cancer. These findings in conjunction with the promising preclinical and clinical consequences observed with epigenetic drugs against chromatin regulators, confirm the important role of epigenetics in cancer therapy.
Collapse
Affiliation(s)
- Seyedeh Elham Norollahi
- Cancer Research Center and Department of Immunology, Semnan University of Medical Sciences, Semnan, Iran
| | - Sogand Vahidi
- Medical Biology Research Center, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Shima Shams
- Student Research Committee, School of Medicine, Guilan University of Medical Sciences, Rasht, Iran
| | - Arman Keymoradzdeh
- Department of Neurosurgery, School of Medicine, Imam Hossein Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Armin Soleymanpour
- Student Research Committee, School of Medicine, Guilan University of Medical Sciences, Rasht, Iran
| | - Nazanin Solymanmanesh
- Student Research Committee, School of Medicine, Guilan University of Medical Sciences, Rasht, Iran
| | - Ebrahim Mirzajani
- Department of Biochemistry and Biophysics, School of Medicine, Guilan University of Medical Sciences, Rasht, Iran
| | - Vida Baloui Jamkhaneh
- Department of Veterinary Medicine, Islamic Azad University of Babol Branch, Babol, Iran
| | - Ali Akbar Samadani
- Guilan Road Trauma Research Center, Guilan University of Medical Sciences, Rasht, Iran
| |
Collapse
|
6
|
Wang Y, Guo Z, Cheng J. Single-cell Hi-C data enhancement with deep residual and generative adversarial networks. Bioinformatics 2023; 39:btad458. [PMID: 37498561 PMCID: PMC10403428 DOI: 10.1093/bioinformatics/btad458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/19/2023] [Accepted: 07/25/2023] [Indexed: 07/28/2023] Open
Abstract
MOTIVATION The spatial genome organization of a eukaryotic cell is important for its function. The development of single-cell technologies for probing the 3D genome conformation, especially single-cell chromosome conformation capture techniques, has enabled us to understand genome function better than before. However, due to extreme sparsity and high noise associated with single-cell Hi-C data, it is still difficult to study genome structure and function using the HiC-data of one single cell. RESULTS In this work, we developed a deep learning method ScHiCEDRN based on deep residual networks and generative adversarial networks for the imputation and enhancement of Hi-C data of a single cell. In terms of both image evaluation and Hi-C reproducibility metrics, ScHiCEDRN outperforms the four deep learning methods (DeepHiC, HiCPlus, HiCSR, and Loopenhance) on enhancing the raw single-cell Hi-C data of human and Drosophila. The experiments also show that it can generate single-cell Hi-C data more suitable for identifying topologically associating domain boundaries and reconstructing 3D chromosome structures than the existing methods. Moreover, ScHiCEDRN's performance generalizes well across different single cells and cell types, and it can be applied to improving population Hi-C data. AVAILABILITY AND IMPLEMENTATION The source code of ScHiCEDRN is available at the GitHub repository: https://github.com/BioinfoMachineLearning/ScHiCEDRN.
Collapse
Affiliation(s)
- Yanli Wang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, United States
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health Institute, University of Missouri, Columbia, MO 65211, United States
| |
Collapse
|
7
|
Cheng J, Cao X, Wang X, Wang J, Yue B, Sun W, Huang Y, Lan X, Ren G, Lei C, Chen H. Dynamic chromatin architectures provide insights into the genetics of cattle myogenesis. J Anim Sci Biotechnol 2023; 14:59. [PMID: 37055796 PMCID: PMC10103417 DOI: 10.1186/s40104-023-00855-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 02/16/2023] [Indexed: 04/15/2023] Open
Abstract
BACKGROUND Sharply increased beef consumption is propelling the genetic improvement projects of beef cattle in China. Three-dimensional genome structure is confirmed to be an important layer of transcription regulation. Although genome-wide interaction data of several livestock species have already been produced, the genome structure states and its regulatory rules in cattle muscle are still limited. RESULTS Here we present the first 3D genome data in Longissimus dorsi muscle of fetal and adult cattle (Bos taurus). We showed that compartments, topologically associating domains (TADs), and loop undergo re-organization and the structure dynamics were consistent with transcriptomic divergence during muscle development. Furthermore, we annotated cis-regulatory elements in cattle genome during myogenesis and demonstrated the enrichments of promoter and enhancer in selection sweeps. We further validated the regulatory function of one HMGA2 intronic enhancer near a strong sweep region on primary bovine myoblast proliferation. CONCLUSIONS Our data provide key insights of the regulatory function of high order chromatin structure and cattle myogenic biology, which will benefit the progress of genetic improvement of beef cattle.
Collapse
Affiliation(s)
- Jie Cheng
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Xiukai Cao
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou, 225009, China
| | - Xiaogang Wang
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Jian Wang
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Binglin Yue
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
- Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Southwest Minzu University, Chengdu, 610225, China
| | - Wei Sun
- College of Animal Science and Technology, Yangzhou University, Yangzhou, 225009, China
| | - Yongzhen Huang
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Xianyong Lan
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Gang Ren
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China
| | - Hong Chen
- College of Animal Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling district, Yangling, Shaanxi province, 712100, China.
- College of Animal Science, Xinjiang Agricultural University, Urumqi, 830052, China.
| |
Collapse
|
8
|
Clarence T, Robert NS, Sarigol F, Fu X, Bates PA, Simakov O. Robust 3D modeling reveals spatiosyntenic properties of animal genomes. iScience 2023; 26:106136. [PMID: 36876129 PMCID: PMC9976460 DOI: 10.1016/j.isci.2023.106136] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 11/18/2022] [Accepted: 01/31/2023] [Indexed: 02/05/2023] Open
Abstract
Animal genomes are organized into chromosomes that are remarkably conserved in their gene content, forming distinct evolutionary units (synteny). Using versatile chromosomal modeling, we infer three-dimensional topology of genomes from representative clades spanning the earliest animal diversification. We apply a partitioning approach using interaction spheres to compensate for varying quality of topological data. Using comparative genomics approaches, we test whether syntenic signal at gene pair, local, and whole chromosomal scale is reflected in the reconstructed spatial organization. We identify evolutionarily conserved three-dimensional networks at all syntenic scales revealing novel evolutionarily maintained interactors associated with known conserved local gene linkages (such as hox). We thus present evidence for evolutionary constraints that are associated with three-, rather than just two-, dimensional animal genome organization, which we term spatiosynteny. As more accurate topological data become available, together with validation approaches, spatiosynteny may become relevant in understanding the functionality behind the observed conservation of animal chromosomes.
Collapse
Affiliation(s)
- Tereza Clarence
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
- Roussos Lab/Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Corresponding author
| | - Nicolas S.M. Robert
- Department of Neuroscience and Developmental Biology, University of Vienna, Vienna, Austria
| | - Fatih Sarigol
- Department of Neuroscience and Developmental Biology, University of Vienna, Vienna, Austria
| | - Xiao Fu
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
- Corresponding author
| | - Oleg Simakov
- Department of Neuroscience and Developmental Biology, University of Vienna, Vienna, Austria
- Corresponding author
| |
Collapse
|
9
|
Li FZ, Zhang XF, Cai HY, Ran LQ, Zhou HY, Liu ZE. Chromosome Three-Dimensional Structure Reconstruction: An Iterative ShRec3D Algorithm. J Comput Biol 2023; 30:575-587. [PMID: 36847350 DOI: 10.1089/cmb.2022.0179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
The three-dimensional (3D) structure of chromosomes is of great significance to ensure that the genome performs various functions (e.g., gene expression) correctly and replicates and separates correctly in mitosis. Since the emergence of Hi-C in 2009, a new experimental technique in molecular biology, researchers have been paying more and more attention to the reconstruction of chromosome 3D structure. To reconstruct the 3D structure of chromosomes based on Hi-C experimental data, many algorithms have been proposed, among which ShRec3D is one of the most outstanding. In this article, an iterative ShRec3D algorithm is presented to greatly improve the native ShRec3D algorithm. Experimental results show that our algorithm can significantly promote the performance of ShRec3D, and this improvement is applicable to almost all data noise range and signal coverage range, so it is universal.
Collapse
Affiliation(s)
- Fang-Zhen Li
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
| | - Xue-Fen Zhang
- College of Smart City, Beijing Union University, Beijing, China
| | - Hui-Ying Cai
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
| | - Ling-Qiang Ran
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
| | - Hai-Yan Zhou
- Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Zhi-E Liu
- College of Physics and Electronic Engineering, Qilu Normal University, Jinan, China
| |
Collapse
|
10
|
Hovenga V, Kalita J, Oluwadare O. HiC-GNN: A generalizable model for 3D chromosome reconstruction using graph convolutional neural networks. Comput Struct Biotechnol J 2022; 21:812-836. [PMID: 36698967 PMCID: PMC9842867 DOI: 10.1016/j.csbj.2022.12.051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 12/08/2022] [Accepted: 12/30/2022] [Indexed: 01/02/2023] Open
Abstract
Chromosome conformation capture (3 C) is a method of measuring chromosome topology in terms of loci interaction. The Hi-C method is a derivative of 3 C that allows for genome-wide quantification of chromosome interaction. From such interaction data, it is possible to infer the three-dimensional (3D) structure of the underlying chromosome. In this paper, we developed a novel method, HiC-GNN, for predicting the 3D structures of chromosomes from Hi-C data. HiC-GNN is unique from other methods for chromosome structure prediction in that the models learned by HiC-GNN can be generalized to data that is distinct from the training data. This aspect of HiC-GNN allows models that were trained on one Hi-C contact map to be used for inference on entirely different maps. To the authors' knowledge, this generalizing capability is not present in any existing methods. HiC-GNN uses a node embedding algorithm and a graph neural network to predict the 3D coordinates of each genomic loci from the corresponding Hi-C contact data. Unlike other methods, our algorithm allows for the storage of pre-trained parameters, thus enabling prediction on data that is entirely different from the training data. We show that our method can accurately generalize a single model across Hi-C resolutions, multiple restriction enzymes, and multiple cell populations while maintaining reconstruction accuracy across three Hi-C datasets. Our algorithm outperforms the state-of-the-art methods in accuracy of prediction and runtime and introduces a novel method for 3D structure prediction from Hi-C data. All our source codes and data are available at https://github.com/OluwadareLab/HiC-GNN.
Collapse
Affiliation(s)
- Van Hovenga
- Department of Mathematics, University of Colorado, Colorado Springs, CO, United States
| | - Jugal Kalita
- Department of Computer Science, University of Colorado, Colorado Springs, CO, United States
| | - Oluwatosin Oluwadare
- Department of Computer Science, University of Colorado, Colorado Springs, CO, United States,Corresponding author.
| |
Collapse
|
11
|
Vadnais D, Middleton M, Oluwadare O. ParticleChromo3D: a Particle Swarm Optimization algorithm for chromosome 3D structure prediction from Hi-C data. BioData Min 2022; 15:19. [PMID: 36131326 PMCID: PMC9494900 DOI: 10.1186/s13040-022-00305-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 08/31/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
The three-dimensional (3D) structure of chromatin has a massive effect on its function. Because of this, it is desirable to have an understanding of the 3D structural organization of chromatin. To gain greater insight into the spatial organization of chromosomes and genomes and the functions they perform, chromosome conformation capture (3C) techniques, particularly Hi-C, have been developed. The Hi-C technology is widely used and well-known because of its ability to profile interactions for all read pairs in an entire genome. The advent of Hi-C has greatly expanded our understanding of the 3D genome, genome folding, gene regulation and has enabled the development of many 3D chromosome structure reconstruction methods.
Results
Here, we propose a novel approach for 3D chromosome and genome structure reconstruction from Hi-C data using Particle Swarm Optimization (PSO) approach called ParticleChromo3D. This algorithm begins with a grouping of candidate solution locations for each chromosome bin, according to the particle swarm algorithm, and then iterates its position towards a global best candidate solution. While moving towards the optimal global solution, each candidate solution or particle uses its own local best information and a randomizer to choose its path. Using several metrics to validate our results, we show that ParticleChromo3D produces a robust and rigorous representation of the 3D structure for input Hi-C data. We evaluated our algorithm on simulated and real Hi-C data in this work. Our results show that ParticleChromo3D is more accurate than most of the existing algorithms for 3D structure reconstruction.
Conclusions
Our results also show that constructed ParticleChromo3D structures are very consistent, hence indicating that it will always arrive at the global solution at every iteration. The source code for ParticleChromo3D, the simulated and real Hi-C datasets, and the models generated for these datasets are available here: https://github.com/OluwadareLab/ParticleChromo3D
Collapse
|
12
|
Yildirim A, Boninsegna L, Zhan Y, Alber F. Uncovering the Principles of Genome Folding by 3D Chromatin Modeling. Cold Spring Harb Perspect Biol 2022; 14:a039693. [PMID: 34400556 PMCID: PMC9248826 DOI: 10.1101/cshperspect.a039693] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Our understanding of how genomic DNA is tightly packed inside the nucleus, yet is still accessible for vital cellular processes, has grown dramatically over recent years with advances in microscopy and genomics technologies. Computational methods have played a pivotal role in the structural interpretation of experimental data, which helped unravel some organizational principles of genome folding. Here, we give an overview of current computational efforts in mechanistic and data-driven 3D chromatin structure modeling. We discuss strengths and limitations of different methods and evaluate the added value and benefits of computational approaches to infer the 3D structural and dynamic properties of the genome and its underlying mechanisms at different scales and resolution, ranging from the dynamic formation of chromatin loops and topological associated domains to nuclear compartmentalization of chromatin and nuclear bodies.
Collapse
Affiliation(s)
- Asli Yildirim
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Lorenzo Boninsegna
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Yuxiang Zhan
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
| | - Frank Alber
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
| |
Collapse
|
13
|
Collins B, Oluwadare O, Brown P. ChromeBat: A Bio-Inspired Approach to 3D Genome Reconstruction. Genes (Basel) 2021; 12:1757. [PMID: 34828363 PMCID: PMC8617892 DOI: 10.3390/genes12111757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 10/28/2021] [Accepted: 11/01/2021] [Indexed: 11/20/2022] Open
Abstract
With the advent of Next Generation Sequencing and the Hi-C experiment, high quality genome-wide contact data are becoming increasingly available. These data represents an empirical measure of how a genome interacts inside the nucleus. Genome conformation is of particular interest as it has been experimentally shown to be a driving force for many genomic functions from regulation to transcription. Thus, the Three Dimensional-Genome Reconstruction Problem (3D-GRP) seeks to take Hi-C data and produces a complete physical genome structure as it appears in the nucleus for genomic analysis. We propose and develop a novel method to solve the Chromosome and Genome Reconstruction problem based on the Bat Algorithm (BA) which we called ChromeBat. We demonstrate on real Hi-C data that ChromeBat is capable of state-of-the-art performance. Additionally, the domain of Genome Reconstruction has been criticized for lacking algorithmic diversity, and the bio-inspired nature of ChromeBat contributes algorithmic diversity to the problem domain. ChromeBat is an effective approach for solving the Genome Reconstruction Problem.
Collapse
Affiliation(s)
| | - Oluwatosin Oluwadare
- Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, USA; (B.C.); (P.B.)
| | | |
Collapse
|
14
|
Highsmith M, Cheng J. Four-Dimensional Chromosome Structure Prediction. Int J Mol Sci 2021; 22:ijms22189785. [PMID: 34575948 PMCID: PMC8465368 DOI: 10.3390/ijms22189785] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 08/28/2021] [Accepted: 09/07/2021] [Indexed: 11/16/2022] Open
Abstract
Chromatin conformation plays an important role in a variety of genomic processes, including genome replication, gene expression, and gene methylation. Hi-C data is frequently used to analyze structural features of chromatin, such as AB compartments, topologically associated domains, and 3D structural models. Recently, the genomics community has displayed growing interest in chromatin dynamics. Here, we present 4DMax, a novel method, which uses time-series Hi-C data to predict dynamic chromosome conformation. Using both synthetic data and real time-series Hi-C data from processes, such as induced pluripotent stem cell reprogramming and cardiomyocyte differentiation, we construct smooth four-dimensional models of individual chromosomes. These predicted 4D models effectively interpolate chromatin position across time, permitting prediction of unknown Hi-C contact maps at intermittent time points. Furthermore, 4DMax correctly recovers higher order features of chromatin, such as AB compartments and topologically associated domains, even at time points where Hi-C data is not made available to the algorithm. Contact map predictions made using 4DMax outperform naïve numerical interpolation in 87.7% of predictions on the induced pluripotent stem cell dataset. A/B compartment profiles derived from 4DMax interpolation showed higher similarity to ground truth than at least one profile generated from a neighboring time point in 100% of induced pluripotent stem cell experiments. Use of 4DMax may alleviate the cost of expensive Hi-C experiments by interpolating intermediary time points while also providing valuable visualization of dynamic chromatin changes.
Collapse
|
15
|
MacKay K, Kusalik A. Computational methods for predicting 3D genomic organization from high-resolution chromosome conformation capture data. Brief Funct Genomics 2021; 19:292-308. [PMID: 32353112 PMCID: PMC7388788 DOI: 10.1093/bfgp/elaa004] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 01/30/2020] [Accepted: 02/07/2020] [Indexed: 12/19/2022] Open
Abstract
The advent of high-resolution chromosome conformation capture assays (such as 5C, Hi-C and Pore-C) has allowed for unprecedented sequence-level investigations into the structure-function relationship of the genome. In order to comprehensively understand this relationship, computational tools are required that utilize data generated from these assays to predict 3D genome organization (the 3D genome reconstruction problem). Many computational tools have been developed that answer this need, but a comprehensive comparison of their underlying algorithmic approaches has not been conducted. This manuscript provides a comprehensive review of the existing computational tools (from November 2006 to September 2019, inclusive) that can be used to predict 3D genome organizations from high-resolution chromosome conformation capture data. Overall, existing tools were found to use a relatively small set of algorithms from one or more of the following categories: dimensionality reduction, graph/network theory, maximum likelihood estimation (MLE) and statistical modeling. Solutions in each category are far from maturity, and the breadth and depth of various algorithmic categories have not been fully explored. While the tools for predicting 3D structure for a genomic region or single chromosome are diverse, there is a general lack of algorithmic diversity among computational tools for predicting the complete 3D genome organization from high-resolution chromosome conformation capture data.
Collapse
|
16
|
Todd S, Todd P, McGowan SJ, Hughes JR, Kakui Y, Leymarie FF, Latham W, Taylor S. CSynth: an interactive modelling and visualization tool for 3D chromatin structure. Bioinformatics 2021; 37:951-955. [PMID: 32866221 PMCID: PMC8128456 DOI: 10.1093/bioinformatics/btaa757] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 08/18/2020] [Accepted: 08/24/2020] [Indexed: 01/22/2023] Open
Abstract
MOTIVATION The 3D structure of chromatin in the nucleus is important for gene expression and regulation. Chromosome conformation capture techniques, such as Hi-C, generate large amounts of data showing interaction points on the genome but these are hard to interpret using standard tools. RESULTS We have developed CSynth, an interactive 3D genome browser and real-time chromatin restraint-based modeller to visualize models of any chromosome conformation capture (3C) data. Unlike other modelling systems, CSynth allows dynamic interaction with the modelling parameters to allow experimentation and effects on the model. It also allows comparison of models generated from data in different tissues/cell states and the results of third-party 3D modelling outputs. In addition, we include an option to view and manipulate these complicated structures using Virtual Reality (VR) so scientists can immerse themselves in the models for further understanding. This VR component has also proven to be a valuable teaching and a public engagement tool. AVAILABILITYAND IMPLEMENTATION CSynth is web based and available to use at csynth.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Stephen Todd
- Department of Computing, Goldsmiths, University of London, London, UK
- London Geometry, Ltd., London, UK
| | | | - Simon J McGowan
- Analysis, Visualization and Informatics, MRC Weatherall Institute of Molecular Medicine, Oxford, UK
| | - James R Hughes
- Genome Biology Group, MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Oxford, UK
| | - Yasutaka Kakui
- The Francis Crick Institute, Chromosome Segregation Laboratory, London, UK
| | - Frederic Fol Leymarie
- Department of Computing, Goldsmiths, University of London, London, UK
- London Geometry, Ltd., London, UK
| | - William Latham
- Department of Computing, Goldsmiths, University of London, London, UK
- London Geometry, Ltd., London, UK
| | - Stephen Taylor
- Analysis, Visualization and Informatics, MRC Weatherall Institute of Molecular Medicine, Oxford, UK
| |
Collapse
|
17
|
VEHiCLE: a Variationally Encoded Hi-C Loss Enhancement algorithm for improving and generating Hi-C data. Sci Rep 2021; 11:8880. [PMID: 33893353 PMCID: PMC8065109 DOI: 10.1038/s41598-021-88115-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 03/10/2021] [Indexed: 11/23/2022] Open
Abstract
Chromatin conformation plays an important role in a variety of genomic processes. Hi-C is one of the most popular assays for inspecting chromatin conformation. However, the utility of Hi-C contact maps is bottlenecked by resolution. Here we present VEHiCLE, a deep learning algorithm for resolution enhancement of Hi-C contact data. VEHiCLE utilises a variational autoencoder and adversarial training strategy equipped with four loss functions (adversarial loss, variational loss, chromosome topology-inspired insulation loss, and mean square error loss) to enhance contact maps, making them more viable for downstream analysis. VEHiCLE expands previous efforts at Hi-C super resolution by providing novel insight into the biologically meaningful and human interpretable feature extraction. Using a deep variational autoencoder, VEHiCLE provides a user tunable, full generative model for generating synthetic Hi-C data while also providing state-of-the-art results in enhancement of Hi-C data across multiple metrics.
Collapse
|
18
|
Hovenga V, Oluwadare O. CBCR: A Curriculum Based Strategy For Chromosome Reconstruction. Int J Mol Sci 2021; 22:ijms22084140. [PMID: 33923653 PMCID: PMC8073114 DOI: 10.3390/ijms22084140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 04/12/2021] [Accepted: 04/13/2021] [Indexed: 11/30/2022] Open
Abstract
In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of xyz coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison.
Collapse
Affiliation(s)
- Van Hovenga
- Department of Mathematics, University of Colorado Colorado Springs, Colorado Springs, CO 80918, USA;
| | - Oluwatosin Oluwadare
- Department of Computer Science, University of Colorado Colorado Springs, Colorado Springs, CO 80918, USA
- Correspondence:
| |
Collapse
|
19
|
A compendium and comparative epigenomics analysis of cis-regulatory elements in the pig genome. Nat Commun 2021; 12:2217. [PMID: 33850120 PMCID: PMC8044108 DOI: 10.1038/s41467-021-22448-x] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 03/15/2021] [Indexed: 02/01/2023] Open
Abstract
Although major advances in genomics have initiated an exciting new era of research, a lack of information regarding cis-regulatory elements has limited the genetic improvement or manipulation of pigs as a meat source and biomedical model. Here, we systematically characterize cis-regulatory elements and their functions in 12 diverse tissues from four pig breeds by adopting similar strategies as the ENCODE and Roadmap Epigenomics projects, which include RNA-seq, ATAC-seq, and ChIP-seq. In total, we generate 199 datasets and identify more than 220,000 cis-regulatory elements in the pig genome. Surprisingly, we find higher conservation of cis-regulatory elements between human and pig genomes than those between human and mouse genomes. Furthermore, the differences of topologically associating domains between the pig and human genomes are associated with morphological evolution of the head and face. Beyond generating a major new benchmark resource for pig epigenetics, our study provides basic comparative epigenetic data relevant to using pigs as models in human biomedical research.
Collapse
|
20
|
Gong H, Yang Y, Zhang S, Li M, Zhang X. Application of Hi-C and other omics data analysis in human cancer and cell differentiation research. Comput Struct Biotechnol J 2021; 19:2070-2083. [PMID: 33995903 PMCID: PMC8086027 DOI: 10.1016/j.csbj.2021.04.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/04/2021] [Accepted: 04/04/2021] [Indexed: 02/07/2023] Open
Abstract
With the development of 3C (chromosome conformation capture) and its derivative technology Hi-C (High-throughput chromosome conformation capture) research, the study of the spatial structure of the genomic sequence in the nucleus helps researchers understand the functions of biological processes such as gene transcription, replication, repair, and regulation. In this paper, we first introduce the research background and purpose of Hi-C data visualization analysis. After that, we discuss the Hi-C data analysis methods from genome 3D structure, A/B compartment, TADs (topologically associated domain), and loop detection. We also discuss how to apply genome visualization technologies to the identification of chromosome feature structures. We continue with a review of correlation analysis differences among multi-omics data, and how to apply Hi-C and other omics data analysis into cancer and cell differentiation research. Finally, we summarize the various problems in joint analyses based on Hi-C and other multi-omics data. We believe this review can help researchers better understand the progress and applications of 3D genome technology.
Collapse
Affiliation(s)
- Haiyan Gong
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| | - Yi Yang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Sichen Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Minghong Li
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Xiaotong Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| |
Collapse
|
21
|
Meluzzi D, Arya G. Computational approaches for inferring 3D conformations of chromatin from chromosome conformation capture data. Methods 2020; 181-182:24-34. [PMID: 31470090 PMCID: PMC7044057 DOI: 10.1016/j.ymeth.2019.08.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/24/2019] [Accepted: 08/23/2019] [Indexed: 02/08/2023] Open
Abstract
Chromosome conformation capture (3C) and its variants are powerful experimental techniques for probing intra- and inter-chromosomal interactions within cell nuclei at high resolution and in a high-throughput, quantitative manner. The contact maps derived from such experiments provide an avenue for inferring the 3D spatial organization of the genome. This review provides an overview of the various computational methods developed in the past decade for addressing the very important but challenging problem of deducing the detailed 3D structure or structure population of chromosomal domains, chromosomes, and even entire genomes from 3C contact maps.
Collapse
Affiliation(s)
- Dario Meluzzi
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, United States
| | - Gaurav Arya
- Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, United States.
| |
Collapse
|
22
|
Chen C, Yu W, Tober J, Gao P, He B, Lee K, Trieu T, Blobel GA, Speck NA, Tan K. Spatial Genome Re-organization between Fetal and Adult Hematopoietic Stem Cells. Cell Rep 2020; 29:4200-4211.e7. [PMID: 31851943 PMCID: PMC7262670 DOI: 10.1016/j.celrep.2019.11.065] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 10/16/2019] [Accepted: 11/14/2019] [Indexed: 01/28/2023] Open
Abstract
Fetal hematopoietic stem cells (HSCs) undergo a developmental switch to become adult HSCs with distinct functional properties. To better understand the molecular mechanisms underlying the developmental switch, we have conducted deep sequencing of the 3D genome, epigenome, and transcriptome of fetal and adult HSCs in mouse. We find that chromosomal compartments and topologically associating domains (TADs) are largely conserved between fetal and adult HSCs. However, there is a global trend of increased compartmentalization and TAD boundary strength in adult HSCs. In contrast, intra-TAD chromatin interactions are much more dynamic and wide-spread, involving over a thousand gene promoters and distal enhancers. These developmental-stage-specific enhancer-promoter interactions are mediated by different sets of transcription factors, such as TCF3 and MAFB in fetal HSCs, versus NR4A1 and GATA3 in adult HSCs. Loss-of-function studies of TCF3 confirm the role of TCF3 in mediating condition-specific enhancer-promoter interactions and gene regulation in fetal HSCs. A developmental transition occurs between fetal and adult hematopoietic stem cells. How the 3D genome folding contributes to this transition is poorly understood. Chen et al. show global genome organization is largely conserved, but a large fraction of enhancer-promoter interactions is reorganized and regulate genes contributing to the phenotypic differences.
Collapse
Affiliation(s)
- Changya Chen
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Wenbao Yu
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Joanna Tober
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Peng Gao
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Bing He
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kiwon Lee
- Sol Sherry Thrombosis Research Center, Temple University Medical School, Philadelphia, PA 19140, USA
| | - Tuan Trieu
- Department of Computer Science, University of Missouri-Columbia, Columbia, MO 65211, USA
| | - Gerd A Blobel
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nancy A Speck
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kai Tan
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
23
|
Oluwadare O, Highsmith M, Turner D, Lieberman Aiden E, Cheng J. GSDB: a database of 3D chromosome and genome structures reconstructed from Hi-C data. BMC Mol Cell Biol 2020; 21:60. [PMID: 32758136 PMCID: PMC7405446 DOI: 10.1186/s12860-020-00304-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 07/29/2020] [Indexed: 11/10/2022] Open
Abstract
Advances in the study of chromosome conformation capture technologies, such as Hi-C technique - capable of capturing chromosomal interactions in a genome-wide scale - have led to the development of three-dimensional chromosome and genome structure reconstruction methods from Hi-C data. The three dimensional genome structure is important because it plays a role in a variety of important biological activities such as DNA replication, gene regulation, genome interaction, and gene expression. In recent years, numerous Hi-C datasets have been generated, and likewise, a number of genome structure construction algorithms have been developed. In this work, we outline the construction of a novel Genome Structure Database (GSDB) to create a comprehensive repository that contains 3D structures for Hi-C datasets constructed by a variety of 3D structure reconstruction tools. The GSDB contains over 50,000 structures from 12 state-of-the-art Hi-C data structure prediction algorithms for 32 Hi-C datasets. GSDB functions as a centralized collection of genome structures which will enable the exploration of the dynamic architectures of chromosomes and genomes for biomedical research. GSDB is accessible at http://sysbio.rnet.missouri.edu/3dgenome/GSDB
Collapse
Affiliation(s)
- Oluwatosin Oluwadare
- Department of Computer Science, University of Colorado, Colorado Springs, CO, 80918, USA
| | - Max Highsmith
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Douglass Turner
- Elastic Image Software LLC, 21 Walnut Street, Lexington, MA, 02421, USA
| | | | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
24
|
Zhu H, Wang Z. SCL: a lattice-based approach to infer 3D chromosome structures from single-cell Hi-C data. Bioinformatics 2020; 35:3981-3988. [PMID: 30865261 PMCID: PMC6792089 DOI: 10.1093/bioinformatics/btz181] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Revised: 01/31/2019] [Accepted: 03/12/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In contrast to population-based Hi-C data, single-cell Hi-C data are zero-inflated and do not indicate the frequency of proximate DNA segments. There are a limited number of computational tools that can model the 3D structures of chromosomes based on single-cell Hi-C data. RESULTS We developed single-cell lattice (SCL), a computational method to reconstruct 3D structures of chromosomes based on single-cell Hi-C data. We designed a loss function and a 2 D Gaussian function specifically for the characteristics of single-cell Hi-C data. A chromosome is represented as beads-on-a-string and stored in a 3 D cubic lattice. Metropolis-Hastings simulation and simulated annealing are used to simulate the structure and minimize the loss function. We evaluated the SCL-inferred 3 D structures (at both 500 and 50 kb resolutions) using multiple criteria and compared them with the ones generated by another modeling software program. The results indicate that the 3 D structures generated by SCL closely fit single-cell Hi-C data. We also found similar patterns of trans-chromosomal contact beads, Lamin-B1 enriched topologically associating domains (TADs), and H3K4me3 enriched TADs by mapping data from previous studies onto the SCL-inferred 3 D structures. AVAILABILITY AND IMPLEMENTATION The C++ source code of SCL is freely available at http://dna.cs.miami.edu/SCL/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hao Zhu
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, USA
| | - Zheng Wang
- Department of Computer Science, University of Miami, Coral Gables, FL, USA
| |
Collapse
|
25
|
Li FZ, Liu ZE, Li XY, Bu LM, Bu HX, Liu H, Zhang CM. Chromatin 3D structure reconstruction with consideration of adjacency relationship among genomic loci. BMC Bioinformatics 2020; 21:272. [PMID: 32611376 PMCID: PMC7329537 DOI: 10.1186/s12859-020-03612-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 06/18/2020] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Chromatin 3D conformation plays important roles in regulating gene or protein functions. High-throughout chromosome conformation capture (3C)-based technologies, such as Hi-C, have been exploited to acquire the contact frequencies among genomic loci at genome-scale. Various computational tools have been proposed to recover the underlying chromatin 3D structures from in situ Hi-C contact map data. As connected residuals in a polymer, neighboring genomic loci have intrinsic mutual dependencies in building a 3D conformation. However, current methods seldom take this feature into account. RESULTS We present a method called ShNeigh, which combines the classical MDS technique with local dependence of neighboring loci modeled by a Gaussian formula, to infer the best 3D structure from noisy and incomplete contact frequency matrices. We validated ShNeigh by comparing it to two typical distance-based algorithms, ShRec3D and ChromSDE. The comparison results on simulated Hi-C dataset showed that, while keeping the high-speed nature of classical MDS, ShNeigh can recover the true structure better than ShRec3D and ChromSDE. Meanwhile, ShNeigh is more robust to data noise. On the publicly available human GM06990 Hi-C data, we demonstrated that the structures reconstructed by ShNeigh are more reproducible between different restriction enzymes than by ShRec3D and ChromSDE, especially at high resolutions manifested by sparse contact maps, which means ShNeigh is more robust to signal coverage. CONCLUSIONS Our method can recover stable structures in high noise and sparse signal settings. It can also reconstruct similar structures from Hi-C data obtained using different restriction enzymes. Therefore, our method provides a new direction for enhancing the reconstruction quality of chromatin 3D structures.
Collapse
Affiliation(s)
- Fang-Zhen Li
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China. .,Key Laboratory of Machine Learning and Financial Data Mining in Universities of Shandong, Jinan, China.
| | - Zhi-E Liu
- College of Physics and Electronic Engineering, Qilu Normal University, Jinan, China
| | - Xiu-Yuan Li
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China.,Key Laboratory of Machine Learning and Financial Data Mining in Universities of Shandong, Jinan, China
| | - Li-Mei Bu
- Department of Gastroenterology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Shanghai, China
| | - Hong-Xia Bu
- Key Laboratory of Machine Learning and Financial Data Mining in Universities of Shandong, Jinan, China
| | - Hui Liu
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China.,Digital Media Technology Key Lab of Shandong Province, Jinan, China
| | - Cai-Ming Zhang
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China.,Digital Media Technology Key Lab of Shandong Province, Jinan, China
| |
Collapse
|
26
|
Trieu T, Oluwadare O, Wopata J, Cheng J. GenomeFlow: a comprehensive graphical tool for modeling and analyzing 3D genome structure. Bioinformatics 2020; 35:1416-1418. [PMID: 30215673 PMCID: PMC6477968 DOI: 10.1093/bioinformatics/bty802] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Revised: 08/29/2018] [Accepted: 09/11/2018] [Indexed: 02/01/2023] Open
Abstract
Motivation Three-dimensional (3D) genome organization plays important functional roles in cells. User-friendly tools for reconstructing 3D genome models from chromosomal conformation capturing data and analyzing them are needed for the study of 3D genome organization. Results We built a comprehensive graphical tool (GenomeFlow) to facilitate the entire process of modeling and analysis of 3D genome organization. This process includes the mapping of Hi-C data to one-dimensional (1D) reference genomes, the generation, normalization and visualization of two-dimensional (2D) chromosomal contact maps, the reconstruction and the visualization of the 3D models of chromosome and genome, the analysis of 3D models and the integration of these models with functional genomics data. This graphical tool is the first of its kind in reconstructing, storing, analyzing and annotating 3D genome models. It can reconstruct 3D genome models from Hi-C data and visualize them in real-time. This tool also allows users to overlay gene annotation, gene expression data and genome methylation data on top of 3D genome models. Availability and implementation The source code and user manual: https://github.com/jianlin-cheng/GenomeFlow. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tuan Trieu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Oluwatosin Oluwadare
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Julia Wopata
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| |
Collapse
|
27
|
Fiorillo L, Bianco S, Chiariello AM, Barbieri M, Esposito A, Annunziatella C, Conte M, Corrado A, Prisco A, Pombo A, Nicodemi M. Inference of chromosome 3D structures from GAM data by a physics computational approach. Methods 2019; 181-182:70-79. [PMID: 31604121 DOI: 10.1016/j.ymeth.2019.09.018] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 08/02/2019] [Accepted: 09/27/2019] [Indexed: 01/06/2023] Open
Abstract
The combination of modelling and experimental advances can provide deep insights for understanding chromatin 3D organization and ultimately its underlying mechanisms. In particular, models of polymer physics can help comprehend the complexity of genomic contact maps, as those emerging from technologies such as Hi-C, GAM or SPRITE. Here we discuss a method to reconstruct 3D structures from Genome Architecture Mapping (GAM) data, based on PRISMR, a computational approach introduced to find the minimal polymer model best describing Hi-C input data from only polymer physics. After recapitulating the PRISMR procedure, we describe how we extended it for treating GAM data. We successfully test the method on a 6 Mb region around the Sox9 gene and, at a lower resolution, on the whole chromosome 7 in mouse embryonic stem cells. The PRISMR derived 3D structures from GAM co-segregation data are finally validated against independent Hi-C contact maps. The method results to be versatile and robust, hinting that it can be similarly applied to different experimental data, such as SPRITE or microscopy distance data.
Collapse
Affiliation(s)
- Luca Fiorillo
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy
| | - Simona Bianco
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy.
| | - Andrea M Chiariello
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy
| | - Mariano Barbieri
- Berlin Institute for Medical Systems Biology, Max-Delbrück Centre for Molecular Medicine, Robert-Rössle Strasse, Berlin-Buch 13092, Germany
| | - Andrea Esposito
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy; Berlin Institute for Medical Systems Biology, Max-Delbrück Centre for Molecular Medicine, Robert-Rössle Strasse, Berlin-Buch 13092, Germany
| | - Carlo Annunziatella
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy
| | - Mattia Conte
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy
| | - Alfonso Corrado
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy
| | - Antonella Prisco
- Institute of Genetics and Biophysics, Consiglio Nazionale Delle Ricerche (CNR), Italy
| | - Ana Pombo
- Berlin Institute for Medical Systems Biology, Max-Delbrück Centre for Molecular Medicine, Robert-Rössle Strasse, Berlin-Buch 13092, Germany
| | - Mario Nicodemi
- Dipartimento di Fisica, Università di Napoli Federico II, and INFN Napoli, Complesso Universitario di Monte Sant'Angelo, 80126 Naples, Italy; Berlin Institute of Health (BIH), MDC-Berlin, Germany.
| |
Collapse
|
28
|
Rosenthal M, Bryner D, Huffer F, Evans S, Srivastava A, Neretti N. Bayesian Estimation of Three-Dimensional Chromosomal Structure from Single-Cell Hi-C Data. J Comput Biol 2019; 26:1191-1202. [PMID: 31211598 PMCID: PMC6856950 DOI: 10.1089/cmb.2019.0100] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The problem of three-dimensional (3D) chromosome structure inference from Hi-C data sets is important and challenging. While bulk Hi-C data sets contain contact information derived from millions of cells and can capture major structural features shared by the majority of cells in the sample, they do not provide information about local variability between cells. Single-cell Hi-C can overcome this problem, but contact matrices are generally very sparse, making structural inference more problematic. We have developed a Bayesian multiscale approach, named Structural Inference via Multiscale Bayesian Approach, to infer 3D structures of chromosomes from single-cell Hi-C while including the bulk Hi-C data and some regularization terms as a prior. We study the landscape of solutions for each single-cell Hi-C data set as a function of prior strength and demonstrate clustering of solutions using data from the same cell.
Collapse
Affiliation(s)
- Michael Rosenthal
- Science and Technology Department, Naval Surface Warfare Center, Panama City Division, Panama City, Florida
| | - Darshan Bryner
- Science and Technology Department, Naval Surface Warfare Center, Panama City Division, Panama City, Florida
| | - Fred Huffer
- Department of Statistics, Florida State University, Tallahassee, Florida
| | - Shane Evans
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Anuj Srivastava
- Department of Statistics, Florida State University, Tallahassee, Florida
| | - Nicola Neretti
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island.,Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, Rhode Island
| |
Collapse
|
29
|
Hou J, Wu T, Cao R, Cheng J. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins 2019; 87:1165-1178. [PMID: 30985027 PMCID: PMC6800999 DOI: 10.1002/prot.25697] [Citation(s) in RCA: 99] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 04/04/2019] [Accepted: 04/12/2019] [Indexed: 12/28/2022]
Abstract
Predicting residue‐residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance‐driven template‐free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template‐free and template‐based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue‐residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template‐based modeling targets. Deep learning also successfully integrated one‐dimensional structural features, two‐dimensional contact information, and three‐dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.
Collapse
Affiliation(s)
- Jie Hou
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, Washington
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| |
Collapse
|
30
|
Oluwadare O, Highsmith M, Cheng J. An Overview of Methods for Reconstructing 3-D Chromosome and Genome Structures from Hi-C Data. Biol Proced Online 2019; 21:7. [PMID: 31049033 PMCID: PMC6482566 DOI: 10.1186/s12575-019-0094-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 04/01/2019] [Indexed: 01/08/2023] Open
Abstract
Over the past decade, methods for predicting three-dimensional (3-D) chromosome and genome structures have proliferated. This has been primarily due to the development of high-throughput, next-generation chromosome conformation capture (3C) technologies, which have provided next-generation sequencing data about chromosome conformations in order to map the 3-D genome structure. The introduction of the Hi-C technique-a variant of the 3C method-has allowed researchers to extract the interaction frequency (IF) for all loci of a genome at high-throughput and at a genome-wide scale. In this review we describe, categorize, and compare the various methods developed to map chromosome and genome structures from 3C data-particularly Hi-C data. We summarize the improvements introduced by these methods, describe the approach used for method evaluation, and discuss how these advancements shape the future of genome structure construction.
Collapse
Affiliation(s)
- Oluwatosin Oluwadare
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| | - Max Highsmith
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
- Informatics Institute, University of Missouri, Columbia, MO 65211 USA
| |
Collapse
|
31
|
Hierarchical Reconstruction of High-Resolution 3D Models of Large Chromosomes. Sci Rep 2019; 9:4971. [PMID: 30899036 PMCID: PMC6428844 DOI: 10.1038/s41598-019-41369-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Accepted: 03/07/2019] [Indexed: 11/08/2022] Open
Abstract
Eukaryotic chromosomes are often composed of components organized into multiple scales, such as nucleosomes, chromatin fibers, topologically associated domains (TAD), chromosome compartments, and chromosome territories. Therefore, reconstructing detailed 3D models of chromosomes in high resolution is useful for advancing genome research. However, the task of constructing quality high-resolution 3D models is still challenging with existing methods. Hence, we designed a hierarchical algorithm, called Hierarchical3DGenome, to reconstruct 3D chromosome models at high resolution (<=5 Kilobase (KB)). The algorithm first reconstructs high-resolution 3D models at TAD level. The TAD models are then assembled to form complete high-resolution chromosomal models. The assembly of TAD models is guided by a complete low-resolution chromosome model. The algorithm is successfully used to reconstruct 3D chromosome models at 5 KB resolution for the human B-cell (GM12878). These high-resolution models satisfy Hi-C chromosomal contacts well and are consistent with models built at lower (i.e. 1 MB) resolution, and with the data of fluorescent in situ hybridization experiments. The Java source code of Hierarchical3DGenome and its user manual are available here https://github.com/BDM-Lab/Hierarchical3DGenome .
Collapse
|
32
|
Varoquaux N. Unfolding the Genome: The Case Study of P. falciparum. Int J Biostat 2018; 15:ijb-2017-0061. [PMID: 29878883 DOI: 10.1515/ijb-2017-0061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Accepted: 05/10/2018] [Indexed: 11/15/2022]
Abstract
The development of new ways to probe samples for the three-dimensional (3D) structure of DNA paves the way for in depth and systematic analyses of the genome architecture. 3C-like methods coupled with high-throughput sequencing can now assess physical interactions between pairs of loci in a genome-wide fashion, thus enabling the creation of genome-by-genome contact maps. The spreading of such protocols creates many new opportunities for methodological development: how can we infer 3D models from these contact maps? Can such models help us gain insights into biological processes? Several recent studies applied such protocols to P. falciparum (the deadliest of the five human malaria parasites), assessing its genome organization at different moments of its life cycle. With its small genomic size, fairly simple (yet changing) genomic organization during its lifecyle and strong correlation between chromatin folding and gene expression, this parasite is the ideal case study for applying and developing methods to infer 3D models and use them for downstream analysis. Here, I review a set of methods used to build and analyse three-dimensional models from contact maps data with a special highlight on P. falciparum's genome organization.
Collapse
Affiliation(s)
- Nelle Varoquaux
- Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, California, USA
- Berkeley Institute for Data Science, 190, Doe libraryBerkeley, United States of America
| |
Collapse
|
33
|
Waldispühl J, Zhang E, Butyaev A, Nazarova E, Cyr Y. Storage, visualization, and navigation of 3D genomics data. Methods 2018; 142:74-80. [PMID: 29792917 DOI: 10.1016/j.ymeth.2018.05.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 05/07/2018] [Accepted: 05/09/2018] [Indexed: 01/27/2023] Open
Abstract
The field of 3D genomics grew at increasing rates in the last decade. The volume and complexity of 2D and 3D data produced is progressively outpacing the capacities of the technology previously used for distributing genome sequences. The emergence of new technologies provides also novel opportunities for the development of innovative approaches. In this paper, we review the state-of-the-art computing technology, as well as the solutions adopted by the platforms currently available.
Collapse
Affiliation(s)
| | - Eric Zhang
- School of Computer Science, McGill University, Montréal, Canada
| | | | - Elena Nazarova
- School of Computer Science, McGill University, Montréal, Canada
| | - Yan Cyr
- Beam Me Up Labs, Montréal, Canada
| |
Collapse
|
34
|
Caudai C, Salerno E, Zoppe M, Merelli I, Tonazzini A. ChromStruct 4: A Python Code to Estimate the Chromatin Structure from Hi-C Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018:1-1. [PMID: 29993555 DOI: 10.1109/tcbb.2018.2838669] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A method and a stand-alone Python(TM) code to estimate the 3D chromatin structure from chromosome conformation capture data are presented. The method is based on a multiresolution, modified-bead-chain chromatin model, evolved through quaternion operators in a Monte Carlo sampling. The solution space to be sampled is generated by a score function with a data-fit part and a constraint part where the available prior knowledge is implicitly coded. The final solution is a set of 3D configurations that are compatible with both the data and the prior knowledge. The iterative code, provided here as additional material, is equipped with a graphical user interface and stores its results in standard-format files for 3D visualization. We describe the mathematical-computational aspects of the method and explain the details of the code. Some experimental results are reported, with a demonstration of their fit to the data.
Collapse
|
35
|
Abstract
The use of 3C-based methods has revealed the importance of the 3D organization of the chromatin for key aspects of genome biology. However, the different caveats of the variants of 3C techniques have limited their scope and the range of scientific fields that could benefit from these approaches. To address these limitations, we present 4Cin, a method to generate 3D models and derive virtual Hi-C (vHi-C) heat maps of genomic loci based on 4C-seq or any kind of 4C-seq-like data, such as those derived from NG Capture-C. 3D genome organization is determined by integrative consideration of the spatial distances derived from as few as four 4C-seq experiments. The 3D models obtained from 4C-seq data, together with their associated vHi-C maps, allow the inference of all chromosomal contacts within a given genomic region, facilitating the identification of Topological Associating Domains (TAD) boundaries. Thus, 4Cin offers a much cheaper, accessible and versatile alternative to other available techniques while providing a comprehensive 3D topological profiling. By studying TAD modifications in genomic structural variants associated to disease phenotypes and performing cross-species evolutionary comparisons of 3D chromatin structures in a quantitative manner, we demonstrate the broad potential and novel range of applications of our method. Chromatin conformation capture (3C) methods have revealed the importance of the 3D organization of the chromatin, which is key to understand many aspects of genome biology. But each of these methods have their own limitations. Here we present 4Cin, a software that generates 3D models of the chromatin from a small number of 4C-seq experiments, a 3C-based method that provides the frequency of contacts between one fragments and the genome (one vs all). These 3D models are used to infer all chromosomal contacts within a given genomic region (many vs many). The contact maps facilitate the identification of Topological Associating Domains boundaries. Our software offers a much cheaper, accessible and versatile alternative to other available techniques while providing a comprehensive 3D topological profiling. We applied our software to two different loci to study modifications in genomic structural variants associated to disease phenotypes and to compare the chromatin organization in two different species in a quantitative manner.
Collapse
|
36
|
Oluwadare O, Zhang Y, Cheng J. A maximum likelihood algorithm for reconstructing 3D structures of human chromosomes from chromosomal contact data. BMC Genomics 2018; 19:161. [PMID: 29471801 PMCID: PMC5824572 DOI: 10.1186/s12864-018-4546-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 02/13/2018] [Indexed: 01/07/2023] Open
Abstract
Background The development of chromosomal conformation capture techniques, particularly, the Hi-C technique, has made the analysis and study of the spatial conformation of a genome an important topic in bioinformatics and computational biology. Aided by high-throughput next generation sequencing techniques, the Hi-C technique can generate genome-wide, large-scale intra- and inter-chromosomal interaction data capable of describing in details the spatial interactions within a genome. These data can be used to reconstruct 3D structures of chromosomes that can be used to study DNA replication, gene regulation, genome interaction, genome folding, and genome function. Results Here, we introduce a maximum likelihood algorithm called 3DMax to construct the 3D structure of a chromosome from Hi-C data. 3DMax employs a maximum likelihood approach to infer the 3D structures of a chromosome, while automatically re-estimating the conversion factor (α) for converting Interaction Frequency (IF) to distance. Our results show that the models generated by 3DMax from a simulated Hi-C dataset match the true models better than most of the existing methods. 3DMax is more robust to structural variability and noise. Compared on a real Hi-C dataset, 3DMax constructs chromosomal models that fit the data better than most methods, and it is faster than all other methods. The models reconstructed by 3DMax were consistent with fluorescent in situ hybridization (FISH) experiments and existing knowledge about the organization of human chromosomes, such as chromosome compartmentalization. Conclusions 3DMax is an effective approach to reconstructing 3D chromosomal models. The results, and the models generated for the simulated and real Hi-C datasets are available here: http://sysbio.rnet.missouri.edu/bdm_download/3DMax/. The source code is available here: https://github.com/BDM-Lab/3DMax. A short video demonstrating how to use 3DMax can be found here: https://youtu.be/ehQUFWoHwfo.
Collapse
Affiliation(s)
- Oluwatosin Oluwadare
- Electrical Engineering & Computer Science Department, University of Missouri, Columbia, MO, 65211, USA
| | - Yuxiang Zhang
- Electrical Engineering & Computer Science Department, University of Missouri, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Electrical Engineering & Computer Science Department, University of Missouri, Columbia, MO, 65211, USA. .,Informatics Institute, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|