1
|
Liu T, Qiu QT, Hua KJ, Ma BG. Chromosome structure modeling tools and their evaluation in bacteria. Brief Bioinform 2024; 25:bbae044. [PMID: 38385874 PMCID: PMC10883143 DOI: 10.1093/bib/bbae044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/31/2023] [Accepted: 01/22/2024] [Indexed: 02/23/2024] Open
Abstract
The three-dimensional (3D) structure of bacterial chromosomes is crucial for understanding chromosome function. With the growing availability of high-throughput chromosome conformation capture (3C/Hi-C) data, the 3D structure reconstruction algorithms have become powerful tools to study bacterial chromosome structure and function. It is highly desired to have a recommendation on the chromosome structure reconstruction tools to facilitate the prokaryotic 3D genomics. In this work, we review existing chromosome 3D structure reconstruction algorithms and classify them based on their underlying computational models into two categories: constraint-based modeling and thermodynamics-based modeling. We briefly compare these algorithms utilizing 3C/Hi-C datasets and fluorescence microscopy data obtained from Escherichia coli and Caulobacter crescentus, as well as simulated datasets. We discuss current challenges in the 3D reconstruction algorithms for bacterial chromosomes, primarily focusing on software usability. Finally, we briefly prospect future research directions for bacterial chromosome structure reconstruction algorithms.
Collapse
Affiliation(s)
- Tong Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Qin-Tian Qiu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Kang-Jian Hua
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Bin-Guang Ma
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
2
|
Feng C, Wang J, Chu X. Large-scale data-driven and physics-based models offer insights into the relationships among the structures, dynamics, and functions of chromosomes. J Mol Cell Biol 2023; 15:mjad042. [PMID: 37365687 PMCID: PMC10782906 DOI: 10.1093/jmcb/mjad042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/22/2023] [Accepted: 06/25/2023] [Indexed: 06/28/2023] Open
Abstract
The organized three-dimensional chromosome architecture in the cell nucleus provides scaffolding for precise regulation of gene expression. When the cell changes its identity in the cell-fate decision-making process, extensive rearrangements of chromosome structures occur accompanied by large-scale adaptations of gene expression, underscoring the importance of chromosome dynamics in shaping genome function. Over the last two decades, rapid development of experimental methods has provided unprecedented data to characterize the hierarchical structures and dynamic properties of chromosomes. In parallel, these enormous data offer valuable opportunities for developing quantitative computational models. Here, we review a variety of large-scale polymer models developed to investigate the structures and dynamics of chromosomes. Different from the underlying modeling strategies, these approaches can be classified into data-driven ('top-down') and physics-based ('bottom-up') categories. We discuss their contributions to offering valuable insights into the relationships among the structures, dynamics, and functions of chromosomes and propose the perspective of developing data integration approaches from different experimental technologies and multidisciplinary theoretical/simulation methods combined with different modeling strategies.
Collapse
Affiliation(s)
- Cibo Feng
- Advanced Materials Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- Green e Materials Laboratory, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- College of Physics, Jilin University, Changchun 130012, China
| | - Jin Wang
- Department of Chemistry and Physics, The State University of New York at Stony Brook, Stony Brook, NY 11794, USA
| | - Xiakun Chu
- Advanced Materials Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- Green e Materials Laboratory, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR 999077, China
- Guangzhou Municipal Key Laboratory of Materials Informatics, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
| |
Collapse
|
3
|
Soroczynski J, Risca VI. Technological advances in probing 4D genome organization. Curr Opin Cell Biol 2023; 84:102211. [PMID: 37556867 PMCID: PMC10588670 DOI: 10.1016/j.ceb.2023.102211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 05/13/2023] [Accepted: 06/29/2023] [Indexed: 08/11/2023]
Abstract
The last two decades of work on chromosome conformation in eukaryotic nuclei have revealed a complex and highly regulated hierarchy of architectural features, from self-associating domains and compartmental interactions to locus-specific loops. Recent findings have shown that these structures are dynamic and heterogeneous, with emerging insights into the factors that shape them and implications for the control of transcription and other nuclear processes. Here, we review the latest advances in the DNA sequencing- and microscopy-based technologies for probing these features in space and time (4D) and discuss how they have been combined with complementary approaches such as genetic perturbations, protein and RNA measurements, and modeling to gain mechanistic insights about genome regulation across space and time.
Collapse
Affiliation(s)
- Jan Soroczynski
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, 1230 York Ave., Box 176, New York, NY 10065, USA; David Rockefeller Graduate Program in Bioscience, The Rockefeller University, 1230 York Ave., New York, NY 10065, USA
| | - Viviana I Risca
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, 1230 York Ave., Box 176, New York, NY 10065, USA.
| |
Collapse
|
4
|
Habeck M. Bayesian methods in integrative structure modeling. Biol Chem 2023; 404:741-754. [PMID: 37505205 DOI: 10.1515/hsz-2023-0145] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 07/07/2023] [Indexed: 07/29/2023]
Abstract
There is a growing interest in characterizing the structure and dynamics of large biomolecular assemblies and their interactions within the cellular environment. A diverse array of experimental techniques allows us to study biomolecular systems on a variety of length and time scales. These techniques range from imaging with light, X-rays or electrons, to spectroscopic methods, cross-linking mass spectrometry and functional genomics approaches, and are complemented by AI-assisted protein structure prediction methods. A challenge is to integrate all of these data into a model of the system and its functional dynamics. This review focuses on Bayesian approaches to integrative structure modeling. We sketch the principles of Bayesian inference, highlight recent applications to integrative modeling and conclude with a discussion of current challenges and future perspectives.
Collapse
Affiliation(s)
- Michael Habeck
- Microscopic Image Analysis Group, Jena University Hospital, D-07743 Jena, Germany
- Max Planck Institute for Multidisciplinary Sciences, d-37077 Göttingen, Germany
| |
Collapse
|
5
|
Zhan Y, Yildirim A, Boninsegna L, Alber F. Conformational analysis of chromosome structures reveals vital role of chromosome morphology in gene function. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.18.528138. [PMID: 36824908 PMCID: PMC9949133 DOI: 10.1101/2023.02.18.528138] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
Abstract
The 3D conformations of chromosomes are highly variant and stochastic between single cells. Recent progress in multiplexed 3D FISH imaging, single cell Hi-C and genome structure modeling allows a closer analysis of the structural variations of chromosomes between cells to infer the functional implications of structural heterogeneity. Here, we introduce a two-step dimensionality reduction method to classify a population of single cell 3D chromosome structures, either from simulation or imaging experiment, into dominant conformational clusters with distinct chromosome morphologies. We found that almost half of all structures for each chromosome can be described by 5-10 dominant chromosome morphologies, which play a fundamental role in establishing conformational variation of chromosomes. These morphologies are conserved in different cell types, but vary in their relative proportion of structures. Chromosome morphologies are distinguished by the presence or absence of characteristic chromosome territory domains, which expose some chromosomal regions to varying nuclear environments in different morphologies, such as nuclear positions and associations to nuclear speckles, lamina, and nucleoli. These observations point to distinct functional variations for the same chromosomal region in different chromosome morphologies. We validated chromosome conformational clusters and their associated subnuclear locations with data from DNA-MERFISH imaging and single cell sci-HiC data. Our method provides an important approach to assess the variation of chromosome structures between cells and link differences in conformational states with distinct gene functions.
Collapse
|
6
|
Makai D, Cseh A, Sepsi A, Makai S. A Multigraph-Based Representation of Hi-C Data. Genes (Basel) 2022; 13:genes13122189. [PMID: 36553456 PMCID: PMC9778156 DOI: 10.3390/genes13122189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 11/10/2022] [Accepted: 11/15/2022] [Indexed: 11/25/2022] Open
Abstract
Chromatin-chromatin interactions and three-dimensional (3D) spatial structures are involved in transcriptional regulation and have a decisive role in DNA replication and repair. To understand how individual genes and their regulatory elements function within the larger genomic context, and how the genome reacts to environmental stimuli, the linear sequence information needs to be interpreted in three-dimensional space, which is still a challenging task. Here, we propose a novel, heuristic approach to represent Hi-C datasets by a whole-genomic pseudo-structure in 3D space. The baseline of our approach is the construction of a multigraph from genomic-sequence data and Hi-C interaction data, then applying a modified force-directed layout algorithm. The resulting layout is a pseudo-structure. While pseudo-structures are not based on direct observation and their details are inherent to settings, surprisingly, they demonstrate interesting, overall similarities of known genome structures of both barley and rice, namely, the Rabl and Rosette-like conformation. It has an exciting potential to be extended by additional omics data (RNA-seq, Chip-seq, etc.), allowing to visualize the dynamics of the pseudo-structures across various tissues or developmental stages. Furthermore, this novel method would make it possible to revisit most Hi-C data accumulated in the public domain in the last decade.
Collapse
Affiliation(s)
- Diána Makai
- Department of Biological Resources, Eötvös Loránd Research Network, Centre for Agricultural Research, 2462 Martonvásár, Hungary
| | - András Cseh
- Department of Molecular Breeding, Eötvös Loránd Research Network, Centre for Agricultural Research, 2462 Martonvásár, Hungary
| | - Adél Sepsi
- Department of Biological Resources, Eötvös Loránd Research Network, Centre for Agricultural Research, 2462 Martonvásár, Hungary
| | - Szabolcs Makai
- Department of Molecular Breeding, Eötvös Loránd Research Network, Centre for Agricultural Research, 2462 Martonvásár, Hungary
- Department of Cereal Breeding, Eötvös Loránd Research Network, Centre for Agricultural Research, 2462 Martonvásár, Hungary
- Correspondence:
| |
Collapse
|
7
|
Hirata Y, Oda AH, Motono C, Shiro M, Ohta K. Imputation-free reconstructions of three-dimensional chromosome architectures in human diploid single-cells using allele-specified contacts. Sci Rep 2022; 12:11757. [PMID: 35817790 PMCID: PMC9273635 DOI: 10.1038/s41598-022-15038-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 06/16/2022] [Indexed: 11/18/2022] Open
Abstract
Single-cell Hi-C analysis of diploid human cells is difficult because of the lack of dense chromosome contact information and the presence of homologous chromosomes with very similar nucleotide sequences. Thus here, we propose a new algorithm to reconstruct the three-dimensional (3D) chromosomal architectures from the Hi-C dataset of single diploid human cells using allele-specific single-nucleotide variations (SNVs). We modified our recurrence plot-based algorithm, which is suitable for the estimation of the 3D chromosome structure from sparse Hi-C datasets, by newly incorporating a function of discriminating SNVs specific to each homologous chromosome. Here, we eventually regard a contact map as a recurrence plot. Importantly, the proposed method does not require any imputation for ambiguous segment information, but could efficiently reconstruct 3D chromosomal structures in single human diploid cells at a 1-Mb resolution. Datasets of segments without allele-specific SNVs, which were considered to be of little value, can also be used to validate the estimated chromosome structure. Introducing an additional mathematical measure called a refinement further improved the resolution to 40-kb or 100-kb. The reconstruction data supported the notion that human chromosomes form chromosomal territories and take fractal structures where the dimension for the underlying chromosome structure is a non-integer value.
Collapse
Affiliation(s)
- Yoshito Hirata
- Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8573, Japan.
| | - Arisa H Oda
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Meguro-ku, Tokyo, 153-8902, Japan
| | - Chie Motono
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology, Koto-ku, Tokyo, 135-0064, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-0072, Japan
| | - Masanori Shiro
- Mathematical Neuroscience Research Group, Human Informatics and Interaction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki, 305-8568, Japan
| | - Kunihiro Ohta
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Meguro-ku, Tokyo, 153-8902, Japan.,Research Center for Complex Systems Biology, Universal Biology Institute, 3-8-1 Komaba, Meguro-ku, Tokyo, 153-8902, Japan
| |
Collapse
|
8
|
Yildirim A, Boninsegna L, Zhan Y, Alber F. Uncovering the Principles of Genome Folding by 3D Chromatin Modeling. Cold Spring Harb Perspect Biol 2022; 14:a039693. [PMID: 34400556 PMCID: PMC9248826 DOI: 10.1101/cshperspect.a039693] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Our understanding of how genomic DNA is tightly packed inside the nucleus, yet is still accessible for vital cellular processes, has grown dramatically over recent years with advances in microscopy and genomics technologies. Computational methods have played a pivotal role in the structural interpretation of experimental data, which helped unravel some organizational principles of genome folding. Here, we give an overview of current computational efforts in mechanistic and data-driven 3D chromatin structure modeling. We discuss strengths and limitations of different methods and evaluate the added value and benefits of computational approaches to infer the 3D structural and dynamic properties of the genome and its underlying mechanisms at different scales and resolution, ranging from the dynamic formation of chromatin loops and topological associated domains to nuclear compartmentalization of chromatin and nuclear bodies.
Collapse
Affiliation(s)
- Asli Yildirim
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Lorenzo Boninsegna
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Yuxiang Zhan
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
| | - Frank Alber
- Institute for Quantitative and Computational Biosciences, Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, California 90095, USA
- Quantitative and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
| |
Collapse
|
9
|
Wang H, Yang J, Zhang Y, Qian J, Wang J. Reconstruct high-resolution 3D genome structures for diverse cell-types using FLAMINGO. Nat Commun 2022; 13:2645. [PMID: 35551182 PMCID: PMC9098643 DOI: 10.1038/s41467-022-30270-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 04/22/2022] [Indexed: 11/30/2022] Open
Abstract
High-resolution reconstruction of spatial chromosome organizations from chromatin contact maps is highly demanded, but is hindered by extensive pairwise constraints, substantial missing data, and limited resolution and cell-type availabilities. Here, we present FLAMINGO, a computational method that addresses these challenges by compressing inter-dependent Hi-C interactions to delineate the underlying low-rank structures in 3D space, based on the low-rank matrix completion technique. FLAMINGO successfully generates 5 kb- and 1 kb-resolution spatial conformations for all chromosomes in the human genome across multiple cell-types, the largest resources to date. Compared to other methods using various experimental metrics, FLAMINGO consistently demonstrates superior accuracy in recapitulating observed structures with raises in scalability by orders of magnitude. The reconstructed 3D structures efficiently facilitate discoveries of higher-order multi-way interactions, imply biological interpretations of long-range QTLs, reveal geometrical properties of chromatin, and provide high-resolution references to understand structural variabilities. Importantly, FLAMINGO achieves robust predictions against high rates of missing data and significantly boosts 3D structure resolutions. Moreover, FLAMINGO shows vigorous cross cell-type structure predictions that capture cell-type specific spatial configurations via integration of 1D epigenomic signals. FLAMINGO can be widely applied to large-scale chromatin contact maps and expand high-resolution spatial genome conformations for diverse cell-types. High-resolution reconstruction of spatial chromosome organisation is in demand. Here the authors report FLAMINGO, for reconstructing high-resolution 3D Genome Organisation from HiC data which they use to generate both 5 kb and 1 kb-resolution 3D chromosomal structures for the human genome.
Collapse
Affiliation(s)
- Hao Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Yu Zhang
- Center for Immunobiology, Department of Investigative Medicine, Western Michigan University Homer Stryker M.D. School of Medicine, Kalamazoo, MI, 49007, USA
| | - Jianliang Qian
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA. .,Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA.
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
10
|
Rowland B, Huh R, Hou Z, Crowley C, Wen J, Shen Y, Hu M, Giusti-Rodríguez P, Sullivan PF, Li Y. THUNDER: A reference-free deconvolution method to infer cell type proportions from bulk Hi-C data. PLoS Genet 2022; 18:e1010102. [PMID: 35259165 PMCID: PMC8932604 DOI: 10.1371/journal.pgen.1010102] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 03/18/2022] [Accepted: 02/14/2022] [Indexed: 11/30/2022] Open
Abstract
Hi-C data provide population averaged estimates of three-dimensional chromatin contacts across cell types and states in bulk samples. Effective analysis of Hi-C data entails controlling for the potential confounding factor of differential cell type proportions across heterogeneous bulk samples. We propose a novel unsupervised deconvolution method for inferring cell type composition from bulk Hi-C data, the Two-step Hi-c UNsupervised DEconvolution appRoach (THUNDER). We conducted extensive simulations to test THUNDER based on combining two published single-cell Hi-C (scHi-C) datasets. THUNDER more accurately estimates the underlying cell type proportions compared to reference-free methods (e.g., TOAST, and NMF) and is more robust than reference-dependent methods (e.g. MuSiC). We further demonstrate the practical utility of THUNDER to estimate cell type proportions and identify cell-type-specific interactions in Hi-C data from adult human cortex tissue samples. THUNDER will be a useful tool in adjusting for varying cell type composition in population samples, facilitating valid and more powerful downstream analysis such as differential chromatin organization studies. Additionally, THUNDER estimated contact profiles provide a useful exploratory framework to investigate cell-type-specificity of the chromatin interactome while experimental data is still rare.
Collapse
Affiliation(s)
- Bryce Rowland
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Ruth Huh
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Zoey Hou
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, Illinois, United States of America
| | - Cheynna Crowley
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Yin Shen
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, Ohio, United States of America
| | - Paola Giusti-Rodríguez
- Department of Psychiatry, University of Florida College of Medicine, Gainesville, Florida, United States of America
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
11
|
Galitsyna AA, Gelfand MS. Single-cell Hi-C data analysis: safety in numbers. Brief Bioinform 2021; 22:bbab316. [PMID: 34406348 PMCID: PMC8575028 DOI: 10.1093/bib/bbab316] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 07/09/2021] [Accepted: 07/21/2021] [Indexed: 02/06/2023] Open
Abstract
Over the past decade, genome-wide assays for chromatin interactions in single cells have enabled the study of individual nuclei at unprecedented resolution and throughput. Current chromosome conformation capture techniques survey contacts for up to tens of thousands of individual cells, improving our understanding of genome function in 3D. However, these methods recover a small fraction of all contacts in single cells, requiring specialised processing of sparse interactome data. In this review, we highlight recent advances in methods for the interpretation of single-cell genomic contacts. After discussing the strengths and limitations of these methods, we outline frontiers for future development in this rapidly moving field.
Collapse
Affiliation(s)
- Aleksandra A Galitsyna
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
- Institute for Information Transmission Problems, RAS, Moscow, Russia
- Institute of Gene Biology, RAS, Moscow, Russia
| | - Mikhail S Gelfand
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
- Institute for Information Transmission Problems, RAS, Moscow, Russia
| |
Collapse
|
12
|
Bonora G, Ramani V, Singh R, Fang H, Jackson DL, Srivatsan S, Qiu R, Lee C, Trapnell C, Shendure J, Duan Z, Deng X, Noble WS, Disteche CM. Single-cell landscape of nuclear configuration and gene expression during stem cell differentiation and X inactivation. Genome Biol 2021; 22:279. [PMID: 34579774 PMCID: PMC8474932 DOI: 10.1186/s13059-021-02432-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 07/07/2021] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Mammalian development is associated with extensive changes in gene expression, chromatin accessibility, and nuclear structure. Here, we follow such changes associated with mouse embryonic stem cell differentiation and X inactivation by integrating, for the first time, allele-specific data from these three modalities obtained by high-throughput single-cell RNA-seq, ATAC-seq, and Hi-C. RESULTS Allele-specific contact decay profiles obtained by single-cell Hi-C clearly show that the inactive X chromosome has a unique profile in differentiated cells that have undergone X inactivation. Loss of this inactive X-specific structure at mitosis is followed by its reappearance during the cell cycle, suggesting a "bookmark" mechanism. Differentiation of embryonic stem cells to follow the onset of X inactivation is associated with changes in contact decay profiles that occur in parallel on both the X chromosomes and autosomes. Single-cell RNA-seq and ATAC-seq show evidence of a delay in female versus male cells, due to the presence of two active X chromosomes at early stages of differentiation. The onset of the inactive X-specific structure in single cells occurs later than gene silencing, consistent with the idea that chromatin compaction is a late event of X inactivation. Single-cell Hi-C highlights evidence of discrete changes in nuclear structure characterized by the acquisition of very long-range contacts throughout the nucleus. Novel computational approaches allow for the effective alignment of single-cell gene expression, chromatin accessibility, and 3D chromosome structure. CONCLUSIONS Based on trajectory analyses, three distinct nuclear structure states are detected reflecting discrete and profound simultaneous changes not only to the structure of the X chromosomes, but also to that of autosomes during differentiation. Our study reveals that long-range structural changes to chromosomes appear as discrete events, unlike progressive changes in gene expression and chromatin accessibility.
Collapse
Affiliation(s)
- Giancarlo Bonora
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Vijay Ramani
- Department of Biochemistry & Biophysics, University of California San Francisco, San Francisco, CA, USA
| | - Ritambhara Singh
- Department of Computer Science, Brown University, Providence, RI, USA
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - He Fang
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Dana L Jackson
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Sanjay Srivatsan
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Ruolan Qiu
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Choli Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - Zhijun Duan
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, USA
- Division of Hematology, Department of Medicine, University of Washington, Seattle, USA
| | - Xinxian Deng
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA.
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
| | - Christine M Disteche
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Department of Medicine, University of Washington, Seattle, WA, USA.
| |
Collapse
|
13
|
Si-C is a method for inferring super-resolution intact genome structure from single-cell Hi-C data. Nat Commun 2021; 12:4369. [PMID: 34272403 PMCID: PMC8285481 DOI: 10.1038/s41467-021-24662-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 06/25/2021] [Indexed: 12/21/2022] Open
Abstract
There is a strong demand for methods that can efficiently reconstruct valid super-resolution intact genome 3D structures from sparse and noise single-cell Hi-C data. Here, we develop Single-Cell Chromosome Conformation Calculator (Si-C) within the Bayesian theory framework and apply this approach to reconstruct intact genome 3D structures from single-cell Hi-C data of eight G1-phase haploid mouse ES cells. The inferred 100-kb and 10-kb structures consistently reproduce the known conserved features of chromatin organization revealed by independent imaging experiments. The analysis of the 10-kb resolution 3D structures reveals cell-to-cell varying domain structures in individual cells and hyperfine structures in domains, such as loops. An average of 0.2 contact reads per divided bin is sufficient for Si-C to obtain reliable structures. The valid super-resolution structures constructed by Si-C demonstrate the potential for visualizing and investigating interactions between all chromatin loci at the genome scale in individual cells. Constructing valid super-resolution intact genome 3D structures from single-cell Hi-C data is essential in investigating chromosome folding. Here the authors develop a method that makes it possible to visualize and investigate chromosome folding in individual cells at the genome scale
Collapse
|
14
|
MacKay K, Kusalik A. Computational methods for predicting 3D genomic organization from high-resolution chromosome conformation capture data. Brief Funct Genomics 2021; 19:292-308. [PMID: 32353112 PMCID: PMC7388788 DOI: 10.1093/bfgp/elaa004] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 01/30/2020] [Accepted: 02/07/2020] [Indexed: 12/19/2022] Open
Abstract
The advent of high-resolution chromosome conformation capture assays (such as 5C, Hi-C and Pore-C) has allowed for unprecedented sequence-level investigations into the structure-function relationship of the genome. In order to comprehensively understand this relationship, computational tools are required that utilize data generated from these assays to predict 3D genome organization (the 3D genome reconstruction problem). Many computational tools have been developed that answer this need, but a comprehensive comparison of their underlying algorithmic approaches has not been conducted. This manuscript provides a comprehensive review of the existing computational tools (from November 2006 to September 2019, inclusive) that can be used to predict 3D genome organizations from high-resolution chromosome conformation capture data. Overall, existing tools were found to use a relatively small set of algorithms from one or more of the following categories: dimensionality reduction, graph/network theory, maximum likelihood estimation (MLE) and statistical modeling. Solutions in each category are far from maturity, and the breadth and depth of various algorithmic categories have not been fully explored. While the tools for predicting 3D structure for a genomic region or single chromosome are diverse, there is a general lack of algorithmic diversity among computational tools for predicting the complete 3D genome organization from high-resolution chromosome conformation capture data.
Collapse
|
15
|
Li J, Lin Y, Tang Q, Li M. Understanding three-dimensional chromatin organization in diploid genomes. Comput Struct Biotechnol J 2021; 19:3589-3598. [PMID: 34257838 PMCID: PMC8246089 DOI: 10.1016/j.csbj.2021.06.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 06/11/2021] [Accepted: 06/12/2021] [Indexed: 11/17/2022] Open
Abstract
The three-dimensional (3D) organization of chromatin in the nucleus of diploid eukaryotic organisms has fascinated biologists for many years. Despite major progress in chromatin conformation studies, current knowledge regarding the spatial organization of diploid (maternal and paternal) genomes is still limited. Recent advances in Hi-C technology and data processing approaches have enabled construction of diploid Hi-C contact maps. These maps greatly accelerated the pace of novel discoveries in haplotype-resolved 3D genome studies, revealing the role of allele biased chromatin conformation in transcriptional regulation. Here, we review emerging concepts and haplotype phasing strategies of Hi-C data in 3D diploid genome studies. We discuss new insights on homologous chromosomal organization and the interplay between allelic biased chromatin architecture and several nuclear functions, explaining how haplotype-resolved Hi-C technologies have been used to resolve important biological questions.
Collapse
Affiliation(s)
- Jing Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
| | - Yu Lin
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
| | - Qianzi Tang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
| | - Mingzhou Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
| |
Collapse
|
16
|
Zha M, Wang N, Zhang C, Wang Z. Inferring Single-Cell 3D Chromosomal Structures Based on the Lennard-Jones Potential. Int J Mol Sci 2021; 22:ijms22115914. [PMID: 34072879 PMCID: PMC8199262 DOI: 10.3390/ijms22115914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 05/23/2021] [Accepted: 05/28/2021] [Indexed: 11/16/2022] Open
Abstract
Reconstructing three-dimensional (3D) chromosomal structures based on single-cell Hi-C data is a challenging scientific problem due to the extreme sparseness of the single-cell Hi-C data. In this research, we used the Lennard-Jones potential to reconstruct both 500 kb and high-resolution 50 kb chromosomal structures based on single-cell Hi-C data. A chromosome was represented by a string of 500 kb or 50 kb DNA beads and put into a 3D cubic lattice for simulations. A 2D Gaussian function was used to impute the sparse single-cell Hi-C contact matrices. We designed a novel loss function based on the Lennard-Jones potential, in which the ε value, i.e., the well depth, was used to indicate how stable the binding of every pair of beads is. For the bead pairs that have single-cell Hi-C contacts and their neighboring bead pairs, the loss function assigns them stronger binding stability. The Metropolis-Hastings algorithm was used to try different locations for the DNA beads, and simulated annealing was used to optimize the loss function. We proved the correctness and validness of the reconstructed 3D structures by evaluating the models according to multiple criteria and comparing the models with 3D-FISH data.
Collapse
Affiliation(s)
- Mengsheng Zha
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Dr, Hattiesburg, MS 39406, USA; (M.Z.); (C.Z.)
| | - Nan Wang
- Department of Computer Science, New Jersey City University, 2039 Kennedy Blvd, Jersey City, NJ 07305, USA;
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Dr, Hattiesburg, MS 39406, USA; (M.Z.); (C.Z.)
| | - Zheng Wang
- Department of Computer Science, University of Miami, 1364 Memorial Drive, Coral Gables, FL 33124, USA
- Correspondence:
| |
Collapse
|
17
|
Gong H, Yang Y, Zhang S, Li M, Zhang X. Application of Hi-C and other omics data analysis in human cancer and cell differentiation research. Comput Struct Biotechnol J 2021; 19:2070-2083. [PMID: 33995903 PMCID: PMC8086027 DOI: 10.1016/j.csbj.2021.04.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/04/2021] [Accepted: 04/04/2021] [Indexed: 02/07/2023] Open
Abstract
With the development of 3C (chromosome conformation capture) and its derivative technology Hi-C (High-throughput chromosome conformation capture) research, the study of the spatial structure of the genomic sequence in the nucleus helps researchers understand the functions of biological processes such as gene transcription, replication, repair, and regulation. In this paper, we first introduce the research background and purpose of Hi-C data visualization analysis. After that, we discuss the Hi-C data analysis methods from genome 3D structure, A/B compartment, TADs (topologically associated domain), and loop detection. We also discuss how to apply genome visualization technologies to the identification of chromosome feature structures. We continue with a review of correlation analysis differences among multi-omics data, and how to apply Hi-C and other omics data analysis into cancer and cell differentiation research. Finally, we summarize the various problems in joint analyses based on Hi-C and other multi-omics data. We believe this review can help researchers better understand the progress and applications of 3D genome technology.
Collapse
Affiliation(s)
- Haiyan Gong
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| | - Yi Yang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Sichen Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Minghong Li
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Xiaotong Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| |
Collapse
|
18
|
Reinknecht C, Riga A, Rivera J, Snyder DA. Patterns in Protein Flexibility: A Comparison of NMR "Ensembles", MD Trajectories, and Crystallographic B-Factors. Molecules 2021; 26:molecules26051484. [PMID: 33803249 PMCID: PMC7967184 DOI: 10.3390/molecules26051484] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 02/18/2021] [Accepted: 02/28/2021] [Indexed: 11/16/2022] Open
Abstract
Proteins are molecular machines requiring flexibility to function. Crystallographic B-factors and Molecular Dynamics (MD) simulations both provide insights into protein flexibility on an atomic scale. Nuclear Magnetic Resonance (NMR) lacks a universally accepted analog of the B-factor. However, a lack of convergence in atomic coordinates in an NMR-based structure calculation also suggests atomic mobility. This paper describes a pattern in the coordinate uncertainties of backbone heavy atoms in NMR-derived structural “ensembles” first noted in the development of FindCore2 (previously called Expanded FindCore: DA Snyder, J Grullon, YJ Huang, R Tejero, GT Montelione, Proteins: Structure, Function, and Bioinformatics 82 (S2), 219–230) and demonstrates that this pattern exists in coordinate variances across MD trajectories but not in crystallographic B-factors. This either suggests that MD trajectories and NMR “ensembles” capture motional behavior of peptide bond units not captured by B-factors or indicates a deficiency common to force fields used in both NMR and MD calculations.
Collapse
|
19
|
Han C, Xie Q, Lin S. Are dropout imputation methods for scRNA-seq effective for scHi-C data? Brief Bioinform 2020; 22:5985294. [PMID: 33201180 DOI: 10.1093/bib/bbaa289] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 09/30/2020] [Accepted: 10/01/2020] [Indexed: 01/07/2023] Open
Abstract
The prevalence of dropout events is a serious problem for single-cell Hi-C (scHiC) data due to insufficient sequencing depth and data coverage, which brings difficulties in downstream studies such as clustering and structural analysis. Complicating things further is the fact that dropouts are confounded with structural zeros due to underlying properties, leading to observed zeros being a mixture of both types of events. Although a great deal of progress has been made in imputing dropout events for single cell RNA-sequencing (RNA-seq) data, little has been done in identifying structural zeros and imputing dropouts for scHiC data. In this paper, we adapted several methods from the single-cell RNA-seq literature for inference on observed zeros in scHiC data and evaluated their effectiveness. Through an extensive simulation study and real data analysis, we have shown that a couple of the adapted single-cell RNA-seq algorithms can be powerful for correctly identifying structural zeros and accurately imputing dropout values. Downstream analysis using the imputed values showed considerable improvement for clustering cells of the same types together over clustering results before imputation.
Collapse
Affiliation(s)
| | | | - Shili Lin
- Translational Data Analytics Institute at the Ohio State University
| |
Collapse
|
20
|
Gilliot PA, Gorochowski TE. Sequencing enabling design and learning in synthetic biology. Curr Opin Chem Biol 2020; 58:54-62. [DOI: 10.1016/j.cbpa.2020.06.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 04/21/2020] [Accepted: 06/02/2020] [Indexed: 01/27/2023]
|
21
|
Meluzzi D, Arya G. Computational approaches for inferring 3D conformations of chromatin from chromosome conformation capture data. Methods 2020; 181-182:24-34. [PMID: 31470090 PMCID: PMC7044057 DOI: 10.1016/j.ymeth.2019.08.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/24/2019] [Accepted: 08/23/2019] [Indexed: 02/08/2023] Open
Abstract
Chromosome conformation capture (3C) and its variants are powerful experimental techniques for probing intra- and inter-chromosomal interactions within cell nuclei at high resolution and in a high-throughput, quantitative manner. The contact maps derived from such experiments provide an avenue for inferring the 3D spatial organization of the genome. This review provides an overview of the various computational methods developed in the past decade for addressing the very important but challenging problem of deducing the detailed 3D structure or structure population of chromosomal domains, chromosomes, and even entire genomes from 3C contact maps.
Collapse
Affiliation(s)
- Dario Meluzzi
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, United States
| | - Gaurav Arya
- Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, United States.
| |
Collapse
|
22
|
Han Z, Cui K, Placek K, Hong N, Lin C, Chen W, Zhao K, Jin W. Diploid genome architecture revealed by multi-omic data of hybrid mice. Genome Res 2020; 30:1097-1106. [PMID: 32759226 PMCID: PMC7462080 DOI: 10.1101/gr.257568.119] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 07/23/2020] [Indexed: 12/24/2022]
Abstract
Although mammalian genomes are diploid, previous studies extensively investigated the average chromatin architectures without considering the differences between homologous chromosomes. We generated Hi-C, ChIP-seq, and RNA-seq data sets from CD4 T cells of B6, Cast, and hybrid mice, to investigate the diploid chromatin organization and epigenetic regulation. Our data indicate that inter-chromosomal interaction patterns between homologous chromosomes are similar, and the similarity is highly correlated with their allelic coexpression levels. Reconstruction of the 3D nucleus revealed that distances of the homologous chromosomes to the center of nucleus are almost the same. The inter-chromosomal interactions at centromere ends are significantly weaker than those at telomere ends, suggesting that they are located in different regions within the chromosome territories. The majority of A|B compartments or topologically associated domains (TADs) are consistent between B6 and Cast. We found 58% of the haploids in hybrids maintain their parental compartment status at B6/Cast divergent compartments owing to cis effect. About 95% of the trans-effected B6/Cast divergent compartments converge to the same compartment status potentially because of a shared cellular environment. We showed the differentially expressed genes between the two haploids in hybrid were associated with either genetic or epigenetic effects. In summary, our multi-omics data from the hybrid mice provided haploid-specific information on the 3D nuclear architecture and a rich resource for further understanding the epigenetic regulation of haploid-specific gene expression.
Collapse
Affiliation(s)
- Zhijun Han
- Department of Biology, Southern University of Science and Technology, Shenzhen, Guangdong 518055, China.,Institute of Life Sciences, Southeast University, Nanjing, Jiangsu 210096, China
| | - Kairong Cui
- Systems Biology Center, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Katarzyna Placek
- Systems Biology Center, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Ni Hong
- Department of Biology, Southern University of Science and Technology, Shenzhen, Guangdong 518055, China
| | - Chengqi Lin
- Institute of Life Sciences, Southeast University, Nanjing, Jiangsu 210096, China
| | - Wei Chen
- Department of Biology, Southern University of Science and Technology, Shenzhen, Guangdong 518055, China
| | - Keji Zhao
- Systems Biology Center, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Wenfei Jin
- Department of Biology, Southern University of Science and Technology, Shenzhen, Guangdong 518055, China
| |
Collapse
|
23
|
Zhu H, Wang Z. SCL: a lattice-based approach to infer 3D chromosome structures from single-cell Hi-C data. Bioinformatics 2020; 35:3981-3988. [PMID: 30865261 PMCID: PMC6792089 DOI: 10.1093/bioinformatics/btz181] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Revised: 01/31/2019] [Accepted: 03/12/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In contrast to population-based Hi-C data, single-cell Hi-C data are zero-inflated and do not indicate the frequency of proximate DNA segments. There are a limited number of computational tools that can model the 3D structures of chromosomes based on single-cell Hi-C data. RESULTS We developed single-cell lattice (SCL), a computational method to reconstruct 3D structures of chromosomes based on single-cell Hi-C data. We designed a loss function and a 2 D Gaussian function specifically for the characteristics of single-cell Hi-C data. A chromosome is represented as beads-on-a-string and stored in a 3 D cubic lattice. Metropolis-Hastings simulation and simulated annealing are used to simulate the structure and minimize the loss function. We evaluated the SCL-inferred 3 D structures (at both 500 and 50 kb resolutions) using multiple criteria and compared them with the ones generated by another modeling software program. The results indicate that the 3 D structures generated by SCL closely fit single-cell Hi-C data. We also found similar patterns of trans-chromosomal contact beads, Lamin-B1 enriched topologically associating domains (TADs), and H3K4me3 enriched TADs by mapping data from previous studies onto the SCL-inferred 3 D structures. AVAILABILITY AND IMPLEMENTATION The C++ source code of SCL is freely available at http://dna.cs.miami.edu/SCL/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hao Zhu
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, USA
| | - Zheng Wang
- Department of Computer Science, University of Miami, Coral Gables, FL, USA
| |
Collapse
|
24
|
Koukos P, Bonvin A. Integrative Modelling of Biomolecular Complexes. J Mol Biol 2020; 432:2861-2881. [DOI: 10.1016/j.jmb.2019.11.009] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Revised: 11/12/2019] [Accepted: 11/13/2019] [Indexed: 12/31/2022]
|
25
|
Bayesian inference of chromatin structure ensembles from population-averaged contact data. Proc Natl Acad Sci U S A 2020; 117:7824-7830. [PMID: 32193349 DOI: 10.1073/pnas.1910364117] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Mounting experimental evidence suggests a role for the spatial organization of chromatin in crucial processes of the cell nucleus such as transcription regulation. Chromosome conformation capture techniques allow us to characterize chromatin structure by mapping contacts between chromosomal loci on a genome-wide scale. The most widespread modality is to measure contact frequencies averaged over a population of cells. Single-cell variants exist, but suffer from low contact numbers and have not yet gained the same resolution as population methods. While intriguing biological insights have already been garnered from ensemble-averaged data, information about three-dimensional (3D) genome organization in the underlying individual cells remains largely obscured because the contact maps show only an average over a huge population of cells. Moreover, computational methods for structure modeling of chromatin have mostly focused on fitting a single consensus structure, thereby ignoring any cell-to-cell variability in the model itself. Here, we propose a fully Bayesian method to infer ensembles of chromatin structures and to determine the optimal number of states in a principled, objective way. We illustrate our approach on simulated data and compute multistate models of chromatin from chromosome conformation capture carbon copy (5C) data. Comparison with independent data suggests that the inferred ensembles represent the underlying sample population faithfully. Harnessing the rich information contained in multistate models, we investigate cell-to-cell variability of chromatin organization into topologically associating domains, thus highlighting the ability of our approach to deliver insights into chromatin organization of great biological relevance.
Collapse
|
26
|
Zhu G, Deng W, Hu H, Ma R, Zhang S, Yang J, Peng J, Kaplan T, Zeng J. Reconstructing spatial organizations of chromosomes through manifold learning. Nucleic Acids Res 2019; 46:e50. [PMID: 29408992 PMCID: PMC5934626 DOI: 10.1093/nar/gky065] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2017] [Accepted: 01/23/2018] [Indexed: 01/09/2023] Open
Abstract
Decoding the spatial organizations of chromosomes has crucial implications for studying eukaryotic gene regulation. Recently, chromosomal conformation capture based technologies, such as Hi-C, have been widely used to uncover the interaction frequencies of genomic loci in a high-throughput and genome-wide manner and provide new insights into the folding of three-dimensional (3D) genome structure. In this paper, we develop a novel manifold learning based framework, called GEM (Genomic organization reconstructor based on conformational Energy and Manifold learning), to reconstruct the three-dimensional organizations of chromosomes by integrating Hi-C data with biophysical feasibility. Unlike previous methods, which explicitly assume specific relationships between Hi-C interaction frequencies and spatial distances, our model directly embeds the neighboring affinities from Hi-C space into 3D Euclidean space. Extensive validations demonstrated that GEM not only greatly outperformed other state-of-art modeling methods but also provided a physically and physiologically valid 3D representations of the organizations of chromosomes. Furthermore, we for the first time apply the modeled chromatin structures to recover long-range genomic interactions missing from original Hi-C data.
Collapse
Affiliation(s)
- Guangxiang Zhu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Wenxuan Deng
- Department of Biostatistics, Yale University, New Haven, CT, USA
| | - Hailin Hu
- School of Medicine, Tsinghua University, Beijing 100084, China
| | - Rui Ma
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Sai Zhang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jinglin Yang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Tommy Kaplan
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
27
|
Rosenthal M, Bryner D, Huffer F, Evans S, Srivastava A, Neretti N. Bayesian Estimation of Three-Dimensional Chromosomal Structure from Single-Cell Hi-C Data. J Comput Biol 2019; 26:1191-1202. [PMID: 31211598 PMCID: PMC6856950 DOI: 10.1089/cmb.2019.0100] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The problem of three-dimensional (3D) chromosome structure inference from Hi-C data sets is important and challenging. While bulk Hi-C data sets contain contact information derived from millions of cells and can capture major structural features shared by the majority of cells in the sample, they do not provide information about local variability between cells. Single-cell Hi-C can overcome this problem, but contact matrices are generally very sparse, making structural inference more problematic. We have developed a Bayesian multiscale approach, named Structural Inference via Multiscale Bayesian Approach, to infer 3D structures of chromosomes from single-cell Hi-C while including the bulk Hi-C data and some regularization terms as a prior. We study the landscape of solutions for each single-cell Hi-C data set as a function of prior strength and demonstrate clustering of solutions using data from the same cell.
Collapse
Affiliation(s)
- Michael Rosenthal
- Science and Technology Department, Naval Surface Warfare Center, Panama City Division, Panama City, Florida
| | - Darshan Bryner
- Science and Technology Department, Naval Surface Warfare Center, Panama City Division, Panama City, Florida
| | - Fred Huffer
- Department of Statistics, Florida State University, Tallahassee, Florida
| | - Shane Evans
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Anuj Srivastava
- Department of Statistics, Florida State University, Tallahassee, Florida
| | - Nicola Neretti
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island.,Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, Rhode Island
| |
Collapse
|
28
|
Abbas A, He X, Niu J, Zhou B, Zhu G, Ma T, Song J, Gao J, Zhang MQ, Zeng J. Integrating Hi-C and FISH data for modeling of the 3D organization of chromosomes. Nat Commun 2019; 10:2049. [PMID: 31053705 PMCID: PMC6499832 DOI: 10.1038/s41467-019-10005-6] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 04/12/2019] [Indexed: 12/13/2022] Open
Abstract
The new advances in various experimental techniques that provide complementary information about the spatial conformations of chromosomes have inspired researchers to develop computational methods to fully exploit the merits of individual data sources and combine them to improve the modeling of chromosome structure. Here we propose GEM-FISH, a method for reconstructing the 3D models of chromosomes through systematically integrating both Hi-C and FISH data with the prior biophysical knowledge of a polymer model. Comprehensive tests on a set of chromosomes, for which both Hi-C and FISH data are available, demonstrate that GEM-FISH can outperform previous chromosome structure modeling methods and accurately capture the higher order spatial features of chromosome conformations. Moreover, our reconstructed 3D models of chromosomes revealed interesting patterns of spatial distributions of super-enhancers which can provide useful insights into understanding the functional roles of these super-enhancers in gene regulation.
Collapse
Affiliation(s)
- Ahmed Abbas
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - Xuan He
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - Jing Niu
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Bin Zhou
- School of Life Science, Tsinghua University, Beijing, 100084, China
| | - Guangxiang Zhu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - Tszshan Ma
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Jiangpeikun Song
- School of Life Science, Tsinghua University, Beijing, 100084, China
| | - Juntao Gao
- MOE Key Laboratory of Bioinformatics; Bioinformatics Division, Center for Synthetic and Systems Biology, BNRist; Department of Automation, Tsinghua University; Center for Synthetic and Systems Biology, Tsinghua University, Beijing, 100084, China
| | - Michael Q Zhang
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, 100084, China
- MOE Key Laboratory of Bioinformatics; Bioinformatics Division, Center for Synthetic and Systems Biology, BNRist; Department of Automation, Tsinghua University; Center for Synthetic and Systems Biology, Tsinghua University, Beijing, 100084, China
- Department of Biological Sciences, Center for Systems Biology, the University of Texas at Dallas, Richardson, TX, 75080-3021, USA
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China.
- MOE Key Laboratory of Bioinformatics; Bioinformatics Division, Center for Synthetic and Systems Biology, BNRist; Department of Automation, Tsinghua University; Center for Synthetic and Systems Biology, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
29
|
Oluwadare O, Highsmith M, Cheng J. An Overview of Methods for Reconstructing 3-D Chromosome and Genome Structures from Hi-C Data. Biol Proced Online 2019; 21:7. [PMID: 31049033 PMCID: PMC6482566 DOI: 10.1186/s12575-019-0094-0] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 04/01/2019] [Indexed: 01/08/2023] Open
Abstract
Over the past decade, methods for predicting three-dimensional (3-D) chromosome and genome structures have proliferated. This has been primarily due to the development of high-throughput, next-generation chromosome conformation capture (3C) technologies, which have provided next-generation sequencing data about chromosome conformations in order to map the 3-D genome structure. The introduction of the Hi-C technique-a variant of the 3C method-has allowed researchers to extract the interaction frequency (IF) for all loci of a genome at high-throughput and at a genome-wide scale. In this review we describe, categorize, and compare the various methods developed to map chromosome and genome structures from 3C data-particularly Hi-C data. We summarize the improvements introduced by these methods, describe the approach used for method evaluation, and discuss how these advancements shape the future of genome structure construction.
Collapse
Affiliation(s)
- Oluwatosin Oluwadare
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| | - Max Highsmith
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211 USA
- Informatics Institute, University of Missouri, Columbia, MO 65211 USA
| |
Collapse
|
30
|
Hierarchical Reconstruction of High-Resolution 3D Models of Large Chromosomes. Sci Rep 2019; 9:4971. [PMID: 30899036 PMCID: PMC6428844 DOI: 10.1038/s41598-019-41369-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Accepted: 03/07/2019] [Indexed: 11/08/2022] Open
Abstract
Eukaryotic chromosomes are often composed of components organized into multiple scales, such as nucleosomes, chromatin fibers, topologically associated domains (TAD), chromosome compartments, and chromosome territories. Therefore, reconstructing detailed 3D models of chromosomes in high resolution is useful for advancing genome research. However, the task of constructing quality high-resolution 3D models is still challenging with existing methods. Hence, we designed a hierarchical algorithm, called Hierarchical3DGenome, to reconstruct 3D chromosome models at high resolution (<=5 Kilobase (KB)). The algorithm first reconstructs high-resolution 3D models at TAD level. The TAD models are then assembled to form complete high-resolution chromosomal models. The assembly of TAD models is guided by a complete low-resolution chromosome model. The algorithm is successfully used to reconstruct 3D chromosome models at 5 KB resolution for the human B-cell (GM12878). These high-resolution models satisfy Hi-C chromosomal contacts well and are consistent with models built at lower (i.e. 1 MB) resolution, and with the data of fluorescent in situ hybridization experiments. The Java source code of Hierarchical3DGenome and its user manual are available here https://github.com/BDM-Lab/Hierarchical3DGenome .
Collapse
|
31
|
Finn EH, Pegoraro G, Brandão HB, Valton AL, Oomen ME, Dekker J, Mirny L, Misteli T. Extensive Heterogeneity and Intrinsic Variation in Spatial Genome Organization. Cell 2019; 176:1502-1515.e10. [PMID: 30799036 DOI: 10.1016/j.cell.2019.01.020] [Citation(s) in RCA: 275] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 10/18/2018] [Accepted: 01/09/2019] [Indexed: 01/16/2023]
Abstract
Several general principles of global 3D genome organization have recently been established, including non-random positioning of chromosomes and genes in the cell nucleus, distinct chromatin compartments, and topologically associating domains (TADs). However, the extent and nature of cell-to-cell and cell-intrinsic variability in genome architecture are still poorly characterized. Here, we systematically probe heterogeneity in genome organization. High-throughput optical mapping of several hundred intra-chromosomal interactions in individual human fibroblasts demonstrates low association frequencies, which are determined by genomic distance, higher-order chromatin architecture, and chromatin environment. The structure of TADs is variable between individual cells, and inter-TAD associations are common. Furthermore, single-cell analysis reveals independent behavior of individual alleles in single nuclei. Our observations reveal extensive variability and heterogeneity in genome organization at the level of individual alleles and demonstrate the coexistence of a broad spectrum of genome configurations in a cell population.
Collapse
Affiliation(s)
| | - Gianluca Pegoraro
- High-throughput Imaging Facility, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Hugo B Brandão
- Graduate Program in Biophysics, Harvard University, Cambridge, MA 02138, USA
| | - Anne-Laure Valton
- Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Marlies E Oomen
- Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Job Dekker
- Howard Hughes Medical Institute, Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Leonid Mirny
- Institute for Medical Engineering and Science and Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tom Misteli
- National Cancer Institute, NIH, Bethesda, MD 20892, USA.
| |
Collapse
|
32
|
Lin D, Bonora G, Yardımcı GG, Noble WS. Computational methods for analyzing and modeling genome structure and organization. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2019; 11:e1435. [PMID: 30022617 PMCID: PMC6294685 DOI: 10.1002/wsbm.1435] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Revised: 06/07/2018] [Accepted: 06/16/2018] [Indexed: 12/31/2022]
Abstract
Recent advances in chromosome conformation capture technologies have led to the discovery of previously unappreciated structural features of chromatin. Computational analysis has been critical in detecting these features and thereby helping to uncover the building blocks of genome architecture. Algorithms are being developed to integrate these architectural features to construct better three-dimensional (3D) models of the genome. These computational methods have revealed the importance of 3D genome organization to essential biological processes. In this article, we review the state of the art in analytic and modeling techniques with a focus on their application to answering various biological questions related to chromatin structure. We summarize the limitations of these computational techniques and suggest future directions, including the importance of incorporating multiple sources of experimental data in building a more comprehensive model of the genome. This article is categorized under: Analytical and Computational Methods > Computational Methods Laboratory Methods and Technologies > Genetic/Genomic Methods Models of Systems Properties and Processes > Mechanistic Models.
Collapse
Affiliation(s)
- Dejun Lin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Giancarlo Bonora
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - William S. Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
33
|
Reichel K, Stelzl LS, Köfinger J, Hummer G. Precision DEER Distances from Spin-Label Ensemble Refinement. J Phys Chem Lett 2018; 9:5748-5752. [PMID: 30212206 DOI: 10.1021/acs.jpclett.8b02439] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Double electron-electron resonance (DEER) experiments probe nanometer-scale distances in spin-labeled proteins and nucleic acids. Rotamer libraries of the covalently attached spin-labels help reduce position uncertainties. Here we show that rotamer reweighting is essential for precision distance measurements, making it possible to resolve Ångstrom-scale domain motions. We analyze extensive DEER measurements on the three N-terminal polypeptide transport-associated (POTRA) domains of the outer membrane protein Omp85. Using the "Bayesian inference of ensembles" maximum-entropy method, we extract rotamer weights from the DEER measurements. Small weight changes suffice to eliminate otherwise significant discrepancies between experiments and model and unmask 1-3 Å domain motions relative to the crystal structure. Rotamer-weight refinement is a simple yet powerful tool for precision distance measurements that should be broadly applicable to label-based measurements including DEER, paramagnetic relaxation enhancement, and fluorescence resonance energy transfer (FRET).
Collapse
Affiliation(s)
- Katrin Reichel
- Department of Theoretical Biophysics , Max Planck Institute of Biophysics , Max-von-Laue-Straße 3 , 60438 Frankfurt am Main , Germany
| | - Lukas S Stelzl
- Department of Theoretical Biophysics , Max Planck Institute of Biophysics , Max-von-Laue-Straße 3 , 60438 Frankfurt am Main , Germany
| | - Jürgen Köfinger
- Department of Theoretical Biophysics , Max Planck Institute of Biophysics , Max-von-Laue-Straße 3 , 60438 Frankfurt am Main , Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics , Max Planck Institute of Biophysics , Max-von-Laue-Straße 3 , 60438 Frankfurt am Main , Germany
- Institute of Biophysics , Goethe University , Max-von-Laue-Straße 9 , 60438 Frankfurt am Main , Germany
| |
Collapse
|
34
|
Tan L, Xing D, Chang CH, Li H, Xie XS. Three-dimensional genome structures of single diploid human cells. Science 2018; 361:924-928. [PMID: 30166492 DOI: 10.1126/science.aat5641] [Citation(s) in RCA: 281] [Impact Index Per Article: 46.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 08/06/2018] [Indexed: 12/17/2022]
Abstract
Three-dimensional genome structures play a key role in gene regulation and cell functions. Characterization of genome structures necessitates single-cell measurements. This has been achieved for haploid cells but has remained a challenge for diploid cells. We developed a single-cell chromatin conformation capture method, termed Dip-C, that combines a transposon-based whole-genome amplification method to detect many chromatin contacts, called META (multiplex end-tagging amplification), and an algorithm to impute the two chromosome haplotypes linked by each contact. We reconstructed the genome structures of single diploid human cells from a lymphoblastoid cell line and from primary blood cells with high spatial resolution, locating specific single-nucleotide and copy number variations in the nucleus. The two alleles of imprinted loci and the two X chromosomes were structurally different. Cells of different types displayed statistically distinct genome structures. Such structural cell typing is crucial for understanding cell functions.
Collapse
Affiliation(s)
- Longzhi Tan
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA. .,Systems Biology Ph.D. Program, Harvard Medical School, Boston, MA 02115, USA
| | - Dong Xing
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA.
| | - Chi-Han Chang
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
| | - Heng Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - X Sunney Xie
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA. .,Innovation Center for Genomics, Peking University, Beijing 100871, China.,Biodynamic and Optical Imaging Center, Peking University, Beijing 100871, China
| |
Collapse
|
35
|
Abstract
The use of 3C-based methods has revealed the importance of the 3D organization of the chromatin for key aspects of genome biology. However, the different caveats of the variants of 3C techniques have limited their scope and the range of scientific fields that could benefit from these approaches. To address these limitations, we present 4Cin, a method to generate 3D models and derive virtual Hi-C (vHi-C) heat maps of genomic loci based on 4C-seq or any kind of 4C-seq-like data, such as those derived from NG Capture-C. 3D genome organization is determined by integrative consideration of the spatial distances derived from as few as four 4C-seq experiments. The 3D models obtained from 4C-seq data, together with their associated vHi-C maps, allow the inference of all chromosomal contacts within a given genomic region, facilitating the identification of Topological Associating Domains (TAD) boundaries. Thus, 4Cin offers a much cheaper, accessible and versatile alternative to other available techniques while providing a comprehensive 3D topological profiling. By studying TAD modifications in genomic structural variants associated to disease phenotypes and performing cross-species evolutionary comparisons of 3D chromatin structures in a quantitative manner, we demonstrate the broad potential and novel range of applications of our method. Chromatin conformation capture (3C) methods have revealed the importance of the 3D organization of the chromatin, which is key to understand many aspects of genome biology. But each of these methods have their own limitations. Here we present 4Cin, a software that generates 3D models of the chromatin from a small number of 4C-seq experiments, a 3C-based method that provides the frequency of contacts between one fragments and the genome (one vs all). These 3D models are used to infer all chromosomal contacts within a given genomic region (many vs many). The contact maps facilitate the identification of Topological Associating Domains boundaries. Our software offers a much cheaper, accessible and versatile alternative to other available techniques while providing a comprehensive 3D topological profiling. We applied our software to two different loci to study modifications in genomic structural variants associated to disease phenotypes and to compare the chromatin organization in two different species in a quantitative manner.
Collapse
|
36
|
Integrative modelling of cellular assemblies. Curr Opin Struct Biol 2017; 46:102-109. [PMID: 28735107 DOI: 10.1016/j.sbi.2017.07.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 07/01/2017] [Accepted: 07/04/2017] [Indexed: 02/06/2023]
Abstract
A wide variety of experimental techniques can be used for understanding the precise molecular mechanisms underlying the activities of cellular assemblies. The inherent limitations of a single experimental technique often requires integration of data from complementary approaches to gain sufficient insights into the assembly structure and function. Here, we review popular computational approaches for integrative modelling of cellular assemblies, including protein complexes and genomic assemblies. We provide recent examples of integrative models generated for such assemblies by different experimental techniques, especially including data from 3D electron microscopy (3D-EM) and chromosome conformation capture experiments, respectively. We highlight general concepts in integrative modelling and discuss the need for careful formulation and merging of different types of information.
Collapse
|
37
|
Habeck M. Bayesian Modeling of Biomolecular Assemblies with Cryo-EM Maps. Front Mol Biosci 2017; 4:15. [PMID: 28382301 PMCID: PMC5360716 DOI: 10.3389/fmolb.2017.00015] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 03/07/2017] [Indexed: 01/09/2023] Open
Abstract
A growing array of experimental techniques allows us to characterize the three-dimensional structure of large biological assemblies at increasingly higher resolution. In addition to X-ray crystallography and nuclear magnetic resonance in solution, new structure determination methods such cryo-electron microscopy (cryo-EM), crosslinking/mass spectrometry and solid-state NMR have emerged. Often it is not sufficient to use a single experimental method, but complementary data need to be collected by using multiple techniques. The integration of all datasets can only be achieved by computational means. This article describes Inferential structure determination, a Bayesian approach to integrative modeling of biomolecular complexes with hybrid structural data. I will introduce probabilistic models for cryo-EM maps and outline Markov chain Monte Carlo algorithms for sampling model structures from the posterior distribution. I will focus on rigid and flexible modeling with cryo-EM data and discuss some of the computational challenges of Bayesian inference in the context of biomolecular modeling.
Collapse
Affiliation(s)
- Michael Habeck
- Statistical Inverse Problems in Biophysics, Max Planck Institute for Biophysical ChemistryGöttingen, Germany; Felix Bernstein Institute for Mathematical Statistics in the Biosciences, University of GöttingenGöttingen, Germany
| |
Collapse
|