1
|
Rana V, Peng J, Pan C, Lyu H, Cheng A, Kim M, Milenkovic O. Interpretable online network dictionary learning for inferring long-range chromatin interactions. PLoS Comput Biol 2024; 20:e1012095. [PMID: 38753877 PMCID: PMC11135774 DOI: 10.1371/journal.pcbi.1012095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 05/29/2024] [Accepted: 04/20/2024] [Indexed: 05/18/2024] Open
Abstract
Dictionary learning (DL), implemented via matrix factorization (MF), is commonly used in computational biology to tackle ubiquitous clustering problems. The method is favored due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability in terms of real biological data. Additionally, they are not optimized for graph-structured data and hence often fail to handle them in a scalable manner. In order to address these limitations, we propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL). Unlike classical DL algorithms, online cvxNDL is implemented via MF and designed to handle extremely large datasets by virtue of its online nature. Importantly, it enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to data with a network structure by incorporating specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data with the goal of identifying important long-range interaction patterns (long-range dictionary elements). ChIA-Drop probes higher-order interactions, and produces data in the form of hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis has the objective of creating an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Through the use of dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. To accomplish the task at hand, we focus on RNAPII-enriched ChIA-Drop data from Drosophila Melanogaster S2 cell lines. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL (MF) methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology (GO) enrichment analysis and perform multiple RNA coexpression studies.
Collapse
Affiliation(s)
- Vishal Rana
- Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, Illinois, United States of America
| | - Jianhao Peng
- Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, Illinois, United States of America
| | - Chao Pan
- Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, Illinois, United States of America
| | - Hanbaek Lyu
- Department of Mathematics, University of Wisconsin - Madison, Madison, Wisconsin, United States of America
| | - Albert Cheng
- School of Biological and Health Systems Engineering, Arizona State University, Phoenix, Arizona, United States of America
| | - Minji Kim
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Olgica Milenkovic
- Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, Illinois, United States of America
| |
Collapse
|
2
|
Jeong D, Shi G, Li X, Thirumalai D. Structural basis for the preservation of a subset of topologically associating domains in interphase chromosomes upon cohesin depletion. eLife 2024; 12:RP88564. [PMID: 38502563 PMCID: PMC10950330 DOI: 10.7554/elife.88564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024] Open
Abstract
Compartment formation in interphase chromosomes is a result of spatial segregation between euchromatin and heterochromatin on a few megabase pairs (Mbp) scale. On the sub-Mbp scales, topologically associating domains (TADs) appear as interacting domains along the diagonal in the ensemble averaged Hi-C contact map. Hi-C experiments showed that most of the TADs vanish upon deleting cohesin, while the compartment structure is maintained, and perhaps even enhanced. However, closer inspection of the data reveals that a non-negligible fraction of TADs is preserved (P-TADs) after cohesin loss. Imaging experiments show that, at the single-cell level, TAD-like structures are present even without cohesin. To provide a structural basis for these findings, we first used polymer simulations to show that certain TADs with epigenetic switches across their boundaries survive after depletion of loops. More importantly, the three-dimensional structures show that many of the P-TADs have sharp physical boundaries. Informed by the simulations, we analyzed the Hi-C maps (with and without cohesin) in mouse liver and human colorectal carcinoma cell lines, which affirmed that epigenetic switches and physical boundaries (calculated using the predicted 3D structures using the data-driven HIPPS method that uses Hi-C as the input) explain the origin of the P-TADs. Single-cell structures display TAD-like features in the absence of cohesin that are remarkably similar to the findings in imaging experiments. Some P-TADs, with physical boundaries, are relevant to the retention of enhancer-promoter/promoter-promoter interactions. Overall, our study shows that preservation of a subset of TADs upon removing cohesin is a robust phenomenon that is valid across multiple cell lines.
Collapse
Affiliation(s)
- Davin Jeong
- Department of Chemistry, University of Texas at AustinAustinUnited States
| | - Guang Shi
- Department of Chemistry, University of Texas at AustinAustinUnited States
| | - Xin Li
- Department of Chemistry, University of Texas at AustinAustinUnited States
| | - D Thirumalai
- Department of Chemistry, University of Texas at AustinAustinUnited States
- Department of Physics, University of Texas at AustinAustinUnited States
| |
Collapse
|
3
|
Feng C, Wang J, Chu X. Large-scale data-driven and physics-based models offer insights into the relationships among the structures, dynamics, and functions of chromosomes. J Mol Cell Biol 2023; 15:mjad042. [PMID: 37365687 PMCID: PMC10782906 DOI: 10.1093/jmcb/mjad042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/22/2023] [Accepted: 06/25/2023] [Indexed: 06/28/2023] Open
Abstract
The organized three-dimensional chromosome architecture in the cell nucleus provides scaffolding for precise regulation of gene expression. When the cell changes its identity in the cell-fate decision-making process, extensive rearrangements of chromosome structures occur accompanied by large-scale adaptations of gene expression, underscoring the importance of chromosome dynamics in shaping genome function. Over the last two decades, rapid development of experimental methods has provided unprecedented data to characterize the hierarchical structures and dynamic properties of chromosomes. In parallel, these enormous data offer valuable opportunities for developing quantitative computational models. Here, we review a variety of large-scale polymer models developed to investigate the structures and dynamics of chromosomes. Different from the underlying modeling strategies, these approaches can be classified into data-driven ('top-down') and physics-based ('bottom-up') categories. We discuss their contributions to offering valuable insights into the relationships among the structures, dynamics, and functions of chromosomes and propose the perspective of developing data integration approaches from different experimental technologies and multidisciplinary theoretical/simulation methods combined with different modeling strategies.
Collapse
Affiliation(s)
- Cibo Feng
- Advanced Materials Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- Green e Materials Laboratory, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- College of Physics, Jilin University, Changchun 130012, China
| | - Jin Wang
- Department of Chemistry and Physics, The State University of New York at Stony Brook, Stony Brook, NY 11794, USA
| | - Xiakun Chu
- Advanced Materials Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- Green e Materials Laboratory, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR 999077, China
- Guangzhou Municipal Key Laboratory of Materials Informatics, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
| |
Collapse
|
4
|
Schuette G, Ding X, Zhang B. Efficient Hi-C inversion facilitates chromatin folding mechanism discovery and structure prediction. Biophys J 2023; 122:3425-3438. [PMID: 37496267 PMCID: PMC10502442 DOI: 10.1016/j.bpj.2023.07.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 07/10/2023] [Accepted: 07/24/2023] [Indexed: 07/28/2023] Open
Abstract
Genome-wide chromosome conformation capture (Hi-C) experiments have revealed many structural features of chromatin across multiple length scales. Further understanding genome organization requires relating these discoveries to the mechanisms that establish chromatin structures and reconstructing these structures in three dimensions, but both objectives are difficult to achieve with existing algorithms that are often computationally expensive. To alleviate this challenge, we present an algorithm that efficiently converts Hi-C data into contact energies, which measure the interaction strength between genomic loci brought into proximity. Contact energies are local quantities unaffected by the topological constraints that correlate Hi-C contact probabilities. Thus, extracting contact energies from Hi-C contact probabilities distills the biologically unique information contained in the data. We show that contact energies reveal the location of chromatin loop anchors, support a phase separation mechanism for genome compartmentalization, and parameterize polymer simulations that predict three-dimensional chromatin structures. Therefore, we anticipate that contact energy extraction will unleash the full potential of Hi-C data and that our inversion algorithm will facilitate the widespread adoption of contact energy analysis.
Collapse
Affiliation(s)
- Greg Schuette
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Xinqiang Ding
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts.
| |
Collapse
|
5
|
Senapati S, Irshad IU, Sharma AK, Kumar H. Fundamental insights into the correlation between chromosome configuration and transcription. Phys Biol 2023; 20:051002. [PMID: 37467757 DOI: 10.1088/1478-3975/ace8e5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 07/19/2023] [Indexed: 07/21/2023]
Abstract
Eukaryotic chromosomes exhibit a hierarchical organization that spans a spectrum of length scales, ranging from sub-regions known as loops, which typically comprise hundreds of base pairs, to much larger chromosome territories that can encompass a few mega base pairs. Chromosome conformation capture experiments that involve high-throughput sequencing methods combined with microscopy techniques have enabled a new understanding of inter- and intra-chromosomal interactions with unprecedented details. This information also provides mechanistic insights on the relationship between genome architecture and gene expression. In this article, we review the recent findings on three-dimensional interactions among chromosomes at the compartment, topologically associating domain, and loop levels and the impact of these interactions on the transcription process. We also discuss current understanding of various biophysical processes involved in multi-layer structural organization of chromosomes. Then, we discuss the relationships between gene expression and genome structure from perturbative genome-wide association studies. Furthermore, for a better understanding of how chromosome architecture and function are linked, we emphasize the role of epigenetic modifications in the regulation of gene expression. Such an understanding of the relationship between genome architecture and gene expression can provide a new perspective on the range of potential future discoveries and therapeutic research.
Collapse
Affiliation(s)
- Swayamshree Senapati
- School of Basic Sciences, Indian Institute of Technology, Bhubaneswar, Argul, Odisha 752050, India
| | - Inayat Ullah Irshad
- Department of Physics, Indian Institute of Technology, Jammu, Jammu 181221, India
| | - Ajeet K Sharma
- Department of Physics, Indian Institute of Technology, Jammu, Jammu 181221, India
- Department of Biosciences and Bioengineering, Indian Institute of Technology Jammu, Jammu 181221, India
| | - Hemant Kumar
- School of Basic Sciences, Indian Institute of Technology, Bhubaneswar, Argul, Odisha 752050, India
| |
Collapse
|
6
|
Schuette G, Ding X, Zhang B. Efficient Hi-C inversion facilitates chromatin folding mechanism discovery and structure prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.17.533194. [PMID: 36993500 PMCID: PMC10055272 DOI: 10.1101/2023.03.17.533194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Genome-wide chromosome conformation capture (Hi-C) experiments have revealed many structural features of chromatin across multiple length scales. Further understanding genome organization requires relating these discoveries to the mechanisms that establish chromatin structures and reconstructing these structures in three dimensions, but both objectives are difficult to achieve with existing algorithms that are often computationally expensive. To alleviate this challenge, we present an algorithm that efficiently converts Hi-C data into contact energies, which measure the interaction strength between genomic loci brought into proximity. Contact energies are local quantities unaffected by the topological constraints that correlate Hi-C contact probabilities. Thus, extracting contact energies from Hi-C contact probabilities distills the biologically unique information contained in the data. We show that contact energies reveal the location of chromatin loop anchors, support a phase separation mechanism for genome compartmentalization, and parameterize polymer simulations that predict three-dimensional chromatin structures. Therefore, we anticipate that contact energy extraction will unleash the full potential of Hi-C data and that our inversion algorithm will facilitate the widespread adoption of contact energy analysis.
Collapse
Affiliation(s)
- Greg Schuette
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Xinqiang Ding
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
7
|
Goychuk A, Kannan D, Chakraborty AK, Kardar M. Polymer folding through active processes recreates features of genome organization. Proc Natl Acad Sci U S A 2023; 120:e2221726120. [PMID: 37155885 PMCID: PMC10194017 DOI: 10.1073/pnas.2221726120] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Accepted: 04/02/2023] [Indexed: 05/10/2023] Open
Abstract
From proteins to chromosomes, polymers fold into specific conformations that control their biological function. Polymer folding has long been studied with equilibrium thermodynamics, yet intracellular organization and regulation involve energy-consuming, active processes. Signatures of activity have been measured in the context of chromatin motion, which shows spatial correlations and enhanced subdiffusion only in the presence of adenosine triphosphate. Moreover, chromatin motion varies with genomic coordinate, pointing toward a heterogeneous pattern of active processes along the sequence. How do such patterns of activity affect the conformation of a polymer such as chromatin? We address this question by combining analytical theory and simulations to study a polymer subjected to sequence-dependent correlated active forces. Our analysis shows that a local increase in activity (larger active forces) can cause the polymer backbone to bend and expand, while less active segments straighten out and condense. Our simulations further predict that modest activity differences can drive compartmentalization of the polymer consistent with the patterns observed in chromosome conformation capture experiments. Moreover, segments of the polymer that show correlated active (sub)diffusion attract each other through effective long-ranged harmonic interactions, whereas anticorrelations lead to effective repulsions. Thus, our theory offers nonequilibrium mechanisms for forming genomic compartments, which cannot be distinguished from affinity-based folding using structural data alone. As a first step toward exploring whether active mechanisms contribute to shaping genome conformations, we discuss a data-driven approach.
Collapse
Affiliation(s)
- Andriy Goychuk
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Deepti Kannan
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Arup K. Chakraborty
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA02139
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA02139
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA02139
- Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA02139
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Mehran Kardar
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA02139
| |
Collapse
|
8
|
Kamat K, Lao Z, Qi Y, Wang Y, Ma J, Zhang B. Compartmentalization with nuclear landmarks yields random, yet precise, genome organization. Biophys J 2023; 122:1376-1389. [PMID: 36871158 PMCID: PMC10111368 DOI: 10.1016/j.bpj.2023.03.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 02/19/2023] [Accepted: 03/01/2023] [Indexed: 03/06/2023] Open
Abstract
The 3D organization of eukaryotic genomes plays an important role in genome function. While significant progress has been made in deciphering the folding mechanisms of individual chromosomes, the principles of the dynamic large-scale spatial arrangement of all chromosomes inside the nucleus are poorly understood. We use polymer simulations to model the diploid human genome compartmentalization relative to nuclear bodies such as nuclear lamina, nucleoli, and speckles. We show that a self-organization process based on a cophase separation between chromosomes and nuclear bodies can capture various features of genome organization, including the formation of chromosome territories, phase separation of A/B compartments, and the liquid property of nuclear bodies. The simulated 3D structures quantitatively reproduce both sequencing-based genomic mapping and imaging assays that probe chromatin interaction with nuclear bodies. Importantly, our model captures the heterogeneous distribution of chromosome positioning across cells while simultaneously producing well-defined distances between active chromatin and nuclear speckles. Such heterogeneity and preciseness of genome organization can coexist due to the nonspecificity of phase separation and the slow chromosome dynamics. Together, our work reveals that the cophase separation provides a robust mechanism for us to produce functionally important 3D contacts without requiring thermodynamic equilibration that can be difficult to achieve.
Collapse
Affiliation(s)
- Kartik Kamat
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Zhuohan Lao
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Yifeng Qi
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Yuchuan Wang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts.
| |
Collapse
|
9
|
Sangeet S, Sarkar R, Mohanty SK, Roy S. Quantifying Mutational Response to Track the Evolution of SARS-CoV-2 Spike Variants: Introducing a Statistical-Mechanics-Guided Machine Learning Method. J Phys Chem B 2022; 126:7895-7905. [PMID: 36178371 PMCID: PMC9534491 DOI: 10.1021/acs.jpcb.2c04574] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/16/2022] [Indexed: 02/07/2023]
Abstract
The emergence of SARS-CoV-2 and its variants that critically affect global public health requires characterization of mutations and their evolutionary pattern from specific Variants of Interest (VOIs) to Variants of Concern (VOCs). Leveraging the concept of equilibrium statistical mechanics, we introduce a new responsive quantity defined as "Mutational Response Function (MRF)" aptly quantifying domain-wise average entropy-fluctuation in the spike glycoprotein sequence of SARS-CoV-2 based on its evolutionary database. As the evolution transits from a specific variant to VOC, we find that the evolutionary crossover is accompanied by a dramatic change in MRF, upholding the characteristic of a dynamic phase transition. With this entropic information, we have developed an ancestral-based machine learning method that helps predict future domain-specific mutations. The feedforward binary classification model pinpoints possible residues prone to future mutations that have implications for enhanced fusogenicity and pathogenicity of the virus. We believe such MRF analyses followed by a statistical mechanics augmented ML approach could help track different evolutionary stages of such species and identify a critical evolutionary transition that is alarming.
Collapse
Affiliation(s)
- Satyam Sangeet
- Department of Chemical Sciences, Indian Institute of Science
Education and Research Kolkata, Kolkata, West Bengal741246,
India
| | - Raju Sarkar
- Department of Chemical Sciences, Indian Institute of Science
Education and Research Kolkata, Kolkata, West Bengal741246,
India
| | - Saswat K. Mohanty
- Department of Chemical Sciences, Indian Institute of Science
Education and Research Kolkata, Kolkata, West Bengal741246,
India
| | - Susmita Roy
- Department of Chemical Sciences, Indian Institute of Science
Education and Research Kolkata, Kolkata, West Bengal741246,
India
| |
Collapse
|
10
|
Das P, Shen T, McCord RP. Characterizing the variation in chromosome structure ensembles in the context of the nuclear microenvironment. PLoS Comput Biol 2022; 18:e1010392. [PMID: 35969616 PMCID: PMC9410561 DOI: 10.1371/journal.pcbi.1010392] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 08/25/2022] [Accepted: 07/15/2022] [Indexed: 11/23/2022] Open
Abstract
Inside the nucleus, chromosomes are subjected to direct physical interaction between different components, active forces, and thermal noise, leading to the formation of an ensemble of three-dimensional structures. However, it is still not well understood to what extent and how the structural ensemble varies from one chromosome region or cell-type to another. We designed a statistical analysis technique and applied it to single-cell chromosome imaging data to reveal the heterogeneity of individual chromosome structures. By analyzing the resulting structural landscape, we find that the largest dynamic variation is the overall radius of gyration of the chromatin region, followed by domain reorganization within the region. By comparing different human cell-lines and experimental perturbation data using this statistical analysis technique and a network-based similarity quantification approach, we identify both cell-type and condition-specific features of the structural landscapes. We identify a relationship between epigenetic state and the properties of chromosome structure fluctuation and validate this relationship through polymer simulations. Overall, our study suggests that the types of variation in a chromosome structure ensemble are cell-type as well as region-specific and can be attributed to constraints placed on the structure by factors such as variation in epigenetic state. Recent work has revealed principles of how chromosomes are folded and structured inside the human nucleus. It is now even possible to microscopically trace the path of chromosomes in 3D in individual cells. With this data, we can start to examine how much variation exists in chromosome structure and what biological factors may restrict or enhance this variation. Are chromosomes stuck in just a few possible positions or do they move around more freely, sampling many configurations? Here, we use a mathematical approach to compare chromosome structure variation in different cell types, at different locations along the genome, and when key structural proteins are removed. Through these comparisons and dynamic simulations of chromosome behavior, we identify factors that may constrain or promote variation in chromosome structure.
Collapse
Affiliation(s)
- Priyojit Das
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Tongye Shen
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
- Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Rachel Patton McCord
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
- Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
11
|
Abstract
The human genome is arranged in the cell nucleus nonrandomly, and phase separation has been proposed as an important driving force for genome organization. However, the cell nucleus is an active system, and the contribution of nonequilibrium activities to phase separation and genome structure and dynamics remains to be explored. We simulated the genome using an energy function parametrized with chromosome conformation capture (Hi-C) data with the presence of active, nondirectional forces that break the detailed balance. We found that active forces that may arise from transcription and chromatin remodeling can dramatically impact the spatial localization of heterochromatin. When applied to euchromatin, active forces can drive heterochromatin to the nuclear envelope and compete with passive interactions among heterochromatin that tend to pull them in opposite directions. Furthermore, active forces induce long-range spatial correlations among genomic loci beyond single chromosome territories. We further showed that the impact of active forces could be understood from the effective temperature defined as the fluctuation-dissipation ratio. Our study suggests that nonequilibrium activities can significantly impact genome structure and dynamics, producing unexpected collective phenomena.
Collapse
Affiliation(s)
- Zhongling Jiang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United States
| | - Yifeng Qi
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United States
| | - Kartik Kamat
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United States
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United States
| |
Collapse
|
12
|
Liu T, Wang Z. scHiCEmbed: Bin-Specific Embeddings of Single-Cell Hi-C Data Using Graph Auto-Encoders. Genes (Basel) 2022; 13:genes13061048. [PMID: 35741810 PMCID: PMC9222580 DOI: 10.3390/genes13061048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/08/2022] [Accepted: 06/09/2022] [Indexed: 02/05/2023] Open
Abstract
Most publicly accessible single-cell Hi-C data are sparse and cannot reach a higher resolution. Therefore, learning latent representations (bin-specific embeddings) of sparse single-cell Hi-C matrices would provide us with a novel way of mining valuable information hidden in the limited number of single-cell Hi-C contacts. We present scHiCEmbed, an unsupervised computational method for learning bin-specific embeddings of single-cell Hi-C data, and the computational system is applied to the tasks of 3D structure reconstruction of whole genomes and detection of topologically associating domains (TAD). The only input of scHiCEmbed is a raw or scHiCluster-imputed single-cell Hi-C matrix. The main process of scHiCEmbed is to embed each node/bin in a higher dimensional space using graph auto-encoders. The learned n-by-3 bin-specific embedding/latent matrix is considered the final reconstructed 3D genome structure. For TAD detection, we use constrained hierarchical clustering on the latent matrix to classify bins: S_Dbw is used to determine the optimal number of clusters, and each cluster is considered as one potential TAD. Our reconstructed 3D structures for individual chromatins at different cell stages reveal the expanding process of chromatins during the cell cycle. We observe that the TADs called from single-cell Hi-C data are not shared across individual cells and that the TAD boundaries called from raw or imputed single-cell Hi-C are significantly different from those called from bulk Hi-C, confirming the cell-to-cell variability in terms of TAD definitions. The source code for scHiCEmbed is publicly available, and the URL can be found in the conclusion section.
Collapse
|
13
|
Bera P, Wasim A, Mondal J. Hi-C embedded polymer model of Escherichia coli reveals the origin of heterogeneous subdiffusion in chromosomal loci. Phys Rev E 2022; 105:064402. [PMID: 35854496 DOI: 10.1103/physreve.105.064402] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 05/10/2022] [Indexed: 06/15/2023]
Abstract
Underneath its apparently simple architecture, the circular chromosome of Escherichia coli is known for displaying complex dynamics in its cytoplasm, with past investigations hinting at inherently diverse mobilities of chromosomal loci across the genome. To decipher its origin, we simulate the dynamics of genome-wide spectrum of E. coli chromosomal loci, via integrating its experimentally derived Hi-C interaction matrix within a polymer-based model. Our analysis demonstrates that, while the dynamics of the chromosome is subdiffusive in a viscoelastic media, the diffusion constants are strongly dependent of chromosomal loci coordinates and diffusive exponents (α) are widely heterogenous with α ≈ 0.36-0.60. The loci-dependent heterogeneous dynamics and mean first-passage times of interloci encounter were found to be modulated via genetically distant interloci communications and is robust even in the presence of active, ATP-dependent noises. Control investigations reveal that the absence of Hi-C-derived interactions in the model would have abolished the traits of heterogeneous loci diffusion, underscoring the key role of loci-specific genetically distant interaction in modulating the underlying heterogeneity of the loci diffusion.
Collapse
Affiliation(s)
- Palash Bera
- Tata Institute of Fundamental Research, Hyderabad 500046, India
| | - Abdul Wasim
- Tata Institute of Fundamental Research, Hyderabad 500046, India
| | | |
Collapse
|
14
|
Xie L, Dong P, Qi Y, Hsieh THS, English BP, Jung S, Chen X, De Marzio M, Casellas R, Chang HY, Zhang B, Tjian R, Liu Z. BRD2 compartmentalizes the accessible genome. Nat Genet 2022; 54:481-491. [PMID: 35410381 PMCID: PMC9099420 DOI: 10.1038/s41588-022-01044-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Accepted: 03/01/2022] [Indexed: 12/15/2022]
Abstract
Mammalian chromosomes are organized into megabase-sized compartments that are further subdivided into topologically associated domains (TADs). While the formation of TADs is dependent on Cohesin, the mechanism behind compartmentalization remains enigmatic. Here, we show that the bromodomain and extraterminal (BET) family scaffold protein BRD2 promotes spatial mixing and compartmentalization of active chromatin after Cohesin loss. This activity is independent of transcription but requires BRD2 to recognize acetylated targets through its double bromodomain and interact with binding partners with its low complexity domain. Notably, genome compartmentalization mediated by BRD2 is antagonized on one hand by Cohesin and on the other by the BET homolog protein BRD4, both of which inhibit BRD2 binding to chromatin. Polymer simulation of our data supports a BRD2-Cohesin interplay model of nuclear topology, where genome compartmentalization results from a competition between loop extrusion and chromatin state-specific affinity interactions.
Collapse
|
15
|
Chu X, Wang J. Deciphering the molecular mechanism of the cancer formation by chromosome structural dynamics. PLoS Comput Biol 2021; 17:e1009596. [PMID: 34752443 PMCID: PMC8631624 DOI: 10.1371/journal.pcbi.1009596] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 11/30/2021] [Accepted: 10/28/2021] [Indexed: 12/15/2022] Open
Abstract
Cancer reflects the dysregulation of the underlying gene network, which is strongly related to the 3D genome organization. Numerous efforts have been spent on experimental characterizations of the structural alterations in cancer genomes. However, there is still a lack of genomic structural-level understanding of the temporal dynamics for cancer initiation and progression. Here, we use a landscape-switching model to investigate the chromosome structural transition during the cancerization and reversion processes. We find that the chromosome undergoes a non-monotonic structural shape-changing pathway with initial expansion followed by compaction during both of these processes. Furthermore, our analysis reveals that the chromosome with a more expanding structure than those at both the normal and cancer cell during cancerization exhibits a sparse contact pattern, which shows significant structural similarity to the one at the embryonic stem cell in many aspects, including the trend of contact probability declining with the genomic distance, the global structural shape geometry and the spatial distribution of loci on the chromosome. In light of the intimate structure-function relationship at the chromosomal level, we further describe the cell state transition processes by the chromosome structural changes, suggesting an elevated cell stemness during the formation of the cancer cells. We show that cell cancerization and reversion are highly irreversible processes in terms of the chromosome structural transition pathways, spatial repositioning of chromosomal loci and hysteresis loop of contact evolution analysis. Our model draws a molecular-scale picture of cell cancerization from the chromosome structural perspective. The process contains initial reprogramming towards the stem cell followed by the differentiation towards the cancer cell, accompanied by an initial increase and subsequent decrease of the cell stemness.
Collapse
Affiliation(s)
- Xiakun Chu
- Department of Chemistry, State University of New York at Stony Brook, Stony Brook, New York, United States of America
| | - Jin Wang
- Department of Chemistry, State University of New York at Stony Brook, Stony Brook, New York, United States of America
- Department of Physics and Astronomy, State University of New York at Stony Brook, Stony Brook, New York, United States of America
| |
Collapse
|
16
|
Lin X, Qi Y, Latham AP, Zhang B. Multiscale modeling of genome organization with maximum entropy optimization. J Chem Phys 2021; 155:010901. [PMID: 34241389 PMCID: PMC8253599 DOI: 10.1063/5.0044150] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 04/28/2021] [Indexed: 12/15/2022] Open
Abstract
Three-dimensional (3D) organization of the human genome plays an essential role in all DNA-templated processes, including gene transcription, gene regulation, and DNA replication. Computational modeling can be an effective way of building high-resolution genome structures and improving our understanding of these molecular processes. However, it faces significant challenges as the human genome consists of over 6 × 109 base pairs, a system size that exceeds the capacity of traditional modeling approaches. In this perspective, we review the progress that has been made in modeling the human genome. Coarse-grained models parameterized to reproduce experimental data via the maximum entropy optimization algorithm serve as effective means to study genome organization at various length scales. They have provided insight into the principles of whole-genome organization and enabled de novo predictions of chromosome structures from epigenetic modifications. Applications of these models at a near-atomistic resolution further revealed physicochemical interactions that drive the phase separation of disordered proteins and dictate chromatin stability in situ. We conclude with an outlook on the opportunities and challenges in studying chromosome dynamics.
Collapse
Affiliation(s)
- Xingcheng Lin
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Yifeng Qi
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Andrew P. Latham
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
17
|
Sood A, Zhang B. Quantifying the Stability of Coupled Genetic and Epigenetic Switches With Variational Methods. Front Genet 2021; 11:636724. [PMID: 33552146 PMCID: PMC7862759 DOI: 10.3389/fgene.2020.636724] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 12/29/2020] [Indexed: 01/23/2023] Open
Abstract
The Waddington landscape provides an intuitive metaphor to view development as a ball rolling down the hill, with distinct phenotypes as basins and differentiation pathways as valleys. Since, at a molecular level, cell differentiation arises from interactions among the genes, a mathematical definition for the Waddington landscape can, in principle, be obtained by studying the gene regulatory networks. For eukaryotes, gene regulation is inextricably and intimately linked to histone modifications. However, the impact of such modifications on both landscape topography and stability of attractor states is not fully understood. In this work, we introduced a minimal kinetic model for gene regulation that combines the impact of both histone modifications and transcription factors. We further developed an approximation scheme based on variational principles to solve the corresponding master equation in a second quantized framework. By analyzing the steady-state solutions at various parameter regimes, we found that histone modification kinetics can significantly alter the behavior of a genetic network, resulting in qualitative changes in gene expression profiles. The emerging epigenetic landscape captures the delicate interplay between transcription factors and histone modifications in driving cell-fate decisions.
Collapse
Affiliation(s)
- Amogh Sood
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, United States
| |
Collapse
|
18
|
Qi Y, Reyes A, Johnstone SE, Aryee MJ, Bernstein BE, Zhang B. Data-Driven Polymer Model for Mechanistic Exploration of Diploid Genome Organization. Biophys J 2020; 119:1905-1916. [PMID: 33086041 DOI: 10.1016/j.bpj.2020.09.009] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 08/24/2020] [Accepted: 09/08/2020] [Indexed: 12/21/2022] Open
Abstract
Chromosomes are positioned nonrandomly inside the nucleus to coordinate with their transcriptional activity. The molecular mechanisms that dictate the global genome organization and the nuclear localization of individual chromosomes are not fully understood. We introduce a polymer model to study the organization of the diploid human genome. It is data-driven because all parameters can be derived from Hi-C data; it is also a mechanistic model because the energy function is explicitly written out based on a few biologically motivated hypotheses. These two features distinguish the model from existing approaches and make it useful both for reconstructing genome structures and for exploring the principles of genome organization. We carried out extensive validations to show that simulated genome structures reproduce a wide variety of experimental measurements, including chromosome radial positions and spatial distances between homologous pairs. Detailed mechanistic investigations support the importance of both specific interchromosomal interactions and centromere clustering for chromosome positioning. We anticipate the polymer model, when combined with Hi-C experiments, to be a powerful tool for investigating large-scale rearrangements in genome structure upon cell differentiation and tumor progression.
Collapse
Affiliation(s)
- Yifeng Qi
- Departments of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Alejandro Reyes
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; Department of Data Sciences, Dana Farber Cancer Institute, Boston, Massachusetts; Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts
| | - Sarah E Johnstone
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Center for Cancer Research, Massachusetts General Hospital, Boston, Massachusetts
| | - Martin J Aryee
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Center for Cancer Research, Massachusetts General Hospital, Boston, Massachusetts
| | - Bradley E Bernstein
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts; Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; Center for Cancer Research, Massachusetts General Hospital, Boston, Massachusetts
| | - Bin Zhang
- Departments of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts.
| |
Collapse
|