1
|
Bammidi LS, Gayen S. Multifaceted role of CTCF in X-chromosome inactivation. Chromosoma 2024; 133:217-231. [PMID: 39433641 DOI: 10.1007/s00412-024-00826-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 10/02/2024] [Accepted: 10/07/2024] [Indexed: 10/23/2024]
Abstract
Therian female mammals compensate for the dosage of X-linked gene expression by inactivating one of the X-chromosomes. X-inactivation is facilitated by the master regulator Xist long non-coding RNA, which coats the inactive-X and facilitates heterochromatinization through recruiting different chromatin modifiers and changing the X-chromosome 3D conformation. However, many mechanistic aspects behind the X-inactivation process remain poorly understood. Among the many contributing players, CTCF has emerged as one of the key players in orchestrating various aspects related to X-chromosome inactivation by interacting with several other protein and RNA partners. In general, CTCF is a well-known architectural protein, which plays an important role in chromatin organization and transcriptional regulation. Here, we provide significant insight into the role of CTCF in orchestrating X-chromosome inactivation and highlight future perspectives.
Collapse
Affiliation(s)
- Lakshmi Sowjanya Bammidi
- Chromatin RNA and Genome (CRG) Lab, Department of Developmental Biology and Genetics, Indian Institute of Science, Bangalore-560012, India
| | - Srimonta Gayen
- Chromatin RNA and Genome (CRG) Lab, Department of Developmental Biology and Genetics, Indian Institute of Science, Bangalore-560012, India.
| |
Collapse
|
2
|
Lam JC, Aboreden NG, Midla SC, Wang S, Huang A, Keller CA, Giardine B, Henderson KA, Hardison RC, Zhang H, Blobel GA. YY1-controlled regulatory connectivity and transcription are influenced by the cell cycle. Nat Genet 2024; 56:1938-1952. [PMID: 39210046 PMCID: PMC11687402 DOI: 10.1038/s41588-024-01871-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 07/16/2024] [Indexed: 09/04/2024]
Abstract
Few transcription factors have been examined for their direct roles in physically connecting enhancers and promoters. Here acute degradation of Yin Yang 1 (YY1) in erythroid cells revealed its requirement for the maintenance of numerous enhancer-promoter loops, but not compartments or domains. Despite its reported ability to interact with cohesin, the formation of YY1-dependent enhancer-promoter loops does not involve stalling of cohesin-mediated loop extrusion. Integrating mitosis-to-G1-phase dynamics, we observed partial retention of YY1 on mitotic chromatin, predominantly at gene promoters, followed by rapid rebinding during mitotic exit, coinciding with enhancer-promoter loop establishment. YY1 degradation during the mitosis-to-G1-phase interval revealed a set of enhancer-promoter loops that require YY1 for establishment during G1-phase entry but not for maintenance in interphase, suggesting that cell cycle stage influences YY1's architectural function. Thus, as revealed here for YY1, chromatin architectural functions of transcription factors can vary in their interplay with CTCF and cohesin as well as by cell cycle stage.
Collapse
Affiliation(s)
- Jessica C Lam
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas G Aboreden
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Susannah C Midla
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Siqing Wang
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Anran Huang
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Cheryl A Keller
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
- Genomics Research Incubator, Pennsylvania State University, University Park, PA, USA
| | - Belinda Giardine
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Kate A Henderson
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Haoyue Zhang
- Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen, China
| | - Gerd A Blobel
- Division of Hematology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
3
|
Liu B, Zhang W, Zeng X, Loza M, Park SJ, Nakai K. TF-EPI: an interpretable enhancer-promoter interaction detection method based on Transformer. Front Genet 2024; 15:1444459. [PMID: 39184348 PMCID: PMC11341371 DOI: 10.3389/fgene.2024.1444459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Accepted: 07/24/2024] [Indexed: 08/27/2024] Open
Abstract
The detection of enhancer-promoter interactions (EPIs) is crucial for understanding gene expression regulation, disease mechanisms, and more. In this study, we developed TF-EPI, a deep learning model based on Transformer designed to detect these interactions solely from DNA sequences. The performance of TF-EPI surpassed that of other state-of-the-art methods on multiple benchmark datasets. Importantly, by utilizing the attention mechanism of the Transformer, we identified distinct cell type-specific motifs and sequences in enhancers and promoters, which were validated against databases such as JASPAR and UniBind, highlighting the potential of our method in discovering new biological insights. Moreover, our analysis of the transcription factors (TFs) corresponding to these motifs and short sequence pairs revealed the heterogeneity and commonality of gene regulatory mechanisms and demonstrated the ability to identify TFs relevant to the source information of the cell line. Finally, the introduction of transfer learning can mitigate the challenges posed by cell type-specific gene regulation, yielding enhanced accuracy in cross-cell line EPI detection. Overall, our work unveils important sequence information for the investigation of enhancer-promoter pairs based on the attention mechanism of the Transformer, providing an important milestone in the investigation of cis-regulatory grammar.
Collapse
Affiliation(s)
- Bowen Liu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Weihang Zhang
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Xin Zeng
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Martin Loza
- Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Sung-Joon Park
- Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Kenta Nakai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
- Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| |
Collapse
|
4
|
Oh JW, Beer MA. Gapped-kmer sequence modeling robustly identifies regulatory vocabularies and distal enhancers conserved between evolutionarily distant mammals. Nat Commun 2024; 15:6464. [PMID: 39085231 PMCID: PMC11291912 DOI: 10.1038/s41467-024-50708-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 07/17/2024] [Indexed: 08/02/2024] Open
Abstract
Gene regulatory elements drive complex biological phenomena and their mutations are associated with common human diseases. The impacts of human regulatory variants are often tested using model organisms such as mice. However, mapping human enhancers to conserved elements in mice remains a challenge, due to both rapid enhancer evolution and limitations of current computational methods. We analyze distal enhancers across 45 matched human/mouse cell/tissue pairs from a comprehensive dataset of DNase-seq experiments, and show that while cell-specific regulatory vocabulary is conserved, enhancers evolve more rapidly than promoters and CTCF binding sites. Enhancer conservation rates vary across cell types, in part explainable by tissue specific transposable element activity. We present an improved genome alignment algorithm using gapped-kmer features, called gkm-align, and make genome wide predictions for 1,401,803 orthologous regulatory elements. We show that gkm-align discovers 23,660 novel human/mouse conserved enhancers missed by previous algorithms, with strong evidence of conserved functional activity.
Collapse
Affiliation(s)
- Jin Woo Oh
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Michael A Beer
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
5
|
Dai Y, Itai T, Pei G, Yan F, Chu Y, Jiang X, Weinberg SM, Mukhopadhyay N, Marazita ML, Simon LM, Jia P, Zhao Z. DeepFace: Deep-learning-based framework to contextualize orofacial-cleft-related variants during human embryonic craniofacial development. HGG ADVANCES 2024; 5:100312. [PMID: 38796699 PMCID: PMC11193024 DOI: 10.1016/j.xhgg.2024.100312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 05/23/2024] [Accepted: 05/23/2024] [Indexed: 05/28/2024] Open
Abstract
Orofacial clefts (OFCs) are among the most common human congenital birth defects. Previous multiethnic studies have identified dozens of associated loci for both cleft lip with or without cleft palate (CL/P) and cleft palate alone (CP). Although several nearby genes have been highlighted, the "casual" variants are largely unknown. Here, we developed DeepFace, a convolutional neural network model, to assess the functional impact of variants by SNP activity difference (SAD) scores. The DeepFace model is trained with 204 epigenomic assays from crucial human embryonic craniofacial developmental stages of post-conception week (pcw) 4 to pcw 10. The Pearson correlation coefficient between the predicted and actual values for 12 epigenetic features achieved a median range of 0.50-0.83. Specifically, our model revealed that SNPs significantly associated with OFCs tended to exhibit higher SAD scores across various variant categories compared to less related groups, indicating a context-specific impact of OFC-related SNPs. Notably, we identified six SNPs with a significant linear relationship to SAD scores throughout developmental progression, suggesting that these SNPs could play a temporal regulatory role. Furthermore, our cell-type specificity analysis pinpointed the trophoblast cell as having the highest enrichment of risk signals associated with OFCs. Overall, DeepFace can harness distal regulatory signals from extensive epigenomic assays, offering new perspectives for prioritizing OFC variants using contextualized functional genomic features. We expect DeepFace to be instrumental in accessing and predicting the regulatory roles of variants associated with OFCs, and the model can be extended to study other complex diseases or traits.
Collapse
Affiliation(s)
- Yulin Dai
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Toshiyuki Itai
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Guangsheng Pei
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Fangfang Yan
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yan Chu
- Center for Secure Artificial Intelligence for Healthcare, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaoqian Jiang
- Center for Secure Artificial Intelligence for Healthcare, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Seth M Weinberg
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Nandita Mukhopadhyay
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Mary L Marazita
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA; Clinical and Translational Science Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Lukas M Simon
- Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Peilin Jia
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA.
| |
Collapse
|
6
|
Razavi-Mohseni M, Huang W, Guo YA, Shigaki D, Ho SWT, Tan P, Skanderup AJ, Beer MA. Machine learning identifies activation of RUNX/AP-1 as drivers of mesenchymal and fibrotic regulatory programs in gastric cancer. Genome Res 2024; 34:680-695. [PMID: 38777607 PMCID: PMC11216402 DOI: 10.1101/gr.278565.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 05/13/2024] [Indexed: 05/25/2024]
Abstract
Gastric cancer (GC) is the fifth most common cancer worldwide and is a heterogeneous disease. Among GC subtypes, the mesenchymal phenotype (Mes-like) is more invasive than the epithelial phenotype (Epi-like). Although gene expression of the epithelial-to-mesenchymal transition (EMT) has been studied, the regulatory landscape shaping this process is not fully understood. Here we use ATAC-seq and RNA-seq data from a compendium of GC cell lines and primary tumors to detect drivers of regulatory state changes and their transcriptional responses. Using the ATAC-seq data, we developed a machine learning approach to determine the transcription factors (TFs) regulating the subtypes of GC. We identified TFs driving the mesenchymal (RUNX2, ZEB1, SNAI2, AP-1 dimer) and the epithelial (GATA4, GATA6, KLF5, HNF4A, FOXA2, GRHL2) states in GC. We identified DNA copy number alterations associated with dysregulation of these TFs, specifically deletion of GATA4 and amplification of MAPK9 Comparisons with bulk and single-cell RNA-seq data sets identified activation toward fibroblast-like epigenomic and expression signatures in Mes-like GC. The activation of this mesenchymal fibrotic program is associated with differentially accessible DNA cis-regulatory elements flanking upregulated mesenchymal genes. These findings establish a map of TF activity in GC and highlight the role of copy number driven alterations in shaping epigenomic regulatory programs as potential drivers of GC heterogeneity and progression.
Collapse
Affiliation(s)
- Milad Razavi-Mohseni
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | - Weitai Huang
- Laboratory of Computational Cancer Genomics, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672
| | - Yu A Guo
- Laboratory of Computational Cancer Genomics, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672
| | - Dustin Shigaki
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | - Shamaine Wei Ting Ho
- Laboratory of Cancer Epigenetic Regulation, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672
| | - Patrick Tan
- Laboratory of Cancer Epigenetic Regulation, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore 169857
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599
- Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117593
| | - Anders J Skanderup
- Laboratory of Computational Cancer Genomics, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672
| | - Michael A Beer
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA;
| |
Collapse
|
7
|
Bose S, Saha S, Goswami H, Shanmugam G, Sarkar K. Involvement of CCCTC-binding factor in epigenetic regulation of cancer. Mol Biol Rep 2023; 50:10383-10398. [PMID: 37840067 DOI: 10.1007/s11033-023-08879-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 10/03/2023] [Indexed: 10/17/2023]
Abstract
A major global health burden continues to be borne by the complex and multifaceted disease of cancer. Epigenetic changes, which are essential for the emergence and spread of cancer, have drawn a huge amount of attention recently. The CCCTC-binding factor (CTCF), which takes part in a wide range of cellular processes including genomic imprinting, X chromosome inactivation, 3D chromatin architecture, local modifications of histone, and RNA polymerase II-mediated gene transcription, stands out among the diverse array of epigenetic regulators. CTCF not only functions as an architectural protein but also modulates DNA methylation and histone modifications. Epigenetic regulation of cancer has already been the focus of plenty of studies. Understanding the role of CTCF in the cancer epigenetic landscape may lead to the development of novel targeted therapeutic strategies for cancer. CTCF has already earned its status as a tumor suppressor gene by acting like a homeostatic regulator of genome integrity and function. Moreover, CTCF has a direct effect on many important transcriptional regulators that control the cell cycle, apoptosis, senescence, and differentiation. As we learn more about CTCF-mediated epigenetic modifications and transcriptional regulations, the possibility of utilizing CTCF as a diagnostic marker and therapeutic target for cancer will also increase. Thus, the current review intends to promote personalized and precision-based therapeutics for cancer patients by shedding light on the complex interplay between CTCF and epigenetic processes.
Collapse
Affiliation(s)
- Sayani Bose
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India
| | - Srawsta Saha
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India
| | - Harsita Goswami
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India
| | - Geetha Shanmugam
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India
| | - Koustav Sarkar
- Department of Biotechnology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India.
| |
Collapse
|
8
|
Toussaint PA, Leiser F, Thiebes S, Schlesner M, Brors B, Sunyaev A. Explainable artificial intelligence for omics data: a systematic mapping study. Brief Bioinform 2023; 25:bbad453. [PMID: 38113073 PMCID: PMC10729786 DOI: 10.1093/bib/bbad453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 07/28/2023] [Accepted: 11/08/2023] [Indexed: 12/21/2023] Open
Abstract
Researchers increasingly turn to explainable artificial intelligence (XAI) to analyze omics data and gain insights into the underlying biological processes. Yet, given the interdisciplinary nature of the field, many findings have only been shared in their respective research community. An overview of XAI for omics data is needed to highlight promising approaches and help detect common issues. Toward this end, we conducted a systematic mapping study. To identify relevant literature, we queried Scopus, PubMed, Web of Science, BioRxiv, MedRxiv and arXiv. Based on keywording, we developed a coding scheme with 10 facets regarding the studies' AI methods, explainability methods and omics data. Our mapping study resulted in 405 included papers published between 2010 and 2023. The inspected papers analyze DNA-based (mostly genomic), transcriptomic, proteomic or metabolomic data by means of neural networks, tree-based methods, statistical methods and further AI methods. The preferred post-hoc explainability methods are feature relevance (n = 166) and visual explanation (n = 52), while papers using interpretable approaches often resort to the use of transparent models (n = 83) or architecture modifications (n = 72). With many research gaps still apparent for XAI for omics data, we deduced eight research directions and discuss their potential for the field. We also provide exemplary research questions for each direction. Many problems with the adoption of XAI for omics data in clinical practice are yet to be resolved. This systematic mapping study outlines extant research on the topic and provides research directions for researchers and practitioners.
Collapse
Affiliation(s)
- Philipp A Toussaint
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
- HIDSS4Health – Helmholtz Information and Data Science School for Health, Karlsruhe, Heidelberg, Germany
| | - Florian Leiser
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Scott Thiebes
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Matthias Schlesner
- Biomedical Informatics, Data Mining and Data Analytics, Faculty of Applied Computer Science and Medical Faculty, University of Augsburg, Augsburg, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Translational Oncology, National Center for Tumor Diseases, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ali Sunyaev
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|
9
|
Xu H, Yi X, Fan X, Wu C, Wang W, Chu X, Zhang S, Dong X, Wang Z, Wang J, Zhou Y, Zhao K, Yao H, Zheng N, Wang J, Chen Y, Plewczynski D, Sham PC, Chen K, Huang D, Li MJ. Inferring CTCF-binding patterns and anchored loops across human tissues and cell types. PATTERNS (NEW YORK, N.Y.) 2023; 4:100798. [PMID: 37602215 PMCID: PMC10436006 DOI: 10.1016/j.patter.2023.100798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 01/25/2023] [Accepted: 06/20/2023] [Indexed: 08/22/2023]
Abstract
CCCTC-binding factor (CTCF) is a transcription regulator with a complex role in gene regulation. The recognition and effects of CTCF on DNA sequences, chromosome barriers, and enhancer blocking are not well understood. Existing computational tools struggle to assess the regulatory potential of CTCF-binding sites and their impact on chromatin loop formation. Here we have developed a deep-learning model, DeepAnchor, to accurately characterize CTCF binding using high-resolution genomic/epigenomic features. This has revealed distinct chromatin and sequence patterns for CTCF-mediated insulation and looping. An optimized implementation of a previous loop model based on DeepAnchor score excels in predicting CTCF-anchored loops. We have established a compendium of CTCF-anchored loops across 52 human tissue/cell types, and this suggests that genomic disruption of these loops could be a general mechanism of disease pathogenesis. These computational models and resources can help investigate how CTCF-mediated cis-regulatory elements shape context-specific gene regulation in cell development and disease progression.
Collapse
Affiliation(s)
- Hang Xu
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A∗STAR), Singapore 138648, Singapore
| | - Xianfu Yi
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xutong Fan
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Chengyue Wu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Wei Wang
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Xinlei Chu
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiaobao Dong
- Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Ke Zhao
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hongcheng Yao
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, The University of Hong Kong, Hong Kong 999077, China
| | - Nan Zheng
- Department of Network Security and Informatization, Tianjin Medical University, Tianjin 300070, China
| | - Junwen Wang
- Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Yupeng Chen
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Dariusz Plewczynski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Pak Chung Sham
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, The University of Hong Kong, Hong Kong 999077, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Dandan Huang
- Wuxi School of Medicine, Jiangnan University, Wuxi 214122, China
| | - Mulin Jun Li
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
10
|
Luo R, Yan J, Oh JW, Xi W, Shigaki D, Wong W, Cho HS, Murphy D, Cutler R, Rosen BP, Pulecio J, Yang D, Glenn RA, Chen T, Li QV, Vierbuchen T, Sidoli S, Apostolou E, Huangfu D, Beer MA. Dynamic network-guided CRISPRi screen identifies CTCF-loop-constrained nonlinear enhancer gene regulatory activity during cell state transitions. Nat Genet 2023; 55:1336-1346. [PMID: 37488417 PMCID: PMC11012226 DOI: 10.1038/s41588-023-01450-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 06/20/2023] [Indexed: 07/26/2023]
Abstract
Comprehensive enhancer discovery is challenging because most enhancers, especially those contributing to complex diseases, have weak effects on gene expression. Our gene regulatory network modeling identified that nonlinear enhancer gene regulation during cell state transitions can be leveraged to improve the sensitivity of enhancer discovery. Using human embryonic stem cell definitive endoderm differentiation as a dynamic transition system, we conducted a mid-transition CRISPRi-based enhancer screen. We discovered a comprehensive set of enhancers for each of the core endoderm-specifying transcription factors. Many enhancers had strong effects mid-transition but weak effects post-transition, consistent with the nonlinear temporal responses to enhancer perturbation predicted by the modeling. Integrating three-dimensional genomic information, we were able to develop a CTCF-loop-constrained Interaction Activity model that can better predict functional enhancers compared to models that rely on Hi-C-based enhancer-promoter contact frequency. Our study provides generalizable strategies for sensitive and systematic enhancer discovery in both normal and pathological cell state transitions.
Collapse
Affiliation(s)
- Renhe Luo
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Jielin Yan
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Jin Woo Oh
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Wang Xi
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Dustin Shigaki
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Wilfred Wong
- Computational & Systems Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
| | - Hyein S Cho
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Dylan Murphy
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
- Department of Medicine, Weill Cornell Medicine, New York City, NY, USA
| | - Ronald Cutler
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Bess P Rosen
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
| | - Julian Pulecio
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Dapeng Yang
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Rachel A Glenn
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
| | - Tingxu Chen
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Qing V Li
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Thomas Vierbuchen
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Simone Sidoli
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Effie Apostolou
- Department of Medicine, Weill Cornell Medicine, New York City, NY, USA
| | - Danwei Huangfu
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA.
| | - Michael A Beer
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
11
|
Liu T, Wang Z. DeepChIA-PET: Accurately predicting ChIA-PET from Hi-C and ChIP-seq with deep dilated networks. PLoS Comput Biol 2023; 19:e1011307. [PMID: 37440599 PMCID: PMC10368233 DOI: 10.1371/journal.pcbi.1011307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023] Open
Abstract
Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) can capture genome-wide chromatin interactions mediated by a specific DNA-associated protein. The ChIA-PET experiments have been applied to explore the key roles of different protein factors in chromatin folding and transcription regulation. However, compared with widely available Hi-C and ChIP-seq data, there are not many ChIA-PET datasets available in the literature. A computational method for accurately predicting ChIA-PET interactions from Hi-C and ChIP-seq data is needed that can save the efforts of performing wet-lab experiments. Here we present DeepChIA-PET, a supervised deep learning approach that can accurately predict ChIA-PET interactions by learning the latent relationships between ChIA-PET and two widely used data types: Hi-C and ChIP-seq. We trained our deep models with CTCF-mediated ChIA-PET of GM12878 as ground truth, and the deep network contains 40 dilated residual convolutional blocks. We first showed that DeepChIA-PET with only Hi-C as input significantly outperforms Peakachu, another computational method for predicting ChIA-PET from Hi-C but using random forests. We next proved that adding ChIP-seq as one extra input does improve the classification performance of DeepChIA-PET, but Hi-C plays a more prominent role in DeepChIA-PET than ChIP-seq. Our evaluation results indicate that our learned models can accurately predict not only CTCF-mediated ChIA-ET in GM12878 and HeLa but also non-CTCF ChIA-PET interactions, including RNA polymerase II (RNAPII) ChIA-PET of GM12878, RAD21 ChIA-PET of GM12878, and RAD21 ChIA-PET of K562. In total, DeepChIA-PET is an accurate tool for predicting the ChIA-PET interactions mediated by various chromatin-associated proteins from different cell types.
Collapse
Affiliation(s)
- Tong Liu
- Department of Computer Science, University of Miami, Coral Gables, Florida, United States of America
| | - Zheng Wang
- Department of Computer Science, University of Miami, Coral Gables, Florida, United States of America
| |
Collapse
|
12
|
Chan B, Rubinstein M. Theory of chromatin organization maintained by active loop extrusion. Proc Natl Acad Sci U S A 2023; 120:e2222078120. [PMID: 37253009 PMCID: PMC10266055 DOI: 10.1073/pnas.2222078120] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 04/13/2023] [Indexed: 06/01/2023] Open
Abstract
The active loop extrusion hypothesis proposes that chromatin threads through the cohesin protein complex into progressively larger loops until reaching specific boundary elements. We build upon this hypothesis and develop an analytical theory for active loop extrusion which predicts that loop formation probability is a nonmonotonic function of loop length and describes chromatin contact probabilities. We validate our model with Monte Carlo and hybrid Molecular Dynamics-Monte Carlo simulations and demonstrate that our theory recapitulates experimental chromatin conformation capture data. Our results support active loop extrusion as a mechanism for chromatin organization and provide an analytical description of chromatin organization that may be used to specifically modify chromatin contact probabilities.
Collapse
Affiliation(s)
- Brian Chan
- Department of Biomedical Engineering, Duke University, Durham, NC27708
| | - Michael Rubinstein
- Department of Biomedical Engineering, Duke University, Durham, NC27708
- Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC27708
- Department of Chemistry, Duke University, Durham, NC27708
- Department of Physics, Duke University, Durham, NC27708
- Institute for Chemical Reaction Design and Discovery (World Premier International Research Center Initiative-ICReDD), Hokkaido University, Sapporo001-0021, Japan
| |
Collapse
|
13
|
Villaman C, Pollastri G, Saez M, Martin AJ. Benefiting from the intrinsic role of epigenetics to predict patterns of CTCF binding. Comput Struct Biotechnol J 2023; 21:3024-3031. [PMID: 37266407 PMCID: PMC10229758 DOI: 10.1016/j.csbj.2023.05.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 05/11/2023] [Accepted: 05/11/2023] [Indexed: 06/03/2023] Open
Abstract
Motivation One of the most relevant mechanisms involved in the determination of chromatin structure is the formation of structural loops that are also related with the conservation of chromatin states. Many of these loops are stabilized by CCCTC-binding factor (CTCF) proteins at their base. Despite the relevance of chromatin structure and the key role of CTCF, the role of the epigenetic factors that are involved in the regulation of CTCF binding, and thus, in the formation of structural loops in the chromatin, is not thoroughly understood. Results Here we describe a CTCF binding predictor based on Random Forest that employs different epigenetic data and genomic features. Importantly, given the ability of Random Forests to determine the relevance of features for the prediction, our approach also shows how the different types of descriptors impact the binding of CTCF, confirming previous knowledge on the relevance of chromatin accessibility and DNA methylation, but demonstrating the effect of epigenetic modifications on the activity of CTCF. We compared our approach against other predictors and found improved performance in terms of areas under PR and ROC curves (PRAUC-ROCAUC), outperforming current state-of-the-art methods.
Collapse
Affiliation(s)
- Camilo Villaman
- Programa de Doctorado en Genómica Integrativa, Vicerrectoría de Investigación, Universidad Mayor, Santiago, Chile
- Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida, Fundación Ciencia & Vida, Escuela de Ingeniería, Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
| | | | - Mauricio Saez
- Centro de Oncología de Precisión, Facultad de Medicina y Ciencias de la Salud, Universidad Mayor, Santiago, Chile
- Laboratorio de Investigación en Salud de Precisión, Departamento de Procesos Diagnósticos y Evaluación, Facultad de Ciencias de la Salud, Universidad Católica de Temuco, Chile
| | - Alberto J.M. Martin
- Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida, Fundación Ciencia & Vida, Escuela de Ingeniería, Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
| |
Collapse
|
14
|
Stefan K, Barski A. Cis-regulatory atlas of primary human CD4+ T cells. BMC Genomics 2023; 24:253. [PMID: 37170195 PMCID: PMC10173520 DOI: 10.1186/s12864-023-09288-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 03/31/2023] [Indexed: 05/13/2023] Open
Abstract
Cis-regulatory elements (CRE) are critical for coordinating gene expression programs that dictate cell-specific differentiation and homeostasis. Recently developed self-transcribing active regulatory region sequencing (STARR-Seq) has allowed for genome-wide annotation of functional CREs. Despite this, STARR-Seq assays are only employed in cell lines, in part, due to difficulties in delivering reporter constructs. Herein, we implemented and validated a STARR-Seq-based screen in human CD4+ T cells using a non-integrating lentiviral transduction system. Lenti-STARR-Seq is the first example of a genome-wide assay of CRE function in human primary cells, identifying thousands of functional enhancers and negative regulatory elements (NREs) in human CD4+ T cells. We find an unexpected difference in nucleosome organization between enhancers and NRE: enhancers are located between nucleosomes, whereas NRE are occupied by nucleosomes in their endogenous locations. We also describe chromatin modification, eRNA production, and transcription factor binding at both enhancers and NREs. Our findings support the idea of silencer repurposing as enhancers in alternate cell types. Collectively, these data suggest that Lenti-STARR-Seq is a successful approach for CRE screening in primary human cell types, and provides an atlas of functional CREs in human CD4+ T cells.
Collapse
Affiliation(s)
- Kurtis Stefan
- Division of Allergy & Immunology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, MLC 7028, Cincinnati, OH, 45229-3026, USA
- Medical Scientist Training Program (MSTP), University of Cincinnati College of Medicine, Cincinnati, OH, 45267, USA
| | - Artem Barski
- Division of Allergy & Immunology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, MLC 7028, Cincinnati, OH, 45229-3026, USA.
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, 45229-3026, USA.
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, 45267, USA.
| |
Collapse
|
15
|
Zhang X, Zhu W, Sun H, Ding Y, Liu L. Prediction of CTCF loop anchor based on machine learning. Front Genet 2023; 14:1181956. [PMID: 37077544 PMCID: PMC10106609 DOI: 10.3389/fgene.2023.1181956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 03/24/2023] [Indexed: 04/05/2023] Open
Abstract
Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusion of chromatin loop. As a multifunctional protein, CTCF has tens of thousands of binding sites in the genome, but only a portion of them can be used as anchors of chromatin loops. It is still unclear how cells select the anchor in the process of chromatin looping.Methods: In this paper, a comparative analysis is performed to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. Furthermore, a machine learning model based on the CTCF binding intensity and DNA sequence is proposed to predict which CTCF sites can form chromatin loop anchors.Results: The accuracy of the machine learning model that we constructed for predicting the anchor of the chromatin loop mediated by CTCF reached 0.8646. And we find that the formation of loop anchor is mainly influenced by the CTCF binding strength and binding pattern (which can be interpreted as the binding of different zinc fingers).Discussion: In conclusion, our results suggest that The CTCF core motif and it’s flanking sequence may be responsible for the binding specificity. This work contributes to understanding the mechanism of loop anchor selection and provides a reference for the prediction of CTCF-mediated chromatin loops.
Collapse
Affiliation(s)
- Xiao Zhang
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
| | - Wen Zhu
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- *Correspondence: Wen Zhu,
| | - Huimin Sun
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, China
| | - Yijie Ding
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
| | - Li Liu
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
16
|
Luo R, Yan J, Oh JW, Xi W, Shigaki D, Wong W, Cho H, Murphy D, Cutler R, Rosen BP, Pulecio J, Yang D, Glenn R, Chen T, Li QV, Vierbuchen T, Sidoli S, Apostolou E, Huangfu D, Beer MA. Dynamic network-guided CRISPRi screen reveals CTCF loop-constrained nonlinear enhancer-gene regulatory activity in cell state transitions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531569. [PMID: 36945628 PMCID: PMC10028945 DOI: 10.1101/2023.03.07.531569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Comprehensive enhancer discovery is challenging because most enhancers, especially those affected in complex diseases, have weak effects on gene expression. Our network modeling revealed that nonlinear enhancer-gene regulation during cell state transitions can be leveraged to improve the sensitivity of enhancer discovery. Utilizing hESC definitive endoderm differentiation as a dynamic transition system, we conducted a mid-transition CRISPRi-based enhancer screen. The screen discovered a comprehensive set of enhancers (4 to 9 per locus) for each of the core endoderm lineage-specifying transcription factors, and many enhancers had strong effects mid-transition but weak effects post-transition. Through integrating enhancer activity measurements and three-dimensional enhancer-promoter interaction information, we were able to develop a CTCF loop-constrained Interaction Activity (CIA) model that can better predict functional enhancers compared to models that rely on Hi-C-based enhancer-promoter contact frequency. Our study provides generalizable strategies for sensitive and more comprehensive enhancer discovery in both normal and pathological cell state transitions.
Collapse
|
17
|
Fan C, Chen K, Wang Y, Ball EV, Stenson PD, Mort M, Bacolla A, Kehrer-Sawatzki H, Tainer JA, Cooper DN, Zhao H. Profiling human pathogenic repeat expansion regions by synergistic and multi-level impacts on molecular connections. Hum Genet 2023; 142:245-274. [PMID: 36344696 PMCID: PMC10290229 DOI: 10.1007/s00439-022-02500-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 10/24/2022] [Indexed: 11/09/2022]
Abstract
Whilst DNA repeat expansions cause numerous heritable human disorders, their origins and underlying pathological mechanisms are often unclear. We collated a dataset comprising 224 human repeat expansions encompassing 203 different genes, and performed a systematic analysis with respect to key topological features at the DNA, RNA and protein levels. Comparison with controls without known pathogenicity and genomic regions lacking repeats, allowed the construction of the first tool to discriminate repeat regions harboring pathogenic repeat expansions (DPREx). At the DNA level, pathogenic repeat expansions exhibited stronger signals for DNA regulatory factors (e.g. H3K4me3, transcription factor-binding sites) in exons, promoters, 5'UTRs and 5'genes but were not significantly different from controls in introns, 3'UTRs and 3'genes. Additionally, pathogenic repeat expansions were also found to be enriched in non-B DNA structures. At the RNA level, pathogenic repeat expansions were characterized by lower free energy for forming RNA secondary structure and were closer to splice sites in introns, exons, promoters and 5'genes than controls. At the protein level, pathogenic repeat expansions exhibited a preference to form coil rather than other types of secondary structure, and tended to encode surface-located protein domains. Guided by these features, DPREx ( http://biomed.nscc-gz.cn/zhaolab/geneprediction/# ) achieved an Area Under the Curve (AUC) value of 0.88 in a test on an independent dataset. Pathogenic repeat expansions are thus located such that they exert a synergistic influence on the gene expression pathway involving inter-molecular connections at the DNA, RNA and protein levels.
Collapse
Affiliation(s)
- Cong Fan
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, 107 Yan Jiang West Road, Guangzhou, 500001, People's Republic of China
| | - Ken Chen
- School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, 500001, China
| | - Yukai Wang
- School of Life Science, Sun Yat-Sen University, Guangzhou, 500001, China
| | - Edward V Ball
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Albino Bacolla
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA
| | | | - John A Tainer
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Huiying Zhao
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, 107 Yan Jiang West Road, Guangzhou, 500001, People's Republic of China.
| |
Collapse
|
18
|
3D chromatin remodeling potentiates transcriptional programs driving cell invasion. Proc Natl Acad Sci U S A 2022; 119:e2203452119. [PMID: 36037342 PMCID: PMC9457068 DOI: 10.1073/pnas.2203452119] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The contribution of deregulated chromatin architecture, including topologically associated domains (TADs), to cancer progression remains ambiguous. CCCTC-binding factor (CTCF) is a central regulator of higher-order chromatin structure that undergoes copy number loss in over half of all breast cancers, but the impact of this defect on epigenetic programming and chromatin architecture remains unclear. We find that under physiological conditions, CTCF organizes subTADs to limit the expression of oncogenic pathways, including phosphatidylinositol 3-kinase (PI3K) and cell adhesion networks. Loss of a single CTCF allele potentiates cell invasion through compromised chromatin insulation and a reorganization of chromatin architecture and histone programming that facilitates de novo promoter-enhancer contacts. However, this change in the higher-order chromatin landscape leads to a vulnerability to inhibitors of mTOR. These data support a model whereby subTAD reorganization drives both modification of histones at de novo enhancer-promoter contacts and transcriptional up-regulation of oncogenic transcriptional networks.
Collapse
|
19
|
Zhou T, Feng Q. Androgen receptor signaling and spatial chromatin organization in castration-resistant prostate cancer. Front Med (Lausanne) 2022; 9:924087. [PMID: 35966880 PMCID: PMC9372301 DOI: 10.3389/fmed.2022.924087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 07/12/2022] [Indexed: 12/03/2022] Open
Abstract
Prostate cancer is one of the leading causes of cancer death and affects millions of men in the world. The American Cancer Society estimated about 34,500 deaths from prostate cancer in the United States in year 2022. The Androgen receptor (AR) signaling is a major pathway that sustains local and metastatic prostate tumor growth. Androgen-deprivation therapy (ADT) is the standard of care for metastatic prostate cancer patient and can suppress the tumor growth for a median of 2-3 years. Unfortunately, the malignancy inevitably progresses to castration-resistant prostate cancer (CRPC) which is more aggressive and no longer responsive to ADT. Surprisingly, for most of the CPRC patients, cancer growth still depends on androgen receptor signaling. Accumulating evidence suggests that CRPC cells have rewired their transcriptional program to retain AR signaling in the absence of androgens. Besides AR, other transcription factors also contribute to the resistance mechanism through multiple pathways including enhancing AR signaling pathway and activating other complementary signaling pathways for the favor of AR downstream genes expression. More recent studies have shown the role of transcription factors in reconfiguring chromatin 3D structure and regulating topologically associating domains (TADs). Pioneer factors, transcription factors and coactivators form liquid-liquid phase separation compartment that can modulate transcriptional events along with configuring TADs. The role of AR and other transcription factors on chromatin structure change and formation of condensate compartment in prostate cancer cells has only been recently investigated and appreciated. This review intends to provide an overview of transcription factors that contribute to AR signaling through activation of gene expression, governing 3D chromatin structure and establishing phase to phase separation. A more detailed understanding of the spatial role of transcription factors in CRPC might provide novel therapeutic targets for the treatment of CRPC.
Collapse
Affiliation(s)
| | - Qin Feng
- Center for Nuclear Receptors and Cell Signaling, Department of Biology and Biochemistry, University of Houston, Houston, TX, United States
| |
Collapse
|
20
|
Peng A, Peng W, Wang R, Zhao H, Yu X, Sun Y. Regulation of 3D Organization and Its Role in Cancer Biology. Front Cell Dev Biol 2022; 10:879465. [PMID: 35757006 PMCID: PMC9213882 DOI: 10.3389/fcell.2022.879465] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Accepted: 04/27/2022] [Indexed: 11/13/2022] Open
Abstract
Three-dimensional (3D) genomics is the frontier field in the post-genomics era, its foremost content is the relationship between chromatin spatial conformation and regulation of gene transcription. Cancer biology is a complex system resulting from genetic alterations in key tumor oncogenes and suppressor genes for cell proliferation, DNA replication, cell differentiation, and homeostatic functions. Although scientific research in recent decades has revealed how the genome sequence is mutated in many cancers, high-order chromosomal structures involved in the development and fate of cancer cells represent a crucial but rarely explored aspect of cancer genomics. Hence, dissection of the 3D genome conformation of cancer helps understand the unique epigenetic patterns and gene regulation processes that distinguish cancer biology from normal physiological states. In recent years, research in tumor 3D genomics has grown quickly. With the rapid progress of 3D genomics technology, we can now better determine the relationship between cancer pathogenesis and the chromatin structure of cancer cells. It is becoming increasingly explicit that changes in 3D chromatin structure play a vital role in controlling oncogene transcription. This review focuses on the relationships between tumor gene expression regulation, tumor 3D chromatin structure, and cancer phenotypic plasticity. Furthermore, based on the functional consequences of spatial disorganization in the cancer genome, we look forward to the clinical application prospects of 3D genomic biomarkers.
Collapse
Affiliation(s)
- Anghui Peng
- Zhuhai Interventional Medical Center, Zhuhai Precision Medical Center, Zhuhai People's Hospital, Zhuhai Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China.,Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai Institute of Translational Medicine, Zhuhai People's Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China
| | - Wang Peng
- Department of Oncology, Liuzhou People's Hospital, Liuzhou, China
| | - Ruiqi Wang
- Department of Pharmacy, Zhuhai People's Hospital, Zhuhai Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China
| | - Hao Zhao
- The First College of Clinical Medical Science, China Three Gorges University, Yichang, China
| | - Xinyang Yu
- Zhuhai Interventional Medical Center, Zhuhai Precision Medical Center, Zhuhai People's Hospital, Zhuhai Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China.,Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai Institute of Translational Medicine, Zhuhai People's Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China
| | - Yihao Sun
- Zhuhai Interventional Medical Center, Zhuhai Precision Medical Center, Zhuhai People's Hospital, Zhuhai Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China.,Guangdong Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai Institute of Translational Medicine, Zhuhai People's Hospital Affiliated with Jinan University, Jinan University, Zhuhai, China
| |
Collapse
|
21
|
Deng S, Feng Y, Pauklin S. 3D chromatin architecture and transcription regulation in cancer. J Hematol Oncol 2022; 15:49. [PMID: 35509102 PMCID: PMC9069733 DOI: 10.1186/s13045-022-01271-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 04/21/2022] [Indexed: 12/18/2022] Open
Abstract
Chromatin has distinct three-dimensional (3D) architectures important in key biological processes, such as cell cycle, replication, differentiation, and transcription regulation. In turn, aberrant 3D structures play a vital role in developing abnormalities and diseases such as cancer. This review discusses key 3D chromatin structures (topologically associating domain, lamina-associated domain, and enhancer-promoter interactions) and corresponding structural protein elements mediating 3D chromatin interactions [CCCTC-binding factor, polycomb group protein, cohesin, and Brother of the Regulator of Imprinted Sites (BORIS) protein] with a highlight of their associations with cancer. We also summarise the recent development of technologies and bioinformatics approaches to study the 3D chromatin interactions in gene expression regulation, including crosslinking and proximity ligation methods in the bulk cell population (ChIA-PET and HiChIP) or single-molecule resolution (ChIA-drop), and methods other than proximity ligation, such as GAM, SPRITE, and super-resolution microscopy techniques.
Collapse
Affiliation(s)
- Siwei Deng
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Old Road, Headington, Oxford, OX3 7LD, UK
| | - Yuliang Feng
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Old Road, Headington, Oxford, OX3 7LD, UK
| | - Siim Pauklin
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Old Road, Headington, Oxford, OX3 7LD, UK.
| |
Collapse
|
22
|
Nuñez-Olvera SI, Puente-Rivera J, Ramos-Payán R, Pérez-Plasencia C, Salinas-Vera YM, Aguilar-Arnal L, López-Camarillo C. Three-Dimensional Genome Organization in Breast and Gynecological Cancers: How Chromatin Folding Influences Tumorigenic Transcriptional Programs. Cells 2021; 11:75. [PMID: 35011637 PMCID: PMC8750285 DOI: 10.3390/cells11010075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 12/15/2021] [Accepted: 12/24/2021] [Indexed: 12/19/2022] Open
Abstract
A growing body of research on the transcriptome and cancer genome has demonstrated that many gynecological tumor-specific gene mutations are located in cis-regulatory elements. Through chromosomal looping, cis-regulatory elements interact which each other to control gene expression by bringing distant regulatory elements, such as enhancers and insulators, into close proximity with promoters. It is well known that chromatin connections may be disrupted in cancer cells, promoting transcriptional dysregulation and the expression of abnormal tumor suppressor genes and oncogenes. In this review, we examine the roles of alterations in 3D chromatin interactions. This includes changes in CTCF protein function, cancer-risk single nucleotide polymorphisms, viral integration, and hormonal response as part of the mechanisms that lead to the acquisition of enhancers or super-enhancers. The translocation of existing enhancers, as well as enhancer loss or acquisition of insulator elements that interact with gene promoters, is also revised. Remarkably, similar processes that modify 3D chromatin contacts in gene promoters may also influence the expression of non-coding RNAs, such as long non-coding RNAs (lncRNAs) and microRNAs (miRNAs), which have emerged as key regulators of gene expression in a variety of cancers, including gynecological malignancies.
Collapse
Affiliation(s)
- Stephanie I. Nuñez-Olvera
- Departamento de Biología Celular y Fisiología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
| | - Jonathan Puente-Rivera
- Posgrado en Ciencias Genómicas, Universidad Autónoma de la Ciudad de México, Mexico City 03100, Mexico;
| | - Rosalio Ramos-Payán
- Facultad de Ciencias Químico Biológicas, Universidad Autónoma de Sinaloa, Culiacan City 80030, Mexico;
| | | | - Yarely M. Salinas-Vera
- Departamento de Bioquímica, Centro de Investigación y Estudios Avanzados, Mexico City 07360, Mexico;
| | - Lorena Aguilar-Arnal
- Departamento de Biología Celular y Fisiología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
| | - César López-Camarillo
- Posgrado en Ciencias Genómicas, Universidad Autónoma de la Ciudad de México, Mexico City 03100, Mexico;
| |
Collapse
|
23
|
Amândio AR, Beccari L, Lopez-Delisle L, Mascrez B, Zakany J, Gitto S, Duboule D. Sequential in cis mutagenesis in vivo reveals various functions for CTCF sites at the mouse HoxD cluster. Genes Dev 2021; 35:1490-1509. [PMID: 34711654 PMCID: PMC8559674 DOI: 10.1101/gad.348934.121] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 09/21/2021] [Indexed: 12/12/2022]
Abstract
Mammalian Hox gene clusters contain a range of CTCF binding sites. In addition to their importance in organizing a TAD border, which isolates the most posterior genes from the rest of the cluster, the positions and orientations of these sites suggest that CTCF may be instrumental in the selection of various subsets of contiguous genes, which are targets of distinct remote enhancers located in the flanking regulatory landscapes. We examined this possibility by producing an allelic series of cumulative in cis mutations in these sites, up to the abrogation of CTCF binding in the five sites located on one side of the TAD border. In the most impactful alleles, the global chromatin architecture of the locus was modified, yet not drastically, illustrating that CTCF sites located on one side of a strong TAD border are sufficient to organize at least part of this insulation. Spatial colinearity in the expression of these genes along the major body axis was nevertheless maintained, despite abnormal expression boundaries. In contrast, strong effects were scored in the selection of target genes responding to particular enhancers, leading to the misregulation of Hoxd genes in specific structures. Altogether, while most enhancer-promoter interactions can occur in the absence of this series of CTCF sites, the binding of CTCF in the Hox cluster is required to properly transform a rather unprecise process into a highly discriminative mechanism of interactions, which is translated into various patterns of transcription accompanied by the distinctive chromatin topology found at this locus. Our allelic series also allowed us to reveal the distinct functional contributions for CTCF sites within this Hox cluster, some acting as insulator elements, others being necessary to anchor or stabilize enhancer-promoter interactions, and some doing both, whereas they all together contribute to the formation of a TAD border. This variety of tasks may explain the amazing evolutionary conservation in the distribution of these sites among paralogous Hox clusters or between various vertebrates.
Collapse
Affiliation(s)
- Ana Rita Amândio
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva, Switzerland
| | - Leonardo Beccari
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva, Switzerland
| | - Lucille Lopez-Delisle
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
| | - Bénédicte Mascrez
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva, Switzerland
| | - Jozsef Zakany
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva, Switzerland
| | - Sandra Gitto
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva, Switzerland
| | - Denis Duboule
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva, Switzerland
- Collège de France, 75231 Paris, France
| |
Collapse
|
24
|
Lange M, Begolli R, Giakountis A. Non-Coding Variants in Cancer: Mechanistic Insights and Clinical Potential for Personalized Medicine. Noncoding RNA 2021; 7:47. [PMID: 34449663 PMCID: PMC8395730 DOI: 10.3390/ncrna7030047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/26/2021] [Accepted: 08/01/2021] [Indexed: 12/11/2022] Open
Abstract
The cancer genome is characterized by extensive variability, in the form of Single Nucleotide Polymorphisms (SNPs) or structural variations such as Copy Number Alterations (CNAs) across wider genomic areas. At the molecular level, most SNPs and/or CNAs reside in non-coding sequences, ultimately affecting the regulation of oncogenes and/or tumor-suppressors in a cancer-specific manner. Notably, inherited non-coding variants can predispose for cancer decades prior to disease onset. Furthermore, accumulation of additional non-coding driver mutations during progression of the disease, gives rise to genomic instability, acting as the driving force of neoplastic development and malignant evolution. Therefore, detection and characterization of such mutations can improve risk assessment for healthy carriers and expand the diagnostic and therapeutic toolbox for the patient. This review focuses on functional variants that reside in transcribed or not transcribed non-coding regions of the cancer genome and presents a collection of appropriate state-of-the-art methodologies to study them.
Collapse
Affiliation(s)
- Marios Lange
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Rodiola Begolli
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Antonis Giakountis
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
- Institute for Fundamental Biomedical Research, B.S.R.C “Alexander Fleming”, 34 Fleming Str., 16672 Vari, Greece
| |
Collapse
|