1
|
Tang L, Liao J, Hill MC, Hu J, Zhao Y, Ellinor P, Li M. MMCT-Loop: a mix model-based pipeline for calling targeted 3D chromatin loops. Nucleic Acids Res 2024; 52:e25. [PMID: 38281134 PMCID: PMC10954456 DOI: 10.1093/nar/gkae029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 12/03/2023] [Accepted: 01/12/2024] [Indexed: 01/30/2024] Open
Abstract
Protein-specific Chromatin Conformation Capture (3C)-based technologies have become essential for identifying distal genomic interactions with critical roles in gene regulation. The standard techniques include Chromatin Interaction Analysis by Paired-End Tag (ChIA-PET), in situ Hi-C followed by chromatin immunoprecipitation (HiChIP) also known as PLAC-seq. To identify chromatin interactions from these data, a variety of computational methods have emerged. Although these state-of-art methods address many issues with loop calling, only few methods can fit different data types simultaneously, and the accuracy as well as the efficiency these approaches remains limited. Here we have generated a pipeline, MMCT-Loop, which ensures the accurate identification of strong loops as well as dynamic or weak loops through a mixed model. MMCT-Loop outperforms existing methods in accuracy, and the detected loops show higher activation functionality. To highlight the utility of MMCT-Loop, we applied it to conformational data derived from neural stem cell (NSCs) and uncovered several previously unidentified regulatory regions for key master regulators of stem cell identity. MMCT-Loop is an accurate and efficient loop caller for targeted conformation capture data, which supports raw data or pre-processed valid pairs as input, the output interactions are formatted and easily uploaded to a genome browser for visualization.
Collapse
Affiliation(s)
- Li Tang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jiaqi Liao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Matthew C Hill
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02129, USA
- Cardiovascular Disease Initiative, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Jiaxin Hu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02129, USA
- Cardiovascular Disease Initiative, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
2
|
Murphy D, Salataj E, Di Giammartino DC, Rodriguez-Hernaez J, Kloetgen A, Garg V, Char E, Uyehara CM, Ee LS, Lee U, Stadtfeld M, Hadjantonakis AK, Tsirigos A, Polyzos A, Apostolou E. 3D Enhancer-promoter networks provide predictive features for gene expression and coregulation in early embryonic lineages. Nat Struct Mol Biol 2024; 31:125-140. [PMID: 38053013 PMCID: PMC10897904 DOI: 10.1038/s41594-023-01130-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 09/18/2023] [Indexed: 12/07/2023]
Abstract
Mammalian embryogenesis commences with two pivotal and binary cell fate decisions that give rise to three essential lineages: the trophectoderm, the epiblast and the primitive endoderm. Although key signaling pathways and transcription factors that control these early embryonic decisions have been identified, the non-coding regulatory elements through which transcriptional regulators enact these fates remain understudied. Here, we characterize, at a genome-wide scale, enhancer activity and 3D connectivity in embryo-derived stem cell lines that represent each of the early developmental fates. We observe extensive enhancer remodeling and fine-scale 3D chromatin rewiring among the three lineages, which strongly associate with transcriptional changes, although distinct groups of genes are irresponsive to topological changes. In each lineage, a high degree of connectivity, or 'hubness', positively correlates with levels of gene expression and enriches for cell-type specific and essential genes. Genes within 3D hubs also show a significantly stronger probability of coregulation across lineages compared to genes in linear proximity or within the same contact domains. By incorporating 3D chromatin features, we build a predictive model for transcriptional regulation (3D-HiChAT) that outperforms models using only 1D promoter or proximal variables to predict levels and cell-type specificity of gene expression. Using 3D-HiChAT, we identify, in silico, candidate functional enhancers and hubs in each cell lineage, and with CRISPRi experiments, we validate several enhancers that control gene expression in their respective lineages. Our study identifies 3D regulatory hubs associated with the earliest mammalian lineages and describes their relationship to gene expression and cell identity, providing a framework to comprehensively understand lineage-specific transcriptional behaviors.
Collapse
Affiliation(s)
- Dylan Murphy
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
- Physiology, Biophysics and Systems Biology Program, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, USA
| | - Eralda Salataj
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Dafne Campigli Di Giammartino
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
- 3D Chromatin Conformation and RNA Genomics Laboratory, Center for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Javier Rodriguez-Hernaez
- Department of Pathology, New York University Langone Health, New York, NY, USA
- Department of Medicine, New York University Langone Health, New York, NY, USA
- Applied Bioinformatics Laboratory, New York University Langone Health, New York, NY, USA
| | - Andreas Kloetgen
- Department of Pathology, New York University Langone Health, New York, NY, USA
- Department of Medicine, New York University Langone Health, New York, NY, USA
- Applied Bioinformatics Laboratory, New York University Langone Health, New York, NY, USA
| | - Vidur Garg
- Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Biochemistry Cell and Molecular Biology Program, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, USA
| | - Erin Char
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medical College, New York, NY, USA
| | - Christopher M Uyehara
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Ly-Sha Ee
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - UkJin Lee
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
- Biochemistry Cell and Molecular Biology Program, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, USA
| | - Matthias Stadtfeld
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Anna-Katerina Hadjantonakis
- Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Aristotelis Tsirigos
- Department of Pathology, New York University Langone Health, New York, NY, USA.
- Department of Medicine, New York University Langone Health, New York, NY, USA.
- Applied Bioinformatics Laboratory, New York University Langone Health, New York, NY, USA.
| | - Alexander Polyzos
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA.
| | - Effie Apostolou
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
3
|
Murphy D, Salataj E, Di Giammartino DC, Rodriguez-Hernaez J, Kloetgen A, Garg V, Char E, Uyehara CM, Ee LS, Lee U, Stadtfeld M, Hadjantonakis AK, Tsirigos A, Polyzos A, Apostolou E. Systematic mapping and modeling of 3D enhancer-promoter interactions in early mouse embryonic lineages reveal regulatory principles that determine the levels and cell-type specificity of gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.19.549714. [PMID: 37577543 PMCID: PMC10422694 DOI: 10.1101/2023.07.19.549714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Mammalian embryogenesis commences with two pivotal and binary cell fate decisions that give rise to three essential lineages, the trophectoderm (TE), the epiblast (EPI) and the primitive endoderm (PrE). Although key signaling pathways and transcription factors that control these early embryonic decisions have been identified, the non-coding regulatory elements via which transcriptional regulators enact these fates remain understudied. To address this gap, we have characterized, at a genome-wide scale, enhancer activity and 3D connectivity in embryo-derived stem cell lines that represent each of the early developmental fates. We observed extensive enhancer remodeling and fine-scale 3D chromatin rewiring among the three lineages, which strongly associate with transcriptional changes, although there are distinct groups of genes that are irresponsive to topological changes. In each lineage, a high degree of connectivity or "hubness" positively correlates with levels of gene expression and enriches for cell-type specific and essential genes. Genes within 3D hubs also show a significantly stronger probability of coregulation across lineages, compared to genes in linear proximity or within the same contact domains. By incorporating 3D chromatin features, we build a novel predictive model for transcriptional regulation (3D-HiChAT), which outperformed models that use only 1D promoter or proximal variables in predicting levels and cell-type specificity of gene expression. Using 3D-HiChAT, we performed genome-wide in silico perturbations to nominate candidate functional enhancers and hubs in each cell lineage, and with CRISPRi experiments we validated several novel enhancers that control expression of one or more genes in their respective lineages. Our study comprehensively identifies 3D regulatory hubs associated with the earliest mammalian lineages and describes their relationship to gene expression and cell identity, providing a framework to understand lineage-specific transcriptional behaviors.
Collapse
Affiliation(s)
- Dylan Murphy
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - Eralda Salataj
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - Dafne Campigli Di Giammartino
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
- 3D Chromatin Conformation and RNA genomics laboratory, Instituto Italiano di Tecnologia (IIT), Center for Human Technologies (CHT), Genova, Italy (current affiliation)
| | - Javier Rodriguez-Hernaez
- Department of Pathology, New York University Langone Health, New York, NY 10016, USA
- Applied Bioinformatics Laboratory, New York University Langone Health, New York, NY 10016, USA
| | - Andreas Kloetgen
- Department of Pathology, New York University Langone Health, New York, NY 10016, USA
- Applied Bioinformatics Laboratory, New York University Langone Health, New York, NY 10016, USA
| | - Vidur Garg
- Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Biochemistry Cell and Molecular Biology Program, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, USA
| | - Erin Char
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medical College, New York, 10065, New York, USA
| | - Christopher M. Uyehara
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - Ly-sha Ee
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - UkJin Lee
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - Matthias Stadtfeld
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - Anna-Katerina Hadjantonakis
- Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Biochemistry Cell and Molecular Biology Program, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, USA
| | - Aristotelis Tsirigos
- Department of Pathology, New York University Langone Health, New York, NY 10016, USA
- Applied Bioinformatics Laboratory, New York University Langone Health, New York, NY 10016, USA
| | - Alexander Polyzos
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| | - Effie Apostolou
- Sanford I. Weill Department of Medicine, Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY, United States
| |
Collapse
|
4
|
Zhong W, Liu W, Chen J, Sun Q, Hu M, Li Y. Understanding the function of regulatory DNA interactions in the interpretation of non-coding GWAS variants. Front Cell Dev Biol 2022; 10:957292. [PMID: 36060805 PMCID: PMC9437546 DOI: 10.3389/fcell.2022.957292] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 07/21/2022] [Indexed: 01/11/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified a vast number of variants associated with various complex human diseases and traits. However, most of these GWAS variants reside in non-coding regions producing no proteins, making the interpretation of these variants a daunting challenge. Prior evidence indicates that a subset of non-coding variants detected within or near cis-regulatory elements (e.g., promoters, enhancers, silencers, and insulators) might play a key role in disease etiology by regulating gene expression. Advanced sequencing- and imaging-based technologies, together with powerful computational methods, enabling comprehensive characterization of regulatory DNA interactions, have substantially improved our understanding of the three-dimensional (3D) genome architecture. Recent literature witnesses plenty of examples where using chromosome conformation capture (3C)-based technologies successfully links non-coding variants to their target genes and prioritizes relevant tissues or cell types. These examples illustrate the critical capability of 3D genome organization in annotating non-coding GWAS variants. This review discusses how 3D genome organization information contributes to elucidating the potential roles of non-coding GWAS variants in disease etiology.
Collapse
Affiliation(s)
- Wujuan Zhong
- Biostatistics and Research Decision Sciences, Merck & Co, Inc, Rahway, NJ, United States
| | - Weifang Liu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, United States
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| |
Collapse
|
5
|
Tang L, Zhong Z, Lin Y, Yang Y, Wang J, Martin J, Li M. EPIXplorer: A web server for prediction, analysis and visualization of enhancer-promoter interactions. Nucleic Acids Res 2022; 50:W290-W297. [PMID: 35639508 PMCID: PMC9252822 DOI: 10.1093/nar/gkac397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 05/01/2022] [Accepted: 05/05/2022] [Indexed: 11/13/2022] Open
Abstract
Long distance enhancers can physically interact with promoters to regulate gene expression through formation of enhancer-promoter (E-P) interactions. Identification of E-P interactions is also important for profound understanding of normal developmental and disease-associated risk variants. Although the state-of-art predictive computation methods facilitate the identification of E-P interactions to a certain extent, currently there is no efficient method that can meet various requirements of usage. Here we developed EPIXplorer, a user-friendly web server for efficient prediction, analysis and visualization of E-P interactions. EPIXplorer integrates 9 robust predictive algorithms, supports multiple types of 3D contact data and multi-omics data as input. The output from EPIXplorer is scored, fully annotated by regulatory elements and risk single-nucleotide polymorphisms (SNPs). In addition, the Visualization and Downstream module provide further functional analysis, all the output files and high-quality images are available for download. Together, EPIXplorer provides a user-friendly interface to predict the E-P interactions in an acceptable time, as well as understand how the genome-wide association study (GWAS) variants influence disease pathology by altering DNA looping between enhancers and the target gene promoters. EPIXplorer is available at https://www.csuligroup.com/EPIXplorer.
Collapse
Affiliation(s)
- Li Tang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zhizhou Zhong
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yisheng Lin
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yifei Yang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jun Wang
- Department of Pediatrics, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - James F Martin
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX 77030, USA
- Cardiovascular Research Institute, Baylor College of Medicine, Houston, TX 77030, USA
- Texas Heart Institute, Houston, TX 77030, USA
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|