1
|
Wei J, Resztak JA, Ranjbaran A, Alazizi A, Mair-Meijers HE, Slatcher RB, Zilioli S, Wen X, Luca F, Pique-Regi R. Functional characterization of eQTLs and asthma risk loci with scATAC-seq across immune cell types and contexts. Am J Hum Genet 2025; 112:301-317. [PMID: 39814021 DOI: 10.1016/j.ajhg.2024.12.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 12/13/2024] [Accepted: 12/17/2024] [Indexed: 01/18/2025] Open
Abstract
cis-regulatory elements (CREs) control gene transcription dynamics across cell types and in response to the environment. In asthma, multiple immune cell types play an important role in the inflammatory process. Genetic variants in CREs can also affect gene expression response dynamics and contribute to asthma risk. However, the regulatory mechanisms underlying control of transcriptional dynamics across different environmental contexts and cell types at single-cell resolution remain to be elucidated. To resolve this question, we performed single-cell ATAC-seq (scATAC-seq) in peripheral blood mononuclear cells (PBMCs) from 16 children with asthma. PBMCs were activated with phytohemagglutinin (PHA) or lipopolysaccharide (LPS) and treated with dexamethasone (DEX), an anti-inflammatory glucocorticoid. We analyzed changes in chromatin accessibility, measured transcription factor motif activity, and identified treatment- and cell-type-specific transcription factors that drive changes in both gene expression mean and variability. We observed a strong positive linear dependence between motif response and their target gene expression changes but a negative relationship with changes in target gene expression variability. This result suggests that an increase of transcription factor binding tightens the variability of gene expression around the mean. We then annotated genetic variants in chromatin accessibility peaks and response motifs, followed by computational fine-mapping of expression quantitative trait loci (eQTL) from a pediatric asthma cohort. We found that eQTLs were 5-fold enriched in peaks with response motifs and refined the credible set for 410 asthma risk genes, with 191 having the causal variant in response motifs. In conclusion, scATAC-seq enhances the understanding of molecular mechanisms for asthma risk variants mediated by gene expression.
Collapse
Affiliation(s)
- Julong Wei
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - Justyna A Resztak
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - Ali Ranjbaran
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - Adnan Alazizi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | | | | | - Samuele Zilioli
- Department of Psychology, Wayne State University, Detroit, MI, USA; Department of Family Medicine and Public Health Sciences, Wayne State University, Detroit, MI, USA
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Francesca Luca
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA; Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI 48201, USA.
| |
Collapse
|
2
|
Kim SH, Marinov GK, Greenleaf WJ. KAS-ATAC reveals the genome-wide single-stranded accessible chromatin landscape of the human genome. Genome Res 2025; 35:124-134. [PMID: 39572230 PMCID: PMC11789636 DOI: 10.1101/gr.279621.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 11/19/2024] [Indexed: 01/24/2025]
Abstract
Gene regulation in most eukaryotes involves two fundamental processes: alterations in genome packaging by nucleosomes, with active cis-regulatory elements (CREs) generally characterized by open-chromatin configuration, and transcriptional activation. Mapping these physical properties and biochemical activities, through profiling chromatin accessibility and active transcription, is a key tool for understanding the logic and mechanisms of transcription and its regulation. However, the relationship between these two states has not been accessible to simultaneous measurement. To this end, we developed KAS-ATAC, a combination of the kethoxal-assisted ssDNA sequencing (KAS-seq) and assay for transposase-accessible chromatin using sequencing (ATAC-seq) methods for mapping single-stranded DNA (and thus active transcription) and chromatin accessibility, respectively, enabling the genome-wide identification of DNA fragments that are simultaneously accessible and contain ssDNA. We use KAS-ATAC to evaluate levels of active transcription over different CRE classes, to estimate absolute levels of transcribed accessible DNA over CREs, to map nucleosomal configurations associated with RNA polymerase activities, and to assess transcription factor association with transcribed DNA through transcription factor binding site (TFBS) footprinting. We observe lower levels of transcription over distal enhancers compared with promoters and distinct nucleosomal configurations around transcription initiation sites associated with active transcription. We find that most TFs associate equally with transcribed and nontranscribed DNA, but a few factors specifically do not exhibit footprints over ssDNA-containing fragments. We anticipate KAS-ATAC to continue to derive useful insights into chromatin organization and transcriptional regulation in other contexts in the future.
Collapse
Affiliation(s)
- Samuel H Kim
- Cancer Biology Programs, School of Medicine, Stanford University, Stanford, California 94305, USA
| | - Georgi K Marinov
- Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA;
| | - William J Greenleaf
- Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA
- Department of Applied Physics, Stanford University, Stanford, California 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, California 94305, USA
- Chan Zuckerberg Biohub, San Francisco, California 94158, USA
| |
Collapse
|
3
|
Wu Y, Xie X, Zhu J, Guan L, Li M. Overview and Prospects of DNA Sequence Visualization. Int J Mol Sci 2025; 26:477. [PMID: 39859192 PMCID: PMC11764684 DOI: 10.3390/ijms26020477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 12/30/2024] [Accepted: 01/04/2025] [Indexed: 01/27/2025] Open
Abstract
Due to advances in big data technology, deep learning, and knowledge engineering, biological sequence visualization has been extensively explored. In the post-genome era, biological sequence visualization enables the visual representation of both structured and unstructured biological sequence data. However, a universal visualization method for all types of sequences has not been reported. Biological sequence data are rapidly expanding exponentially and the acquisition, extraction, fusion, and inference of knowledge from biological sequences are critical supporting technologies for visualization research. These areas are important and require in-depth exploration. This paper elaborates on a comprehensive overview of visualization methods for DNA sequences from four different perspectives-two-dimensional, three-dimensional, four-dimensional, and dynamic visualization approaches-and discusses the strengths and limitations of each method in detail. Furthermore, this paper proposes two potential future research directions for biological sequence visualization in response to the challenges of inefficient graphical feature extraction and knowledge association network generation in existing methods. The first direction is the construction of knowledge graphs for biological sequence big data, and the second direction is the cross-modal visualization of biological sequences using machine learning methods. This review is anticipated to provide valuable insights and contributions to computational biology, bioinformatics, genomic computing, genetic breeding, evolutionary analysis, and other related disciplines in the fields of biology, medicine, chemistry, statistics, and computing. It has an important reference value in biological sequence recommendation systems and knowledge question answering systems.
Collapse
Affiliation(s)
| | | | | | | | - Mengshan Li
- School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China; (Y.W.); (X.X.); (J.Z.); (L.G.)
| |
Collapse
|
4
|
Pampari A, Shcherbina A, Kvon EZ, Kosicki M, Nair S, Kundu S, Kathiria AS, Risca VI, Kuningas K, Alasoo K, Greenleaf WJ, Pennacchio LA, Kundaje A. ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.12.25.630221. [PMID: 39829783 PMCID: PMC11741299 DOI: 10.1101/2024.12.25.630221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Despite extensive mapping of cis-regulatory elements (cREs) across cellular contexts with chromatin accessibility assays, the sequence syntax and genetic variants that regulate transcription factor (TF) binding and chromatin accessibility at context-specific cREs remain elusive. We introduce ChromBPNet, a deep learning DNA sequence model of base-resolution accessibility profiles that detects, learns and deconvolves assay-specific enzyme biases from regulatory sequence determinants of accessibility, enabling robust discovery of compact TF motif lexicons, cooperative motif syntax and precision footprints across assays and sequencing depths. Extensive benchmarks show that ChromBPNet, despite its lightweight design, is competitive with much larger contemporary models at predicting variant effects on chromatin accessibility, pioneer TF binding and reporter activity across assays, cell contexts and ancestry, while providing interpretation of disrupted regulatory syntax. ChromBPNet also helps prioritize and interpret regulatory variants that influence complex traits and rare diseases, thereby providing a powerful lens to decode regulatory DNA and genetic variation.
Collapse
Affiliation(s)
- Anusri Pampari
- Department of Computer Science, Stanford University, Stanford CA, 94305
| | - Anna Shcherbina
- Department of Biomedical Data Sciences, Stanford University, Stanford CA, 94305
| | - Evgeny Z. Kvon
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Michael Kosicki
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Surag Nair
- Department of Computer Science, Stanford University, Stanford CA, 94305
| | - Soumya Kundu
- Department of Computer Science, Stanford University, Stanford CA, 94305
| | | | | | | | - Kaur Alasoo
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - William James Greenleaf
- Department of Genetics, Stanford University, Stanford CA, 94305
- Department of Applied Physics, Stanford University, Stanford, California 94305, USA
| | - Len A. Pennacchio
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford CA, 94305
- Department of Genetics, Stanford University, Stanford CA, 94305
| |
Collapse
|
5
|
Wanniarachchi DV, Viswakula S, Wickramasuriya AM. The evaluation of transcription factor binding site prediction tools in human and Arabidopsis genomes. BMC Bioinformatics 2024; 25:371. [PMID: 39623329 PMCID: PMC11613939 DOI: 10.1186/s12859-024-05995-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 11/21/2024] [Indexed: 12/06/2024] Open
Abstract
BACKGROUND The precise prediction of transcription factor binding sites (TFBSs) is pivotal for unraveling the gene regulatory networks underlying biological processes. While numerous tools have emerged for in silico TFBS prediction in recent years, the evolving landscape of computational biology necessitates thorough assessments of tool performance to ensure accuracy and reliability. Only a limited number of studies have been conducted to evaluate the performance of TFBS prediction tools comprehensively. Thus, the present study focused on assessing twelve widely used TFBS prediction tools and four de novo motif discovery tools using a benchmark dataset comprising real, generic, Markov, and negative sequences. TFBSs of Arabidopsis thaliana and Homo sapiens genomes downloaded from the JASPAR database were implanted in these sequences and the performance of tools was evaluated using several statistical parameters at different overlap percentages between the lengths of known and predicted binding sites. RESULTS Overall, the Multiple Cluster Alignment and Search Tool (MCAST) emerged as the best TFBS prediction tool, followed by Find Individual Motif Occurrences (FIMO) and MOtif Occurrence Detection Suite (MOODS). In addition, MotEvo and Dinucleotide Weight Tensor Toolbox (DWT-toolbox) demonstrated the highest sensitivity in identifying TFBSs at 90% and 80% overlap. Further, MCAST and DWT-toolbox managed to demonstrate the highest sensitivity across all three data types real, generic, and Markov. Among the de novo motif discovery tools, the Multiple Em for Motif Elicitation (MEME) emerged as the best performer. An analysis of the promoter regions of genes involved in the anthocyanin biosynthesis pathway in plants and the pentose phosphate pathway in humans, using the three best-performing tools, revealed considerable variation among the top 20 motifs identified by these tools. CONCLUSION The findings of this study lay a robust groundwork for selecting optimal TFBS prediction tools for future research. Given the variability observed in tool performance, employing multiple tools for identifying TFBSs in a set of sequences is highly recommended. In addition, further studies are recommended to develop an integrated toolbox that incorporates TFBS prediction or motif discovery tools, aiming to streamline result precision and accuracy.
Collapse
Affiliation(s)
- Dinithi V Wanniarachchi
- Department of Plant Sciences, Faculty of Science, University of Colombo, Colombo 03, Sri Lanka
| | - Sameera Viswakula
- Department of Statistics, Faculty of Science, University of Colombo, Colombo 03, Sri Lanka
| | | |
Collapse
|
6
|
Morgan D, DeMeo DL, Glass K. Using methylation data to improve transcription factor binding prediction. Epigenetics 2024; 19:2309826. [PMID: 38300850 PMCID: PMC10841018 DOI: 10.1080/15592294.2024.2309826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 01/01/2024] [Indexed: 02/03/2024] Open
Abstract
Modelling the regulatory mechanisms that determine cell fate, response to external perturbation, and disease state depends on measuring many factors, a task made more difficult by the plasticity of the epigenome. Scanning the genome for the sequence patterns defined by Position Weight Matrices (PWM) can be used to estimate transcription factor (TF) binding locations. However, this approach does not incorporate information regarding the epigenetic context necessary for TF binding. CpG methylation is an epigenetic mark influenced by environmental factors that is commonly assayed in human cohort studies. We developed a framework to score inferred TF binding locations using methylation data. We intersected motif locations identified using PWMs with methylation information captured in both whole-genome bisulfite sequencing and Illumina EPIC array data for six cell lines, scored motif locations based on these data, and compared with experimental data characterizing TF binding (ChIP-seq). We found that for most TFs, binding prediction improves using methylation-based scoring compared to standard PWM-scores. We also illustrate that our approach can be generalized to infer TF binding when methylation information is only proximally available, i.e. measured for nearby CpGs that do not directly overlap with a motif location. Overall, our approach provides a framework for inferring context-specific TF binding using methylation data. Importantly, the availability of DNA methylation data in existing patient populations provides an opportunity to use our approach to understand the impact of methylation on gene regulatory processes in the context of human disease.
Collapse
Affiliation(s)
- Daniel Morgan
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Dawn L. DeMeo
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biostatistics, Harvard Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
7
|
Jo Y, Greene TT, Zhang K, Chiale C, Fang Z, Dallari S, Marooki N, Wang W, Zuniga EI. Genomic Analysis of Progenitors in Viral Infection Implicates Glucocorticoids as Suppressors of Plasmacytoid Dendritic Cell Generation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.28.620771. [PMID: 39554106 PMCID: PMC11565824 DOI: 10.1101/2024.10.28.620771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Plasmacytoid Dendritic cells (pDCs) are the most potent producers of interferons, which are critical antiviral cytokines. pDC development is, however, compromised following a viral infection, and this phenomenon, as well as its relationship to conventional (c)DC development is still incompletely understood. By using lymphocytic choriomeningitis virus (LCMV) infection in mice as a model system, we observed that DC progenitors skewed away from pDC and towards cDC development during in vivo viral infection. Subsequent characterization of the transcriptional and epigenetic landscape of fms-like tyrosine kinase 3 + (Flt3 + ) DC progenitors and follow-up studies revealed increased apoptosis and reduced proliferation in different individual DC-progenitors as well as a profound IFN-I-dependent ablation of pre-pDCs, but not pre-DC precursor, after both acute and chronic LCMV infections. In addition, integrated genomic analysis identified altered activity of 34 transcription factors in Flt3 + DC progenitors from infected mice, including two regulators of Glucocorticoid (GC) responses. Subsequent studies demonstrated that addition of GCs to DC progenitors led to downregulated pDC-primed-genes while upregulating cDC-primed-genes, and that endogenous GCs selectively decreased pDC, but not cDC, numbers upon in-vivo LCMV infection. These findings demonstrate a significant ablation of pre-pDCs in infected mice and identify GCs as suppressors of pDC generation from early progenitors. This provides an explanation for the impaired pDC development following viral infection and links pDC generation to the hypothalamic-pituitary-adrenal axis. Significance Statement Plasmacytoid dendritic cells (pDCs) play critical roles in antiviral responses. However, adaptations of DC progenitors lead to compromised pDC generation after viral infection. Here, we characterized the transcriptional and epigenetic landscapes of DC progenitors after infection. We observed widespread changes in gene expression and chromatin accessibility, reflecting shifts in proliferation, apoptosis, and differentiation potential into various DC subsets. Notably, we identified alterations in the predicted activity of 34 transcription factors, including two regulators of glucocorticoid responses. Our data demonstrate that glucocorticoids inhibit pDC generation by reprogramming DC progenitors. These findings establish a molecular framework for understanding how DC progenitors adapt to infection and highlight the role of glucocorticoid signaling in this process.
Collapse
|
8
|
Chinnam NB, Thapar R, Arvai AS, Sarker AH, Soll JM, Paul T, Syed A, Rosenberg DJ, Hammel M, Bacolla A, Katsonis P, Asthana A, Tsai MS, Ivanov I, Lichtarge O, Silverman RH, Mosammaparast N, Tsutakawa SE, Tainer JA. ASCC1 structures and bioinformatics reveal a novel helix-clasp-helix RNA-binding motif linked to a two-histidine phosphodiesterase. J Biol Chem 2024; 300:107368. [PMID: 38750793 PMCID: PMC11214414 DOI: 10.1016/j.jbc.2024.107368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/07/2024] [Accepted: 05/09/2024] [Indexed: 06/06/2024] Open
Abstract
Activating signal co-integrator complex 1 (ASCC1) acts with ASCC-ALKBH3 complex in alkylation damage responses. ASCC1 uniquely combines two evolutionarily ancient domains: nucleotide-binding K-Homology (KH) (associated with regulating splicing, transcriptional, and translation) and two-histidine phosphodiesterase (PDE; associated with hydrolysis of cyclic nucleotide phosphate bonds). Germline mutations link loss of ASCC1 function to spinal muscular atrophy with congenital bone fractures 2 (SMABF2). Herein analysis of The Cancer Genome Atlas (TCGA) suggests ASCC1 RNA overexpression in certain tumors correlates with poor survival, Signatures 29 and 3 mutations, and genetic instability markers. We determined crystal structures of Alvinella pompejana (Ap) ASCC1 and Human (Hs) PDE domain revealing high-resolution details and features conserved over 500 million years of evolution. Extending our understanding of the KH domain Gly-X-X-Gly sequence motif, we define a novel structural Helix-Clasp-Helix (HCH) nucleotide binding motif and show ASCC1 sequence-specific binding to CGCG-containing RNA. The V-shaped PDE nucleotide binding channel has two His-Φ-Ser/Thr-Φ (HXT) motifs (Φ being hydrophobic) positioned to initiate cyclic phosphate bond hydrolysis. A conserved atypical active-site histidine torsion angle implies a novel PDE substrate. Flexible active site loop and arginine-rich domain linker appear regulatory. Small-angle X-ray scattering (SAXS) revealed aligned KH-PDE RNA binding sites with limited flexibility in solution. Quantitative evolutionary bioinformatic analyses of disease and cancer-associated mutations support implied functional roles for RNA binding, phosphodiesterase activity, and regulation. Collective results inform ASCC1's roles in transactivation and alkylation damage responses, its targeting by structure-based inhibitors, and how ASCC1 mutations may impact inherited disease and cancer.
Collapse
Affiliation(s)
- Naga Babu Chinnam
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Roopa Thapar
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Andrew S Arvai
- Integrative Structural & Computational Biology, The Scripps Research Institute, La Jolla, California, USA
| | - Altaf H Sarker
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Jennifer M Soll
- Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Tanmoy Paul
- Department of Chemistry, Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, Georgia, USA
| | - Aleem Syed
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Daniel J Rosenberg
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Michal Hammel
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Albino Bacolla
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Panagiotis Katsonis
- Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Abhishek Asthana
- Department Cancer Biology, Cleveland Clinic Foundation, Lerner Research Institute, Cleveland, Ohio, USA
| | - Miaw-Sheue Tsai
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Ivaylo Ivanov
- Department of Chemistry, Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, Georgia, USA
| | - Olivier Lichtarge
- Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Robert H Silverman
- Department Cancer Biology, Cleveland Clinic Foundation, Lerner Research Institute, Cleveland, Ohio, USA
| | - Nima Mosammaparast
- Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Susan E Tsutakawa
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, USA.
| | - John A Tainer
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA; Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, USA; Department of Cancer Biology, University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
| |
Collapse
|
9
|
Veerappa A, Guda C. Coordination among frequent genetic variants imparts substance use susceptibility and pathogenesis. Front Neurosci 2024; 18:1332419. [PMID: 38660223 PMCID: PMC11041639 DOI: 10.3389/fnins.2024.1332419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 04/02/2024] [Indexed: 04/26/2024] Open
Abstract
Determining the key genetic variants is a crucial step to comprehensively understand substance use disorders (SUDs). In this study, utilizing whole exome sequences of five multi-generational pedigrees with SUDs, we used an integrative omics-based approach to uncover candidate genetic variants that impart susceptibility to SUDs and influence addition traits. We identified several SNPs and rare, protein-function altering variants in genes, GRIA3, NCOR1, and SHANK1; compound heterozygous variants in LNPEP, LRP1, and TBX2, that play a significant role in the neurotransmitter-neuropeptide axis, specifically in the dopaminergic circuits. We also noted a greater frequency of heterozygous and recessive variants in genes involved in the structural and functional integrity of synapse receptors, CHRNA4, CNR2, GABBR1, DRD4, NPAS4, ADH1B, ADH1C, OPRM1, and GABBR2. Variant analysis in upstream promoter regions revealed regulatory variants in NEK9, PRRX1, PRPF4B, CELA2A, RABGEF1, and CRBN, crucial for dopamine regulation. Using family-and pedigree-based data, we identified heterozygous recessive alleles in LNPEP, LRP1 (4 frameshift deletions), and TBX2 (2 frameshift deletions) linked to SUDs. GWAS overlap identified several SNPs associated with SUD susceptibility, including rs324420 and rs1229984. Furthermore, miRNA variant analysis revealed notable variants in mir-548 U and mir-532. Pathway studies identified the presence of extensive coordination among these genetic variants to impart substance use susceptibility and pathogenesis. This study identified variants that were found to be overrepresented among genes of dopaminergic circuits participating in the neurotransmitter-neuropeptide axis, suggesting pleiotropic influences in the development and sustenance of chronic substance use. The presence of a diverse set of haploinsufficient variants in varying frequencies demonstrates the existence of extraordinary coordination among them in attributing risk and modulating severity to SUDs.
Collapse
Affiliation(s)
- Avinash Veerappa
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States
- Center for Biomedical Informatics Research and Innovation, University of Nebraska Medical Center, Omaha, NE, United States
| |
Collapse
|
10
|
Mondal A, Kolomeisky AB. Why Are Nucleosome Breathing Dynamics Asymmetric? J Phys Chem Lett 2024; 15:422-431. [PMID: 38180351 DOI: 10.1021/acs.jpclett.3c03339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
In eukaryotic cells, DNA is bound to nucleosomes, but DNA segments occasionally unbind in the process known as nucleosome breathing. Although DNA can unwrap simultaneously from both ends of the nucleosome (symmetric breathing), experiments indicate that DNA prefers to dissociate from only one end (asymmetric breathing). However, the molecular origin of the asymmetry is not understood. We developed a new theoretical approach that gives microscopic explanations of asymmetric breathing. It is based on a stochastic description that leads to a comprehensive evaluation of dynamics by using effective free-energy landscapes. It is shown that asymmetric breathing follows the kinetically preferred pathways. In addition, it is also found that asymmetric breathing leads to a faster target search by transcription factors. Theoretical predictions, supported by computer simulations, agree with experiments. It is proposed that nature utilizes the symmetry of nucleosome breathing to achieve a better dynamic accessibility of chromatin for more efficient genetic regulation.
Collapse
Affiliation(s)
- Anupam Mondal
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Anatoly B Kolomeisky
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
11
|
Oh KS, Aqdas M, Sung MH. XL-DNase-Seq: Footprinting Analysis of Dynamic Transcription Factors. Methods Mol Biol 2024; 2846:243-261. [PMID: 39141240 DOI: 10.1007/978-1-0716-4071-5_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
We have developed a novel method for genomic footprinting of transcription factors (TFs) that detects potential gene regulatory relationships from DNase-seq data at the nucleotide level. We introduce an assay termed cross-link (XL)-DNase-seq, designed to capture chromatin interactions of dynamic TFs. A mild cross-linking step in XL-DNase-seq improves the detection of DNase-based footprints of dynamic TFs. The footprint strengths and detectability depend on an optimal cross-linking procedure. This method may help extract novel gene regulatory circuits involving previously undetectable TFs. The XL-DNase-seq method is illustrated here for activated mouse macrophage-like cells, which share several features with inflammatory macrophages.
Collapse
Affiliation(s)
- Kyu-Seon Oh
- Laboratory of Molecular Biology and Immunology, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Mohammad Aqdas
- Laboratory of Molecular Biology and Immunology, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Myong-Hee Sung
- Laboratory of Molecular Biology and Immunology, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA.
| |
Collapse
|
12
|
Nanda D, Pant P, Machha P, Sowpati DT, Kumarswamy R. Transcriptional changes during isoproterenol-induced cardiac fibrosis in mice. Front Mol Biosci 2023; 10:1263913. [PMID: 38178867 PMCID: PMC10765171 DOI: 10.3389/fmolb.2023.1263913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 10/23/2023] [Indexed: 01/06/2024] Open
Abstract
Introduction: β-adrenergic stimulation using β-agonists such as isoproterenol has been routinely used to induce cardiac fibrosis in experimental animal models. Although transcriptome changes in surgical models of cardiac fibrosis such as transverse aortic constriction (TAC) and coronary artery ligation (CAL) are well-studied, transcriptional changes during isoproterenol-induced cardiac fibrosis are not well-explored. Methods: Cardiac fibrosis was induced in male C57BL6 mice by administration of isoproterenol for 4, 8, or 11 days at 50 mg/kg/day dose. Temporal changes in gene expression were studied by RNA sequencing. Results and discussion: We observed a significant alteration in the transcriptome profile across the different experimental groups compared to the saline group. Isoproterenol treatment caused upregulation of genes associated with ECM organization, cell-cell contact, three-dimensional structure, and cell growth, while genes associated with fatty acid oxidation, sarcoplasmic reticulum calcium ion transport, and cardiac muscle contraction are downregulated. A number of known long non-coding RNAs (lncRNAs) and putative novel lncRNAs exhibited differential regulation. In conclusion, our study shows that isoproterenol administration leads to the dysregulation of genes relevant to ECM deposition and cardiac contraction, and serves as an excellent alternate model to the surgical models of heart failure.
Collapse
Affiliation(s)
- Disha Nanda
- Council of Scientific and Industrial Research (CSIR)–Centre for Cellular and Molecular Biology, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Priyanka Pant
- Council of Scientific and Industrial Research (CSIR)–Centre for Cellular and Molecular Biology, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Pratheusa Machha
- Council of Scientific and Industrial Research (CSIR)–Centre for Cellular and Molecular Biology, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Divya Tej Sowpati
- Council of Scientific and Industrial Research (CSIR)–Centre for Cellular and Molecular Biology, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Regalla Kumarswamy
- Council of Scientific and Industrial Research (CSIR)–Centre for Cellular and Molecular Biology, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
13
|
Hecker D, Lauber M, Behjati Ardakani F, Ashrafiyan S, Manz Q, Kersting J, Hoffmann M, Schulz MH, List M. Computational tools for inferring transcription factor activity. Proteomics 2023; 23:e2200462. [PMID: 37706624 DOI: 10.1002/pmic.202200462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/11/2023] [Accepted: 08/22/2023] [Indexed: 09/15/2023]
Abstract
Transcription factors (TFs) are essential players in orchestrating the regulatory landscape in cells. Still, their exact modes of action and dependencies on other regulatory aspects remain elusive. Since TFs act cell type-specific and each TF has its own characteristics, untangling their regulatory interactions from an experimental point of view is laborious and convoluted. Thus, there is an ongoing development of computational tools that estimate transcription factor activity (TFA) from a variety of data modalities, either based on a mapping of TFs to their putative target genes or in a genome-wide, gene-unspecific fashion. These tools can help to gain insights into TF regulation and to prioritize candidates for experimental validation. We want to give an overview of available computational tools that estimate TFA, illustrate examples of their application, debate common result validation strategies, and discuss assumptions and concomitant limitations.
Collapse
Affiliation(s)
- Dennis Hecker
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Michael Lauber
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Fatemeh Behjati Ardakani
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Shamim Ashrafiyan
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Quirin Manz
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Johannes Kersting
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- GeneSurge GmbH, München, Germany
| | - Markus Hoffmann
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Institute for Advanced Study, Technical University of Munich, Garching, Germany
- National Institute of Diabetes, Digestive, and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, USA
| | - Marcel H Schulz
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Markus List
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|
14
|
Xue D, Narisu N, Taylor DL, Zhang M, Grenko C, Taylor HJ, Yan T, Tang X, Sinha N, Zhu J, Vandana JJ, Nok Chong AC, Lee A, Mansell EC, Swift AJ, Erdos MR, Zhong A, Bonnycastle LL, Zhou T, Chen S, Collins FS. Functional interrogation of twenty type 2 diabetes-associated genes using isogenic human embryonic stem cell-derived β-like cells. Cell Metab 2023; 35:1897-1914.e11. [PMID: 37858332 PMCID: PMC10841752 DOI: 10.1016/j.cmet.2023.09.013] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/26/2023] [Accepted: 09/28/2023] [Indexed: 10/21/2023]
Abstract
Genetic studies have identified numerous loci associated with type 2 diabetes (T2D), but the functional roles of many loci remain unexplored. Here, we engineered isogenic knockout human embryonic stem cell lines for 20 genes associated with T2D risk. We examined the impacts of each knockout on β cell differentiation, functions, and survival. We generated gene expression and chromatin accessibility profiles on β cells derived from each knockout line. Analyses of T2D-association signals overlapping HNF4A-dependent ATAC peaks identified a likely causal variant at the FAIM2 T2D-association signal. Additionally, the integrative association analyses identified four genes (CP, RNASE1, PCSK1N, and GSTA2) associated with insulin production, and two genes (TAGLN3 and DHRS2) associated with β cell sensitivity to lipotoxicity. Finally, we leveraged deep ATAC-seq read coverage to assess allele-specific imbalance at variants heterozygous in the parental line and identified a single likely functional variant at each of 23 T2D-association signals.
Collapse
Affiliation(s)
- Dongxiang Xue
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Center for Genomic Health, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA
| | - Narisu Narisu
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - D Leland Taylor
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Meili Zhang
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA
| | - Caleb Grenko
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Henry J Taylor
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA; Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, CB1 8RN Cambridge, UK
| | - Tingfen Yan
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Xuming Tang
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Center for Genomic Health, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA
| | - Neelam Sinha
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jiajun Zhu
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Center for Genomic Health, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA
| | - J Jeya Vandana
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Center for Genomic Health, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Tri-Institutional PhD Program in Chemical Biology, Weill Cornell Medicine, The Rockefeller University, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Angie Chi Nok Chong
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Center for Genomic Health, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA
| | - Angela Lee
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Erin C Mansell
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Amy J Swift
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Michael R Erdos
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Aaron Zhong
- Stem Cell Research Facility, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Lori L Bonnycastle
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ting Zhou
- Stem Cell Research Facility, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Shuibing Chen
- Department of Surgery, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Center for Genomic Health, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA.
| | - Francis S Collins
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| |
Collapse
|
15
|
Lee H, Seo P. Accessible gene borders establish a core structural unit for chromatin architecture in Arabidopsis. Nucleic Acids Res 2023; 51:10261-10277. [PMID: 37884483 PMCID: PMC10602878 DOI: 10.1093/nar/gkad710] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 08/08/2023] [Accepted: 08/16/2023] [Indexed: 10/28/2023] Open
Abstract
Three-dimensional (3D) chromatin structure is linked to transcriptional regulation in multicellular eukaryotes including plants. Taking advantage of high-resolution Hi-C (high-throughput chromatin conformation capture), we detected a small structural unit with 3D chromatin architecture in the Arabidopsis genome, which lacks topologically associating domains, and also in the genomes of tomato, maize, and Marchantia polymorpha. The 3D folding domain unit was usually established around an individual gene and was dependent on chromatin accessibility at the transcription start site (TSS) and transcription end site (TES). We also observed larger contact domains containing two or more neighboring genes, which were dependent on accessible border regions. Binding of transcription factors to accessible TSS/TES regions formed these gene domains. We successfully simulated these Hi-C contact maps via computational modeling using chromatin accessibility as input. Our results demonstrate that gene domains establish basic 3D chromatin architecture units that likely contribute to higher-order 3D genome folding in plants.
Collapse
Affiliation(s)
- Hongwoo Lee
- Department of Chemistry, Seoul National University, Seoul 08826, Korea
| | - Pil Joon Seo
- Department of Chemistry, Seoul National University, Seoul 08826, Korea
- Plant Genomics and Breeding Institute, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
16
|
Grau J, Schmidt F, Schulz MH. Widespread effects of DNA methylation and intra-motif dependencies revealed by novel transcription factor binding models. Nucleic Acids Res 2023; 51:e95. [PMID: 37650641 PMCID: PMC10570048 DOI: 10.1093/nar/gkad693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 07/20/2023] [Accepted: 08/10/2023] [Indexed: 09/01/2023] Open
Abstract
Several studies suggested that transcription factor (TF) binding to DNA may be impaired or enhanced by DNA methylation. We present MeDeMo, a toolbox for TF motif analysis that combines information about DNA methylation with models capturing intra-motif dependencies. In a large-scale study using ChIP-seq data for 335 TFs, we identify novel TFs that show a binding behaviour associated with DNA methylation. Overall, we find that the presence of CpG methylation decreases the likelihood of binding for the majority of methylation-associated TFs. For a considerable subset of TFs, we show that intra-motif dependencies are pivotal for accurately modelling the impact of DNA methylation on TF binding. We illustrate that the novel methylation-aware TF binding models allow to predict differential ChIP-seq peaks and improve the genome-wide analysis of TF binding. Our work indicates that simplistic models that neglect the effect of DNA methylation on DNA binding may lead to systematic underperformance for methylation-associated TFs.
Collapse
Affiliation(s)
- Jan Grau
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle 06120, Germany
| | - Florian Schmidt
- Goethe-University Frankfurt, Institute for Cardiovascular Regeneration, Theodor-Stern-Kai 7, 60590 Frankfurt, Germany
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken 66123, Germany
- Systems Biology and Data Analytics, Genome Institute of Singapore, Singapore 13862, Singapore
- ImmunoScape Pte Ltd, Singapore 228208, Singapore
| | - Marcel H Schulz
- Goethe-University Frankfurt, Institute for Cardiovascular Regeneration, Theodor-Stern-Kai 7, 60590 Frankfurt, Germany
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken 66123, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, 60590 Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University, Frankfurt am Main, Germany
| |
Collapse
|
17
|
Xu Z, He L, Wu Y, Yang L, Li C, Wu H. PTEN regulates hematopoietic lineage plasticity via PU.1-dependent chromatin accessibility. Cell Rep 2023; 42:112967. [PMID: 37561626 DOI: 10.1016/j.celrep.2023.112967] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 06/20/2023] [Accepted: 07/26/2023] [Indexed: 08/12/2023] Open
Abstract
PTEN loss in fetal liver hematopoietic stem cells (HSCs) leads to alterations in myeloid, T-, and B-lineage potentials and T-lineage acute lymphoblastic leukemia (T-ALL) development. To explore the mechanism underlying PTEN-regulated hematopoietic lineage choices, we carry out integrated assay for transposase-accessible chromatin using sequencing (ATAC-seq), single-cell RNA-seq, and in vitro culture analyses using in vivo-isolated mouse pre-leukemic HSCs and progenitors. We find that PTEN loss alters chromatin accessibility of key lineage transcription factor (TF) binding sites at the prepro-B stage, corresponding to increased myeloid and T-lineage potentials and reduced B-lineage potential. Importantly, we find that PU.1 is an essential TF downstream of PTEN and that altering PU.1 levels can reprogram the chromatin accessibility landscape and myeloid, T-, and B-lineage potentials in Ptennull prepro-B cells. Our study discovers prepro-B as the key developmental stage underlying PTEN-regulated hematopoietic lineage choices and suggests a critical role of PU.1 in modulating the epigenetic state and lineage plasticity of prepro-B progenitors.
Collapse
Affiliation(s)
- Zihan Xu
- The MOE Key Laboratory of Cell Proliferation and Differentiation, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China; Center for Statistical Science, Peking University, Beijing, China
| | - Libing He
- The MOE Key Laboratory of Cell Proliferation and Differentiation, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Yilin Wu
- The MOE Key Laboratory of Cell Proliferation and Differentiation, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Lu Yang
- The MOE Key Laboratory of Cell Proliferation and Differentiation, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Cheng Li
- The MOE Key Laboratory of Cell Proliferation and Differentiation, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China; Center for Statistical Science, Peking University, Beijing, China.
| | - Hong Wu
- The MOE Key Laboratory of Cell Proliferation and Differentiation, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
| |
Collapse
|
18
|
Mondal A, Kolomeisky AB. Role of Nucleosome Sliding in the Protein Target Search for Covered DNA Sites. J Phys Chem Lett 2023; 14:7073-7082. [PMID: 37527481 DOI: 10.1021/acs.jpclett.3c01704] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Associations of transcription factors (TFs) with specific sites on DNA initiate major cellular processes. But DNA in eukaryotic cells is covered by nucleosomes which prevent TFs from binding. However, nucleosome structures on DNA are not static and exhibit breathing and sliding. We develop a theoretical framework to investigate the effect of nucleosome sliding on a protein target search. By analysis of a discrete-state stochastic model of nucleosome sliding, search dynamics are explicitly evaluated. It is found that for long sliding lengths the target search dynamics are faster for normal TFs that cannot enter the nucleosomal DNA. But for more realistic short sliding lengths, the so-called pioneer TFs, which can invade nucleosomal DNA, locate specific sites faster. It is also suggested that nucleosome breathing, which is a faster process, has a stronger effect on protein search dynamics than that of nucleosome sliding. Theoretical arguments to explain these observations are presented.
Collapse
Affiliation(s)
- Anupam Mondal
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Anatoly B Kolomeisky
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
19
|
Abstract
Nearly three-fourths of all eukaryotic DNA is occupied by nucleosomes, protein-DNA complexes comprising octameric histone core proteins and ∼150 base pairs of DNA. In addition to acting as a DNA compaction vehicle, the dynamics of nucleosomes regulate the DNA site accessibility for the nonhistone proteins, thereby controlling regulatory processes involved in determining the cell identity and cell fate. Here, we propose an analytical framework to analyze the role of nucleosome dynamics on the target search process of transcription factors through a simple discrete-state stochastic description of the search process. By considering the experimentally determined kinetic rates associated with protein and nucleosome dynamics as the only inputs, we estimate the target search time of a protein via first-passage probability calculations separately during nucleosome breathing and sliding dynamics. Although both the nucleosome dynamics permit transient access to the DNA sites that are otherwise occluded by the histone proteins, our result suggests substantial differences between the protein search mechanism on a nucleosome performing breathing and sliding dynamics. Furthermore, we identify the molecular factors that influence the search efficiency and demonstrate how these factors together portray a highly dynamic landscape of gene regulation. Our analytical results are validated using extensive Monte Carlo simulations.
Collapse
Affiliation(s)
- Sujeet Kumar Mishra
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Arnab Bhattacherjee
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| |
Collapse
|
20
|
Zitti B, Hoffer E, Zheng W, Pandey RV, Schlums H, Perinetti Casoni G, Fusi I, Nguyen L, Kärner J, Kokkinou E, Carrasco A, Gahm J, Ehrström M, Happaniemi S, Keita ÅV, Hedin CRH, Mjösberg J, Eidsmo L, Bryceson YT. Human skin-resident CD8 + T cells require RUNX2 and RUNX3 for induction of cytotoxicity and expression of the integrin CD49a. Immunity 2023:S1074-7613(23)00220-0. [PMID: 37269830 DOI: 10.1016/j.immuni.2023.05.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 01/26/2023] [Accepted: 05/05/2023] [Indexed: 06/05/2023]
Abstract
The integrin CD49a marks highly cytotoxic epidermal-tissue-resident memory (TRM) cells, but their differentiation from circulating populations remains poorly defined. We demonstrate enrichment of RUNT family transcription-factor-binding motifs in human epidermal CD8+CD103+CD49a+ TRM cells, paralleled by high RUNX2 and RUNX3 protein expression. Sequencing of paired skin and blood samples revealed clonal overlap between epidermal CD8+CD103+CD49a+ TRM cells and circulating memory CD8+CD45RA-CD62L+ T cells. In vitro stimulation of circulating CD8+CD45RA-CD62L+ T cells with IL-15 and TGF-β induced CD49a expression and cytotoxic transcriptional profiles in a RUNX2- and RUNX3-dependent manner. We therefore identified a reservoir of circulating cells with cytotoxic TRM potential. In melanoma patients, high RUNX2, but not RUNX3, transcription correlated with a cytotoxic CD8+CD103+CD49a+ TRM cell signature and improved patient survival. Together, our results indicate that combined RUNX2 and RUNX3 activity promotes the differentiation of cytotoxic CD8+CD103+CD49a+ TRM cells, providing immunosurveillance of infected and malignant cells.
Collapse
Affiliation(s)
- Beatrice Zitti
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden
| | - Elena Hoffer
- Division of Rheumatology, Department of Medicine Solna, Karolinska Institutet and Unit of Rheumatology, Karolinska University Hospital, 17176 Stockholm, Sweden; Leo Foundation Skin Immunology Center, Department of Immunology and Microbiology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Wenning Zheng
- Division of Rheumatology, Department of Medicine Solna, Karolinska Institutet and Unit of Rheumatology, Karolinska University Hospital, 17176 Stockholm, Sweden; Leo Foundation Skin Immunology Center, Department of Immunology and Microbiology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Ram Vinay Pandey
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden
| | - Heinrich Schlums
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden
| | - Giovanna Perinetti Casoni
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden
| | - Irene Fusi
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden; University of Siena, 53100 Siena, Italy
| | - Lien Nguyen
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden
| | - Jaanika Kärner
- Division of Rheumatology, Department of Medicine Solna, Karolinska Institutet and Unit of Rheumatology, Karolinska University Hospital, 17176 Stockholm, Sweden
| | - Efthymia Kokkinou
- Center for Infectious Medicine, Department of Medicine Hudddinge, Karolinska Institutet, Karolinska University Hospital Huddinge, 14157 Stockholm, Sweden
| | - Anna Carrasco
- Center for Infectious Medicine, Department of Medicine Hudddinge, Karolinska Institutet, Karolinska University Hospital Huddinge, 14157 Stockholm, Sweden
| | - Jessica Gahm
- Department of Reconstructive surgery, Karolinska Institutet and Karolinska University Hospital, 17176 Stockholm, Sweden
| | | | | | - Åsa V Keita
- Department of Biomedical and Clinical Sciences, Linköping University, 58183 Linköping, Sweden
| | - Charlotte R H Hedin
- Department of Medicine Solna, Karolinska Institutet, 17176 Stockholm, Sweden; Gastroenterology Unit, Department of Gastroenterology, Dermatovenereology and Rheumatology, Karolinska University Hospital, 17176 Stockholm, Sweden
| | - Jenny Mjösberg
- Center for Infectious Medicine, Department of Medicine Hudddinge, Karolinska Institutet, Karolinska University Hospital Huddinge, 14157 Stockholm, Sweden
| | - Liv Eidsmo
- Division of Rheumatology, Department of Medicine Solna, Karolinska Institutet and Unit of Rheumatology, Karolinska University Hospital, 17176 Stockholm, Sweden; Leo Foundation Skin Immunology Center, Department of Immunology and Microbiology, University of Copenhagen, 2200 Copenhagen, Denmark.
| | - Yenan T Bryceson
- Center for Hematology and Regenerative Medicine, Department of Medicine Hudddinge, Karolinska Institute, 14157 Stockholm, Sweden; Department of Clinical Immunology and Transfusion Medicine, Karolinska University Hospital, 17176 Stockholm, Sweden; Broegelmann Research Laboratory, Department of Clinical Sciences, University of Bergen, 5030 Bergen, Norway.
| |
Collapse
|
21
|
Marri D, Filipovic D, Kana O, Tischkau S, Bhattacharya S. Prediction of mammalian tissue-specific CLOCK-BMAL1 binding to E-box DNA motifs. Sci Rep 2023; 13:7742. [PMID: 37173345 PMCID: PMC10182026 DOI: 10.1038/s41598-023-34115-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 04/25/2023] [Indexed: 05/15/2023] Open
Abstract
The Brain and Muscle ARNTL-Like 1 protein (BMAL1) forms a heterodimer with either Circadian Locomotor Output Cycles Kaput (CLOCK) or Neuronal PAS domain protein 2 (NPAS2) to act as a master regulator of the mammalian circadian clock gene network. The dimer binds to E-box gene regulatory elements on DNA, activating downstream transcription of clock genes. Identification of transcription factor binding sites and genomic features that correlate to DNA binding by BMAL1 is a challenging problem, given that CLOCK-BMAL1 or NPAS2-BMAL1 bind to several distinct binding motifs (CANNTG) on DNA. Using three different types of tissue-specific machine learning models with features based on (1) DNA sequence, (2) DNA sequence plus DNA shape, and (3) DNA sequence and shape plus histone modifications, we developed an interpretable predictive model of genome-wide BMAL1 binding to E-box motifs and dissected the mechanisms underlying BMAL1-DNA binding. Our results indicated that histone modifications, the local shape of the DNA, and the flanking sequence of the E-box motif are sufficient predictive features for BMAL1-DNA binding. Our models also provide mechanistic insights into tissue specificity of DNA binding by BMAL1.
Collapse
Affiliation(s)
- Daniel Marri
- Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - David Filipovic
- Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Omar Kana
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
- Department of Pharmacology and Toxicology, Michigan State University, East Lansing, MI, USA
- Institute for Integrative Toxicology, Michigan State University, East Lansing, MI, USA
| | - Shelley Tischkau
- Department of Pharmacology, Southern Illinois University School of Medicine, Springfield, IL, USA
| | - Sudin Bhattacharya
- Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA.
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA.
- Department of Pharmacology and Toxicology, Michigan State University, East Lansing, MI, USA.
- Institute for Integrative Toxicology, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
22
|
Xue D, Narisu N, Taylor DL, Zhang M, Grenko C, Taylor HJ, Yan T, Tang X, Sinha N, Zhu J, Vandana JJ, Chong ACN, Lee A, Mansell EC, Swift AJ, Erdos MR, Zhou T, Bonnycastle LL, Zhong A, Chen S, Collins FS. Functional interrogation of twenty type 2 diabetes-associated genes using isogenic hESC-derived β-like cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.07.539774. [PMID: 37214922 PMCID: PMC10197532 DOI: 10.1101/2023.05.07.539774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Genetic studies have identified numerous loci associated with type 2 diabetes (T2D), but the functional role of many loci has remained unexplored. In this study, we engineered isogenic knockout human embryonic stem cell (hESC) lines for 20 genes associated with T2D risk. We systematically examined β-cell differentiation, insulin production and secretion, and survival. We performed RNA-seq and ATAC-seq on hESC-β cells from each knockout line. Analyses of T2D GWAS signals overlapping with HNF4A-dependent ATAC peaks identified a specific SNP as a likely causal variant. In addition, we performed integrative association analyses and identified four genes ( CP, RNASE1, PCSK1N and GSTA2 ) associated with insulin production, and two genes ( TAGLN3 and DHRS2 ) associated with sensitivity to lipotoxicity. Finally, we leveraged deep ATAC-seq read coverage to assess allele-specific imbalance at variants heterozygous in the parental hESC line, to identify a single likely functional variant at each of 23 T2D GWAS signals.
Collapse
|
23
|
Mondal A, Felipe C, Kolomeisky AB. Nucleosome Breathing Facilitates the Search for Hidden DNA Sites by Pioneer Transcription Factors. J Phys Chem Lett 2023; 14:4096-4103. [PMID: 37125729 DOI: 10.1021/acs.jpclett.3c00529] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Transfer of genetic information starts with transcription factors (TFs) binding to specific sites on DNA. But in living cells, DNA is mostly covered by nucleosomes. There are proteins, known as pioneer TFs, that can efficiently reach the DNA sites hidden by nucleosomes, although the underlying mechanisms are not understood. Using the recently proposed idea of interaction-compensation mechanism, we develop a stochastic model for the target search on DNA with nucleosome breathing. It is found that nucleosome breathing can significantly accelerate the search by pioneer TFs in comparison to situations without breathing. We argue that this is the result of the interaction-compensation mechanism that allows proteins to enter the inner nucleosome region through the outer DNA segment. It is suggested that nature optimized pioneer TFs to take advantage of nucleosome breathing. The presented theoretical picture provides a possible microscopic explanation for the successful invasion of nucleosome-buried genes.
Collapse
Affiliation(s)
- Anupam Mondal
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Cayke Felipe
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
| | - Anatoly B Kolomeisky
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
24
|
Kumar Mishra S, Bhattacherjee A. Understanding the Target Search by Multiple Transcription Factors on Nucleosomal DNA. Chemphyschem 2023; 24:e202200644. [PMID: 36602094 DOI: 10.1002/cphc.202200644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 01/04/2023] [Accepted: 01/05/2023] [Indexed: 01/06/2023]
Abstract
The association of multiple Transcription Factors (TFs) in the cis-regulatory region is imperative for developmental changes in eukaryotes. The underlying process is exceedingly complex, and it is not at all clear what orchestrates the overall search process by multiple TFs. In this study, by developing a theoretical model based on a discrete-state stochastic approach, we investigated the target search mechanism of multiple TFs on nucleosomal DNA. Experimental kinetic rate constants of different TFs are taken as input to estimate the Mean-First-Passage time to recognize the binding motifs by two TFs on a dynamic nucleosome model. The theory systematically analyzes when the TFs search their binding motifs hierarchically and when simultaneously by proceeding via the formation of a protein-protein complex. Our results, validated by extensive Monte Carlo simulations, elucidate the molecular basis of the complex target search phenomenon of multiple TFs on nucleosomal DNA.
Collapse
Affiliation(s)
- Sujeet Kumar Mishra
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Arnab Bhattacherjee
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| |
Collapse
|
25
|
Voit RA, Tao L, Yu F, Cato LD, Cohen B, Fleming TJ, Antoszewski M, Liao X, Fiorini C, Nandakumar SK, Wahlster L, Teichert K, Regev A, Sankaran VG. A genetic disorder reveals a hematopoietic stem cell regulatory network co-opted in leukemia. Nat Immunol 2023; 24:69-83. [PMID: 36522544 PMCID: PMC9810535 DOI: 10.1038/s41590-022-01370-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 10/25/2022] [Indexed: 12/23/2022]
Abstract
The molecular regulation of human hematopoietic stem cell (HSC) maintenance is therapeutically important, but limitations in experimental systems and interspecies variation have constrained our knowledge of this process. Here, we have studied a rare genetic disorder due to MECOM haploinsufficiency, characterized by an early-onset absence of HSCs in vivo. By generating a faithful model of this disorder in primary human HSCs and coupling functional studies with integrative single-cell genomic analyses, we uncover a key transcriptional network involving hundreds of genes that is required for HSC maintenance. Through our analyses, we nominate cooperating transcriptional regulators and identify how MECOM prevents the CTCF-dependent genome reorganization that occurs as HSCs differentiate. We show that this transcriptional network is co-opted in high-risk leukemias, thereby enabling these cancers to acquire stem cell properties. Collectively, we illuminate a regulatory network necessary for HSC self-renewal through the study of a rare experiment of nature.
Collapse
Affiliation(s)
- Richard A Voit
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Liming Tao
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Genentech, South San Francisco, CA, USA
| | - Fulong Yu
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Liam D Cato
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Blake Cohen
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Travis J Fleming
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mateusz Antoszewski
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Xiaotian Liao
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Claudia Fiorini
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Satish K Nandakumar
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cell Biology, Albert Einstein College of Medicine, Albert Einstein Cancer Center, Ruth L. and David S. Gottesman Institute for Stem Cell Research and Regenerative Medicine, Bronx, NY, USA
| | - Lara Wahlster
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kristian Teichert
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Aviv Regev
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Genentech, South San Francisco, CA, USA
| | - Vijay G Sankaran
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Stem Cell Institute, Cambridge, MA, USA.
| |
Collapse
|
26
|
Cazares TA, Rizvi FW, Iyer B, Chen X, Kotliar M, Bejjani AT, Wayman JA, Donmez O, Wronowski B, Parameswaran S, Kottyan LC, Barski A, Weirauch MT, Prasath VBS, Miraldi ER. maxATAC: Genome-scale transcription-factor binding prediction from ATAC-seq with deep neural networks. PLoS Comput Biol 2023; 19:e1010863. [PMID: 36719906 PMCID: PMC9917285 DOI: 10.1371/journal.pcbi.1010863] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 02/10/2023] [Accepted: 01/10/2023] [Indexed: 02/01/2023] Open
Abstract
Transcription factors read the genome, fundamentally connecting DNA sequence to gene expression across diverse cell types. Determining how, where, and when TFs bind chromatin will advance our understanding of gene regulatory networks and cellular behavior. The 2017 ENCODE-DREAM in vivo Transcription-Factor Binding Site (TFBS) Prediction Challenge highlighted the value of chromatin accessibility data to TFBS prediction, establishing state-of-the-art methods for TFBS prediction from DNase-seq. However, the more recent Assay-for-Transposase-Accessible-Chromatin (ATAC)-seq has surpassed DNase-seq as the most widely-used chromatin accessibility profiling method. Furthermore, ATAC-seq is the only such technique available at single-cell resolution from standard commercial platforms. While ATAC-seq datasets grow exponentially, suboptimal motif scanning is unfortunately the most common method for TFBS prediction from ATAC-seq. To enable community access to state-of-the-art TFBS prediction from ATAC-seq, we (1) curated an extensive benchmark dataset (127 TFs) for ATAC-seq model training and (2) built "maxATAC", a suite of user-friendly, deep neural network models for genome-wide TFBS prediction from ATAC-seq in any cell type. With models available for 127 human TFs, maxATAC is the largest collection of high-performance TFBS prediction models for ATAC-seq. maxATAC performance extends to primary cells and single-cell ATAC-seq, enabling improved TFBS prediction in vivo. We demonstrate maxATAC's capabilities by identifying TFBS associated with allele-dependent chromatin accessibility at atopic dermatitis genetic risk loci.
Collapse
Affiliation(s)
- Tareian A. Cazares
- Immunology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
| | - Faiz W. Rizvi
- Systems Biology and Physiology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
| | - Balaji Iyer
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Xiaoting Chen
- The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Michael Kotliar
- Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Anthony T. Bejjani
- Molecular and Developmental Biology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
| | - Joseph A. Wayman
- Division of Immunobiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Omer Donmez
- The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Benjamin Wronowski
- Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Sreeja Parameswaran
- The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Leah C. Kottyan
- The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Artem Barski
- Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Matthew T. Weirauch
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - V. B. Surya Prasath
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
| | - Emily R. Miraldi
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America
- Division of Immunobiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
| |
Collapse
|
27
|
Yang T, Henao R. TAMC: A deep-learning approach to predict motif-centric transcriptional factor binding activity based on ATAC-seq profile. PLoS Comput Biol 2022; 18:e1009921. [PMID: 36094959 PMCID: PMC9499209 DOI: 10.1371/journal.pcbi.1009921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 09/22/2022] [Accepted: 08/24/2022] [Indexed: 11/18/2022] Open
Abstract
Determining transcriptional factor binding sites (TFBSs) is critical for understanding the molecular mechanisms regulating gene expression in different biological conditions. Biological assays designed to directly mapping TFBSs require large sample size and intensive resources. As an alternative, ATAC-seq assay is simple to conduct and provides genomic cleavage profiles that contain rich information for imputing TFBSs indirectly. Previous footprint-based tools are inheritably limited by the accuracy of their bias correction algorithms and the efficiency of their feature extraction models. Here we introduce TAMC (Transcriptional factor binding prediction from ATAC-seq profile at Motif-predicted binding sites using Convolutional neural networks), a deep-learning approach for predicting motif-centric TF binding activity from paired-end ATAC-seq data. TAMC does not require bias correction during signal processing. By leveraging a one-dimensional convolutional neural network (1D-CNN) model, TAMC make predictions based on both footprint and non-footprint features at binding sites for each TF and outperforms existing footprinting tools in TFBS prediction particularly for ATAC-seq data with limited sequencing depth.
Collapse
Affiliation(s)
- Tianqi Yang
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine, Durham, North Carolina, United States of America
- Department of Cell Biology, Duke University School of Medicine, Durham, North Carolina, United States of America
- * E-mail: (TY); (RH)
| | - Ricardo Henao
- Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, North Carolina, United States of America
- Department of Biostatistics and Informatics, Duke University, Durham, North Carolina, United States of America
- * E-mail: (TY); (RH)
| |
Collapse
|
28
|
Kiani K, Sanford EM, Goyal Y, Raj A. Changes in chromatin accessibility are not concordant with transcriptional changes for single-factor perturbations. Mol Syst Biol 2022; 18:e10979. [PMID: 36069349 PMCID: PMC9450098 DOI: 10.15252/msb.202210979] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 08/11/2022] [Accepted: 08/19/2022] [Indexed: 11/23/2022] Open
Abstract
A major goal in the field of transcriptional regulation is the mapping of changes in the binding of transcription factors to the resultant changes in gene expression. Recently, methods for measuring chromatin accessibility have enabled us to measure changes in accessibility across the genome, which are thought to correspond to transcription factor-binding events. In concert with RNA-sequencing, these data in principle enable such mappings; however, few studies have looked at their concordance over short-duration treatments with specific perturbations. Here, we used tandem, bulk ATAC-seq, and RNA-seq measurements from MCF-7 breast carcinoma cells to systematically evaluate the concordance between changes in accessibility and changes in expression in response to retinoic acid and TGF-β. We found two classes of genes whose expression showed a significant change: those that showed some changes in the accessibility of nearby chromatin, and those that showed virtually no change despite strong changes in expression. The peaks associated with genes in the former group had lower baseline accessibility prior to exposure to signal. Focusing the analysis specifically on peaks with motifs for transcription factors associated with retinoic acid and TGF-β signaling did not reduce the lack of correspondence. Analysis of paired chromatin accessibility and gene expression data from distinct paths along the hematopoietic differentiation trajectory showed a much stronger correspondence, suggesting that the multifactorial biological processes associated with differentiation may lead to changes in chromatin accessibility that reflect rather than driving altered transcriptional status. Together, these results show many gene expression changes can happen independently of changes in the accessibility of local chromatin in the context of a single-factor perturbation.
Collapse
Affiliation(s)
- Karun Kiani
- Genetics and Epigenetics, Cell and Molecular Biology Graduate Group, Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Eric M Sanford
- Genomics and Computational Biology Graduate Group, Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaPAUSA
| | - Yogesh Goyal
- Department of Bioengineering, School of Engineering and Applied SciencesUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Cell and Developmental Biology, Feinberg School of MedicineNorthwestern UniversityChicagoIllinoisUSA
- Center for Synthetic BiologyNorthwestern UniversityChicagoIllinoisUSA
| | - Arjun Raj
- Department of Bioengineering, School of Engineering and Applied SciencesUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Genetics, Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| |
Collapse
|
29
|
Song S, Sun H, Liu JS, Hou L. Multi-Cell-Type Openness-Weighted Association Studies for Trait-Associated Genomic Segments Prioritization. Genes (Basel) 2022; 13:1220. [PMID: 35886003 PMCID: PMC9323627 DOI: 10.3390/genes13071220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 06/30/2022] [Accepted: 07/03/2022] [Indexed: 02/01/2023] Open
Abstract
Openness-weighted association study (OWAS) is a method that leverages the in silico prediction of chromatin accessibility to prioritize genome-wide association studies (GWAS) signals, and can provide novel insights into the roles of non-coding variants in complex diseases. A prerequisite to apply OWAS is to choose a trait-related cell type beforehand. However, for most complex traits, the trait-relevant cell types remain elusive. In addition, many complex traits involve multiple related cell types. To address these issues, we develop OWAS-joint, an efficient framework that aggregates predicted chromatin accessibility across multiple cell types, to prioritize disease-associated genomic segments. In simulation studies, we demonstrate that OWAS-joint achieves a greater statistical power compared to OWAS. Moreover, the heritability explained by OWAS-joint segments is higher than or comparable to OWAS segments. OWAS-joint segments also have high replication rates in independent replication cohorts. Applying the method to six complex human traits, we demonstrate the advantages of OWAS-joint over a single-cell-type OWAS approach. We highlight that OWAS-joint enhances the biological interpretation of disease mechanisms, especially for non-coding regions.
Collapse
Affiliation(s)
- Shuang Song
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing 100084, China; (S.S.); (H.S.)
| | - Hongyi Sun
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing 100084, China; (S.S.); (H.S.)
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | - Lin Hou
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing 100084, China; (S.S.); (H.S.)
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
30
|
Wang RR, Qiu X, Pan R, Fu H, Zhang Z, Wang Q, Chen H, Wu QQ, Pan X, Zhou Y, Shan P, Wang S, Guo G, Zheng M, Zhu L, Meng ZX. Dietary intervention preserves β cell function in mice through CTCF-mediated transcriptional reprogramming. J Exp Med 2022; 219:213256. [PMID: 35652891 DOI: 10.1084/jem.20211779] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 04/04/2022] [Accepted: 05/12/2022] [Indexed: 12/12/2022] Open
Abstract
Pancreatic β cell plasticity is the primary determinant of disease progression and remission of type 2 diabetes (T2D). However, the dynamic nature of β cell adaptation remains elusive. Here, we establish a mouse model exhibiting the compensation-to-decompensation adaptation of β cell function in response to increasing duration of high-fat diet (HFD) feeding. Comprehensive islet functional and transcriptome analyses reveal a dynamic orchestration of transcriptional networks featuring temporal alteration of chromatin remodeling. Interestingly, prediabetic dietary intervention completely rescues β cell dysfunction, accompanied by a remarkable reversal of HFD-induced reprogramming of islet chromatin accessibility and transcriptome. Mechanistically, ATAC-based motif analysis identifies CTCF as the top candidate driving dietary intervention-induced preservation of β cell function. CTCF expression is markedly decreased in β cells from obese and diabetic mice and humans. Both dietary intervention and AAV-mediated restoration of CTCF expression ameliorate β cell dysfunction ex vivo and in vivo, through transducing the lipid toxicity and inflammatory signals to transcriptional reprogramming of genes critical for β cell glucose metabolism and stress response.
Collapse
Affiliation(s)
- Ruo-Ran Wang
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Key Laboratory of Disease Proteomics of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Chronic Disease Research Institute, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xinyuan Qiu
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Department of Biology and Chemistry, College of Liberal Arts and Sciences, National University of Defense Technology, Changsha, Hunan, China
| | - Ran Pan
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Key Laboratory of Disease Proteomics of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Chronic Disease Research Institute, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China
| | - Hongxing Fu
- Department of Hepatobiliary and Pancreatic Surgery of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Ziyin Zhang
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Key Laboratory of Disease Proteomics of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Chronic Disease Research Institute, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China
| | - Qintao Wang
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Key Laboratory of Disease Proteomics of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Chronic Disease Research Institute, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China
| | - Haide Chen
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Qing-Qian Wu
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Key Laboratory of Disease Proteomics of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Chronic Disease Research Institute, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xiaowen Pan
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Yanping Zhou
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Pengfei Shan
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Shusen Wang
- Organ Transplant Center, Tianjin First Central Hospital, Tianjin, China.,NHC Key Laboratory for Critical Care Medicine, Tianjin First Central Hospital, Tianjin, China
| | - Guoji Guo
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Min Zheng
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Lingyun Zhu
- Department of Biology and Chemistry, College of Liberal Arts and Sciences, National University of Defense Technology, Changsha, Hunan, China
| | - Zhuo-Xian Meng
- Department of Pathology and Pathophysiology and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Key Laboratory of Disease Proteomics of Zhejiang Province, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.,Chronic Disease Research Institute, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China.,Department of Geriatrics, Affiliated Hangzhou First People's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| |
Collapse
|
31
|
Lee HJ, Hou Y, Maeng JH, Shah NM, Chen Y, Lawson HA, Yang H, Yue F, Wang T. Epigenomic analysis reveals prevalent contribution of transposable elements to cis-regulatory elements, tissue-specific expression, and alternative promoters in zebrafish. Genome Res 2022; 32:1424-1436. [PMID: 35649578 PMCID: PMC9341505 DOI: 10.1101/gr.276052.121] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 05/27/2022] [Indexed: 12/04/2022]
Abstract
Transposable elements (TEs) encode regulatory elements that impact gene expression in multiple species, yet a comprehensive analysis of zebrafish TEs in the context of gene regulation is lacking. Here, we systematically investigate the epigenomic and transcriptomic landscape of TEs across 11 adult zebrafish tissues using multidimensional sequencing data. We find that TEs contribute substantially to a diverse array of regulatory elements in the zebrafish genome and that 37% of TEs are positioned in active regulatory states in adult zebrafish tissues. We identify TE subfamilies enriched in highly specific regulatory elements among different tissues. We use transcript assembly to discover TE-derived transcriptional units expressed across tissues. Finally, we show that novel TE-derived promoters can initiate tissue-specific transcription of alternate gene isoforms. This work provides a comprehensive profile of TE activity across normal zebrafish tissues, shedding light on mechanisms underlying the regulation of gene expression in this widely used model organism.
Collapse
Affiliation(s)
- Hyung Joo Lee
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Yiran Hou
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Ju Heon Maeng
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Nakul M Shah
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Yujie Chen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Heather A Lawson
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Hongbo Yang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois 60611, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois 60611, USA
- Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, Illinois 60611, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| |
Collapse
|
32
|
Karimzadeh M, Hoffman MM. Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome. Genome Biol 2022; 23:126. [PMID: 35681170 PMCID: PMC9185870 DOI: 10.1186/s13059-022-02690-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/16/2022] [Indexed: 11/29/2022] Open
Abstract
Existing methods for computational prediction of transcription factor (TF) binding sites evaluate genomic regions with similarity to known TF sequence preferences. Most TF binding sites, however, do not resemble known TF sequence motifs, and many TFs are not sequence-specific. We developed Virtual ChIP-seq, which predicts binding of individual TFs in new cell types, integrating learned associations with gene expression and binding, TF binding sites from other cell types, and chromatin accessibility data in the new cell type. This approach outperforms methods that predict TF binding solely based on sequence preference, predicting binding for 36 TFs (MCC>0.3).
Collapse
Affiliation(s)
- Mehran Karimzadeh
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.,Princess Margaret Cancer Centre, Toronto, ON, Canada.,Vector Institute, Toronto, ON, Canada
| | - Michael M Hoffman
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada. .,Princess Margaret Cancer Centre, Toronto, ON, Canada. .,Vector Institute, Toronto, ON, Canada. .,Department of Computer Science, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
33
|
Luo K, Zhong J, Safi A, Hong LK, Tewari AK, Song L, Reddy TE, Ma L, Crawford GE, Hartemink AJ. Profiling the quantitative occupancy of myriad transcription factors across conditions by modeling chromatin accessibility data. Genome Res 2022; 32:1183-1198. [PMID: 35609992 PMCID: PMC9248881 DOI: 10.1101/gr.272203.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 05/06/2022] [Indexed: 11/24/2022]
Abstract
Over a thousand different transcription factors (TFs) bind with varying occupancy across the human genome. Chromatin immunoprecipitation (ChIP) can assay occupancy genome-wide, but only one TF at a time, limiting our ability to comprehensively observe the TF occupancy landscape, let alone quantify how it changes across conditions. We developed TF occupancy profiler (TOP), a Bayesian hierarchical regression framework, to profile genome-wide quantitative occupancy of numerous TFs using data from a single chromatin accessibility experiment (DNase- or ATAC-seq). TOP is supervised, and its hierarchical structure allows it to predict the occupancy of any sequence-specific TF, even those never assayed with ChIP. We used TOP to profile the quantitative occupancy of hundreds of sequence-specific TFs at sites throughout the genome and examined how their occupancies changed in multiple contexts: in approximately 200 human cell types, through 12 h of exposure to different hormones, and across the genetic backgrounds of 70 individuals. TOP enables cost-effective exploration of quantitative changes in the landscape of TF binding.
Collapse
Affiliation(s)
- Kaixuan Luo
- Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Computer Science, Duke University, Durham, North Carolina 27708, USA
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jianling Zhong
- Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Computer Science, Duke University, Durham, North Carolina 27708, USA
| | - Alexias Safi
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Linda K Hong
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Alok K Tewari
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
| | - Lingyun Song
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Timothy E Reddy
- Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Biostatistics and Bioinformatics, Durham, North Carolina 27710, USA
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
- Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA
| | - Li Ma
- Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA
- Department of Statistical Science, Duke University, Durham, North Carolina 27708, USA
| | - Gregory E Crawford
- Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Alexander J Hartemink
- Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Computer Science, Duke University, Durham, North Carolina 27708, USA
- Department of Biology, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
34
|
Hesami M, Alizadeh M, Jones AMP, Torkamaneh D. Machine learning: its challenges and opportunities in plant system biology. Appl Microbiol Biotechnol 2022; 106:3507-3530. [PMID: 35575915 DOI: 10.1007/s00253-022-11963-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/14/2022] [Accepted: 05/07/2022] [Indexed: 12/25/2022]
Abstract
Sequencing technologies are evolving at a rapid pace, enabling the generation of massive amounts of data in multiple dimensions (e.g., genomics, epigenomics, transcriptomic, metabolomics, proteomics, and single-cell omics) in plants. To provide comprehensive insights into the complexity of plant biological systems, it is important to integrate different omics datasets. Although recent advances in computational analytical pipelines have enabled efficient and high-quality exploration and exploitation of single omics data, the integration of multidimensional, heterogenous, and large datasets (i.e., multi-omics) remains a challenge. In this regard, machine learning (ML) offers promising approaches to integrate large datasets and to recognize fine-grained patterns and relationships. Nevertheless, they require rigorous optimizations to process multi-omics-derived datasets. In this review, we discuss the main concepts of machine learning as well as the key challenges and solutions related to the big data derived from plant system biology. We also provide in-depth insight into the principles of data integration using ML, as well as challenges and opportunities in different contexts including multi-omics, single-cell omics, protein function, and protein-protein interaction. KEY POINTS: • The key challenges and solutions related to the big data derived from plant system biology have been highlighted. • Different methods of data integration have been discussed. • Challenges and opportunities of the application of machine learning in plant system biology have been highlighted and discussed.
Collapse
Affiliation(s)
- Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | - Milad Alizadeh
- Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | | | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, G1V 0A6, Canada. .,Institut de Biologie Intégrative Et Des Systèmes (IBIS), Université Laval, Québec City, QC, G1V 0A6, Canada.
| |
Collapse
|
35
|
Li H, Guan Y. Asymmetric predictive relationships across histone modifications. NAT MACH INTELL 2022; 4:288-299. [DOI: 10.1038/s42256-022-00455-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
36
|
Zibetti C. Deciphering the Retinal Epigenome during Development, Disease and Reprogramming: Advancements, Challenges and Perspectives. Cells 2022; 11:cells11050806. [PMID: 35269428 PMCID: PMC8908986 DOI: 10.3390/cells11050806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 02/15/2022] [Accepted: 02/18/2022] [Indexed: 02/01/2023] Open
Abstract
Retinal neurogenesis is driven by concerted actions of transcription factors, some of which are expressed in a continuum and across several cell subtypes throughout development. While seemingly redundant, many factors diversify their regulatory outcome on gene expression, by coordinating variations in chromatin landscapes to drive divergent retinal specification programs. Recent studies have furthered the understanding of the epigenetic contribution to the progression of age-related macular degeneration, a leading cause of blindness in the elderly. The knowledge of the epigenomic mechanisms that control the acquisition and stabilization of retinal cell fates and are evoked upon damage, holds the potential for the treatment of retinal degeneration. Herein, this review presents the state-of-the-art approaches to investigate the retinal epigenome during development, disease, and reprogramming. A pipeline is then reviewed to functionally interrogate the epigenetic and transcriptional networks underlying cell fate specification, relying on a truly unbiased screening of open chromatin states. The related work proposes an inferential model to identify gene regulatory networks, features the first footprinting analysis and the first tentative, systematic query of candidate pioneer factors in the retina ever conducted in any model organism, leading to the identification of previously uncharacterized master regulators of retinal cell identity, such as the nuclear factor I, NFI. This pipeline is virtually applicable to the study of genetic programs and candidate pioneer factors in any developmental context. Finally, challenges and limitations intrinsic to the current next-generation sequencing techniques are discussed, as well as recent advances in super-resolution imaging, enabling spatio-temporal resolution of the genome.
Collapse
Affiliation(s)
- Cristina Zibetti
- Department of Ophthalmology, Institute of Clinical Medicine, University of Oslo, Kirkeveien 166, Building 36, 0455 Oslo, Norway
| |
Collapse
|
37
|
Suriyalaksh M, Raimondi C, Mains A, Segonds-Pichon A, Mukhtar S, Murdoch S, Aldunate R, Krueger F, Guimerà R, Andrews S, Sales-Pardo M, Casanueva O. Gene regulatory network inference in long-lived C. elegans reveals modular properties that are predictive of novel aging genes. iScience 2022; 25:103663. [PMID: 35036864 PMCID: PMC8753122 DOI: 10.1016/j.isci.2021.103663] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 09/09/2021] [Accepted: 12/15/2021] [Indexed: 11/24/2022] Open
Abstract
We design a “wisdom-of-the-crowds” GRN inference pipeline and couple it to complex network analysis to understand the organizational principles governing gene regulation in long-lived glp-1/Notch Caenorhabditis elegans. The GRN has three layers (input, core, and output) and is topologically equivalent to bow-tie/hourglass structures prevalent among metabolic networks. To assess the functional importance of structural layers, we screened 80% of regulators and discovered 50 new aging genes, 86% with human orthologues. Genes essential for longevity—including ones involved in insulin-like signaling (ILS)—are at the core, indicating that GRN's structure is predictive of functionality. We used in vivo reporters and a novel functional network covering 5,497 genetic interactions to make mechanistic predictions. We used genetic epistasis to test some of these predictions, uncovering a novel transcriptional regulator, sup-37, that works alongside DAF-16/FOXO. We present a framework with predictive power that can accelerate discovery in C. elegans and potentially humans.
Gene-regulatory inference provides global network of long-lived animals The large-scale topology of the network has an hourglass structure Membership to the core of the hourglass is a good predictor of functionality Discovered 50 novel aging genes, including sup-37, a DAF-16 dependent gene
Collapse
Affiliation(s)
| | | | - Abraham Mains
- Babraham Institute, Babraham, Cambridge CB22 3AT, UK
| | | | | | | | - Rebeca Aldunate
- Escuela de Biotecnología, Facultad de Ciencias, Universidad Santo Tomas, Santiago, Chile
| | - Felix Krueger
- Babraham Institute, Babraham, Cambridge CB22 3AT, UK
| | - Roger Guimerà
- ICREA, Barcelona 08010, Catalonia, Spain.,Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona 43007, Catalonia, Spain
| | - Simon Andrews
- Babraham Institute, Babraham, Cambridge CB22 3AT, UK
| | - Marta Sales-Pardo
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona 43007, Catalonia, Spain
| | | |
Collapse
|
38
|
Loft A, Andersen MW, Madsen JGS, Mandrup S. Analysis of Enhancers and Transcriptional Networks in Thermogenic Adipocytes. Methods Mol Biol 2022; 2448:155-175. [PMID: 35167097 DOI: 10.1007/978-1-0716-2087-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Transcription factor (TF) networks orchestrate the regulation of gene programs in mammalian cells, including white and brown adipocytes. In this protocol, we outline how genomics and transcriptomics data can be integrated to infer causal TFs of a given cellular response or cell type using "Integrated analysis of Motif Activity and Gene Expression changes of transcription factors" (IMAGE). Here, we show how key regulatory TFs controlling white and brown adipocyte gene programs can be predicted from chromatin accessibility and RNA-seq data. Furthermore, we demonstrate how information about target sites and target genes of the predicted key regulators can be integrated to propose testable hypotheses regarding the role and mechanisms of TFs.
Collapse
Affiliation(s)
- Anne Loft
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
- Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Odense, Denmark.
| | - Maja Worm Andersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Jesper Grud Skat Madsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
- Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Odense, Denmark
| | - Susanne Mandrup
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
- Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
39
|
Zhang Q, Huang Z, Zuo H, Lin Y, Xiao Y, Yan Y, Cui Y, Lin C, Pei F, Chen Z, Liu H. Chromatin Accessibility Predetermines Odontoblast Terminal Differentiation. Front Cell Dev Biol 2021; 9:769193. [PMID: 34901015 PMCID: PMC8655119 DOI: 10.3389/fcell.2021.769193] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 10/29/2021] [Indexed: 12/03/2022] Open
Abstract
Embryonic development and stem cell differentiation are orchestrated by changes in sequential binding of regulatory transcriptional factors to their motifs. These processes are invariably accompanied by the alternations in chromatin accessibility, conformation, and histone modification. Odontoblast lineage originates from cranial neural crest cells and is crucial in dentinogenesis. Our previous work revealed several transcription factors (TFs) that promote odontoblast differentiation. However, it remains elusive as to whether chromatin accessibility affects odontoblast terminal differentiation. Herein, integration of single-cell RNA-seq and bulk RNA-seq revealed that in vitro odontoblast differentiation using dental papilla cells at E18.5 was comparable to the crown odontoblast differentiation trajectory of OC (osteocalcin)-positive odontogenic lineage. Before in vitro odontoblast differentiation, ATAC-seq and H3K27Ac CUT and Tag experiments demonstrated high accessibility of chromatin regions adjacent to genes associated with odontogenic potential. However, following odontoblastic induction, regions near mineralization-related genes became accessible. Integration of RNA-seq and ATAC-seq results further revealed that the expression levels of these genes were correlated with the accessibility of nearby chromatin. Time-course ATAC-seq experiments further demonstrated that odontoblast terminal differentiation was correlated with the occupation of the basic region/leucine zipper motif (bZIP) TF family, whereby we validated the positive role of ATF5 in vitro. Collectively, this study reports a global mapping of open chromatin regulatory elements during dentinogenesis and illustrates how these regions are regulated via dynamic binding of different TF families, resulting in odontoblast terminal differentiation. The findings also shed light on understanding the genetic regulation of dentin regeneration using dental mesenchymal stem cells.
Collapse
Affiliation(s)
- Qian Zhang
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Zhen Huang
- Fujian Key Laboratory of Developmental and Neuro Biology, College of Life Science, Fujian Normal University, Fuzhou, China
| | - Huanyan Zuo
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Yuxiu Lin
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Yao Xiao
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Yanan Yan
- Fujian Key Laboratory of Developmental and Neuro Biology, College of Life Science, Fujian Normal University, Fuzhou, China
| | - Yu Cui
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Chujiao Lin
- Division of Rheumatology, Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
| | - Fei Pei
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Zhi Chen
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China
| | - Huan Liu
- The State Key Laboratory Breeding Base of Basic Science of Stomatology and Key Laboratory for Oral Biomedicine of Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, China.,Department of Periodontology, School of Stomatology, Wuhan University, Wuhan, China
| |
Collapse
|
40
|
Interleukin-10 receptor signaling promotes the maintenance of a PD-1 int TCF-1 + CD8 + T cell population that sustains anti-tumor immunity. Immunity 2021; 54:2825-2841.e10. [PMID: 34879221 DOI: 10.1016/j.immuni.2021.11.004] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 03/26/2021] [Accepted: 11/09/2021] [Indexed: 12/20/2022]
Abstract
T cell exhaustion limits anti-tumor immunity and responses to immunotherapy. Here, we explored the microenvironmental signals regulating T cell exhaustion using a model of chronic lymphocytic leukemia (CLL). Single-cell analyses identified a subset of PD-1hi, functionally impaired CD8+ T cells that accumulated in secondary lymphoid organs during disease progression and a functionally competent PD-1int subset. Frequencies of PD-1int TCF-1+ CD8+ T cells decreased upon Il10rb or Stat3 deletion, leading to accumulation of PD-1hi cells and accelerated tumor progression. Mechanistically, inhibition of IL-10R signaling altered chromatin accessibility and disrupted cooperativity between the transcription factors NFAT and AP-1, promoting a distinct NFAT-associated program. Low IL10 expression or loss of IL-10R-STAT3 signaling correlated with increased frequencies of exhausted CD8+ T cells and poor survival in CLL and in breast cancer patients. Thus, balance between PD-1hi, exhausted CD8+ T cells and functional PD-1int TCF-1+ CD8+ T cells is regulated by cell-intrinsic IL-10R signaling, with implications for immunotherapy.
Collapse
|
41
|
Constructing gene regulatory networks using epigenetic data. NPJ Syst Biol Appl 2021; 7:45. [PMID: 34887443 PMCID: PMC8660777 DOI: 10.1038/s41540-021-00208-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 11/01/2021] [Indexed: 12/24/2022] Open
Abstract
The biological processes that drive cellular function can be represented by a complex network of interactions between regulators (transcription factors) and their targets (genes). A cell's epigenetic state plays an important role in mediating these interactions, primarily by influencing chromatin accessibility. However, how to effectively use epigenetic data when constructing a gene regulatory network remains an open question. Almost all existing network reconstruction approaches focus on estimating transcription factor to gene connections using transcriptomic data. In contrast, computational approaches for analyzing epigenetic data generally focus on improving transcription factor binding site predictions rather than deducing regulatory network relationships. We bridged this gap by developing SPIDER, a network reconstruction approach that incorporates epigenetic data into a message-passing framework to estimate gene regulatory networks. We validated SPIDER's predictions using ChIP-seq data from ENCODE and found that SPIDER networks are both highly accurate and include cell-line-specific regulatory interactions. Notably, SPIDER can recover ChIP-seq verified transcription factor binding events in the regulatory regions of genes that do not have a corresponding sequence motif. The networks estimated by SPIDER have the potential to identify novel hypotheses that will allow us to better characterize cell-type and phenotype specific regulatory mechanisms.
Collapse
|
42
|
Mondal A, Mishra SK, Bhattacherjee A. Kinetic origin of nucleosome invasion by pioneer transcription factors. Biophys J 2021; 120:5219-5230. [PMID: 34757077 DOI: 10.1016/j.bpj.2021.10.039] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 05/14/2021] [Accepted: 10/27/2021] [Indexed: 01/25/2023] Open
Abstract
Recently, a cryo-electron microscopy study has captured different stages of nucleosome breathing dynamics that show partial unwrapping of DNA from histone core to permit transient access to the DNA sites by transcription factors. In practice, however, only a subset of transcription factors named pioneer factors can invade nucleosomes and bind to specific DNA sites to trigger essential DNA metabolic processes. We propose a discrete-state stochastic model that considers the interplay of nucleosome breathing and protein dynamics explicitly and estimate the mean time to search the target DNA sites. It is found that the molecular principle governing the search process on nucleosome is very different compared to that on naked DNA. The pioneer factors minimize their search times on nucleosomal DNA by compensating their nucleosome association rates by dissociation rates. A fine balance between the two presents a tradeoff between their nuclear mobility and error associated with the search process.
Collapse
Affiliation(s)
- Anupam Mondal
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Sujeet Kumar Mishra
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India; Institute for Theoretical Physics, Heidelberg University, Heidelberg, Germany
| | - Arnab Bhattacherjee
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India.
| |
Collapse
|
43
|
Geusz RJ, Wang A, Lam DK, Vinckier NK, Alysandratos KD, Roberts DA, Wang J, Kefalopoulou S, Ramirez A, Qiu Y, Chiou J, Gaulton KJ, Ren B, Kotton DN, Sander M. Sequence logic at enhancers governs a dual mechanism of endodermal organ fate induction by FOXA pioneer factors. Nat Commun 2021; 12:6636. [PMID: 34789735 PMCID: PMC8599738 DOI: 10.1038/s41467-021-26950-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 10/28/2021] [Indexed: 01/15/2023] Open
Abstract
FOXA pioneer transcription factors (TFs) associate with primed enhancers in endodermal organ precursors. Using a human stem cell model of pancreas differentiation, we here discover that only a subset of pancreatic enhancers is FOXA-primed, whereas the majority is unprimed and engages FOXA upon lineage induction. Primed enhancers are enriched for signal-dependent TF motifs and harbor abundant and strong FOXA motifs. Unprimed enhancers harbor fewer, more degenerate FOXA motifs, and FOXA recruitment to unprimed but not primed enhancers requires pancreatic TFs. Strengthening FOXA motifs at an unprimed enhancer near NKX6.1 renders FOXA recruitment pancreatic TF-independent, induces priming, and broadens the NKX6.1 expression domain. We make analogous observations about FOXA binding during hepatic and lung development. Our findings suggest a dual role for FOXA in endodermal organ development: first, FOXA facilitates signal-dependent lineage initiation via enhancer priming, and second, FOXA enforces organ cell type-specific gene expression via indirect recruitment by lineage-specific TFs.
Collapse
Affiliation(s)
- Ryan J. Geusz
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Biomedical Graduate Studies Program, University of California San Diego, La Jolla, San Diego, CA 92037 USA
| | - Allen Wang
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA
| | - Dieter K. Lam
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA
| | - Nicholas K. Vinckier
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA
| | - Konstantinos-Dionysios Alysandratos
- grid.239424.a0000 0001 2183 6745Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118 USA ,grid.189504.10000 0004 1936 7558The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118 USA
| | - David A. Roberts
- grid.239424.a0000 0001 2183 6745Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118 USA
| | - Jinzhao Wang
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA
| | - Samy Kefalopoulou
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA
| | - Araceli Ramirez
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.468218.10000 0004 5913 3393Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA 92093 USA
| | - Yunjiang Qiu
- grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA
| | - Joshua Chiou
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA ,grid.266100.30000 0001 2107 4242Biomedical Graduate Studies Program, University of California San Diego, La Jolla, San Diego, CA 92037 USA
| | - Kyle J. Gaulton
- grid.266100.30000 0001 2107 4242Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA 92093 USA
| | - Bing Ren
- grid.266100.30000 0001 2107 4242Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA 92093 USA ,grid.1052.60000000097371625Ludwig Institute for Cancer Research, La Jolla, San Diego, CA 92093-0653 USA
| | - Darrell N. Kotton
- grid.239424.a0000 0001 2183 6745Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118 USA ,grid.189504.10000 0004 1936 7558The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118 USA
| | - Maike Sander
- Department of Pediatrics, Pediatric Diabetes Research Center, University of California, La Jolla, San Diego, CA, 92093, USA. .,Department of Cellular & Molecular Medicine, University of California, La Jolla, San Diego, CA, 92093, USA. .,Sanford Consortium for Regenerative Medicine, La Jolla, San Diego, CA, 92093, USA.
| |
Collapse
|
44
|
Wang S, He Y, Chen Z, Zhang Q. FCNGRU: Locating Transcription Factor Binding Sites by combing Fully Convolutional Neural Network with Gated Recurrent Unit. IEEE J Biomed Health Inform 2021; 26:1883-1890. [PMID: 34613923 DOI: 10.1109/jbhi.2021.3117616] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Deciphering the relationship between transcription factors (TFs) and DNA sequences is very helpful for computational inference of gene regulation and a comprehensive understanding of gene regulation mechanisms. Transcription factor binding sites (TFBSs) are specific DNA short sequences that play a pivotal role in controlling gene expression through interaction with TF proteins. Although recently many computational and deep learning methods have been proposed to predict TFBSs aiming to predict sequence specificity of TF-DNA binding, there is still a lack of effective methods to directly locate TFBSs. In order to address this problem, we propose FCNGRU combing a fully convolutional neural network (FCN) with the gated recurrent unit (GRU) to directly locate TFBSs in this paper. Furthermore, we present a two-task framework (FCNGRU-double): one is a classification task at nucleotide level which predicts the probability of each nucleotide and locates TFBSs, and the other is a regression task at sequence level which predicts the intensity of each sequence. A series of experiments are conducted on 45 in-vitro datasets collected from the UniPROBE database derived from universal protein binding microarrays (uPBMs). Compared with competing methods, FCNGRU-double achieves much better results on these datasets. Moreover, FCNGRU-double has an advantage over a single-task framework, FCNGRU-single, which only contains the branch of locating TFBSs. In additionwe combine with in vivo datasets to make a further analysis and discussion. The source codes are avaiable at https://github.com/wangguoguoa/FCNGRU.
Collapse
|
45
|
Jang HS, Chen Y, Ge J, Wilkening AN, Hou Y, Lee HJ, Choi YR, Lowdon RF, Xing X, Li D, Kaufman CK, Johnson SL, Wang T. Epigenetic dynamics shaping melanophore and iridophore cell fate in zebrafish. Genome Biol 2021; 22:282. [PMID: 34607603 PMCID: PMC8489059 DOI: 10.1186/s13059-021-02493-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 09/09/2021] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Zebrafish pigment cell differentiation provides an attractive model for studying cell fate progression as a neural crest progenitor engenders diverse cell types, including two morphologically distinct pigment cells: black melanophores and reflective iridophores. Nontrivial classical genetic and transcriptomic approaches have revealed essential molecular mechanisms and gene regulatory circuits that drive neural crest-derived cell fate decisions. However, how the epigenetic landscape contributes to pigment cell differentiation, especially in the context of iridophore cell fate, is poorly understood. RESULTS We chart the global changes in the epigenetic landscape, including DNA methylation and chromatin accessibility, during neural crest differentiation into melanophores and iridophores to identify epigenetic determinants shaping cell type-specific gene expression. Motif enrichment in the epigenetically dynamic regions reveals putative transcription factors that might be responsible for driving pigment cell identity. Through this effort, in the relatively uncharacterized iridophores, we validate alx4a as a necessary and sufficient transcription factor for iridophore differentiation and present evidence on alx4a's potential regulatory role in guanine synthesis pathway. CONCLUSIONS Pigment cell fate is marked by substantial DNA demethylation events coupled with dynamic chromatin accessibility to potentiate gene regulation through cis-regulatory control. Here, we provide a multi-omic resource for neural crest differentiation into melanophores and iridophores. This work led to the discovery and validation of iridophore-specific alx4a transcription factor.
Collapse
Affiliation(s)
- Hyo Sik Jang
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
- Present address: Department of Epigenetics, Van Andel Institute, Grand Rapids, MI USA
| | - Yujie Chen
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Jiaxin Ge
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Alicia N. Wilkening
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Yiran Hou
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Hyung Joo Lee
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - You Rim Choi
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Rebecca F. Lowdon
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Xiaoyun Xing
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Daofeng Li
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Charles K. Kaufman
- Department of Medicine, Division of Medical Oncology, and Department of Developmental Biology, Washington University in Saint Louis, St. Louis, MO USA
| | - Stephen L. Johnson
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St Louis, MO USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| |
Collapse
|
46
|
Jin Y, Jiang J, Wang R, Qin ZS. Systematic Evaluation of DNA Sequence Variations on in vivo Transcription Factor Binding Affinity. Front Genet 2021; 12:667866. [PMID: 34567058 PMCID: PMC8458901 DOI: 10.3389/fgene.2021.667866] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 08/02/2021] [Indexed: 02/01/2023] Open
Abstract
The majority of the single nucleotide variants (SNVs) identified by genome-wide association studies (GWAS) fall outside of the protein-coding regions. Elucidating the functional implications of these variants has been a major challenge. A possible mechanism for functional non-coding variants is that they disrupted the canonical transcription factor (TF) binding sites that affect the in vivo binding of the TF. However, their impact varies since many positions within a TF binding motif are not well conserved. Therefore, simply annotating all variants located in putative TF binding sites may overestimate the functional impact of these SNVs. We conducted a comprehensive survey to study the effect of SNVs on the TF binding affinity. A sequence-based machine learning method was used to estimate the change in binding affinity for each SNV located inside a putative motif site. From the results obtained on 18 TF binding motifs, we found that there is a substantial variation in terms of a SNV’s impact on TF binding affinity. We found that only about 20% of SNVs located inside putative TF binding sites would likely to have significant impact on the TF-DNA binding.
Collapse
Affiliation(s)
- Yutong Jin
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| | - Jiahui Jiang
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| | - Ruixuan Wang
- College of Environmental Sciences and Engineering, Peking University, Beijing, China
| | - Zhaohui S Qin
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| |
Collapse
|
47
|
Zhou W, Hongkai J. Genome-wide Prediction of Chromatin Accessibility Based on Gene Expression. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2021; 13:e1544. [PMID: 39391743 PMCID: PMC11466374 DOI: 10.1002/wics.1544] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 11/28/2020] [Indexed: 10/12/2024]
Abstract
Decoding gene regulation in a biological system requires information from both transcriptome and regulome. While multiple high-throughput transcriptome and regulome mapping technologies are available, transcriptome profiling is more widely used. Today, over a million bulk and single-cell gene expression samples are stored in public databases. This number is orders of magnitude larger than the number of available regulome samples. Most of the gene expression samples do not have corresponding regulome data. However, it is possible to obtain regulome information via prediction. Open chromatin is a hallmark of active regulatory elements. This mini-review discusses recent advances in predicting chromatin accessibility using gene expression data, including both the development of prediction methods and their applications in expanding the regulome catalog, improving regulome analysis, integrating transcriptome and regulome data, and facilitating single-cell analysis of gene regulation.
Collapse
Affiliation(s)
- Weiqiang Zhou
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA
| | - Ji Hongkai
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA
| |
Collapse
|
48
|
Findley AS, Zhang X, Boye C, Lin YL, Kalita CA, Barreiro L, Lohmueller KE, Pique-Regi R, Luca F. A signature of Neanderthal introgression on molecular mechanisms of environmental responses. PLoS Genet 2021; 17:e1009493. [PMID: 34570765 PMCID: PMC8509894 DOI: 10.1371/journal.pgen.1009493] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 10/12/2021] [Accepted: 08/18/2021] [Indexed: 12/17/2022] Open
Abstract
Ancient human migrations led to the settlement of population groups in varied environmental contexts worldwide. The extent to which adaptation to local environments has shaped human genetic diversity is a longstanding question in human evolution. Recent studies have suggested that introgression of archaic alleles in the genome of modern humans may have contributed to adaptation to environmental pressures such as pathogen exposure. Functional genomic studies have demonstrated that variation in gene expression across individuals and in response to environmental perturbations is a main mechanism underlying complex trait variation. We considered gene expression response to in vitro treatments as a molecular phenotype to identify genes and regulatory variants that may have played an important role in adaptations to local environments. We investigated if Neanderthal introgression in the human genome may contribute to the transcriptional response to environmental perturbations. To this end we used eQTLs for genes differentially expressed in a panel of 52 cellular environments, resulting from 5 cell types and 26 treatments, including hormones, vitamins, drugs, and environmental contaminants. We found that SNPs with introgressed Neanderthal alleles (N-SNPs) disrupt binding of transcription factors important for environmental responses, including ionizing radiation and hypoxia, and for glucose metabolism. We identified an enrichment for N-SNPs among eQTLs for genes differentially expressed in response to 8 treatments, including glucocorticoids, caffeine, and vitamin D. Using Massively Parallel Reporter Assays (MPRA) data, we validated the regulatory function of 21 introgressed Neanderthal variants in the human genome, corresponding to 8 eQTLs regulating 15 genes that respond to environmental perturbations. These findings expand the set of environments where archaic introgression may have contributed to adaptations to local environments in modern humans and provide experimental validation for the regulatory function of introgressed variants.
Collapse
Affiliation(s)
- Anthony S. Findley
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
| | - Xinjun Zhang
- Department of Ecology and Evolutionary Biology, UCLA, Los Angeles, California, United States of America
| | - Carly Boye
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
| | - Yen Lung Lin
- Genetics Section, Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Cynthia A. Kalita
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
| | - Luis Barreiro
- Genetics Section, Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, UCLA, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, California, United States of America
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, Michigan, United States of America
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, United States of America
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, Michigan, United States of America
| |
Collapse
|
49
|
Abstract
Chromatin accessibility is directly linked with transcription in eukaryotes. Accessible regions associated with regulatory proteins are highly sensitive to DNase I digestion and are termed DNase I hypersensitive sites (DHSs). DHSs can be identified by DNase I digestion, followed by high-throughput DNA sequencing (DNase-seq). The single-base-pair resolution digestion patterns from DNase-seq allows identifying transcription factor (TF) footprints of local DNA protection that predict TF-DNA binding. The identification of differential footprinting between two conditions allows mapping relevant TF regulatory interactions. Here, we provide step-by-step instructions to build gene regulatory networks from DNase-seq data. Our pipeline includes steps for DHSs calling, identification of differential TF footprints between treatment and control conditions, and construction of gene regulatory networks. Even though the data we used in this example was obtained from Arabidopsis thaliana, the workflow developed in this guide can be adapted to work with DNase-seq data from any organism with a sequenced genome.
Collapse
|
50
|
Yao Q, Ferragina P, Reshef Y, Lettre G, Bauer DE, Pinello L. Motif-Raptor: a cell type-specific and transcription factor centric approach for post-GWAS prioritization of causal regulators. Bioinformatics 2021; 37:2103-2111. [PMID: 33532840 PMCID: PMC11025460 DOI: 10.1093/bioinformatics/btab072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 11/30/2020] [Accepted: 01/28/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Genome-wide association studies (GWASs) have identified thousands of common trait-associated genetic variants but interpretation of their function remains challenging. These genetic variants can overlap the binding sites of transcription factors (TFs) and therefore could alter gene expression. However, we currently lack a systematic understanding on how this mechanism contributes to phenotype. RESULTS We present Motif-Raptor, a TF-centric computational tool that integrates sequence-based predictive models, chromatin accessibility, gene expression datasets and GWAS summary statistics to systematically investigate how TF function is affected by genetic variants. Given trait-associated non-coding variants, Motif-Raptor can recover relevant cell types and critical TFs to drive hypotheses regarding their mechanism of action. We tested Motif-Raptor on complex traits such as rheumatoid arthritis and red blood cell count and demonstrated its ability to prioritize relevant cell types, potential regulatory TFs and non-coding SNPs which have been previously characterized and validated. AVAILABILITY AND IMPLEMENTATION Motif-Raptor is freely available as a Python package at: https://github.com/pinellolab/MotifRaptor. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qiuming Yao
- Department of Pathology, Massachusetts General Hospital, Charlestown, MA 02129, USA
- Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115, USA
- Harvard Medical School, Boston, MA 02115, USA
| | - Paolo Ferragina
- Department of Computer Science, University of Pisa, Pisa 56128, Italy
| | - Yakir Reshef
- Department of Computer Science, Harvard University, Cambridge, MA 02138, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Guillaume Lettre
- Faculty of Medicine, Université de Montréal, Montreal, Quebec H3C3J7, Canada
- Montreal Heart Institute, Montreal, Quebec H1T1C8, Canada
| | - Daniel E Bauer
- Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115, USA
- Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
| | - Luca Pinello
- Department of Pathology, Massachusetts General Hospital, Charlestown, MA 02129, USA
- Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|