Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Worsley Hunt R, Mathelier A, Del Peso L, Wasserman WW. Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment. BMC Genomics 2014;15:472. [PMID: 24927817 PMCID: PMC4082612 DOI: 10.1186/1471-2164-15-472] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 05/20/2014] [Indexed: 11/10/2022] Open

For:	Worsley Hunt R, Mathelier A, Del Peso L, Wasserman WW. Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment. BMC Genomics 2014;15:472. [PMID: 24927817 PMCID: PMC4082612 DOI: 10.1186/1471-2164-15-472] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 05/20/2014] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Sundaram L, Kumar A, Zatzman M, Salcedo A, Ravindra N, Shams S, Louie BH, Bagdatli ST, Myers MA, Sarmashghi S, Choi HY, Choi WY, Yost KE, Zhao Y, Granja JM, Hinoue T, Hayes DN, Cherniack A, Felau I, Choudhry H, Zenklusen JC, Farh KKH, McPherson A, Curtis C, Laird PW, Demchok JA, Yang L, Tarnuzzer R, Caesar-Johnson SJ, Wang Z, Doane AS, Khurana E, Castro MAA, Lazar AJ, Broom BM, Weinstein JN, Akbani R, Kumar SV, Raphael BJ, Wong CK, Stuart JM, Safavi R, Benz CC, Johnson BK, Kyi C, Shen H, Corces MR, Chang HY, Greenleaf WJ. Single-cell chromatin accessibility reveals malignant regulatory programs in primary human cancers. Science 2024;385:eadk9217. [PMID: 39236169 DOI: 10.1126/science.adk9217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 07/03/2024] [Indexed: 09/07/2024]

Affiliation(s)

Laksshman Sundaram Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA Illumina AI laboratory, Illumina Inc, Foster City, CA, USA NVIDIA Bio Research, NVIDIA, Santa Clara, CA, USA
Arvind Kumar Illumina AI laboratory, Illumina Inc, Foster City, CA, USA
Matthew Zatzman Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Adriana Salcedo Illumina AI laboratory, Illumina Inc, Foster City, CA, USA
Neal Ravindra Illumina AI laboratory, Illumina Inc, Foster City, CA, USA
Shadi Shams Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
Bryan H Louie Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
S Tansu Bagdatli Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
Matthew A Myers Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Shahab Sarmashghi Broad Institute of MIT and Harvard, Cambridge, MA, USA
Hyo Young Choi Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA Department of Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
Won-Young Choi UTHSC Center for Cancer Research, University of Tennessee Health Science Center, Memphis, TN, USA
Kathryn E Yost Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
Yanding Zhao Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA
Jeffrey M Granja Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
Toshinori Hinoue Center for Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA
D Neil Hayes Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA Department of Medicine, University of Tennessee Health Science Center, Memphis, TN, USA UTHSC Center for Cancer Research, University of Tennessee Health Science Center, Memphis, TN, USA
Andrew Cherniack Broad Institute of MIT and Harvard, Cambridge, MA, USA
Ina Felau National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
Hani Choudhry Department of Biochemistry, Faculty of Science, Cancer and Mutagenesis Unit, King Fahd Center for Medical Research, King Abdulaziz University, Jeddah, Saudi Arabia
Jean C Zenklusen National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
Kyle Kai-How Farh Illumina AI laboratory, Illumina Inc, Foster City, CA, USA
Andrew McPherson Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Christina Curtis Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA Chan Zuckerberg Biohub, San Francisco, CA, USA
Peter W Laird Center for Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA
John A Demchok Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
Liming Yang Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
Roy Tarnuzzer Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
Samantha J Caesar-Johnson Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
Zhining Wang Center for Biomedical Informatics and Information Technology, National Cancer Institute, NIH, 9609 Medical Center Drive, Rockville, MD 20850, USA
Ashley S Doane Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10065, USA
Ekta Khurana Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10065, USA Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10065, USA
Mauro A A Castro Bioinformatics and Systems Biology Laboratory, Federal University of Paraná, Curitiba 81520-260, Brazil
Alexander J Lazar Departments of Pathology & Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Bradley M Broom Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
John N Weinstein Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030
Rehan Akbani Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
Shwetha V Kumar Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
Benjamin J Raphael Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540
Christopher K Wong Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
Joshua M Stuart Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
Rojin Safavi Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
Christopher C Benz Buck Institute for Research on Aging, Novato, CA 94945, USA
Benjamin K Johnson Center for Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA
Cindy Kyi Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
Hui Shen Center for Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA
M Ryan Corces Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA Department of Neurology, University of California San Francisco, San Francisco, CA, USA
Howard Y Chang Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA Howard Hughes Medical Institute, Stanford University, School of Medicine, Stanford, CA, USA
William J Greenleaf Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA Department of Applied Physics, Stanford University, Stanford, CA, USA

Collapse

Raditsa V, Tsukanov A, Bogomolov A, Levitsky V. Genomic background sequences systematically outperform synthetic ones in de novo motif discovery for ChIP-seq data. NAR Genom Bioinform 2024;6:lqae090. [PMID: 39071850 PMCID: PMC11282361 DOI: 10.1093/nargab/lqae090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/03/2024] [Accepted: 07/19/2024] [Indexed: 07/30/2024] Open

Ramamurthy E, Agarwal S, Toong N, Sestili H, Kaplow IM, Chen Z, Phan B, Pfenning AR. Regression convolutional neural network models implicate peripheral immune regulatory variants in the predisposition to Alzheimer's disease. PLoS Comput Biol 2024;20:e1012356. [PMID: 39186798 PMCID: PMC11389932 DOI: 10.1371/journal.pcbi.1012356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 09/11/2024] [Accepted: 07/23/2024] [Indexed: 08/28/2024] Open

Romero R, Menichelli C, Vroland C, Marin JM, Lèbre S, Lecellier CH, Bréhélin L. TFscope: systematic analysis of the sequence features involved in the binding preferences of transcription factors. Genome Biol 2024;25:187. [PMID: 38987807 DOI: 10.1186/s13059-024-03321-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 06/24/2024] [Indexed: 07/12/2024] Open

Ang DA, Carter JM, Deka K, Tan JHL, Zhou J, Chen Q, Chng WJ, Harmston N, Li Y. Aberrant non-canonical NF-κB signalling reprograms the epigenome landscape to drive oncogenic transcriptomes in multiple myeloma. Nat Commun 2024;15:2513. [PMID: 38514625 PMCID: PMC10957915 DOI: 10.1038/s41467-024-46728-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 03/07/2024] [Indexed: 03/23/2024] Open

Affiliation(s)

Daniel A Ang School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore, 637551, Singapore
Jean-Michel Carter School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore, 637551, Singapore
Kamalakshi Deka School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore, 637551, Singapore
Joel H L Tan Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), 61 Biopolis Drive, Proteos, Singapore, 138673, Singapore
Jianbiao Zhou Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599, Republic of Singapore Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597, Republic of Singapore NUS Centre for Cancer Research, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599, Singapore
Qingfeng Chen Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), 61 Biopolis Drive, Proteos, Singapore, 138673, Singapore
Wee Joo Chng Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599, Republic of Singapore Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597, Republic of Singapore NUS Centre for Cancer Research, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599, Singapore Department of Hematology-Oncology, National University Cancer Institute of Singapore (NCIS), The National University Health System (NUHS), 1E, Kent Ridge Road, Singapore, 119228, Republic of Singapore
Nathan Harmston Division of Science, Yale-NUS College, Singapore, 138527, Singapore Program in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, 169857, Singapore Molecular Biosciences Division, Cardiff School of Biosciences, Cardiff University, Cardiff, CF10 3AX, UK
Yinghui Li School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore, 637551, Singapore. Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), 61 Biopolis Drive, Proteos, Singapore, 138673, Singapore.

Collapse

Vishnevsky OV, Bocharnikov AV, Ignatieva EV. Peak Scores Significantly Depend on the Relationships between Contextual Signals in ChIP-Seq Peaks. Int J Mol Sci 2024;25:1011. [PMID: 38256085 PMCID: PMC10816497 DOI: 10.3390/ijms25021011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/13/2023] [Accepted: 01/09/2024] [Indexed: 01/24/2024] Open

Kaplow IM, Lawler AJ, Schäffer DE, Srinivasan C, Sestili HH, Wirthlin ME, Phan BN, Prasad K, Brown AR, Zhang X, Foley K, Genereux DP, Karlsson EK, Lindblad-Toh K, Meyer WK, Pfenning AR. Relating enhancer genetic variation across mammals to complex phenotypes using machine learning. Science 2023;380:eabm7993. [PMID: 37104615 PMCID: PMC10322212 DOI: 10.1126/science.abm7993] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 02/23/2023] [Indexed: 04/29/2023]

Affiliation(s)

Irene M. Kaplow Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
Alyssa J. Lawler Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Daniel E. Schäffer Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Chaitanya Srinivasan Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Heather H. Sestili Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Morgan E. Wirthlin Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
BaDoi N. Phan Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
Kavya Prasad Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Ashley R. Brown Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Xiaomeng Zhang Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
Kathleen Foley Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA
Diane P. Genereux Broad Institute, Cambridge, MA, USA Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
Zoonomia Consortium
Elinor K. Karlsson Broad Institute, Cambridge, MA, USA Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
Kerstin Lindblad-Toh Broad Institute, Cambridge, MA, USA Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
Wynn K. Meyer Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA
Andreas R. Pfenning Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA

Collapse

Saha S, Spinelli L, Castro Mondragon JA, Kervadec A, Lynott M, Kremmer L, Roder L, Krifa S, Torres M, Brun C, Vogler G, Bodmer R, Colas AR, Ocorr K, Perrin L. Genetic architecture of natural variation of cardiac performance from flies to humans. eLife 2022;11:82459. [DOI: 10.7554/elife.82459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 10/25/2022] [Indexed: 11/17/2022] Open

Tsukanov AV, Mironova VV, Levitsky VG. Motif models proposing independent and interdependent impacts of nucleotides are related to high and low affinity transcription factor binding sites in Arabidopsis. FRONTIERS IN PLANT SCIENCE 2022;13:938545. [PMID: 35968123 PMCID: PMC9373801 DOI: 10.3389/fpls.2022.938545] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2022] [Accepted: 07/05/2022] [Indexed: 05/15/2023]

Abstract

Position weight matrix (PWM) is the traditional motif model representing the transcription factor (TF) binding sites. It proposes that the positions contribute independently to TFs binding affinity, although this hypothesis does not fit the data perfectly. This explains why PWM hits are missing in a substantial fraction of ChIP-seq peaks. To study various modes of the direct binding of plant TFs, we compiled the benchmark collection of 111 ChIP-seq datasets for Arabidopsis thaliana, and applied the traditional PWM, and two alternative motif models BaMM and SiteGA, proposing the dependencies of the positions. The variation in the stringency of the recognition thresholds for the models proposed that the hits of PWM, BaMM, and SiteGA models are associated with the sites of high/medium, any, and low affinity, respectively. At the medium recognition threshold, about 60% of ChIP-seq peaks contain PWM hits consisting of conserved core consensuses, while BaMM and SiteGA provide hits for an additional 15% of peaks in which a weaker core consensus is compensated through intra-motif dependencies. The presence/absence of these dependencies in the motifs of alternative/traditional models was confirmed by the dependency logo DepLogo visualizing the position-wise partitioning of the alignments of predicted sites. We exemplify the detailed analysis of ChIP-seq profiles for plant TFs CCA1, MYC2, and SEP3. Gene ontology (GO) enrichment analysis revealed that among the three motif models, the SiteGA had the highest portions of genes with the significantly enriched GO terms among all predicted genes. We showed that both alternative motif models provide for traditional PWM greater extensions in predicted sites for TFs MYC2/SEP3 with condition/tissue specific functions, compared to those for TF CCA1 with housekeeping functions. Overall, the combined application of standard and alternative motif models is beneficial to detect various modes of the direct TF-DNA interactions in the maximal portion of ChIP-seq loci.

Collapse

Srinivasan C, Phan BN, Lawler AJ, Ramamurthy E, Kleyman M, Brown AR, Kaplow IM, Wirthlin ME, Pfenning AR. Addiction-Associated Genetic Variants Implicate Brain Cell Type- and Region-Specific Cis-Regulatory Elements in Addiction Neurobiology. J Neurosci 2021;41:9008-9030. [PMID: 34462306 PMCID: PMC8549541 DOI: 10.1523/jneurosci.2534-20.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 06/18/2021] [Accepted: 07/10/2021] [Indexed: 12/14/2022] Open

Abstract

Recent large genome-wide association studies have identified multiple confident risk loci linked to addiction-associated behavioral traits. Most genetic variants linked to addiction-associated traits lie in noncoding regions of the genome, likely disrupting cis-regulatory element (CRE) function. CREs tend to be highly cell type-specific and may contribute to the functional development of the neural circuits underlying addiction. Yet, a systematic approach for predicting the impact of risk variants on the CREs of specific cell populations is lacking. To dissect the cell types and brain regions underlying addiction-associated traits, we applied stratified linkage disequilibrium score regression to compare genome-wide association studies to genomic regions collected from human and mouse assays for open chromatin, which is associated with CRE activity. We found enrichment of addiction-associated variants in putative CREs marked by open chromatin in neuronal (NeuN+) nuclei collected from multiple prefrontal cortical areas and striatal regions known to play major roles in reward and addiction. To further dissect the cell type-specific basis of addiction-associated traits, we also identified enrichments in human orthologs of open chromatin regions of female and male mouse neuronal subtypes: cortical excitatory, D1, D2, and PV. Last, we developed machine learning models to predict mouse cell type-specific open chromatin, enabling us to further categorize human NeuN+ open chromatin regions into cortical excitatory or striatal D1 and D2 neurons and predict the functional impact of addiction-associated genetic variants. Our results suggest that different neuronal subtypes within the reward system play distinct roles in the variety of traits that contribute to addiction.SIGNIFICANCE STATEMENT We combine statistical genetic and machine learning techniques to find that the predisposition to for nicotine, alcohol, and cannabis use behaviors can be partially explained by genetic variants in conserved regulatory elements within specific brain regions and neuronal subtypes of the reward system. Our computational framework can flexibly integrate open chromatin data across species to screen for putative causal variants in a cell type- and tissue-specific manner for numerous complex traits.

Collapse

Novakovsky G, Saraswat M, Fornes O, Mostafavi S, Wasserman WW. Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol 2021;22:280. [PMID: 34579793 PMCID: PMC8474956 DOI: 10.1186/s13059-021-02499-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 09/15/2021] [Indexed: 12/27/2022] Open

Khan A, Riudavets Puig R, Boddie P, Mathelier A. BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences. Bioinformatics 2021;37:1607-1609. [PMID: 33135764 PMCID: PMC8275979 DOI: 10.1093/bioinformatics/btaa928] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 10/11/2020] [Accepted: 10/19/2020] [Indexed: 12/20/2022] Open

Puig RR, Boddie P, Khan A, Castro-Mondragon JA, Mathelier A. UniBind: maps of high-confidence direct TF-DNA interactions across nine species. BMC Genomics 2021;22:482. [PMID: 34174819 PMCID: PMC8236138 DOI: 10.1186/s12864-021-07760-6] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 05/27/2021] [Indexed: 12/17/2022] Open

Fagny M, Kuijjer ML, Stam M, Joets J, Turc O, Rozière J, Pateyron S, Venon A, Vitte C. Identification of Key Tissue-Specific, Biological Processes by Integrating Enhancer Information in Maize Gene Regulatory Networks. Front Genet 2021;11:606285. [PMID: 33505431 PMCID: PMC7834273 DOI: 10.3389/fgene.2020.606285] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 12/03/2020] [Indexed: 12/27/2022] Open

Abstract

Enhancers are key players in the spatio-temporal coordination of gene expression during numerous crucial processes, including tissue differentiation across development. Characterizing the transcription factors (TFs) and genes they connect, and the molecular functions underpinned is important to better characterize developmental processes. In plants, the recent molecular characterization of enhancers revealed their capacity to activate the expression of several target genes. Nevertheless, identifying these target genes at a genome-wide level is challenging, particularly for large-genome species, where enhancers and target genes can be hundreds of kilobases away. Therefore, the contribution of enhancers to plant regulatory networks remains poorly understood. Here, we investigate the enhancer-driven regulatory network of two maize tissues at different stages: leaves at seedling stage (V2-IST) and husks (bracts) at flowering. Using systems biology, we integrate genomic, epigenomic, and transcriptomic data to model the regulatory relationships between TFs and their potential target genes, and identify regulatory modules specific to husk and V2-IST. We show that leaves at the V2-IST stage are characterized by the response to hormones and macromolecules biogenesis and assembly, which are regulated by the BBR/BPC and AP2/ERF TF families, respectively. In contrast, husks are characterized by cell wall modification and response to abiotic stresses, which are, respectively, orchestrated by the C2C2/DOF and AP2/EREB families. Analysis of the corresponding enhancer sequences reveals that two different transposable element families (TIR transposon Mutator and MITE Pif/Harbinger) have shaped part of the regulatory network in each tissue, and that MITEs have provided potential new TF binding sites involved in husk tissue-specificity.

Collapse

Delos Santos NP, Texari L, Benner C. MEIRLOP: improving score-based motif enrichment by incorporating sequence bias covariates. BMC Bioinformatics 2020;21:410. [PMID: 32938397 PMCID: PMC7493370 DOI: 10.1186/s12859-020-03739-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Accepted: 09/04/2020] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Motif enrichment analysis (MEA) identifies over-represented transcription factor binding (TF) motifs in the DNA sequence of regulatory regions, enabling researchers to infer which transcription factors can regulate transcriptional response to a stimulus, or identify sequence features found near a target protein in a ChIP-seq experiment. Score-based MEA determines motifs enriched in regions exhibiting extreme differences in regulatory activity, but existing methods do not control for biases in GC content or dinucleotide composition. This lack of control for sequence bias, such as those often found in CpG islands, can obscure the enrichment of biologically relevant motifs.

RESULTS

We developed Motif Enrichment In Ranked Lists of Peaks (MEIRLOP), a novel MEA method that determines enrichment of TF binding motifs in a list of scored regulatory regions, while controlling for sequence bias. In this study, we compare MEIRLOP against other MEA methods in identifying binding motifs found enriched in differentially active regulatory regions after interferon-beta stimulus, finding that using logistic regression and covariates improves the ability to call enrichment of ISGF3 binding motifs from differential acetylation ChIP-seq data compared to other methods. Our method achieves similar or better performance compared to other methods when quantifying the enrichment of TF binding motifs from ENCODE TF ChIP-seq datasets. We also demonstrate how MEIRLOP is broadly applicable to the analysis of numerous types of NGS assays and experimental designs.

CONCLUSIONS

Our results demonstrate the importance of controlling for sequence bias when accurately identifying enriched DNA sequence motifs using score-based MEA. MEIRLOP is available for download from https://github.com/npdeloss/meirlop under the MIT license.

Collapse

Partridge EC, Chhetri SB, Prokop JW, Ramaker RC, Jansen CS, Goh ST, Mackiewicz M, Newberry KM, Brandsmeier LA, Meadows SK, Messer CL, Hardigan AA, Coppola CJ, Dean EC, Jiang S, Savic D, Mortazavi A, Wold BJ, Myers RM, Mendenhall EM. Occupancy maps of 208 chromatin-associated proteins in one human cell type. Nature 2020;583:720-728. [PMID: 32728244 PMCID: PMC7398277 DOI: 10.1038/s41586-020-2023-4] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/09/2020] [Indexed: 01/02/2023]

Abstract

Transcription factors are DNA-binding proteins that have key roles in gene regulation^1,2. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes^3–6. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP–seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium.

ChIP–seq and CETCh–seq data are used to analyse binding maps for 208 transcription factors and other chromatin-associated proteins in a single human cell type, providing a comprehensive catalogue of the transcription factor landscape and gene regulatory networks in these cells.

Collapse

Affiliation(s)

E Christopher Partridge HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Surya B Chhetri HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MA, USA
Jeremy W Prokop HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI, USA
Ryne C Ramaker HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
Camden S Jansen Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
Say-Tar Goh Division of Biology, California Institute of Technology, Pasadena, CA, USA
Mark Mackiewicz HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Kimberly M Newberry HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Laurel A Brandsmeier HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Sarah K Meadows HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
C Luke Messer HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Andrew A Hardigan HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
Candice J Coppola Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA
Emma C Dean HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, USA
Shan Jiang Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
Daniel Savic Pharmaceutical Sciences Department, St Jude Children's Research Hospital, Memphis, TN, USA
Ali Mortazavi Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
Barbara J Wold Division of Biology, California Institute of Technology, Pasadena, CA, USA
Richard M Myers HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
Eric M Mendenhall HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA. .,Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA.

Collapse

Ibarra IL, Hollmann NM, Klaus B, Augsten S, Velten B, Hennig J, Zaugg JB. Mechanistic insights into transcription factor cooperativity and its impact on protein-phenotype interactions. Nat Commun 2020;11:124. [PMID: 31913281 PMCID: PMC6949242 DOI: 10.1038/s41467-019-13888-7] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 11/28/2019] [Indexed: 11/25/2022] Open

Villanueva-Cañas JL, Horvath V, Aguilera L, González J. Diverse families of transposable elements affect the transcriptional regulation of stress-response genes in Drosophila melanogaster. Nucleic Acids Res 2020;47:6842-6857. [PMID: 31175824 PMCID: PMC6649756 DOI: 10.1093/nar/gkz490] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 05/20/2019] [Accepted: 05/22/2019] [Indexed: 12/25/2022] Open

Müller AU, Imkamp F, Weber-Ban E. The Mycobacterial LexA/RecA-Independent DNA Damage Response Is Controlled by PafBC and the Pup-Proteasome System. Cell Rep 2019;23:3551-3564. [PMID: 29924998 DOI: 10.1016/j.celrep.2018.05.073] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 04/16/2018] [Accepted: 05/22/2018] [Indexed: 12/11/2022] Open

Berest I, Arnold C, Reyes-Palomares A, Palla G, Rasmussen KD, Giles H, Bruch PM, Huber W, Dietrich S, Helin K, Zaugg JB. Quantification of Differential Transcription Factor Activity and Multiomics-Based Classification into Activators and Repressors: diffTF. Cell Rep 2019;29:3147-3159.e12. [DOI: 10.1016/j.celrep.2019.10.106] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Revised: 09/20/2019] [Accepted: 10/28/2019] [Indexed: 12/26/2022] Open

Gheorghe M, Sandve GK, Khan A, Chèneby J, Ballester B, Mathelier A. A map of direct TF-DNA interactions in the human genome. Nucleic Acids Res 2019;47:e21. [PMID: 30517703 PMCID: PMC6393237 DOI: 10.1093/nar/gky1210] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Revised: 10/31/2018] [Accepted: 11/20/2018] [Indexed: 12/11/2022] Open

Youn A, Marquez EJ, Lawlor N, Stitzel ML, Ucar D. BiFET: sequencing Bias-free transcription factor Footprint Enrichment Test. Nucleic Acids Res 2019;47:e11. [PMID: 30428075 PMCID: PMC6344870 DOI: 10.1093/nar/gky1117] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 10/23/2018] [Indexed: 01/15/2023] Open

Lecellier CH, Wasserman WW, Mathelier A. Human Enhancers Harboring Specific Sequence Composition, Activity, and Genome Organization Are Linked to the Immune Response. Genetics 2018;209:1055-1071. [PMID: 29871881 PMCID: PMC6063234 DOI: 10.1534/genetics.118.301116] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 06/01/2018] [Indexed: 12/15/2022] Open

Wang M, Tai C, E W, Wei L. DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants. Nucleic Acids Res 2018;46:e69. [PMID: 29617928 PMCID: PMC6009584 DOI: 10.1093/nar/gky215] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 03/12/2018] [Accepted: 03/14/2018] [Indexed: 01/19/2023] Open

Wyler E, Menegatti J, Franke V, Kocks C, Boltengagen A, Hennig T, Theil K, Rutkowski A, Ferrai C, Baer L, Kermas L, Friedel C, Rajewsky N, Akalin A, Dölken L, Grässer F, Landthaler M. Widespread activation of antisense transcription of the host genome during herpes simplex virus 1 infection. Genome Biol 2017;18:209. [PMID: 29089033 PMCID: PMC5663069 DOI: 10.1186/s13059-017-1329-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 09/29/2017] [Indexed: 12/19/2022] Open

Affiliation(s)

Emanuel Wyler Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Jennifer Menegatti Institute of Virology, Saarland University Medical School, Kirrbergerstrasse, Haus 47, 66421, Homburg/Saar, Germany
Vedran Franke Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Christine Kocks Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Anastasiya Boltengagen Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Thomas Hennig Institut für Virologie und Immunbiologie, Julius-Maximilians-Universität Würzburg, Versbacherstr. 7, 97078, Würzburg, Germany
Kathrin Theil Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Andrzej Rutkowski Department of Medicine, University of Cambridge, Addenbrookes Hospital, Box 157, Hills Rd, Cambridge, CB2 0QQ, UK.,Present address: AstraZeneca, Darwin Building, 310 Cambridge Science Park, Cambridge, CB4 0WG, UK
Carmelo Ferrai Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Laura Baer Institute of Virology, Saarland University Medical School, Kirrbergerstrasse, Haus 47, 66421, Homburg/Saar, Germany
Lisa Kermas Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Caroline Friedel Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333, München, Germany
Nikolaus Rajewsky Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Altuna Akalin Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany
Lars Dölken Institut für Virologie und Immunbiologie, Julius-Maximilians-Universität Würzburg, Versbacherstr. 7, 97078, Würzburg, Germany
Friedrich Grässer Institute of Virology, Saarland University Medical School, Kirrbergerstrasse, Haus 47, 66421, Homburg/Saar, Germany.
Markus Landthaler Berlin Institute for Medical Systems Biology, Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strasse 10, 13125, Berlin, Germany. .,IRI Life Sciences, Institute für Biologie, Humboldt Universität zu Berlin, Philippstraße 13, 10115, Berlin, Germany.

Collapse

Mariani L, Weinand K, Vedenko A, Barrera LA, Bulyk ML. Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds. Cell Syst 2017;5:187-201.e7. [PMID: 28957653 PMCID: PMC5657590 DOI: 10.1016/j.cels.2017.06.015] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Revised: 06/03/2017] [Accepted: 06/29/2017] [Indexed: 01/08/2023]

Jayaram N, Usvyat D, R Martin AC. Evaluating tools for transcription factor binding site prediction. BMC Bioinformatics 2016;17:547. [PMID: 27806697 PMCID: PMC6889335 DOI: 10.1186/s12859-016-1298-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Accepted: 10/20/2016] [Indexed: 12/21/2022] Open

Mathelier A, Xin B, Chiu TP, Yang L, Rohs R, Wasserman WW. DNA Shape Features Improve Transcription Factor Binding Site Predictions In Vivo. Cell Syst 2016;3:278-286.e4. [PMID: 27546793 PMCID: PMC5042832 DOI: 10.1016/j.cels.2016.07.001] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Revised: 03/04/2016] [Accepted: 06/30/2016] [Indexed: 01/09/2023]

Shi W, Fornes O, Mathelier A, Wasserman WW. Evaluating the impact of single nucleotide variants on transcription factor binding. Nucleic Acids Res 2016;44:10106-10116. [PMID: 27492288 PMCID: PMC5137422 DOI: 10.1093/nar/gkw691] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 07/25/2016] [Accepted: 07/26/2016] [Indexed: 12/21/2022] Open

Differences in the Early Development of Human and Mouse Embryonic Stem Cells. PLoS One 2015;10:e0140803. [PMID: 26473594 PMCID: PMC4608779 DOI: 10.1371/journal.pone.0140803] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Accepted: 09/30/2015] [Indexed: 01/22/2023] Open

Abstract

We performed a systematic analysis of gene expression features in early (10–21 days) development of human vs mouse embryonic cells (hESCs vs mESCs). Many development features were found to be conserved, and a majority of differentially regulated genes have similar expression change in both organisms. The similarity is especially evident, when gene expression profiles are clustered together and properties of clustered groups of genes are compared. First 10 days of mESC development match the features of hESC development within 21 days, in accordance with the differences in population doubling time in human and mouse ESCs. At the same time, several important differences are seen. There is a clear difference in initial expression change of transcription factors and stimulus responsive genes, which may be caused by the difference in experimental procedures. However, we also found that some biological processes develop differently; this can clearly be shown, for example, for neuron and sensory organ development. Some groups of genes show peaks of the expression levels during the development and these peaks cannot be claimed to happen at the same time points in the two organisms, as well as for the same groups of (orthologous) genes. We also detected a larger number of upregulated genes during development of mESCs as compared to hESCs. The differences were quantified by comparing promoters of related genes. Most of gene groups behave similarly and have similar transcription factor (TF) binding sites on their promoters. A few groups of genes have similar promoters, but are expressed differently in two species. Interestingly, there are groups of genes expressed similarly, although they have different promoters, which can be shown by comparing their TF binding sites. Namely, a large group of similarly expressed cell cycle-related genes is found to have discrepant TF binding properties in mouse vs human.

Collapse

Dabrowski M, Dojer N, Krystkowiak I, Kaminska B, Wilczynski B. Optimally choosing PWM motif databases and sequence scanning approaches based on ChIP-seq data. BMC Bioinformatics 2015;16:140. [PMID: 25927199 PMCID: PMC4436866 DOI: 10.1186/s12859-015-0573-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2014] [Accepted: 04/14/2015] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

For many years now, binding preferences of Transcription Factors have been described by so called motifs, usually mathematically defined by position weight matrices or similar models, for the purpose of predicting potential binding sites. However, despite the availability of thousands of motif models in public and commercial databases, a researcher who wants to use them is left with many competing methods of identifying potential binding sites in a genome of interest and there is little published information regarding the optimality of different choices. Thanks to the availability of large number of different motif models as well as a number of experimental datasets describing actual binding of TFs in hundreds of TF-ChIP-seq pairs, we set out to perform a comprehensive analysis of this matter.

RESULTS

We focus on the task of identifying potential transcription factor binding sites in the human genome. Firstly, we provide a comprehensive comparison of the coverage and quality of models available in different databases, showing that the public databases have comparable TFs coverage and better motif performance than commercial databases. Secondly, we compare different motif scanners showing that, regardless of the database used, the tools developed by the scientific community outperform the commercial tools. Thirdly, we calculate for each motif a detection threshold optimizing the accuracy of prediction. Finally, we provide an in-depth comparison of different methods of choosing thresholds for all motifs a priori. Surprisingly, we show that selecting a common false-positive rate gives results that are the least biased by the information content of the motif and therefore most uniformly accurate.

CONCLUSION

We provide a guide for researchers working with transcription factor motifs. It is supplemented with detailed results of the analysis and the benchmark datasets at http://bioputer.mimuw.edu.pl/papers/motifs/ .

Collapse

Mathelier A, Shi W, Wasserman WW. Identification of altered cis-regulatory elements in human disease. Trends Genet 2015;31:67-76. [DOI: 10.1016/j.tig.2014.12.003] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Revised: 12/19/2014] [Accepted: 12/19/2014] [Indexed: 02/01/2023]

Worsley Hunt R, Wasserman WW. Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets. Genome Biol 2014;15:412. [PMID: 25070602 PMCID: PMC4165360 DOI: 10.1186/s13059-014-0412-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Accepted: 07/29/2014] [Indexed: 12/15/2022] Open

Abstract

Background

The global effort to annotate the non-coding portion of the human genome relies heavily on chromatin immunoprecipitation data generated with high-throughput DNA sequencing (ChIP-seq). ChIP-seq is generally successful in detailing the segments of the genome bound by the immunoprecipitated transcription factor (TF), however almost all datasets contain genomic regions devoid of the canonical motif for the TF. It remains to be determined if these regions are related to the immunoprecipitated TF or whether, despite the use of controls, there is a portion of peaks that can be attributed to other causes.

Results

Analyses across hundreds of ChIP-seq datasets generated for sequence-specific DNA binding TFs reveal a small set of TF binding profiles for which predicted TF binding site motifs are repeatedly observed to be significantly enriched. Grouping related binding profiles, the set includes: CTCF-like, ETS-like, JUN-like, and THAP11 profiles. These frequently enriched profiles are termed ‘zingers’ to highlight their unanticipated enrichment in datasets for which they were not the targeted TF, and their potential impact on the interpretation and analysis of TF ChIP-seq data. Peaks with zinger motifs and lacking the ChIPped TF’s motif are observed to compose up to 45% of a ChIP-seq dataset. There is substantial overlap of zinger motif containing regions between diverse TF datasets, suggesting a mechanism that is not TF-specific for the recovery of these regions.

Conclusions

Based on the zinger regions proximity to cohesin-bound segments, a loading station model is proposed. Further study of zingers will advance understanding of gene regulation.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0412-4) contains supplementary material, which is available to authorized users.

Collapse