1
|
Hemphill WO, Steiner HR, Kominsky JR, Wuttke DS, Cech TR. Transcription factors ERα and Sox2 have differing multiphasic DNA- and RNA-binding mechanisms. RNA (NEW YORK, N.Y.) 2024; 30:1089-1105. [PMID: 38760076 PMCID: PMC11251522 DOI: 10.1261/rna.080027.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 05/01/2024] [Indexed: 05/19/2024]
Abstract
Many transcription factors (TFs) have been shown to bind RNA, leading to open questions regarding the mechanism(s) of this RNA binding and its role in regulating TF activities. Here, we use biophysical assays to interrogate the k on, k off, and K d for DNA and RNA binding of two model human TFs, ERα and Sox2. Unexpectedly, we found that both proteins exhibit multiphasic nucleic acid-binding kinetics. We propose that Sox2 RNA and DNA multiphasic binding kinetics can be explained by a conventional model for sequential Sox2 monomer association and dissociation. In contrast, ERα nucleic acid binding exhibited biphasic dissociation paired with novel triphasic association behavior, in which two apparent binding transitions are separated by a 10-20 min "lag" phase depending on protein concentration. We considered several conventional models for the observed kinetic behavior, none of which adequately explained all the ERα nucleic acid-binding data. Instead, simulations with a model incorporating sequential ERα monomer association, ERα nucleic acid complex isomerization, and product "feedback" on isomerization rate recapitulated the general kinetic trends for both ERα DNA and RNA binding. Collectively, our findings reveal that Sox2 and ERα bind RNA and DNA with previously unappreciated multiphasic binding kinetics, and that their reaction mechanisms differ with ERα binding nucleic acids via a novel reaction mechanism.
Collapse
Affiliation(s)
- Wayne O Hemphill
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303, USA
- Howard Hughes Medical Institute and BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80303, USA
| | - Halley R Steiner
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303, USA
| | - Jackson R Kominsky
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303, USA
- Howard Hughes Medical Institute and BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80303, USA
| | - Deborah S Wuttke
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303, USA
| | - Thomas R Cech
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303, USA
- Howard Hughes Medical Institute and BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80303, USA
| |
Collapse
|
2
|
Mandal K, Tomar SK, Kumar Santra M. Decoding the ubiquitin language: Orchestrating transcription initiation and gene expression through chromatin remodelers and histones. Gene 2024; 904:148218. [PMID: 38307220 DOI: 10.1016/j.gene.2024.148218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 01/22/2024] [Accepted: 01/25/2024] [Indexed: 02/04/2024]
Abstract
Eukaryotic transcription is a finely orchestrated process and it is controlled by transcription factors as well as epigenetic regulators. Transcription factors and epigenetic regulators undergo different types of posttranslational modifications including ubiquitination to control transcription process. Ubiquitination, traditionally associated with protein degradation, has emerged as a crucial contributor to the regulation of chromatin structure through ubiquitination of histone and chromatin remodelers. Ubiquitination introduces new layers of intricacy to the regulation of transcription initiation through controlling the equilibrium between euchromatin and heterochromatin states. Nucleosome, the fundamental units of chromatin, spacing in euchromatin and heterochromatin states are regulated by histone modification and chromatin remodeling complexes. Chromatin remodeling complexes actively sculpt the chromatin architecture and thereby influence the transcriptional states of genes. Therefore, understanding the dynamic behavior of nucleosome spacing is critical as it impacts various cellular functions through controlling gene expression profiles. In this comprehensive review, we discussed the intricate interplay between ubiquitination and transcription initiation, and illuminated the underlying molecular mechanisms that occur in a variety of biological contexts. This exploration sheds light on the complex regulatory networks that govern eukaryotic transcription, providing important insights into the fine orchestration of gene expression and chromatin dynamics.
Collapse
Affiliation(s)
- Kartik Mandal
- Cancer Biology Division, National Centre for Cell Science, Ganeshkhind Road, Pune, Maharashtra 411007, India; Department of Biotechnology, Savitribai Phule Pune University, Ganeshkhind Road, Pune, Maharashtra 411007, India
| | - Shiva Kumar Tomar
- Cancer Biology Division, National Centre for Cell Science, Ganeshkhind Road, Pune, Maharashtra 411007, India; Department of Biotechnology, Savitribai Phule Pune University, Ganeshkhind Road, Pune, Maharashtra 411007, India
| | - Manas Kumar Santra
- Cancer Biology Division, National Centre for Cell Science, Ganeshkhind Road, Pune, Maharashtra 411007, India.
| |
Collapse
|
3
|
Schultheis H, Bentsen M, Heger V, Looso M. Uncovering uncharacterized binding of transcription factors from ATAC-seq footprinting data. Sci Rep 2024; 14:9275. [PMID: 38654130 DOI: 10.1038/s41598-024-59989-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
Transcription factors (TFs) are crucial epigenetic regulators, which enable cells to dynamically adjust gene expression in response to environmental signals. Computational procedures like digital genomic footprinting on chromatin accessibility assays such as ATACseq can be used to identify bound TFs in a genome-wide scale. This method utilizes short regions of low accessibility signals due to steric hindrance of DNA bound proteins, called footprints (FPs), which are combined with motif databases for TF identification. However, while over 1600 TFs have been described in the human genome, only ~ 700 of these have a known binding motif. Thus, a substantial number of FPs without overlap to a known DNA motif are normally discarded from FP analysis. In addition, the FP method is restricted to organisms with a substantial number of known TF motifs. Here we present DENIS (DE Novo motIf diScovery), a framework to generate and systematically investigate the potential of de novo TF motif discovery from FPs. DENIS includes functionality (1) to isolate FPs without binding motifs, (2) to perform de novo motif generation and (3) to characterize novel motifs. Here, we show that the framework rediscovers artificially removed TF motifs, quantifies de novo motif usage during an early embryonic development example dataset, and is able to analyze and uncover TF activity in organisms lacking canonical motifs. The latter task is exemplified by an investigation of a scATAC-seq dataset in zebrafish which covers different cell types during hematopoiesis.
Collapse
Affiliation(s)
- Hendrik Schultheis
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Mette Bentsen
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Vanessa Heger
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Mario Looso
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany.
- Cardio-Pulmonary Institute (CPI), Bad Nauheim, Germany.
| |
Collapse
|
4
|
Hemphill WO, Steiner HR, Kominsky JR, Wuttke DS, Cech TR. Transcription factors ERα and Sox2 have differing multiphasic DNA and RNA binding mechanisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.18.585577. [PMID: 38562825 PMCID: PMC10983890 DOI: 10.1101/2024.03.18.585577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Many transcription factors (TFs) have been shown to bind RNA, leading to open questions regarding the mechanism(s) of this RNA binding and its role in regulating TF activities. Here we use biophysical assays to interrogate the k o n , k o f f , and K d for DNA and RNA binding of two model human transcription factors, ERα and Sox2. Unexpectedly, we found that both proteins exhibited multiphasic nucleic acid binding kinetics. We propose that Sox2 RNA and DNA multiphasic binding kinetics could be explained by a conventional model for sequential Sox2 monomer association and dissociation. In contrast, ERα nucleic acid binding exhibited biphasic dissociation paired with novel triphasic association behavior, where two apparent binding transitions are separated by a 10-20 min "lag" phase depending on protein concentration. We considered several conventional models for the observed kinetic behavior, none of which adequately explained all the ERα nucleic acid binding data. Instead, simulations with a model incorporating sequential ERα monomer association, ERα nucleic acid complex isomerization, and product "feedback" on isomerization rate recapitulated the general kinetic trends for both ERα DNA and RNA binding. Collectively, our findings reveal that Sox2 and ERα bind RNA and DNA with previously unappreciated multiphasic binding kinetics, and that their reaction mechanisms differ with ERα binding nucleic acids via a novel reaction mechanism.
Collapse
Affiliation(s)
- Wayne O. Hemphill
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303 USA
- Howard Hughes Medical Institute and BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80303 USA
| | - Halley R. Steiner
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303 USA
| | - Jackson R. Kominsky
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303 USA
- Howard Hughes Medical Institute and BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80303 USA
| | - Deborah S. Wuttke
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303 USA
| | - Thomas R. Cech
- Department of Biochemistry, University of Colorado Boulder, Boulder, Colorado 80303 USA
- Howard Hughes Medical Institute and BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80303 USA
| |
Collapse
|
5
|
Elbaz B, Darwish A, Vardy M, Isaac S, Tokars HM, Dzhashiashvili Y, Korshunov K, Prakriya M, Eden A, Popko B. The bone transcription factor Osterix controls extracellular matrix- and node of Ranvier-related gene expression in oligodendrocytes. Neuron 2024; 112:247-263.e6. [PMID: 37924811 PMCID: PMC10843489 DOI: 10.1016/j.neuron.2023.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 08/24/2023] [Accepted: 10/04/2023] [Indexed: 11/06/2023]
Abstract
Oligodendrocytes are the primary producers of many extracellular matrix (ECM)-related proteins found in the CNS. Therefore, oligodendrocytes play a critical role in the determination of brain stiffness, node of Ranvier formation, perinodal ECM deposition, and perineuronal net formation, all of which depend on the ECM. Nevertheless, the transcription factors that control ECM-related gene expression in oligodendrocytes remain unknown. Here, we found that the transcription factor Osterix (also known as Sp7) binds in proximity to genes important for CNS ECM and node of Ranvier formation and mediates their expression. Oligodendrocyte-specific ablation of Sp7 changes ECM composition and brain stiffness and results in aberrant node of Ranvier formation. Sp7 is known to control osteoblast maturation and bone formation. Our comparative analyses suggest that Sp7 plays a conserved biological role in oligodendrocytes and in bone-forming cells, where it mediates brain and bone tissue stiffness by controlling expression of ECM components.
Collapse
Affiliation(s)
- Benayahu Elbaz
- Department of Neurology, Division of Multiple Sclerosis and Neuroimmunology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| | - Alaa Darwish
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Maia Vardy
- Department of Neurology, Division of Multiple Sclerosis and Neuroimmunology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Sara Isaac
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Haley Margaret Tokars
- Department of Neurology, Division of Multiple Sclerosis and Neuroimmunology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Yulia Dzhashiashvili
- Department of Neurology, Division of Multiple Sclerosis and Neuroimmunology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Kirill Korshunov
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Murali Prakriya
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Amir Eden
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Brian Popko
- Department of Neurology, Division of Multiple Sclerosis and Neuroimmunology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
6
|
Viner C, Ishak CA, Johnson J, Walker NJ, Shi H, Sjöberg-Herrera MK, Shen SY, Lardo SM, Adams DJ, Ferguson-Smith AC, De Carvalho DD, Hainer SJ, Bailey TL, Hoffman MM. Modeling methyl-sensitive transcription factor motifs with an expanded epigenetic alphabet. Genome Biol 2024; 25:11. [PMID: 38191487 PMCID: PMC10773111 DOI: 10.1186/s13059-023-03070-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 09/21/2023] [Indexed: 01/10/2024] Open
Abstract
BACKGROUND Transcription factors bind DNA in specific sequence contexts. In addition to distinguishing one nucleobase from another, some transcription factors can distinguish between unmodified and modified bases. Current models of transcription factor binding tend not to take DNA modifications into account, while the recent few that do often have limitations. This makes a comprehensive and accurate profiling of transcription factor affinities difficult. RESULTS Here, we develop methods to identify transcription factor binding sites in modified DNA. Our models expand the standard A/C/G/T DNA alphabet to include cytosine modifications. We develop Cytomod to create modified genomic sequences and we also enhance the MEME Suite, adding the capacity to handle custom alphabets. We adapt the well-established position weight matrix (PWM) model of transcription factor binding affinity to this expanded DNA alphabet. Using these methods, we identify modification-sensitive transcription factor binding motifs. We confirm established binding preferences, such as the preference of ZFP57 and C/EBPβ for methylated motifs and the preference of c-Myc for unmethylated E-box motifs. CONCLUSIONS Using known binding preferences to tune model parameters, we discover novel modified motifs for a wide array of transcription factors. Finally, we validate our binding preference predictions for OCT4 using cleavage under targets and release using nuclease (CUT&RUN) experiments across conventional, methylation-, and hydroxymethylation-enriched sequences. Our approach readily extends to other DNA modifications. As more genome-wide single-base resolution modification data becomes available, we expect that our method will yield insights into altered transcription factor binding affinities across many different modifications.
Collapse
Affiliation(s)
- Coby Viner
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Charles A Ishak
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Epigenetics and Molecular Carcinogenesis, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - James Johnson
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Nicolas J Walker
- Department of Genetics, University of Cambridge, Cambridge, England
| | - Hui Shi
- Department of Genetics, University of Cambridge, Cambridge, England
| | - Marcela K Sjöberg-Herrera
- Wellcome Sanger Institute, Cambridge, England
- Faculty of Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Shu Yi Shen
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Santana M Lardo
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | | | | | - Daniel D De Carvalho
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
| | - Sarah J Hainer
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Timothy L Bailey
- Department of Pharmacology, University of Nevada, Reno, Reno, NV, USA
| | - Michael M Hoffman
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
| |
Collapse
|
7
|
Vorontsov IE, Eliseeva IA, Zinkevich A, Nikonov M, Abramov S, Boytsov A, Kamenets V, Kasianova A, Kolmykov S, Yevshin I, Favorov A, Medvedeva YA, Jolma A, Kolpakov F, Makeev V, Kulakovskiy I. HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors. Nucleic Acids Res 2024; 52:D154-D163. [PMID: 37971293 PMCID: PMC10767914 DOI: 10.1093/nar/gkad1077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 10/17/2023] [Accepted: 10/26/2023] [Indexed: 11/19/2023] Open
Abstract
We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.
Collapse
Affiliation(s)
- Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Irina A Eliseeva
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Russia
| | - Arsenii Zinkevich
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991 Moscow, Russia
| | - Mikhail Nikonov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991 Moscow, Russia
| | - Sergey Abramov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Altius Institute for Biomedical Sciences, 98121 Seattle, WA, USA
| | - Alexandr Boytsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Altius Institute for Biomedical Sciences, 98121 Seattle, WA, USA
| | - Vasily Kamenets
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
- Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, 450054 Ufa, Russia
| | - Alexandra Kasianova
- Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences, 127051 Moscow, Russia
| | - Semyon Kolmykov
- Department of Computational Biology, Sirius University of Science and Technology, 354340 Sirius, Krasnodar region, Russia
| | | | - Alexander Favorov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Yulia A Medvedeva
- Research Center of Biotechnology RAS, Russian Academy of Sciences, 119071 Moscow, Russia
| | - Arttu Jolma
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Fedor Kolpakov
- Department of Computational Biology, Sirius University of Science and Technology, 354340 Sirius, Krasnodar region, Russia
- Bioinformatics Laboratory, Federal Research Center for Information and Computational Technologies, 630090 Novosibirsk, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
- Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, 450054 Ufa, Russia
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Russia
- Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, 420008 Kazan, Russia
| |
Collapse
|
8
|
Acencio ML, Vazquez M, Chawla K, Lægreid A, Kuiper M. TFCheckpoint database update, a cross-referencing system for transcription factors from human, mouse and rat. Nucleic Acids Res 2024; 52:D334-D344. [PMID: 37992291 PMCID: PMC10767992 DOI: 10.1093/nar/gkad1030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/20/2023] [Accepted: 10/21/2023] [Indexed: 11/24/2023] Open
Abstract
Prior knowledge about DNA-binding transcription factors (dbTFs), transcription co-regulators (coTFs) and general transcriptional factors (GTFs) is crucial for the study and understanding of the regulation of transcription. This is reflected by the many publications and database resources describing knowledge about TFs. We previously launched the TFCheckpoint database, an integrated resource focused on human, mouse and rat dbTFs, providing users access to a comprehensive overview of these proteins. Here, we describe TFCheckpoint 2.0 (https://www.tfcheckpoint.org/index.php), comprising 13 collections of dbTFs, coTFs and GTFs. TFCheckpoint 2.0 provides an easy and versatile cross-referencing system for users to view and download collections that may otherwise be cumbersome to find, compare and retrieve.
Collapse
Affiliation(s)
- Marcio L Acencio
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| | - Miguel Vazquez
- Barcelona Supercomputing Center, Barcelona, 08034, Spain
| | - Konika Chawla
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
- Bioinformatics Core Facility, St. Olavs hospital HF, Trondheim, NO-7491, Norway
| | - Astrid Lægreid
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| | - Martin Kuiper
- Department of Biology, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| |
Collapse
|
9
|
Liu Z, Samee M. Structural underpinnings of mutation rate variations in the human genome. Nucleic Acids Res 2023; 51:7184-7197. [PMID: 37395403 PMCID: PMC10415140 DOI: 10.1093/nar/gkad551] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/06/2023] [Accepted: 06/15/2023] [Indexed: 07/04/2023] Open
Abstract
Single nucleotide mutation rates have critical implications for human evolution and genetic diseases. Importantly, the rates vary substantially across the genome and the principles underlying such variations remain poorly understood. A recent model explained much of this variation by considering higher-order nucleotide interactions in the 7-mer sequence context around mutated nucleotides. This model's success implicates a connection between DNA shape and mutation rates. DNA shape, i.e. structural properties like helical twist and tilt, is known to capture interactions between nucleotides within a local context. Thus, we hypothesized that changes in DNA shape features at and around mutated positions can explain mutation rate variations in the human genome. Indeed, DNA shape-based models of mutation rates showed similar or improved performance over current nucleotide sequence-based models. These models accurately characterized mutation hotspots in the human genome and revealed the shape features whose interactions underlie mutation rate variations. DNA shape also impacts mutation rates within putative functional regions like transcription factor binding sites where we find a strong association between DNA shape and position-specific mutation rates. This work demonstrates the structural underpinnings of nucleotide mutations in the human genome and lays the groundwork for future models of genetic variations to incorporate DNA shape.
Collapse
Affiliation(s)
- Zian Liu
- Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Md Abul Hassan Samee
- Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
10
|
Striker SS, Wilferd SF, Lewis EM, O'Connor SA, Plaisier CL. Systematic integration of protein-affecting mutations, gene fusions, and copy number alterations into a comprehensive somatic mutational profile. CELL REPORTS METHODS 2023; 3:100442. [PMID: 37159661 PMCID: PMC10162952 DOI: 10.1016/j.crmeth.2023.100442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 12/21/2022] [Accepted: 03/10/2023] [Indexed: 05/11/2023]
Abstract
Somatic mutations occur as random genetic changes in genes through protein-affecting mutations (PAMs), gene fusions, or copy number alterations (CNAs). Mutations of different types can have a similar phenotypic effect (i.e., allelic heterogeneity) and should be integrated into a unified gene mutation profile. We developed OncoMerge to fill this niche of integrating somatic mutations to capture allelic heterogeneity, assign a function to mutations, and overcome known obstacles in cancer genetics. Application of OncoMerge to TCGA Pan-Cancer Atlas increased detection of somatically mutated genes and improved the prediction of the somatic mutation role as either activating or loss of function. Using integrated somatic mutation matrices increased the power to infer gene regulatory networks and uncovered the enrichment of switch-like feedback motifs and delay-inducing feedforward loops. These studies demonstrate that OncoMerge efficiently integrates PAMs, fusions, and CNAs and strengthens downstream analyses linking somatic mutations to cancer phenotypes.
Collapse
Affiliation(s)
- Shawn S. Striker
- School of Biological and Health Systems Engineering, Fulton Schools of Engineering, Arizona State University, Tempe, AZ 85287-9709, USA
| | - Sierra F. Wilferd
- School of Biological and Health Systems Engineering, Fulton Schools of Engineering, Arizona State University, Tempe, AZ 85287-9709, USA
| | - Erika M. Lewis
- School of Biological and Health Systems Engineering, Fulton Schools of Engineering, Arizona State University, Tempe, AZ 85287-9709, USA
| | - Samantha A. O'Connor
- School of Biological and Health Systems Engineering, Fulton Schools of Engineering, Arizona State University, Tempe, AZ 85287-9709, USA
| | - Christopher L. Plaisier
- School of Biological and Health Systems Engineering, Fulton Schools of Engineering, Arizona State University, Tempe, AZ 85287-9709, USA
| |
Collapse
|
11
|
Lesha E, George H, Zaki MM, Smith CJ, Khoshakhlagh P, Ng AHM. A Survey of Transcription Factors in Cell Fate Control. Methods Mol Biol 2023; 2594:133-141. [PMID: 36264493 DOI: 10.1007/978-1-0716-2815-7_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Transcription factors (TFs) play a cardinal role in the development and maintenance of human physiology by acting as mediators of gene expression and cell state control. Recent advancements have broadened our knowledge on the potency of TFs in governing cell physiology and have deepened our understanding of the mechanisms through which they exert this control. The ability of TFs to program cell fates has gathered significant interest in recent decades, and high-throughput technologies now allow for the systematic discovery of forward programming factors to convert pluripotent stem cells into numerous differentiated cell types. The next generation of these technologies has the potential to improve our understanding and control of cell fates and states and provide advanced therapeutic modalities to address many medical conditions.
Collapse
Affiliation(s)
- Emal Lesha
- GC Therapeutics Inc., Cambridge, MA, USA
- Department of Neurosurgery, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Haydy George
- GC Therapeutics Inc., Cambridge, MA, USA
- School of Medicine, St. George's University, West Indies, Grenada
| | - Mark M Zaki
- GC Therapeutics Inc., Cambridge, MA, USA
- Department of Neurosurgery, University of Michigan, Ann Arbor, MI, USA
| | | | | | | |
Collapse
|
12
|
Larcombe MR, Hsu S, Polo JM, Knaupp AS. Indirect Mechanisms of Transcription Factor-Mediated Gene Regulation during Cell Fate Changes. ADVANCED GENETICS (HOBOKEN, N.J.) 2022; 3:2200015. [PMID: 36911290 PMCID: PMC9993476 DOI: 10.1002/ggn2.202200015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Indexed: 06/18/2023]
Abstract
Transcription factors (TFs) are the master regulators of cellular identity, capable of driving cell fate transitions including differentiations, reprogramming, and transdifferentiations. Pioneer TFs recognize partial motifs exposed on nucleosomal DNA, allowing for TF-mediated activation of repressed chromatin. Moreover, there is evidence suggesting that certain TFs can repress actively expressed genes either directly through interactions with accessible regulatory elements or indirectly through mechanisms that impact the expression, activity, or localization of other regulatory factors. Recent evidence suggests that during reprogramming, the reprogramming TFs initiate opening of chromatin regions rich in somatic TF motifs that are inaccessible in the initial and final cellular states. It is postulated that analogous to a sponge, these transiently accessible regions "soak up" somatic TFs, hence lowering the initial barriers to cell fate changes. This indirect TF-mediated gene regulation event, which is aptly named the "sponge effect," may play an essential role in the silencing of the somatic transcriptional network during different cellular conversions.
Collapse
Affiliation(s)
- Michael R. Larcombe
- Department of Anatomy and Developmental BiologyMonash UniversityClaytonVictoria3168Australia
- Development and Stem Cells ProgramMonash Biomedicine Discovery InstituteClaytonVictoria3168Australia
- Australian Regenerative Medicine InstituteMonash UniversityClaytonVictoria3168Australia
| | - Sheng Hsu
- Department of Anatomy and Developmental BiologyMonash UniversityClaytonVictoria3168Australia
- Development and Stem Cells ProgramMonash Biomedicine Discovery InstituteClaytonVictoria3168Australia
- Australian Regenerative Medicine InstituteMonash UniversityClaytonVictoria3168Australia
| | - Jose M. Polo
- Department of Anatomy and Developmental BiologyMonash UniversityClaytonVictoria3168Australia
- Development and Stem Cells ProgramMonash Biomedicine Discovery InstituteClaytonVictoria3168Australia
- Australian Regenerative Medicine InstituteMonash UniversityClaytonVictoria3168Australia
- South Australian Immunogenomics Cancer Institute, Faculty of Health and Medical SciencesUniversity of AdelaideAdelaideSouth Australia5005Australia
- Adelaide Centre for Epigenetics, Faculty of Health and Medical SciencesUniversity of AdelaideAdelaideSouth Australia5005Australia
| | - Anja S. Knaupp
- Department of Anatomy and Developmental BiologyMonash UniversityClaytonVictoria3168Australia
- Development and Stem Cells ProgramMonash Biomedicine Discovery InstituteClaytonVictoria3168Australia
- Australian Regenerative Medicine InstituteMonash UniversityClaytonVictoria3168Australia
| |
Collapse
|
13
|
Transcription Factors as Important Regulators of Changes in Behavior through Domestication of Gray Rats: Quantitative Data from RNA Sequencing. Int J Mol Sci 2022; 23:ijms232012269. [PMID: 36293128 PMCID: PMC9603081 DOI: 10.3390/ijms232012269] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 09/28/2022] [Accepted: 10/12/2022] [Indexed: 11/16/2022] Open
Abstract
Studies on hereditary fixation of the tame-behavior phenotype during animal domestication remain relevant and important because they are of both basic research and applied significance. In model animals, gray rats Rattus norvegicus bred for either an enhancement or reduction in defensive response to humans, for the first time, we used high-throughput RNA sequencing to investigate differential expression of genes in tissue samples from the tegmental region of the midbrain in 2-month-old rats showing either tame or aggressive behavior. A total of 42 differentially expressed genes (DEGs; adjusted p-value < 0.01 and fold-change > 2) were identified, with 20 upregulated and 22 downregulated genes in the tissue samples from tame rats compared with aggressive rats. Among them, three genes encoding transcription factors (TFs) were detected: Ascl3 was upregulated, whereas Fos and Fosb were downregulated in tissue samples from the brains of tame rats brain. Other DEGs were annotated as associated with extracellular matrix components, transporter proteins, the neurotransmitter system, signaling molecules, and immune system proteins. We believe that these DEGs encode proteins that constitute a multifactorial system determining the behavior for which the rats have been artificially selected. We demonstrated that several structural subtypes of E-box motifs—known as binding sites for many developmental TFs of the bHLH class, including the ASCL subfamily of TFs—are enriched in the set of promoters of the DEGs downregulated in the tissue samples of tame rats’. Because ASCL3 may act as a repressor on target genes of other developmental TFs of the bHLH class, we hypothesize that the expression of TF gene Ascl3 in tame rats indicates longer neurogenesis (as compared to aggressive rats), which is a sign of neoteny and domestication. Thus, our domestication model shows a new function of TF ASCL3: it may play the most important role in behavioral changes in animals.
Collapse
|
14
|
Steenwyk JL, Goltz DC, Buida TJ, Li Y, Shen XX, Rokas A. OrthoSNAP: A tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees. PLoS Biol 2022; 20:e3001827. [PMID: 36228036 PMCID: PMC9595520 DOI: 10.1371/journal.pbio.3001827] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 10/25/2022] [Accepted: 09/13/2022] [Indexed: 11/19/2022] Open
Abstract
Molecular evolution studies, such as phylogenomic studies and genome-wide surveys of selection, often rely on gene families of single-copy orthologs (SC-OGs). Large gene families with multiple homologs in 1 or more species-a phenomenon observed among several important families of genes such as transporters and transcription factors-are often ignored because identifying and retrieving SC-OGs nested within them is challenging. To address this issue and increase the number of markers used in molecular evolution studies, we developed OrthoSNAP, a software that uses a phylogenetic framework to simultaneously split gene families into SC-OGs and prune species-specific inparalogs. We term SC-OGs identified by OrthoSNAP as SNAP-OGs because they are identified using a splitting and pruning procedure analogous to snapping branches on a tree. From 415,129 orthologous groups of genes inferred across 7 eukaryotic phylogenomic datasets, we identified 9,821 SC-OGs; using OrthoSNAP on the remaining 405,308 orthologous groups of genes, we identified an additional 10,704 SNAP-OGs. Comparison of SNAP-OGs and SC-OGs revealed that their phylogenetic information content was similar, even in complex datasets that contain a whole-genome duplication, complex patterns of duplication and loss, transcriptome data where each gene typically has multiple transcripts, and contentious branches in the tree of life. OrthoSNAP is useful for increasing the number of markers used in molecular evolution data matrices, a critical step for robustly inferring and exploring the tree of life.
Collapse
Affiliation(s)
- Jacob L. Steenwyk
- Vanderbilt University, Department of Biological Sciences, Nashville, Tennessee, United States of America
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail: (JLS); (AR)
| | - Dayna C. Goltz
- Independent Researcher, Nashville, Tennessee, United States of America
| | - Thomas J. Buida
- Independent Researcher, Nashville, Tennessee, United States of America
| | - Yuanning Li
- Vanderbilt University, Department of Biological Sciences, Nashville, Tennessee, United States of America
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Institute of Marine Science and Technology, Shandong University, Qingdao, China
| | - Xing-Xing Shen
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Antonis Rokas
- Vanderbilt University, Department of Biological Sciences, Nashville, Tennessee, United States of America
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- * E-mail: (JLS); (AR)
| |
Collapse
|
15
|
Liang X, Brooks MJ, Swaroop A. Developmental genome-wide occupancy analysis of bZIP transcription factor NRL uncovers the role of c-Jun in early differentiation of rod photoreceptors in the mammalian retina. Hum Mol Genet 2022; 31:3914-3933. [PMID: 35776116 DOI: 10.1093/hmg/ddac143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 11/12/2022] Open
Abstract
The basic motif-leucine zipper (bZIP) transcription factor NRL determines rod photoreceptor cell fate during retinal development, and its loss leads to cone-only retina in mice. NRL works synergistically with homeodomain protein CRX and other regulatory factors to control the transcription of most genes associated with rod morphogenesis and functional maturation, which span over a period of several weeks in the mammalian retina. We predicted that NRL gradually establishes rod cell identity and function by temporal and dynamic regulation of stage-specific transcriptional targets. Therefore, we mapped the genomic occupancy of NRL at four stages of mouse photoreceptor differentiation by CUT&RUN analysis. Dynamics of NRL-binding revealed concordance with the corresponding changes in transcriptome of the developing rods. Notably, we identified c-Jun proto-oncogene as one of the targets of NRL, which could bind to specific cis-elements in the c-Jun promoter and modulate its activity in HEK293 cells. Coimmunoprecipitation studies showed association of NRL with c-Jun, also a bZIP protein, in transfected cells as well as in developing mouse retina. Additionally, shRNA-mediated knockdown of c-Jun in the mouse retina in vivo resulted in altered expression of almost 1000 genes, with reduced expression of phototransduction genes and many direct targets of NRL in rod photoreceptors. We propose that c-Jun-NRL heterodimers prime the NRL-directed transcriptional program in neonatal rod photoreceptors before high NRL expression suppresses c-Jun at later stages. Our study highlights a broader cooperation among cell-type restricted and widely expressed bZIP proteins, such as c-Jun, in specific spatiotemporal contexts during cellular differentiation.
Collapse
Affiliation(s)
- Xulong Liang
- Neurobiology, Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, MSC0610, Bethesda, MD 20892, USA
| | - Matthew J Brooks
- Neurobiology, Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, MSC0610, Bethesda, MD 20892, USA
| | - Anand Swaroop
- Neurobiology, Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, MSC0610, Bethesda, MD 20892, USA
| |
Collapse
|
16
|
Steimle JD, Grisanti Canozo FJ, Park M, Kadow ZA, Samee MAH, Martin JF. Decoding the PITX2-controlled genetic network in atrial fibrillation. JCI Insight 2022; 7:e158895. [PMID: 35471998 PMCID: PMC9221021 DOI: 10.1172/jci.insight.158895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Atrial fibrillation (AF), the most common sustained cardiac arrhythmia and a major risk factor for stroke, often arises through ectopic electrical impulses derived from the pulmonary veins (PVs). Sequence variants in enhancers controlling expression of the transcription factor PITX2, which is expressed in the cardiomyocytes (CMs) of the PV and left atrium (LA), have been implicated in AF predisposition. Single nuclei multiomic profiling of RNA and analysis of chromatin accessibility combined with spectral clustering uncovered distinct PV- and LA-enriched CM cell states. Pitx2-mutant PV and LA CMs exhibited gene expression changes consistent with cardiac dysfunction through cell type-distinct, PITX2-directed, cis-regulatory grammars controlling target gene expression. The perturbed network targets in each CM were enriched in distinct human AF predisposition genes, suggesting combinatorial risk for AF genesis. Our data further reveal that PV and LA Pitx2-mutant CMs signal to endothelial and endocardial cells through BMP10 signaling with pathogenic potential. This work provides a multiomic framework for interrogating the basis of AF predisposition in the PVs of humans.
Collapse
Affiliation(s)
| | | | | | - Zachary A. Kadow
- Program in Developmental Biology, and
- Medical Scientist Training Program, Baylor College of Medicine, Houston, Texas, USA
| | | | - James F. Martin
- Department of Integrative Physiology
- Texas Heart Institute, Houston, Texas, USA
- Center for Organ Repair and Renewal, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
17
|
Hojo H, Ohba S. Sp7 Action in the Skeleton: Its Mode of Action, Functions, and Relevance to Skeletal Diseases. Int J Mol Sci 2022; 23:5647. [PMID: 35628456 PMCID: PMC9143072 DOI: 10.3390/ijms23105647] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 05/14/2022] [Accepted: 05/16/2022] [Indexed: 02/01/2023] Open
Abstract
Osteoblast differentiation is a tightly regulated process in which key transcription factors (TFs) and their target genes constitute gene regulatory networks (GRNs) under the control of osteogenic signaling pathways. Among these TFs, Sp7 works as an osteoblast determinant critical for osteoblast differentiation. Following the identification of Sp7 and a large number of its functional studies, recent genome-scale analyses have made a major contribution to the identification of a "non-canonical" mode of Sp7 action as well as "canonical" ones. The analyses have not only confirmed known Sp7 targets but have also uncovered its additional targets and upstream factors. In addition, biochemical analyses have demonstrated that Sp7 actions are regulated by chemical modifications and protein-protein interaction with other transcriptional regulators. Sp7 is also involved in chondrocyte differentiation and osteocyte biology as well as postnatal bone metabolism. The critical role of SP7 in the skeleton is supported by its relevance to human skeletal diseases. This review aims to overview the Sp7 actions in skeletal development and maintenance, particularly focusing on recent advances in our understanding of how Sp7 functions in the skeleton under physiological and pathological conditions.
Collapse
Affiliation(s)
- Hironori Hojo
- Center for Disease Biology and Integrative Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan;
| | - Shinsuke Ohba
- Department of Cell Biology, Institute of Biomedical Sciences, Nagasaki University, Nagasaki 852-8588, Japan
- Department of Oral Anatomy and Developmental Biology, Osaka University Graduate School of Dentistry, Osaka 565-0871, Japan
| |
Collapse
|
18
|
Planat M, Amaral MM, Fang F, Chester D, Aschheim R, Irwin K. Group Theory of Syntactical Freedom in DNA Transcription and Genome Decoding. Curr Issues Mol Biol 2022; 44:1417-1433. [PMID: 35723353 PMCID: PMC9164029 DOI: 10.3390/cimb44040095] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/19/2022] [Accepted: 03/20/2022] [Indexed: 12/24/2022] Open
Abstract
Transcription factors (TFs) are proteins that recognize specific DNA fragments in order to decode the genome and ensure its optimal functioning. TFs work at the local and global scales by specifying cell type, cell growth and death, cell migration, organization and timely tasks. We investigate the structure of DNA-binding motifs with the theory of finitely generated groups. The DNA ‘word’ in the binding domain—the motif—may be seen as the generator of a finitely generated group Fdna on four letters, the bases A, T, G and C. It is shown that, most of the time, the DNA-binding motifs have subgroup structures close to free groups of rank three or less, a property that we call ‘syntactical freedom’. Such a property is associated with the aperiodicity of the motif when it is seen as a substitution sequence. Examples are provided for the major families of TFs, such as leucine zipper factors, zinc finger factors, homeo-domain factors, etc. We also discuss the exceptions to the existence of such DNA syntactical rules and their functional roles. This includes the TATA box in the promoter region of some genes, the single-nucleotide markers (SNP) and the motifs of some genes of ubiquitous roles in transcription and regulation.
Collapse
Affiliation(s)
- Michel Planat
- Institut FEMTO-ST CNRS UMR 6174, Université de Bourgogne-Franche-Comté, F-25044 Besançon, France
- Correspondence:
| | - Marcelo M. Amaral
- Quantum Gravity Research, Los Angeles, CA 90290, USA; (M.M.A.); (F.F.); (D.C.); (R.A.); (K.I.)
| | - Fang Fang
- Quantum Gravity Research, Los Angeles, CA 90290, USA; (M.M.A.); (F.F.); (D.C.); (R.A.); (K.I.)
| | - David Chester
- Quantum Gravity Research, Los Angeles, CA 90290, USA; (M.M.A.); (F.F.); (D.C.); (R.A.); (K.I.)
| | - Raymond Aschheim
- Quantum Gravity Research, Los Angeles, CA 90290, USA; (M.M.A.); (F.F.); (D.C.); (R.A.); (K.I.)
| | - Klee Irwin
- Quantum Gravity Research, Los Angeles, CA 90290, USA; (M.M.A.); (F.F.); (D.C.); (R.A.); (K.I.)
| |
Collapse
|
19
|
Ali O, Farooq A, Yang M, Jin VX, Bjørås M, Wang J. abc4pwm: affinity based clustering for position weight matrices in applications of DNA sequence analysis. BMC Bioinformatics 2022; 23:83. [PMID: 35240993 PMCID: PMC8896320 DOI: 10.1186/s12859-022-04615-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 02/18/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription factor (TF) binding motifs are identified by high throughput sequencing technologies as means to capture Protein-DNA interactions. These motifs are often represented by consensus sequences in form of position weight matrices (PWMs). With ever-increasing pool of TF binding motifs from multiple sources, redundancy issues are difficult to avoid, especially when every source maintains its own database for collection. One solution can be to cluster biologically relevant or similar PWMs, whether coming from experimental detection or in silico predictions. However, there is a lack of efficient tools to cluster PWMs. Assessing quality of PWM clusters is yet another challenge. Therefore, new methods and tools are required to efficiently cluster PWMs and assess quality of clusters. RESULTS A new Python package Affinity Based Clustering for Position Weight Matrices (abc4pwm) was developed. It efficiently clustered PWMs from multiple sources with or without using DNA-Binding Domain (DBD) information, generated a representative motif for each cluster, evaluated the clustering quality automatically, and filtered out incorrectly clustered PWMs. Additionally, it was able to update human DBD family database automatically, classified known human TF PWMs to the respective DBD family, and performed TF motif searching and motif discovery by a new ensemble learning approach. CONCLUSION This work demonstrates applications of abc4pwm in the DNA sequence analysis for various high throughput sequencing data using ~ 1770 human TF PWMs. It recovered known TF motifs at gene promoters based on gene expression profiles (RNA-seq) and identified true TF binding targets for motifs predicted from ChIP-seq experiments. Abc4pwm is a useful tool for TF motif searching, clustering, quality assessment and integration in multiple types of sequence data analysis including RNA-seq, ChIP-seq and ATAC-seq.
Collapse
Affiliation(s)
- Omer Ali
- Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway
| | - Amna Farooq
- Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway
| | - Mingyi Yang
- Department of Medical Biochemistry, Oslo University Hospital and University of Oslo, Oslo, Norway.,Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
| | - Victor X Jin
- Department of Molecular Medicine, University of Texas Health San Antonio, San Antonio, TX, USA
| | - Magnar Bjørås
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway.,Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Junbai Wang
- Department of Clinical Molecular Biology, Institute of Clinical Medicine, University of Oslo, Oslo, Norway. .,Department of Clinical Molecular Biology (EpiGen), Akershus University Hospital, Lørenskog, Norway.
| |
Collapse
|
20
|
Abstract
The change in cell state from normal to malignant is driven fundamentally by oncogenic mutations in cooperation with epigenetic alterations of chromatin. These alterations in chromatin can be a consequence of environmental stressors or germline and/or somatic mutations that directly alter the structure of chromatin machinery proteins, their levels, or their regulatory function. These changes can result in an inability of the cell to differentiate along a predefined lineage path, or drive a hyperactive, highly proliferative state with addiction to high levels of transcriptional output. We discuss how these genetic alterations hijack the chromatin machinery for the oncogenic process to reveal unique vulnerabilities and novel targets for cancer therapy.
Collapse
Affiliation(s)
- Berkley Gryder
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Peter C Scacheri
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Thomas Ried
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Javed Khan
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA
| |
Collapse
|
21
|
Lui JC, Raimann A, Hojo H, Dong L, Roschger P, Kikani B, Wintergerst U, Fratzl-Zelman N, Jee YH, Haeusler G, Baron J. A neomorphic variant in SP7 alters sequence specificity and causes a high-turnover bone disorder. Nat Commun 2022; 13:700. [PMID: 35121733 PMCID: PMC8816926 DOI: 10.1038/s41467-022-28318-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 01/20/2022] [Indexed: 12/14/2022] Open
Abstract
SP7/Osterix is a transcription factor critical for osteoblast maturation and bone formation. Homozygous loss-of-function mutations in SP7 cause osteogenesis imperfecta type XII, but neomorphic (gain-of-new-function) mutations of SP7 have not been reported in humans. Here we describe a de novo dominant neomorphic missense variant (c.926 C > G:p.S309W) in SP7 in a patient with craniosynostosis, cranial hyperostosis, and long bone fragility. Histomorphometry shows increased osteoblasts but decreased bone mineralization. Mice with the corresponding variant also show a complex skeletal phenotype distinct from that of Sp7-null mice. The mutation alters the binding specificity of SP7 from AT-rich motifs to a GC-consensus sequence (typical of other SP family members) and produces an aberrant gene expression profile, including increased expression of Col1a1 and endogenous Sp7, but decreased expression of genes involved in matrix mineralization. Our study identifies a pathogenic mechanism in which a mutation in a transcription factor shifts DNA binding specificity and provides important in vivo evidence that the affinity of SP7 for AT-rich motifs, unique among SP proteins, is critical for normal osteoblast differentiation.
Collapse
Affiliation(s)
- Julian C Lui
- Section on Growth and Development, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA.
| | - Adalbert Raimann
- Department of Pediatrics and Adolescent Medicine, Division of Pediatric Pulmonology, Allergology and Endocrinology, Medical University of Vienna, Vienna, Austria
- Vienna Bone and Growth Center, Vienna, Austria
| | - Hironori Hojo
- Center for Disease and Integrative Medicine, University of Tokyo, Tokyo, Japan
| | - Lijin Dong
- Genetic Engineering Core, National Eye Institute, National Institute of Health, Bethesda, MD, USA
| | - Paul Roschger
- Ludwig Boltzmann Institute of Osteology at the Hanusch Hospital of OEGK and AUVA Trauma Centre Meidling, 1st Medical Department, Hanusch Hospital, Vienna, Austria
| | - Bijal Kikani
- Section on Growth and Development, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Uwe Wintergerst
- Department of Pediatrics, Hospital of Braunau, Braunau, Austria
| | - Nadja Fratzl-Zelman
- Vienna Bone and Growth Center, Vienna, Austria
- Ludwig Boltzmann Institute of Osteology at the Hanusch Hospital of OEGK and AUVA Trauma Centre Meidling, 1st Medical Department, Hanusch Hospital, Vienna, Austria
| | - Youn Hee Jee
- Section on Growth and Development, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Gabriele Haeusler
- Department of Pediatrics and Adolescent Medicine, Division of Pediatric Pulmonology, Allergology and Endocrinology, Medical University of Vienna, Vienna, Austria
- Vienna Bone and Growth Center, Vienna, Austria
| | - Jeffrey Baron
- Section on Growth and Development, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
22
|
Comparative Investigation of Gene Regulatory Processes Underlying Avian Influenza Viruses in Chicken and Duck. BIOLOGY 2022; 11:biology11020219. [PMID: 35205087 PMCID: PMC8868632 DOI: 10.3390/biology11020219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 01/07/2022] [Accepted: 01/25/2022] [Indexed: 11/30/2022]
Abstract
Simple Summary Avian influenza poses a great risk to gallinaceous poultry, while mallard ducks can withstand most virus strains. To date, the mechanisms underlying the susceptibility of chicken and the effective immune response of duck have not been completely understood. In this study, our aim is to investigate the transcriptional gene regulation governing the expression of important avian-influenza-induced genes and to reveal the master regulators stimulating an effective immune response after virus infection in ducks while dysfunctioning in chicken. Abstract The avian influenza virus (AIV) mainly affects birds and not only causes animals’ deaths, but also poses a great risk of zoonotically infecting humans. While ducks and wild waterfowl are seen as a natural reservoir for AIVs and can withstand most virus strains, chicken mostly succumb to infection with high pathogenic avian influenza (HPAI). To date, the mechanisms underlying the susceptibility of chicken and the effective immune response of duck have not been completely unraveled. In this study, we investigate the transcriptional gene regulation underlying disease progression in chicken and duck after AIV infection. For this purpose, we use a publicly available RNA-sequencing dataset from chicken and ducks infected with low-pathogenic avian influenza (LPAI) H5N2 and HPAI H5N1 (lung and ileum tissues, 1 and 3 days post-infection). Unlike previous studies, we performed a promoter analysis based on orthologous genes to detect important transcription factors (TFs) and their cooperation, based on which we apply a systems biology approach to identify common and species-specific master regulators. We found master regulators such as EGR1, FOS, and SP1, specifically for chicken and ETS1 and SMAD3/4, specifically for duck, which could be responsible for the duck’s effective and the chicken’s ineffective immune response.
Collapse
|
23
|
Ohba S. Genome-scale actions of master regulators directing skeletal development. JAPANESE DENTAL SCIENCE REVIEW 2021; 57:217-223. [PMID: 34745394 PMCID: PMC8556520 DOI: 10.1016/j.jdsr.2021.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 09/14/2021] [Accepted: 10/10/2021] [Indexed: 11/03/2022] Open
Abstract
The mammalian skeleton develops through two distinct modes of ossification: intramembranous ossification and endochondral ossification. During the process of skeletal development, SRY-box containing gene 9 (Sox9), runt-related transcription factor 2 (Runx2), and Sp7 work as master transcription factors (TFs) or transcriptional regulators, underlying cell fate specification of the two distinct populations: bone-forming osteoblasts and cartilage-forming chondrocytes. In the past two decades, core transcriptional circuits underlying skeletal development have been identified mainly through mouse genetics and biochemical approaches. Recently emerging next-generation sequencer (NGS)-based studies have provided genome-scale views on the gene regulatory landscape programmed by the master TFs/transcriptional regulators. With particular focus on Sox9, Runx2, and Sp7, this review aims to discuss the gene regulatory landscape in skeletal development, which has been identified by genome-scale data, and provide future perspectives in this field.
Collapse
Affiliation(s)
- Shinsuke Ohba
- Department of Cell Biology, Institute of Biomedical Sciences, Nagasaki University, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan
| |
Collapse
|
24
|
Muley VY, König R. Human transcriptional gene regulatory network compiled from 14 data resources. Biochimie 2021; 193:115-125. [PMID: 34740743 DOI: 10.1016/j.biochi.2021.10.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 10/28/2021] [Accepted: 10/29/2021] [Indexed: 11/02/2022]
Abstract
The transcriptional regulatory network (TRN) in a cell orchestrates spatio-temporal expression of genes to generate cellular responses for maintenance, reproduction, development and survival of the cell and its hosting organism. Transcription factors (TF) regulate the expression of their target genes (TG) and are the fundamental units of TRN. Several databases have been developed to catalogue human TRN based on low- and high-throughput experimental and computational studies considering their importance in understanding cellular physiology. However, literature lacks their comparative assessment on the strengths and weaknesses. We compared over 2.2 million regulatory pairs between 1379 TF and 22,518 TG from 14 resources. Our study reveals that the TF and TG were common across data resources but not their regulatory pairs. TF and TG of the regulatory pairs showed weak expression correlation, significant gene ontology overlap, co-citations in PubMed and low numbers of TF-TG pairs representing transcriptional repression relationships. We assigned each TF-TG regulatory pair a combined confidence score reflecting its reliability based on its presence in multiple databases. The assembled TRN contains 2,246,598 TF-TG pairs, of which, 44,284 with information on TF's activating or repressing effects on their TG and is available upon request. This study brings the information about transcriptional regulation scattered over the literature and databases at one place in the form of one of the most comprehensive and complete human TRN assembled to date. It will be a valuable resource for benchmarking TRN prediction tools, and to the scientific community working in functional genomics, gene expression and regulation analysis.
Collapse
Affiliation(s)
| | - Rainer König
- Institute for Infectious Diseases and Infection Control, Jena University Hospital, Jena, Germany; Integrated Research and Treatment Center, Center for Sepsis Control and Care, Jena University Hospital, Jena, Germany.
| |
Collapse
|
25
|
Yao Z, Liu H, Xie F, Fischer S, Adkins RS, Aldridge AI, Ament SA, Bartlett A, Behrens MM, Van den Berge K, Bertagnolli D, de Bézieux HR, Biancalani T, Booeshaghi AS, Bravo HC, Casper T, Colantuoni C, Crabtree J, Creasy H, Crichton K, Crow M, Dee N, Dougherty EL, Doyle WI, Dudoit S, Fang R, Felix V, Fong O, Giglio M, Goldy J, Hawrylycz M, Herb BR, Hertzano R, Hou X, Hu Q, Kancherla J, Kroll M, Lathia K, Li YE, Lucero JD, Luo C, Mahurkar A, McMillen D, Nadaf NM, Nery JR, Nguyen TN, Niu SY, Ntranos V, Orvis J, Osteen JK, Pham T, Pinto-Duarte A, Poirion O, Preissl S, Purdom E, Rimorin C, Risso D, Rivkin AC, Smith K, Street K, Sulc J, Svensson V, Tieu M, Torkelson A, Tung H, Vaishnav ED, Vanderburg CR, van Velthoven C, Wang X, White OR, Huang ZJ, Kharchenko PV, Pachter L, Ngai J, Regev A, Tasic B, Welch JD, Gillis J, Macosko EZ, Ren B, Ecker JR, Zeng H, Mukamel EA. A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature 2021; 598:103-110. [PMID: 34616066 PMCID: PMC8494649 DOI: 10.1038/s41586-021-03500-8] [Citation(s) in RCA: 132] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 03/26/2021] [Indexed: 12/30/2022]
Abstract
Single-cell transcriptomics can provide quantitative molecular signatures for large, unbiased samples of the diverse cell types in the brain1-3. With the proliferation of multi-omics datasets, a major challenge is to validate and integrate results into a biological understanding of cell-type organization. Here we generated transcriptomes and epigenomes from more than 500,000 individual cells in the mouse primary motor cortex, a structure that has an evolutionarily conserved role in locomotion. We developed computational and statistical methods to integrate multimodal data and quantitatively validate cell-type reproducibility. The resulting reference atlas-containing over 56 neuronal cell types that are highly replicable across analysis methods, sequencing technologies and modalities-is a comprehensive molecular and genomic account of the diverse neuronal and non-neuronal cell types in the mouse primary motor cortex. The atlas includes a population of excitatory neurons that resemble pyramidal cells in layer 4 in other cortical regions4. We further discovered thousands of concordant marker genes and gene regulatory elements for these cell types. Our results highlight the complex molecular regulation of cell types in the brain and will directly enable the design of reagents to target specific cell types in the mouse primary motor cortex for functional analysis.
Collapse
Affiliation(s)
- Zizhen Yao
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Hanqing Liu
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Fangming Xie
- grid.266100.30000 0001 2107 4242Department of Physics, University of California, San Diego, La Jolla, CA USA
| | - Stephan Fischer
- grid.225279.90000 0004 0387 3667Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY USA
| | - Ricky S. Adkins
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Andrew I. Aldridge
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Seth A. Ament
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Anna Bartlett
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - M. Margarita Behrens
- grid.250671.70000 0001 0662 7144Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Koen Van den Berge
- grid.47840.3f0000 0001 2181 7878Department of Statistics, University of California, Berkeley, Berkeley, CA USA ,grid.5342.00000 0001 2069 7798Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
| | | | - Hector Roux de Bézieux
- grid.47840.3f0000 0001 2181 7878Division of Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, CA USA
| | | | - A. Sina Booeshaghi
- grid.20861.3d0000000107068890California Institute of Technology, Pasadena, CA USA
| | - Héctor Corrada Bravo
- grid.164295.d0000 0001 0941 7177Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD USA
| | - Tamara Casper
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Carlo Colantuoni
- grid.21107.350000 0001 2171 9311Johns Hopkins School of Medicine, Department of Neurology, Baltimore, MD USA ,grid.21107.350000 0001 2171 9311Johns Hopkins School of Medicine, Department of Neuroscience, Baltimore, MD USA ,grid.411024.20000 0001 2175 4264University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD USA
| | - Jonathan Crabtree
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Heather Creasy
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | | | - Megan Crow
- grid.225279.90000 0004 0387 3667Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY USA
| | - Nick Dee
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | | | - Wayne I. Doyle
- grid.266100.30000 0001 2107 4242Department of Cognitive Science, University of California, San Diego, La Jolla, CA USA
| | - Sandrine Dudoit
- grid.47840.3f0000 0001 2181 7878Department of Statistics, University of California, Berkeley, Berkeley, CA USA
| | - Rongxin Fang
- grid.266100.30000 0001 2107 4242Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, San Diego, CA USA
| | - Victor Felix
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Olivia Fong
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Michelle Giglio
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Jeff Goldy
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Mike Hawrylycz
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Brian R. Herb
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Ronna Hertzano
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA ,grid.411024.20000 0001 2175 4264Department of Otorhinolaryngology, Anatomy and Neurobiology, University of Maryland School of Medicine, Baltimore, MD USA
| | - Xiaomeng Hou
- grid.266100.30000 0001 2107 4242Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA USA
| | - Qiwen Hu
- grid.38142.3c000000041936754XDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA USA
| | - Jayaram Kancherla
- grid.164295.d0000 0001 0941 7177Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD USA
| | - Matthew Kroll
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Kanan Lathia
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Yang Eric Li
- grid.1052.60000000097371625Ludwig Institute for Cancer Research, La Jolla, CA USA
| | - Jacinta D. Lucero
- grid.250671.70000 0001 0662 7144Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Chongyuan Luo
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA ,grid.19006.3e0000 0000 9632 6718Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA USA ,grid.250671.70000 0001 0662 7144Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Anup Mahurkar
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | | | - Naeem M. Nadaf
- grid.66859.34Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Joseph R. Nery
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | | | - Sheng-Yong Niu
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Vasilis Ntranos
- grid.266102.10000 0001 2297 6811University of California, San Francisco, San Francisco, CA USA
| | - Joshua Orvis
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Julia K. Osteen
- grid.250671.70000 0001 0662 7144Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Thanh Pham
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Antonio Pinto-Duarte
- grid.250671.70000 0001 0662 7144Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Olivier Poirion
- grid.266100.30000 0001 2107 4242Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA USA
| | - Sebastian Preissl
- grid.266100.30000 0001 2107 4242Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA USA
| | - Elizabeth Purdom
- grid.47840.3f0000 0001 2181 7878Department of Statistics, University of California, Berkeley, Berkeley, CA USA
| | | | - Davide Risso
- grid.5608.b0000 0004 1757 3470Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Angeline C. Rivkin
- grid.250671.70000 0001 0662 7144Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Kimberly Smith
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Kelly Street
- grid.65499.370000 0001 2106 9910Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA USA
| | - Josef Sulc
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Valentine Svensson
- grid.20861.3d0000000107068890California Institute of Technology, Pasadena, CA USA
| | - Michael Tieu
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Amy Torkelson
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Herman Tung
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | | | | | | | - Xinxin Wang
- grid.266100.30000 0001 2107 4242Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA USA ,grid.4367.60000 0001 2355 7002Present Address: McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO USA
| | - Owen R. White
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA
| | - Z. Josh Huang
- grid.225279.90000 0004 0387 3667Cold Spring Harbor Laboratory, Cold Spring Harbor, NY USA
| | - Peter V. Kharchenko
- grid.38142.3c000000041936754XDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA USA
| | - Lior Pachter
- grid.20861.3d0000000107068890California Institute of Technology, Pasadena, CA USA
| | - John Ngai
- grid.47840.3f0000 0001 2181 7878Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA USA
| | - Aviv Regev
- grid.66859.34Broad Institute of MIT and Harvard, Cambridge, MA USA ,grid.116068.80000 0001 2341 2786Howard Hughes Medical Institute, Department of Biology, MIT, Cambridge, MA USA
| | - Bosiljka Tasic
- grid.417881.3Allen Institute for Brain Science, Seattle, WA USA
| | - Joshua D. Welch
- grid.214458.e0000000086837370Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA
| | - Jesse Gillis
- grid.225279.90000 0004 0387 3667Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY USA
| | - Evan Z. Macosko
- grid.66859.34Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Bing Ren
- grid.266100.30000 0001 2107 4242Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA USA ,grid.1052.60000000097371625Ludwig Institute for Cancer Research, La Jolla, CA USA
| | - Joseph R. Ecker
- grid.250671.70000 0001 0662 7144Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA USA ,grid.250671.70000 0001 0662 7144Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA USA
| | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA, USA.
| | - Eran A. Mukamel
- grid.266100.30000 0001 2107 4242Department of Cognitive Science, University of California, San Diego, La Jolla, CA USA
| |
Collapse
|
26
|
Ignatieva EV, Matrosova EA. Disease-associated genetic variants in the regulatory regions of human genes: mechanisms of action on transcription and genomic resources for dissecting these mechanisms. Vavilovskii Zhurnal Genet Selektsii 2021; 25:18-29. [PMID: 34541447 PMCID: PMC8408020 DOI: 10.18699/vj21.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 01/18/2021] [Accepted: 01/18/2021] [Indexed: 11/21/2022] Open
Abstract
Whole genome and whole exome sequencing technologies play a very important role in the studies of the genetic aspects of the pathogenesis of various diseases. The ample use of genome-wide and exome-wide association study
methodology (GWAS and EWAS) made it possible to identify a large number of genetic variants associated with diseases.
This information is accumulated in the databases like GWAS central, GWAS catalog, OMIM, ClinVar, etc. Most of the variants identified by the GWAS technique are located in the noncoding regions of the human genome. According to the
ENCODE project, the fraction of regions in the human genome potentially involved in transcriptional control is many times
greater than the fraction of coding regions. Thus, genetic variation in noncoding regions of the genome can increase the
susceptibility to diseases by disrupting various regulatory elements (promoters, enhancers, silencers, insulator regions,
etc.). However, identification of the mechanisms of influence of pathogenic genetic variants on the diseases risk is difficult
due to a wide variety of regulatory elements. The present review focuses on the molecular genetic mechanisms by which
pathogenic genetic variants affect gene expression. At the same time, attention is concentrated on the transcriptional level
of regulation as an initial step in the expression of any gene. A triggering event mediating the effect of a pathogenic genetic
variant on the level of gene expression can be, for example, a change in the functional activity of transcription factor binding sites (TFBSs) or DNA methylation change, which, in turn, affects the functional activity of promoters or enhancers. Dissecting the regulatory roles of polymorphic loci have been impossible without close integration of modern experimental
approaches with computer analysis of a growing wealth of genetic and biological data obtained using omics technologies.
The review provides a brief description of a number of the most well-known public genomic information resources containing data obtained using omics technologies, including (1) resources that accumulate data on the chromatin states and the
regions of transcription factor binding derived from ChIP-seq experiments; (2) resources containing data on genomic loci,
for which allele-specific transcription factor binding was revealed based on ChIP-seq technology; (3) resources containing
in silico predicted data on the potential impact of genetic variants on the transcription factor binding sites
Collapse
Affiliation(s)
- E V Ignatieva
- Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Novosibirsk State University, Novosibirsk, Russia
| | - E A Matrosova
- Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|
27
|
Jovanovic VM, Sarfert M, Reyna-Blanco CS, Indrischek H, Valdivia DI, Shelest E, Nowick K. Positive Selection in Gene Regulatory Factors Suggests Adaptive Pleiotropic Changes During Human Evolution. Front Genet 2021; 12:662239. [PMID: 34079582 PMCID: PMC8166252 DOI: 10.3389/fgene.2021.662239] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 04/19/2021] [Indexed: 01/09/2023] Open
Abstract
Gene regulatory factors (GRFs), such as transcription factors, co-factors and histone-modifying enzymes, play many important roles in modifying gene expression in biological processes. They have also been proposed to underlie speciation and adaptation. To investigate potential contributions of GRFs to primate evolution, we analyzed GRF genes in 27 publicly available primate genomes. Genes coding for zinc finger (ZNF) proteins, especially ZNFs with a Krüppel-associated box (KRAB) domain were the most abundant TFs in all genomes. Gene numbers per TF family differed between all species. To detect signs of positive selection in GRF genes we investigated more than 3,000 human GRFs with their more than 70,000 orthologs in 26 non-human primates. We implemented two independent tests for positive selection, the branch-site-model of the PAML suite and aBSREL of the HyPhy suite, focusing on the human and great ape branch. Our workflow included rigorous procedures to reduce the number of false positives: excluding distantly similar orthologs, manual corrections of alignments, and considering only genes and sites detected by both tests for positive selection. Furthermore, we verified the candidate sites for selection by investigating their variation within human and non-human great ape population data. In order to approximately assign a date to positively selected sites in the human lineage, we analyzed archaic human genomes. Our work revealed with high confidence five GRFs that have been positively selected on the human lineage and one GRF that has been positively selected on the great ape lineage. These GRFs are scattered on different chromosomes and have been previously linked to diverse functions. For some of them a role in speciation and/or adaptation can be proposed based on the expression pattern or association with human diseases, but it seems that they all contributed independently to human evolution. Four of the positively selected GRFs are KRAB-ZNF proteins, that induce changes in target genes co-expression and/or through arms race with transposable elements. Since each positively selected GRF contains several sites with evidence for positive selection, we suggest that these GRFs participated pleiotropically to phenotypic adaptations in humans.
Collapse
Affiliation(s)
- Vladimir M Jovanovic
- Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany.,Bioinformatics Solution Center, Freie Universität Berlin, Berlin, Germany
| | - Melanie Sarfert
- Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany
| | - Carlos S Reyna-Blanco
- Department of Biology, University of Fribourg, Fribourg, Switzerland.,Swiss Institute of Bioinformatics, Fribourg, Switzerland
| | - Henrike Indrischek
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Max Planck Institute for the Physics of Complex Systems, Dresden, Germany.,Center for Systems Biology Dresden, Dresden, Germany
| | - Dulce I Valdivia
- Evolutionary Genomics Laboratory and Genome Topology and Regulation Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-Irapuato), Irapuato, Mexico
| | - Ekaterina Shelest
- Centre for Enzyme Innovation, University of Portsmouth, Portsmouth, United Kingdom
| | - Katja Nowick
- Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
28
|
Kamaraj US, Chen J, Katwadi K, Ouyang JF, Yang Sun YB, Lim YM, Liu X, Handoko L, Polo JM, Petretto E, Rackham OJ. EpiMogrify Models H3K4me3 Data to Identify Signaling Molecules that Improve Cell Fate Control and Maintenance. Cell Syst 2020; 11:509-522.e10. [DOI: 10.1016/j.cels.2020.09.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 04/30/2020] [Accepted: 09/14/2020] [Indexed: 12/14/2022]
|
29
|
Skeffington AW, Donath A. ProminTools: shedding light on proteins of unknown function in biomineralization with user friendly tools illustrated using mollusc shell matrix protein sequences. PeerJ 2020; 8:e9852. [PMID: 32974096 PMCID: PMC7489238 DOI: 10.7717/peerj.9852] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 08/11/2020] [Indexed: 01/24/2023] Open
Abstract
Biominerals are crucial to the fitness of many organism and studies of the mechanisms of biomineralization are driving research into novel materials. Biomineralization is generally controlled by a matrix of organic molecules including proteins, so proteomic studies of biominerals are important for understanding biomineralization mechanisms. Many such studies identify large numbers of proteins of unknown function, which are often of low sequence complexity and biased in their amino acid composition. A lack of user-friendly tools to find patterns in such sequences and robustly analyse their statistical properties relative to the background proteome means that they are often neglected in follow-up studies. Here we present ProminTools, a user-friendly package for comparison of two sets of protein sequences in terms of their global properties and motif content. Outputs include data tables, graphical summaries in an html file and an R-script as a starting point for data-set specific visualizations. We demonstrate the utility of ProminTools using a previously published shell matrix proteome of the giant limpet Lottia gigantea.
Collapse
Affiliation(s)
| | - Andreas Donath
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| |
Collapse
|
30
|
Hojo H, Ohba S. Gene regulatory landscape in osteoblast differentiation. Bone 2020; 137:115458. [PMID: 32474244 DOI: 10.1016/j.bone.2020.115458] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/25/2020] [Accepted: 05/25/2020] [Indexed: 12/29/2022]
Abstract
The development of osteoblasts, a bone-forming cell population, occurs in conjunction with development of the skeleton, which creates our physical framework and shapes the body. In the past two decades, genetic studies have uncovered the molecular framework of this process-namely, transcriptional regulators and signaling pathways coordinate the cell fate determination and differentiation of osteoblasts in a spatial and temporal manner. Recently emerging genome-wide studies provide additional layers of understanding of the gene regulatory landscape during osteoblast differentiation, allowing us to gain novel insight into the modes of action of the key regulators, functional interaction among the regulator-bound enhancers, epigenetic regulations, and the complex nature of regulatory inputs. In this review, we summarize current understanding of the transcriptional regulation in osteoblasts, in terms of the gene regulatory landscape.
Collapse
Affiliation(s)
- Hironori Hojo
- Department of Clinical Biotechnology, Center for Disease Biology and Integrative Medicine, The University of Tokyo Graduate School of Medicine, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Shinsuke Ohba
- Department of Cell Biology, Institute of Biomedical Sciences, Nagasaki University, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan.
| |
Collapse
|
31
|
Partridge EC, Chhetri SB, Prokop JW, Ramaker RC, Jansen CS, Goh ST, Mackiewicz M, Newberry KM, Brandsmeier LA, Meadows SK, Messer CL, Hardigan AA, Coppola CJ, Dean EC, Jiang S, Savic D, Mortazavi A, Wold BJ, Myers RM, Mendenhall EM. Occupancy maps of 208 chromatin-associated proteins in one human cell type. Nature 2020; 583:720-728. [PMID: 32728244 PMCID: PMC7398277 DOI: 10.1038/s41586-020-2023-4] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/09/2020] [Indexed: 01/02/2023]
Abstract
Transcription factors are DNA-binding proteins that have key roles in gene regulation1,2. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes3–6. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP–seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium. ChIP–seq and CETCh–seq data are used to analyse binding maps for 208 transcription factors and other chromatin-associated proteins in a single human cell type, providing a comprehensive catalogue of the transcription factor landscape and gene regulatory networks in these cells.
Collapse
Affiliation(s)
| | - Surya B Chhetri
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MA, USA
| | - Jeremy W Prokop
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI, USA
| | - Ryne C Ramaker
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Camden S Jansen
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
| | - Say-Tar Goh
- Division of Biology, California Institute of Technology, Pasadena, CA, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | | | - Sarah K Meadows
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - C Luke Messer
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Andrew A Hardigan
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Candice J Coppola
- Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA
| | - Emma C Dean
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.,Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Shan Jiang
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
| | - Daniel Savic
- Pharmaceutical Sciences Department, St Jude Children's Research Hospital, Memphis, TN, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
| | - Barbara J Wold
- Division of Biology, California Institute of Technology, Pasadena, CA, USA
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
| | - Eric M Mendenhall
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA. .,Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, AL, USA.
| |
Collapse
|
32
|
Buljan M, Ciuffa R, van Drogen A, Vichalkovski A, Mehnert M, Rosenberger G, Lee S, Varjosalo M, Pernas LE, Spegg V, Snijder B, Aebersold R, Gstaiger M. Kinase Interaction Network Expands Functional and Disease Roles of Human Kinases. Mol Cell 2020; 79:504-520.e9. [PMID: 32707033 PMCID: PMC7427327 DOI: 10.1016/j.molcel.2020.07.001] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 02/14/2020] [Accepted: 06/30/2020] [Indexed: 12/30/2022]
Abstract
Protein kinases are essential for signal transduction and control of most cellular processes, including metabolism, membrane transport, motility, and cell cycle. Despite the critical role of kinases in cells and their strong association with diseases, good coverage of their interactions is available for only a fraction of the 535 human kinases. Here, we present a comprehensive mass-spectrometry-based analysis of a human kinase interaction network covering more than 300 kinases. The interaction dataset is a high-quality resource with more than 5,000 previously unreported interactions. We extensively characterized the obtained network and were able to identify previously described, as well as predict new, kinase functional associations, including those of the less well-studied kinases PIM3 and protein O-mannose kinase (POMK). Importantly, the presented interaction map is a valuable resource for assisting biomedical studies. We uncover dozens of kinase-disease associations spanning from genetic disorders to complex diseases, including cancer.
Collapse
Affiliation(s)
- Marija Buljan
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland; Empa, Swiss Federal Laboratories for Materials Science and Technology, 9014 St. Gallen, Switzerland
| | - Rodolfo Ciuffa
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Audrey van Drogen
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Anton Vichalkovski
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Martin Mehnert
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - George Rosenberger
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland; Columbia University Department of Systems Biology, New York, NY 10032, USA
| | - Sohyon Lee
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Markku Varjosalo
- Institute of Biotechnology, University of Helsinki, Helsinki 00014, Finland
| | - Lucia Espona Pernas
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Vincent Spegg
- Department of Molecular Mechanisms of Disease, University of Zurich, 8057 Zurich, Switzerland
| | - Berend Snijder
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland; Faculty of Science, University of Zurich, Zurich, Switzerland.
| | - Matthias Gstaiger
- Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland.
| |
Collapse
|
33
|
Sulakhe D, D'Souza M, Wang S, Balasubramanian S, Athri P, Xie B, Canzar S, Agam G, Gilliam TC, Maltsev N. Exploring the functional impact of alternative splicing on human protein isoforms using available annotation sources. Brief Bioinform 2020; 20:1754-1768. [PMID: 29931155 DOI: 10.1093/bib/bby047] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/02/2018] [Indexed: 12/30/2022] Open
Abstract
In recent years, the emphasis of scientific inquiry has shifted from whole-genome analyses to an understanding of cellular responses specific to tissue, developmental stage or environmental conditions. One of the central mechanisms underlying the diversity and adaptability of the contextual responses is alternative splicing (AS). It enables a single gene to encode multiple isoforms with distinct biological functions. However, to date, the functions of the vast majority of differentially spliced protein isoforms are not known. Integration of genomic, proteomic, functional, phenotypic and contextual information is essential for supporting isoform-based modeling and analysis. Such integrative proteogenomics approaches promise to provide insights into the functions of the alternatively spliced protein isoforms and provide high-confidence hypotheses to be validated experimentally. This manuscript provides a survey of the public databases supporting isoform-based biology. It also presents an overview of the potential global impact of AS on the human canonical gene functions, molecular interactions and cellular pathways.
Collapse
Affiliation(s)
- Dinanath Sulakhe
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, USA
| | - Mark D'Souza
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA
| | - Sheng Wang
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, USA
| | - Sandhya Balasubramanian
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Genentech, Inc. 1 DNA Way, Mail Stop: 35-6J, South San Francisco, CA, USA
| | - Prashanth Athri
- Department of Computer Science and Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, Kasavanahalli, Carmelaram P.O., Bengaluru, Karnataka, India
| | - Bingqing Xie
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Department of Computer Science, Illinois Institute of Technology, Chicago, IL, USA
| | - Stefan Canzar
- Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, USA.,Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Gady Agam
- Department of Computer Science, Illinois Institute of Technology, Chicago, IL, USA
| | - T Conrad Gilliam
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, USA
| | - Natalia Maltsev
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, USA
| |
Collapse
|
34
|
Gorkin DU, Barozzi I, Zhao Y, Zhang Y, Huang H, Lee AY, Li B, Chiou J, Wildberg A, Ding B, Zhang B, Wang M, Strattan JS, Davidson JM, Qiu Y, Afzal V, Akiyama JA, Plajzer-Frick I, Novak CS, Kato M, Garvin TH, Pham QT, Harrington AN, Mannion BJ, Lee EA, Fukuda-Yuzawa Y, He Y, Preissl S, Chee S, Han JY, Williams BA, Trout D, Amrhein H, Yang H, Cherry JM, Wang W, Gaulton K, Ecker JR, Shen Y, Dickel DE, Visel A, Pennacchio LA, Ren B. An atlas of dynamic chromatin landscapes in mouse fetal development. Nature 2020; 583:744-751. [PMID: 32728240 PMCID: PMC7398618 DOI: 10.1038/s41586-020-2093-3] [Citation(s) in RCA: 190] [Impact Index Per Article: 47.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 06/11/2019] [Indexed: 02/08/2023]
Abstract
The Encyclopedia of DNA Elements (ENCODE) project has established a genomic resource for mammalian development, profiling a diverse panel of mouse tissues at 8 developmental stages from 10.5 days after conception until birth, including transcriptomes, methylomes and chromatin states. Here we systematically examined the state and accessibility of chromatin in the developing mouse fetus. In total we performed 1,128 chromatin immunoprecipitation with sequencing (ChIP-seq) assays for histone modifications and 132 assay for transposase-accessible chromatin using sequencing (ATAC-seq) assays for chromatin accessibility across 72 distinct tissue-stages. We used integrative analysis to develop a unified set of chromatin state annotations, infer the identities of dynamic enhancers and key transcriptional regulators, and characterize the relationship between chromatin state and accessibility during developmental gene regulation. We also leveraged these data to link enhancers to putative target genes and demonstrate tissue-specific enrichments of sequence variants associated with disease in humans. The mouse ENCODE data sets provide a compendium of resources for biomedical researchers and achieve, to our knowledge, the most comprehensive view of chromatin dynamics during mammalian fetal development to date.
Collapse
Affiliation(s)
- David U Gorkin
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Iros Barozzi
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Yuan Zhao
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, USA
| | - Yanxiao Zhang
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Hui Huang
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Biomedical Sciences Graduate Program, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Ah Young Lee
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Bin Li
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Joshua Chiou
- Biomedical Sciences Graduate Program, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Department of Pediatrics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Andre Wildberg
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Bo Ding
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Bo Zhang
- Department of Biochemistry and Molecular Biology, Penn State School of Medicine, Hershey, PA, USA
| | - Mengchi Wang
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - J Seth Strattan
- Stanford University School of Medicine, Department of Genetics, Stanford, CA, USA
| | - Jean M Davidson
- Stanford University School of Medicine, Department of Genetics, Stanford, CA, USA
| | - Yunjiang Qiu
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, USA
| | - Veena Afzal
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Jennifer A Akiyama
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Ingrid Plajzer-Frick
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Catherine S Novak
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Momoe Kato
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Tyler H Garvin
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Quan T Pham
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Anne N Harrington
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Brandon J Mannion
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Elizabeth A Lee
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Yoko Fukuda-Yuzawa
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Yupeng He
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, USA
- Genomic Analysis Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Sebastian Preissl
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Sora Chee
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Jee Yun Han
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Brian A Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Diane Trout
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Henry Amrhein
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Hongbo Yang
- Department of Biochemistry and Molecular Biology, Penn State School of Medicine, Hershey, PA, USA
| | - J Michael Cherry
- Stanford University School of Medicine, Department of Genetics, Stanford, CA, USA
| | - Wei Wang
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Kyle Gaulton
- Department of Pediatrics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Joseph R Ecker
- Genomic Analysis Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
- Howard Hughes Medical Institute, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Yin Shen
- Institute for Human Genetics and University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Diane E Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
- School of Natural Sciences, University of California, Merced, Merced, CA, USA.
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA, USA.
| | - Bing Ren
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Genomic Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Moores Cancer Center, University of California, San Diego School of Medicine, La Jolla, CA, USA.
| |
Collapse
|
35
|
Identifying Cattle Breed-Specific Partner Choice of Transcription Factors during the African Trypanosomiasis Disease Progression Using Bioinformatics Analysis. Vaccines (Basel) 2020; 8:vaccines8020246. [PMID: 32456126 PMCID: PMC7350023 DOI: 10.3390/vaccines8020246] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 05/13/2020] [Accepted: 05/21/2020] [Indexed: 12/18/2022] Open
Abstract
African Animal Trypanosomiasis (AAT) is a disease caused by pathogenic trypanosomes which affects millions of livestock every year causing huge economic losses in agricultural production especially in sub-Saharan Africa. The disease is spread by the tsetse fly which carries the parasite in its saliva. During the disease progression, the cattle are prominently subjected to anaemia, weight loss, intermittent fever, chills, neuronal degeneration, congestive heart failure, and finally death. According to their different genetic programs governing the level of tolerance to AAT, cattle breeds are classified as either resistant or susceptible. In this study, we focus on the cattle breeds N’Dama and Boran which are known to be resistant and susceptible to trypanosomiasis, respectively. Despite the rich literature on both breeds, the gene regulatory mechanisms of the underlying biological processes for their resistance and susceptibility have not been extensively studied. To address the limited knowledge about the tissue-specific transcription factor (TF) cooperations associated with trypanosomiasis, we investigated gene expression data from these cattle breeds computationally. Consequently, we identified significant cooperative TF pairs (especially DBP−PPARA and DBP−THAP1 in N’Dama and DBP−PAX8 in Boran liver tissue) which could help understand the underlying AAT tolerance/susceptibility mechanism in both cattle breeds.
Collapse
|
36
|
Levitsky V, Zemlyanskaya E, Oshchepkov D, Podkolodnaya O, Ignatieva E, Grosse I, Mironova V, Merkulova T. A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package. Nucleic Acids Res 2020; 47:e139. [PMID: 31750523 PMCID: PMC6868382 DOI: 10.1093/nar/gkz800] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 08/12/2019] [Accepted: 09/09/2019] [Indexed: 01/20/2023] Open
Abstract
Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.
Collapse
Affiliation(s)
- Victor Levitsky
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Elena Zemlyanskaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Dmitry Oshchepkov
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Olga Podkolodnaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Elena Ignatieva
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Ivo Grosse
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany
| | - Victoria Mironova
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Tatyana Merkulova
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Department of Molecular Genetics, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| |
Collapse
|
37
|
Daou R, Beißbarth T, Wingender E, Gültas M, Haubrock M. Constructing temporal regulatory cascades in the context of development and cell differentiation. PLoS One 2020; 15:e0231326. [PMID: 32275727 PMCID: PMC7147753 DOI: 10.1371/journal.pone.0231326] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 03/20/2020] [Indexed: 12/02/2022] Open
Abstract
Cell differentiation is a complex process orchestrated by sets of regulators precisely appearing at certain time points, resulting in regulatory cascades that affect the expression of broader sets of genes, ending up in the formation of different tissues and organ parts. The identification of stage-specific master regulators and the mechanism by which they activate each other is a key to understanding and controlling differentiation, particularly in the fields of tissue regeneration and organoid engineering. Here we present a workflow that combines a comprehensive general regulatory network based on binding site predictions with user-provided temporal gene expression data, to generate a a temporally connected series of stage-specific regulatory networks, which we call a temporal regulatory cascade (TRC). A TRC identifies those regulators that are unique for each time point, resulting in a cascade that shows the emergence of these regulators and regulatory interactions across time. The model was implemented in the form of a user-friendly, visual web-tool, that requires no expert knowledge in programming or statistics, making it directly usable for life scientists. In addition to generating TRCs the tool links multiple interactive visual workflows, in which a user can track and investigate further different regulators, target genes, and interactions, directing the tool along the way into biologically sensible results based on the given dataset. We applied the TRC model on two different expression datasets, one based on experiments conducted on human induced pluripotent stem cells (hiPSCs) undergoing differentiation into mature cardiomyocytes and the other based on the differentiation of H1-derived human neuronal precursor cells. The model was successful in identifying previously known and new potential key regulators, in addition to the particular time points with which these regulators are associated, in cardiac and neural development.
Collapse
Affiliation(s)
- Rayan Daou
- Department of Medical Bioinformatics, University Medical Center Göttingen, Goettingen, Niedersachsen, Germany
| | - Tim Beißbarth
- Department of Medical Bioinformatics, University Medical Center Göttingen, Goettingen, Niedersachsen, Germany
| | - Edgar Wingender
- Department of Medical Bioinformatics, University Medical Center Göttingen, Goettingen, Niedersachsen, Germany
| | - Mehmet Gültas
- Breeding Informatics Group, Department of Animal Science, Georg-August University, Goettingen, Niedersachsen, Germany
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Goettingen, Niedersachsen, Germany
| | - Martin Haubrock
- Department of Medical Bioinformatics, University Medical Center Göttingen, Goettingen, Niedersachsen, Germany
- * E-mail:
| |
Collapse
|
38
|
Baumgarten N, Schmidt F, Schulz MH. Improved linking of motifs to their TFs using domain information. Bioinformatics 2020; 36:1655-1662. [PMID: 31742324 PMCID: PMC7703792 DOI: 10.1093/bioinformatics/btz855] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 11/08/2019] [Accepted: 11/16/2019] [Indexed: 11/23/2022] Open
Abstract
Motivation A central aim of molecular biology is to identify mechanisms of transcriptional regulation. Transcription factors (TFs), which are DNA-binding proteins, are highly involved in these processes, thus a crucial information is to know where TFs interact with DNA and to be aware of the TFs’ DNA-binding motifs. For that reason, computational tools exist that link DNA-binding motifs to TFs either without sequence information or based on TF-associated sequences, e.g. identified via a chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiment. In this paper, we present MASSIF, a novel method to improve the performance of existing tools that link motifs to TFs relying on TF-associated sequences. MASSIF is based on the idea that a DNA-binding motif, which is correctly linked to a TF, should be assigned to a DNA-binding domain (DBD) similar to that of the mapped TF. Because DNA-binding motifs are in general not linked to DBDs, it is not possible to compare the DBD of a TF and the motif directly. Instead we created a DBD collection, which consist of TFs with a known DBD and an associated motif. This collection enables us to evaluate how likely it is that a linked motif and a TF of interest are associated to the same DBD. We named this similarity measure domain score, and represent it as a P-value. We developed two different ways to improve the performance of existing tools that link motifs to TFs based on TF-associated sequences: (i) using meta-analysis to combine P-values from one or several of these tools with the P-value of the domain score and (ii) filter unlikely motifs based on the domain score. Results We demonstrate the functionality of MASSIF on several human ChIP-seq datasets, using either motifs from the HOCOMOCO database or de novo identified ones as input motifs. In addition, we show that both variants of our method improve the performance of tools that link motifs to TFs based on TF-associated sequences significantly independent of the considered DBD type. Availability and implementation MASSIF is freely available online at https://github.com/SchulzLab/MASSIF. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nina Baumgarten
- Institute for Cardiovascular Regeneration, Goethe University, Frankfurt am Main 60590, Germany.,German Center for Cardiovascular Regeneration, Partner Site Rhein-Main, Frankfurt am Main 60590, Germany
| | - Florian Schmidt
- High-throughput Genomics & Systems Biology, Cluster of Excellence MMCI, Saarland University.,Research Group Computational Biology, Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken 66123, Germany
| | - Marcel H Schulz
- Institute for Cardiovascular Regeneration, Goethe University, Frankfurt am Main 60590, Germany.,German Center for Cardiovascular Regeneration, Partner Site Rhein-Main, Frankfurt am Main 60590, Germany.,High-throughput Genomics & Systems Biology, Cluster of Excellence MMCI, Saarland University.,Research Group Computational Biology, Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken 66123, Germany
| |
Collapse
|
39
|
Ochsner SA, Abraham D, Martin K, Ding W, McOwiti A, Kankanamge W, Wang Z, Andreano K, Hamilton RA, Chen Y, Hamilton A, Gantner ML, Dehart M, Qu S, Hilsenbeck SG, Becnel LB, Bridges D, Ma'ayan A, Huss JM, Stossi F, Foulds CE, Kralli A, McDonnell DP, McKenna NJ. The Signaling Pathways Project, an integrated 'omics knowledgebase for mammalian cellular signaling pathways. Sci Data 2019; 6:252. [PMID: 31672983 PMCID: PMC6823428 DOI: 10.1038/s41597-019-0193-4] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 09/11/2019] [Indexed: 12/28/2022] Open
Abstract
Mining of integrated public transcriptomic and ChIP-Seq (cistromic) datasets can illuminate functions of mammalian cellular signaling pathways not yet explored in the research literature. Here, we designed a web knowledgebase, the Signaling Pathways Project (SPP), which incorporates community classifications of signaling pathway nodes (receptors, enzymes, transcription factors and co-nodes) and their cognate bioactive small molecules. We then mapped over 10,000 public transcriptomic or cistromic experiments to their pathway node or biosample of study. To enable prediction of pathway node-gene target transcriptional regulatory relationships through SPP, we generated consensus 'omics signatures, or consensomes, which ranked genes based on measures of their significant differential expression or promoter occupancy across transcriptomic or cistromic experiments mapped to a specific node family. Consensomes were validated using alignment with canonical literature knowledge, gene target-level integration of transcriptomic and cistromic data points, and in bench experiments confirming previously uncharacterized node-gene target regulatory relationships. To expose the SPP knowledgebase to researchers, a web browser interface was designed that accommodates numerous routine data mining strategies. SPP is freely accessible at https://www.signalingpathways.org .
Collapse
Affiliation(s)
- Scott A Ochsner
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - David Abraham
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Kirt Martin
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Wei Ding
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Apollo McOwiti
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Wasula Kankanamge
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Zichen Wang
- Icahn School of Medicine, Mount Sinai University, New York, NY, 10029, USA
| | - Kaitlyn Andreano
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine, Durham, NC, 27710, USA
| | - Ross A Hamilton
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Yue Chen
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Angelica Hamilton
- Diabetes & Metabolism Research Institute, City of Hope, Duarte, CA, 91010, USA
| | - Marin L Gantner
- Department of Chemical Physiology, Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Michael Dehart
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Shijing Qu
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Susan G Hilsenbeck
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Lauren B Becnel
- Duncan NCI Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Dave Bridges
- University of Michigan School of Public Health, Ann Arbor, MI, 48109, USA
| | - Avi Ma'ayan
- Icahn School of Medicine, Mount Sinai University, New York, NY, 10029, USA
| | - Janice M Huss
- Diabetes & Metabolism Research Institute, City of Hope, Duarte, CA, 91010, USA
| | - Fabio Stossi
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Charles E Foulds
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Anastasia Kralli
- Department of Chemical Physiology, Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Donald P McDonnell
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine, Durham, NC, 27710, USA
| | - Neil J McKenna
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA.
| |
Collapse
|
40
|
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Sharipov RN, Fedorova AD, Rumynskiy EI, Medvedeva YA, Magana-Mora A, Bajic VB, Papatsenko DA, Kolpakov FA, Makeev VJ. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Res 2019; 46:D252-D259. [PMID: 29140464 PMCID: PMC5753240 DOI: 10.1093/nar/gkx1106] [Citation(s) in RCA: 473] [Impact Index Per Article: 94.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 10/31/2017] [Indexed: 12/15/2022] Open
Abstract
We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ivan S Yevshin
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia
| | - Ruslan N Sharipov
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russia.,Novosibirsk State University, 630090, Pirogova 2, Novosibirsk, Russia
| | - Alla D Fedorova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119234, Leninskiye Gory 1-73, Moscow, Russia
| | - Eugene I Rumynskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia.,Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, 119071, 2 Leninsky Ave. 33, Moscow, Russia
| | - Arturo Magana-Mora
- National Institute of Advanced Industrial Science and Technology (AIST), Com. Bio Big-Data Open Innovation Lab. (CBBD-OIL), AIST Tokyo Waterfront Main Bldg. #323, 2-3-26 Aomi, Tokyo 135-0064, Japan.,King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Dmitry A Papatsenko
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia
| | - Fedor A Kolpakov
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia
| |
Collapse
|
41
|
Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res 2019; 29:1363-1375. [PMID: 31340985 PMCID: PMC6673718 DOI: 10.1101/gr.240663.118] [Citation(s) in RCA: 411] [Impact Index Per Article: 82.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 05/28/2019] [Indexed: 12/25/2022]
Abstract
The prediction of transcription factor (TF) activities from the gene expression of their targets (i.e., TF regulon) is becoming a widely used approach to characterize the functional status of transcriptional regulatory circuits. Several strategies and data sets have been proposed to link the target genes likely regulated by a TF, each one providing a different level of evidence. The most established ones are (1) manually curated repositories, (2) interactions derived from ChIP-seq binding data, (3) in silico prediction of TF binding on gene promoters, and (4) reverse-engineered regulons from large gene expression data sets. However, it is not known how these different sources of regulons affect the TF activity estimations and, thereby, downstream analysis and interpretation. Here we compared the accuracy and biases of these strategies to define human TF regulons by means of their ability to predict changes in TF activities in three reference benchmark data sets. We assembled a collection of TF-target interactions for 1541 human TFs and evaluated how different molecular and regulatory properties of the TFs, such as the DNA-binding domain, specificities, or mode of interaction with the chromatin, affect the predictions of TF activity. We assessed their coverage and found little overlap on the regulons derived from each strategy and better performance by literature-curated information followed by ChIP-seq data. We provide an integrated resource of all TF-target interactions derived through these strategies, with confidence scores, as a resource for enhanced prediction of TF activities.
Collapse
Affiliation(s)
- Luz Garcia-Alonso
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD Cambridge, United Kingdom
- Open Targets, Wellcome Genome Campus, CB10 1SD Cambridge, United Kingdom
| | - Christian H Holland
- Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany
- Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, 69120 Heidelberg, Germany
| | - Mahmoud M Ibrahim
- Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany
- Department of Nephrology, RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany
| | - Denes Turei
- Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany
- Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, 69120 Heidelberg, Germany
| | - Julio Saez-Rodriguez
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, CB10 1SD Cambridge, United Kingdom
- Open Targets, Wellcome Genome Campus, CB10 1SD Cambridge, United Kingdom
- Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, 52074 Aachen, Germany
- Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, 69120 Heidelberg, Germany
| |
Collapse
|
42
|
Wingender E, Schoeps T, Haubrock M, Krull M, Dönitz J. TFClass: expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res 2019; 46:D343-D347. [PMID: 29087517 PMCID: PMC5753292 DOI: 10.1093/nar/gkx987] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/12/2017] [Indexed: 02/03/2023] Open
Abstract
TFClass is a resource that classifies eukaryotic transcription factors (TFs) according to their DNA-binding domains (DBDs), available online at http://tfclass.bioinf.med.uni-goettingen.de. The classification scheme of TFClass was originally derived for human TFs and is expanded here to the whole taxonomic class of mammalia. Combining information from different resources, checking manually the retrieved mammalian TFs sequences and applying extensive phylogenetic analyses, >39 000 TFs from up to 41 mammalian species were assigned to the Superclasses, Classes, Families and Subfamilies of TFClass. As a result, TFClass now provides the corresponding sequence collection in FASTA format, sequence logos and phylogenetic trees at different classification levels, predicted TF binding sites for human, mouse, dog and cow genomes as well as links to several external databases. In particular, all those TFs that are also documented in the TRANSFAC® database (FACTOR table) have been linked and can be freely accessed. TRANSFAC® FACTOR can also be queried through an own search interface.
Collapse
Affiliation(s)
- Edgar Wingender
- Institute of Bioinformatics, University Medical Center Göttingen, Georg August University, D-37077 Göttingen, Germany.,geneXplain GmbH, D-38302 Wolfenbüttel, Germany
| | - Torsten Schoeps
- Institute of Bioinformatics, University Medical Center Göttingen, Georg August University, D-37077 Göttingen, Germany
| | - Martin Haubrock
- Institute of Bioinformatics, University Medical Center Göttingen, Georg August University, D-37077 Göttingen, Germany
| | | | - Jürgen Dönitz
- Institute of Bioinformatics, University Medical Center Göttingen, Georg August University, D-37077 Göttingen, Germany.,Dpt. of Evolutionary Developmental Genetics, Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, Georg August University, D-37077 Göttingen, Germany
| |
Collapse
|
43
|
Homotypic cooperativity and collective binding are determinants of bHLH specificity and function. Proc Natl Acad Sci U S A 2019; 116:16143-16152. [PMID: 31341088 DOI: 10.1073/pnas.1818015116] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Eukaryotic cells express transcription factor (TF) paralogues that bind to nearly identical DNA sequences in vitro but bind at different genomic loci and perform different functions in vivo. Predicting how 2 paralogous TFs bind in vivo using DNA sequence alone is an important open problem. Here, we analyzed 2 yeast bHLH TFs, Cbf1p and Tye7p, which have highly similar binding preferences in vitro, yet bind at almost completely nonoverlapping target loci in vivo. We dissected the determinants of specificity for these 2 proteins by making a number of chimeric TFs in which we swapped different domains of Cbf1p and Tye7p and determined the effects on in vivo binding and cellular function. From these experiments, we learned that the Cbf1p dimer achieves its specificity by binding cooperatively with other Cbf1p dimers bound nearby. In contrast, we found that Tye7p achieves its specificity by binding cooperatively with 3 other DNA-binding proteins, Gcr1p, Gcr2p, and Rap1p. Remarkably, most promoters (63%) that are bound by Tye7p do not contain a consensus Tye7p binding site. Using this information, we were able to build simple models to accurately discriminate bound and unbound genomic loci for both Cbf1p and Tye7p. We then successfully reprogrammed the human bHLH NPAS2 to bind Cbf1p in vivo targets and a Tye7p target intergenic region to be bound by Cbf1p. These results demonstrate that the genome-wide binding targets of paralogous TFs can be discriminated using sequence information, and provide lessons about TF specificity that can be applied across the phylogenetic tree.
Collapse
|
44
|
Pearl JR, Colantuoni C, Bergey DE, Funk CC, Shannon P, Basu B, Casella AM, Oshone RT, Hood L, Price ND, Ament SA. Genome-Scale Transcriptional Regulatory Network Models of Psychiatric and Neurodegenerative Disorders. Cell Syst 2019; 8:122-135.e7. [DOI: 10.1016/j.cels.2019.01.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Revised: 10/19/2018] [Accepted: 01/14/2019] [Indexed: 12/23/2022]
|
45
|
Raccaud M, Friman ET, Alber AB, Agarwal H, Deluz C, Kuhn T, Gebhardt JCM, Suter DM. Mitotic chromosome binding predicts transcription factor properties in interphase. Nat Commun 2019; 10:487. [PMID: 30700703 PMCID: PMC6353955 DOI: 10.1038/s41467-019-08417-5] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/08/2019] [Indexed: 12/31/2022] Open
Abstract
Mammalian transcription factors (TFs) differ broadly in their nuclear mobility and sequence-specific/non-specific DNA binding. How these properties affect their ability to occupy specific genomic sites and modify the epigenetic landscape is unclear. The association of TFs with mitotic chromosomes observed by fluorescence microscopy is largely mediated by non-specific DNA interactions and differs broadly between TFs. Here we combine quantitative measurements of mitotic chromosome binding (MCB) of 501 TFs, TF mobility measurements by fluorescence recovery after photobleaching, single molecule imaging of DNA binding, and mapping of TF binding and chromatin accessibility. TFs associating to mitotic chromosomes are enriched in DNA-rich compartments in interphase and display slower mobility in interphase and mitosis. Remarkably, MCB correlates with relative TF on-rates and genome-wide specific site occupancy, but not with TF residence times. This suggests that non-specific DNA binding properties of TFs regulate their search efficiency and occupancy of specific genomic sites.
Collapse
Affiliation(s)
- Mahé Raccaud
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland
| | - Elias T Friman
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland
| | - Andrea B Alber
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland
| | - Harsha Agarwal
- Institute of Biophysics, Ulm University, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | - Cédric Deluz
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland
| | - Timo Kuhn
- Institute of Biophysics, Ulm University, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | - J Christof M Gebhardt
- Institute of Biophysics, Ulm University, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | - David M Suter
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland.
| |
Collapse
|
46
|
Samee MAH, Bruneau BG, Pollard KS. A De Novo Shape Motif Discovery Algorithm Reveals Preferences of Transcription Factors for DNA Shape Beyond Sequence Motifs. Cell Syst 2019; 8:27-42.e6. [PMID: 30660610 PMCID: PMC6368855 DOI: 10.1016/j.cels.2018.12.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 08/18/2018] [Accepted: 12/03/2018] [Indexed: 12/17/2022]
Abstract
DNA shape adds specificity to sequence motifs but has not been explored systematically outside this context. We hypothesized that DNA-binding proteins (DBPs) preferentially occupy DNA with specific structures ("shape motifs") regardless of whether or not these correspond to high information content sequence motifs. We present ShapeMF, a Gibbs sampling algorithm that identifies de novo shape motifs. Using binding data from hundreds of in vivo and in vitro experiments, we show that most DBPs have shape motifs and can occupy these in the absence of sequence motifs. This "shape-only binding" is common for many DBPs and in regions co-bound by multiple DBPs. When shape and sequence motifs co-occur, they can be overlapping, flanking, or separated by consistent spacing. Finally, DBPs within the same protein family have different shape motifs, explaining their distinct genome-wide occupancy despite having similar sequence motifs. These results suggest that shape motifs not only complement sequence motifs but also facilitate recognition of DNA beyond conventionally defined sequence motifs.
Collapse
Affiliation(s)
| | - Benoit G Bruneau
- Gladstone Institutes, San Francisco, CA 94158, USA; Department of Pediatrics and Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA 94158, USA; Department of Epidemiology & Biostatistics, Institute for Human Genetics, Quantitative Biology Institute, and Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Chan-Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|
47
|
Swindell WR, Bojanowski K, Kindy MS, Chau RMW, Ko D. GM604 regulates developmental neurogenesis pathways and the expression of genes associated with amyotrophic lateral sclerosis. Transl Neurodegener 2018; 7:30. [PMID: 30524706 PMCID: PMC6276193 DOI: 10.1186/s40035-018-0135-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Accepted: 10/21/2018] [Indexed: 12/11/2022] Open
Abstract
Background Amyotrophic lateral sclerosis (ALS) is currently an incurable disease without highly effective pharmacological treatments. The peptide drug GM604 (GM6 or Alirinetide) was developed as a candidate ALS therapy, which has demonstrated safety and good drug-like properties with a favorable pharmacokinetic profile. GM6 is hypothesized to bolster neuron survival through the multi-target regulation of developmental pathways, but mechanisms of action are not fully understood. Methods This study used RNA-seq to evaluate transcriptome responses in SH-SY5Y neuroblastoma cells following GM6 treatment (6, 24 and 48 h). Results We identified 2867 protein-coding genes with expression significantly altered by GM6 (FDR < 0.10). Early (6 h) responses included up-regulation of Notch and hedgehog signaling components, with increased expression of developmental genes mediating neurogenesis and axon growth. Prolonged GM6 treatment (24 and 48 h) altered the expression of genes contributing to cell adhesion and the extracellular matrix. GM6 further down-regulated the expression of genes associated with mitochondria, inflammatory responses, mRNA processing and chromatin organization. GM6-increased genes were located near GC-rich motifs interacting with C2H2 zinc finger transcription factors, whereas GM6-decreased genes were located near AT-rich motifs associated with helix-turn-helix homeodomain factors. Such motifs interacted with a diverse network of transcription factors encoded by GM6-regulated genes (STAT3, HOXD11, HES7, GLI1). We identified 77 ALS-associated genes with expression significantly altered by GM6 treatment (FDR < 0.10), which were known to function in neurogenesis, axon guidance and the intrinsic apoptosis pathway. Conclusions Our findings support the hypothesis that GM6 acts through developmental-stage pathways to influence neuron survival. Gene expression responses were consistent with neurotrophic effects, ECM modulation, and activation of the Notch and hedgehog neurodevelopmental pathways. This multifaceted mechanism of action is unique among existing ALS drug candidates and may be applicable to multiple neurodegenerative diseases. Electronic supplementary material The online version of this article (10.1186/s40035-018-0135-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- William R Swindell
- 1Heritage College of Osteopathic Medicine, Ohio University, Athens, OH USA
| | | | - Mark S Kindy
- 3Department of Pharmaceutical Sciences, College of Pharmacy, University of South Florida, Tampa, FL USA.,4James A. Haley VAMC, Tampa, FL USA
| | | | - Dorothy Ko
- Genervon Biopharmaceuticals LLC, Pasadena, CA USA
| |
Collapse
|
48
|
Guo Z, Qin J, Zhou X, Zhang Y. Insect Transcription Factors: A Landscape of Their Structures and Biological Functions in Drosophila and beyond. Int J Mol Sci 2018; 19:ijms19113691. [PMID: 30469390 PMCID: PMC6274879 DOI: 10.3390/ijms19113691] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 11/16/2018] [Accepted: 11/16/2018] [Indexed: 12/17/2022] Open
Abstract
Transcription factors (TFs) play essential roles in the transcriptional regulation of functional genes, and are involved in diverse physiological processes in living organisms. The fruit fly Drosophila melanogaster, a simple and easily manipulated organismal model, has been extensively applied to study the biological functions of TFs and their related transcriptional regulation mechanisms. It is noteworthy that with the development of genetic tools such as CRISPR/Cas9 and the next-generation genome sequencing techniques in recent years, identification and dissection the complex genetic regulatory networks of TFs have also made great progress in other insects beyond Drosophila. However, unfortunately, there is no comprehensive review that systematically summarizes the structures and biological functions of TFs in both model and non-model insects. Here, we spend extensive effort in collecting vast related studies, and attempt to provide an impartial overview of the progress of the structure and biological functions of current documented TFs in insects, as well as the classical and emerging research methods for studying their regulatory functions. Consequently, considering the importance of versatile TFs in orchestrating diverse insect physiological processes, this review will assist a growing number of entomologists to interrogate this understudied field, and to propel the progress of their contributions to pest control and even human health.
Collapse
Affiliation(s)
- Zhaojiang Guo
- Department of Plant Protection, Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| | - Jianying Qin
- Department of Plant Protection, Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
- Longping Branch, Graduate School of Hunan University, Changsha 410125, China.
| | - Xiaomao Zhou
- Longping Branch, Graduate School of Hunan University, Changsha 410125, China.
| | - Youjun Zhang
- Department of Plant Protection, Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| |
Collapse
|
49
|
Gysi DM, Voigt A, Fragoso TDM, Almaas E, Nowick K. wTO: an R package for computing weighted topological overlap and a consensus network with integrated visualization tool. BMC Bioinformatics 2018; 19:392. [PMID: 30355288 PMCID: PMC6201546 DOI: 10.1186/s12859-018-2351-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 08/30/2018] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Network analyses, such as of gene co-expression networks, metabolic networks and ecological networks have become a central approach for the systems-level study of biological data. Several software packages exist for generating and analyzing such networks, either from correlation scores or the absolute value of a transformed score called weighted topological overlap (wTO). However, since gene regulatory processes can up- or down-regulate genes, it is of great interest to explicitly consider both positive and negative correlations when constructing a gene co-expression network. RESULTS Here, we present an R package for calculating the weighted topological overlap (wTO), that, in contrast to existing packages, explicitly addresses the sign of the wTO values, and is thus especially valuable for the analysis of gene regulatory networks. The package includes the calculation of p-values (raw and adjusted) for each pairwise gene score. Our package also allows the calculation of networks from time series (without replicates). Since networks from independent datasets (biological repeats or related studies) are not the same due to technical and biological noise in the data, we additionally, incorporated a novel method for calculating a consensus network (CN) from two or more networks into our R package. To graphically inspect the resulting networks, the R package contains a visualization tool, which allows for the direct network manipulation and access of node and link information. When testing the package on a standard laptop computer, we can conduct all calculations for systems of more than 20,000 genes in under two hours. We compare our new wTO package to state of art packages and demonstrate the application of the wTO and CN functions using 3 independently derived datasets from healthy human pre-frontal cortex samples. To showcase an example for the time series application we utilized a metagenomics data set. CONCLUSION In this work, we developed a software package that allows the computation of wTO networks, CNs and a visualization tool in the R statistical environment. It is publicly available on CRAN repositories under the GPL -2 Open Source License ( https://cran.r-project.org/web/packages/wTO/ ).
Collapse
Affiliation(s)
- Deisy Morselli Gysi
- Department of Computer Science, Interdisciplinary Center of Bioinformatics, University of Leipzig, Haertelstrasse 16-18, Leipzig, 04109 Germany
- Swarm Intelligence and Complex Systems Group, Faculty of Mathematics and Computer Science, University of Leipzig, Augustusplatz 10, Leipzig, 04109 Germany
| | - Andre Voigt
- Department of Biotechnology, NTNU - Norwegian University of Science and Technology, Trondheim, N-7049 Norway
| | | | - Eivind Almaas
- Department of Biotechnology, NTNU - Norwegian University of Science and Technology, Trondheim, N-7049 Norway
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health, NTNU - Norwegian University of Science and Technology, Trondheim, N-7049 Norway
| | - Katja Nowick
- Freie Universität Berlin, Human Biology Group, Institute for Zoology, Department of Biology, Chemistry and Pharmacy, Königin-Luise-Straße 1-3, Berlin, D-14195 Germany
| |
Collapse
|
50
|
Zuo H, Yang L, Zheng J, Su Z, Weng S, He J, Xu X. A single C4 Zinc finger-containing protein from Litopenaeus vannamei involved in antibacterial responses. FISH & SHELLFISH IMMUNOLOGY 2018; 81:493-501. [PMID: 30064017 DOI: 10.1016/j.fsi.2018.07.053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Revised: 07/18/2018] [Accepted: 07/27/2018] [Indexed: 06/08/2023]
Abstract
The Zinc finger domains (ZnFs), which contain finger-like protrusions stabilized by zinc ions and function to bind DNA, RNA, protein and lipid substrates, are ubiquitously present in a large number of proteins. In this study, a novel protein containing a single C4 type Znf domain (SZnf) was identified from Pacific white shrimp, Litopenaeus vannamei and its role in immunity was further investigated. The ZnF domain of SZnF but not other regions shared high homology with those of fushi tarazu-factor 1 (FTZ-F1) proteins. The SZnF protein was mainly localized in the cytoplasm and was also present in the nucleus at a small level. SZnF was high expressed in the scape and muscle tissues of healthy shrimp and its expression in gill and heptopancreas was strongly up-regulated during bacterial infection. Silencing of SZnf in vivo could strongly increase the susceptibility of shrimp to infection with Vibrio parahaemolyticus but not white spot syndrome virus (WSSV), suggesting that SZnf could be mainly involved in antibacterial responses. Both dual luciferase reporter assays and real-time PCR analysis demonstrated that SZnf could positively regulate the expression of various antimicrobial peptides in vitro and in vivo, which could be part of the mechanism underlying its antibacterial effects. In summary, the current study could help learn more about the function of ZnF-containing proteins and the regulatory mechanisms of immune responses against pathogen infection in crustaceans.
Collapse
Affiliation(s)
- Hongliang Zuo
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China
| | - Linwei Yang
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China
| | - Jiefu Zheng
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China
| | - Ziqi Su
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China
| | - Shaoping Weng
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China
| | - Jianguo He
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China.
| | - Xiaopeng Xu
- MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China; Institute of Aquatic Economic Animals and Guangdong Provice Key Laboratory for Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China.
| |
Collapse
|