1
|
Bhattacharjya A, Islam MM, Uddin MA, Talukder MA, Azad AKM, Aryal S, Paul BK, Tasnim W, Almoyad MAA, Moni MA. Exploring gene regulatory interaction networks and predicting therapeutic molecules for hypopharyngeal cancer and EGFR-mutated lung adenocarcinoma. FEBS Open Bio 2024; 14:1166-1191. [PMID: 38783639 PMCID: PMC11216941 DOI: 10.1002/2211-5463.13807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 01/30/2024] [Accepted: 04/16/2024] [Indexed: 05/25/2024] Open
Abstract
Hypopharyngeal cancer is a disease that is associated with EGFR-mutated lung adenocarcinoma. Here we utilized a bioinformatics approach to identify genetic commonalities between these two diseases. To this end, we examined microarray datasets from GEO (Gene Expression Omnibus) to identify differentially expressed genes, common genes, and hub genes between the selected two diseases. Our analyses identified potential therapeutic molecules for the selected diseases based on 10 hub genes with the highest interactions according to the degree topology method and the maximum clique centrality (MCC). These therapeutic molecules may have the potential for simultaneous treatment of these diseases.
Collapse
Affiliation(s)
- Abanti Bhattacharjya
- Department of Computer Science and EngineeringJagannath UniversityDhakaBangladesh
| | - Md Manowarul Islam
- Department of Computer Science and EngineeringJagannath UniversityDhakaBangladesh
| | - Md Ashraf Uddin
- School of Information TechnologyDeakin UniversityGeelongAustralia
| | - Md Alamin Talukder
- Department of Computer Science and EngineeringInternational University of Business Agriculture and TechnologyDhakaBangladesh
| | - AKM Azad
- Department of Mathematics and Statistics, Faculty of ScienceImam Mohammad Ibn Saud Islamic University (IMSIU)RiyadhSaudi Arabia
| | - Sunil Aryal
- School of Information TechnologyDeakin UniversityGeelongAustralia
| | - Bikash Kumar Paul
- Department of Information and Communication TechnologyMawlana Bhashani Science and Technology UniversityTangailBangladesh
- Department of Software EngineeringDaffodil International UniversityDhakaBangladesh
| | - Wahia Tasnim
- Department of Information and Communication TechnologyMawlana Bhashani Science and Technology UniversityTangailBangladesh
| | | | - Mohammad Ali Moni
- Artificial Intelligence & Data Science, Faculty of Health and Behavioural SciencesThe University of QueenslandBrisbaneAustralia
- AI & Digital Health Technology, Artificial Intelligence and Cyber Futures InstituteCharles Sturt UniversityBathurstAustralia
- Rural Health Research InstituteCharles Sturt UniversityOrangeAustralia
| |
Collapse
|
2
|
Andreani V, South EJ, Dunlop MJ. Generating information-dense promoter sequences with optimal string packing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.01.565124. [PMID: 37961203 PMCID: PMC10635063 DOI: 10.1101/2023.11.01.565124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Dense arrangements of binding sites within nucleotide sequences can collectively influence downstream transcription rates or initiate biomolecular interactions. For example, natural promoter regions can harbor many overlapping transcription factor binding sites that influence the rate of transcription initiation. Despite the prevalence of overlapping binding sites in nature, rapid design of nucleotide sequences with many overlapping sites remains a challenge. Here, we show that this is an NP-hard problem, coined here as the nucleotide String Packing Problem (SPP). We then introduce a computational technique that efficiently assembles sets of DNA-protein binding sites into dense, contiguous stretches of double-stranded DNA. For the efficient design of nucleotide sequences spanning hundreds of base pairs, we reduce the SPP to an Orienteering Problem with integer distances, and then leverage modern integer linear programming solvers. Our method optimally packs libraries of 20-100 binding sites into dense nucleotide arrays of 50-300 base pairs in 0.05-10 seconds. Unlike approximation algorithms or meta-heuristics, our approach finds provably optimal solutions. We demonstrate how our method can generate large sets of diverse sequences suitable for library generation, where the frequency of binding site usage across the returned sequences can be controlled by modulating the objective function. As an example, we then show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The nucleotide string packing approach we present can accelerate the design of sequences with complex DNA-protein interactions. When used in combination with synthesis and high-throughput screening, this design strategy could help interrogate how complex binding site arrangements impact either gene expression or biomolecular mechanisms in varied cellular contexts. Author Summary The way protein binding sites are arranged on DNA can control the regulation and transcription of downstream genes. Areas with a high concentration of binding sites can enable complex interplay between transcription factors, a feature that is exploited by natural promoters. However, designing synthetic promoters that contain dense arrangements of binding sites is a challenge. The task involves overlapping many binding sites, each typically about 10 nucleotides long, within a constrained sequence area, which becomes increasingly difficult as sequence length decreases, and binding site variety increases. We introduce an approach to design nucleotide sequences with optimally packed protein binding sites, which we call the nucleotide String Packing Problem (SPP). We show that the SPP can be solved efficiently using integer linear programming to identify the densest arrangements of binding sites for a specified sequence length. We show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The presented approach enables the rapid design and study of nucleotide sequences with complex, dense binding site architectures.
Collapse
|
3
|
Vishnevsky OV, Bocharnikov AV, Ignatieva EV. Peak Scores Significantly Depend on the Relationships between Contextual Signals in ChIP-Seq Peaks. Int J Mol Sci 2024; 25:1011. [PMID: 38256085 PMCID: PMC10816497 DOI: 10.3390/ijms25021011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/13/2023] [Accepted: 01/09/2024] [Indexed: 01/24/2024] Open
Abstract
Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq) is a central genome-wide method for in vivo analyses of DNA-protein interactions in various cellular conditions. Numerous studies have demonstrated the complex contextual organization of ChIP-seq peak sequences and the presence of binding sites for transcription factors in them. We assessed the dependence of the ChIP-seq peak score on the presence of different contextual signals in the peak sequences by analyzing these sequences from several ChIP-seq experiments using our fully enumerative GPU-based de novo motif discovery method, Argo_CUDA. Analysis revealed sets of significant IUPAC motifs corresponding to the binding sites of the target and partner transcription factors. For these ChIP-seq experiments, multiple regression models were constructed, demonstrating a significant dependence of the peak scores on the presence in the peak sequences of not only highly significant target motifs but also less significant motifs corresponding to the binding sites of the partner transcription factors. A significant correlation was shown between the presence of the target motifs FOXA2 and the partner motifs HNF4G, which found experimental confirmation in the scientific literature, demonstrating the important contribution of the partner transcription factors to the binding of the target transcription factor to DNA and, consequently, their important contribution to the peak score.
Collapse
Affiliation(s)
- Oleg V. Vishnevsky
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia;
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia;
| | - Andrey V. Bocharnikov
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia;
| | - Elena V. Ignatieva
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia;
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia;
| |
Collapse
|
4
|
Cain B, Webb J, Yuan Z, Cheung D, Lim HW, Kovall R, Weirauch MT, Gebelein B. Prediction of cooperative homeodomain DNA binding sites from high-throughput-SELEX data. Nucleic Acids Res 2023; 51:6055-6072. [PMID: 37114997 PMCID: PMC10325903 DOI: 10.1093/nar/gkad318] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 04/12/2023] [Accepted: 04/25/2023] [Indexed: 04/29/2023] Open
Abstract
Homeodomain proteins constitute one of the largest families of metazoan transcription factors. Genetic studies have demonstrated that homeodomain proteins regulate many developmental processes. Yet, biochemical data reveal that most bind highly similar DNA sequences. Defining how homeodomain proteins achieve DNA binding specificity has therefore been a long-standing goal. Here, we developed a novel computational approach to predict cooperative dimeric binding of homeodomain proteins using High-Throughput (HT) SELEX data. Importantly, we found that 15 of 88 homeodomain factors form cooperative homodimer complexes on DNA sites with precise spacing requirements. Approximately one third of the paired-like homeodomain proteins cooperatively bind palindromic sequences spaced 3 bp apart, whereas other homeodomain proteins cooperatively bind sites with distinct orientation and spacing requirements. Combining structural models of a paired-like factor with our cooperativity predictions identified key amino acid differences that help differentiate between cooperative and non-cooperative factors. Finally, we confirmed predicted cooperative dimer sites in vivo using available genomic data for a subset of factors. These findings demonstrate how HT-SELEX data can be computationally mined to predict cooperativity. In addition, the binding site spacing requirements of select homeodomain proteins provide a mechanism by which seemingly similar AT-rich DNA sequences can preferentially recruit specific homeodomain factors.
Collapse
Affiliation(s)
- Brittany Cain
- Department of Biomedical Engineering, University of Cincinnati, Cincinnati, OH 45221, USA
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7007, Cincinnati, OH 45229, USA
| | - Jordan Webb
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Zhenyu Yuan
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - David Cheung
- Graduate Program in Molecular and Developmental Biology, Cincinnati Children's Hospital Research Foundation, Cincinnati, OH 45229, USA
| | - Hee-Woong Lim
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Rhett A Kovall
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Matthew T Weirauch
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
- Divisions of Human Genetics, Biomedical Informatics and Developmental Biology, Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Brian Gebelein
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7007, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| |
Collapse
|
5
|
Moeckel C, Zaravinos A, Georgakopoulos-Soares I. Strand Asymmetries Across Genomic Processes. Comput Struct Biotechnol J 2023; 21:2036-2047. [PMID: 36968020 PMCID: PMC10030826 DOI: 10.1016/j.csbj.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/08/2023] [Accepted: 03/08/2023] [Indexed: 03/12/2023] Open
Abstract
Across biological systems, a number of genomic processes, including transcription, replication, DNA repair, and transcription factor binding, display intrinsic directionalities. These directionalities are reflected in the asymmetric distribution of nucleotides, motifs, genes, transposon integration sites, and other functional elements across the two complementary strands. Strand asymmetries, including GC skews and mutational biases, have shaped the nucleotide composition of diverse organisms. The investigation of strand asymmetries often serves as a method to understand underlying biological mechanisms, including protein binding preferences, transcription factor interactions, retrotransposition, DNA damage and repair preferences, transcription-replication collisions, and mutagenesis mechanisms. Research into this subject also enables the identification of functional genomic sites, such as replication origins and transcription start sites. Improvements in our ability to detect and quantify DNA strand asymmetries will provide insights into diverse functionalities of the genome, the contribution of different mutational mechanisms in germline and somatic mutagenesis, and our knowledge of genome instability and evolution, which all have significant clinical implications in human disease, including cancer. In this review, we describe key developments that have been made across the field of genomic strand asymmetries, as well as the discovery of associated mechanisms.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Apostolos Zaravinos
- Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus
- Cancer Genetics, Genomics and Systems Biology laboratory, Basic and Translational Cancer Research Center (BTCRC), Nicosia 1516, Cyprus
- Corresponding author at: Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus.
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Corresponding author.
| |
Collapse
|
6
|
Wu X, Liu S, Liang G. Detecting clusters of transcription factors based on a nonhomogeneous poisson process model. BMC Bioinformatics 2022; 23:535. [PMID: 36494794 PMCID: PMC9738027 DOI: 10.1186/s12859-022-05090-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 11/30/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Rapidly growing genome-wide ChIP-seq data have provided unprecedented opportunities to explore transcription factor (TF) binding under various cellular conditions. Despite the rich resources, development of analytical methods for studying the interaction among TFs in gene regulation still lags behind. RESULTS In order to address cooperative TF binding and detect TF clusters with coordinative functions, we have developed novel computational methods based on clustering the sample paths of nonhomogeneous Poisson processes. Simulation studies demonstrated the capability of these methods to accurately detect TF clusters and uncover the hierarchy of TF interactions. A further application to the multiple-TF ChIP-seq data in mouse embryonic stem cells (ESCs) showed that our methods identified the cluster of core ESC regulators reported in the literature and provided new insights on functional implications of transcrisptional regulatory modules. CONCLUSIONS Effective analytical tools are essential for studying protein-DNA relations. Information derived from this research will help us better understand the orchestration of transcription factors in gene regulation processes.
Collapse
Affiliation(s)
- Xiaowei Wu
- grid.438526.e0000 0001 0694 4940Department of Statistics, Virginia Tech, 250 Drillfield Drive, Blacksburg, VA 24061 USA
| | - Shicheng Liu
- grid.438526.e0000 0001 0694 4940Department of Mathematics, Virginia Tech, 225 Stanger Street, Blacksburg, VA 24061 USA
| | - Guanying Liang
- grid.438526.e0000 0001 0694 4940Department of Mathematics, Virginia Tech, 225 Stanger Street, Blacksburg, VA 24061 USA
| |
Collapse
|
7
|
Jiang Y, Harigaya Y, Zhang Z, Zhang H, Zang C, Zhang NR. Nonparametric single-cell multiomic characterization of trio relationships between transcription factors, target genes, and cis-regulatory regions. Cell Syst 2022; 13:737-751.e4. [PMID: 36055233 PMCID: PMC9509445 DOI: 10.1016/j.cels.2022.08.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 06/23/2022] [Accepted: 08/11/2022] [Indexed: 01/26/2023]
Abstract
The epigenetic control of gene expression is highly cell-type and context specific. Yet, despite its complexity, gene regulatory logic can be broken down into modular components consisting of a transcription factor (TF) activating or repressing the target gene expression through its binding to a cis-regulatory region. We propose a nonparametric approach, TRIPOD, to detect and characterize the three-way relationships between a TF, its target gene, and the accessibility of the TF's binding site using single-cell RNA and ATAC multiomic data. We apply TRIPOD to interrogate the cell-type-specific regulatory logic in peripheral blood mononuclear cells and contrast our results to detections from enhancer databases, cis-eQTL studies, ChIP-seq experiments, and TF knockdown/knockout studies. We then apply TRIPOD to mouse embryonic brain data and identify regulatory relationships, validated by ChIP-seq and PLAC-seq. Finally, we demonstrate TRIPOD on the SHARE-seq data of differentiating mouse hair follicle cells and identify lineage-specific regulation supported by histone marks and super-enhancer annotations. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC 27599, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA.
| | - Yuriko Harigaya
- Curriculum in Bioinformatics and Computational Biology, School of Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Zhaojun Zhang
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hongpan Zhang
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
| | - Chongzhi Zang
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA; Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
| | - Nancy R Zhang
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
8
|
Bentsen M, Heger V, Schultheis H, Kuenne C, Looso M. TF-COMB - discovering grammar of transcription factor binding sites. Comput Struct Biotechnol J 2022; 20:4040-4051. [PMID: 35983231 PMCID: PMC9358416 DOI: 10.1016/j.csbj.2022.07.025] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 07/12/2022] [Indexed: 02/07/2023] Open
Abstract
Cooperativity between transcription factors is important to regulate target gene expression. In particular, the binding grammar of TFs in relation to each other, as well as in the context of other genomic elements, is crucial for TF functionality. However, tools to easily uncover co-occurrence between DNA-binding proteins, and investigate the regulatory modules of TFs, are limited. Here we present TF-COMB (Transcription Factor Co-Occurrence using Market Basket analysis) - a tool to investigate co-occurring TFs and binding grammar within regulatory regions. We found that TF-COMB can accurately identify known co-occurring TFs from ChIP-seq data, as well as uncover preferential localization to other genomic elements. With the use of ATAC-seq footprinting and TF motif locations, we found that TFs exhibit both preferred orientation and distance in relation to each other, and that these are biologically significant. Finally, we extended the analysis to not only investigate individual TF pairs, but also TF pairs in the context of networks, which enabled the investigation of TF complexes and TF hubs. In conclusion, TF-COMB is a flexible tool to investigate various aspects of TF binding grammar.
Collapse
Affiliation(s)
- Mette Bentsen
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Vanessa Heger
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Hendrik Schultheis
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Carsten Kuenne
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Mario Looso
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
- Cardio-Pulmonary Institute (CPI), Bad Nauheim, Germany
- Corresponding author at: Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany.
| |
Collapse
|
9
|
Zheng D, Zhang W. Characterization of Expression and Epigenetic Features of Core Genes in Common Wheat. Genes (Basel) 2022; 13:genes13071112. [PMID: 35885895 PMCID: PMC9317296 DOI: 10.3390/genes13071112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 06/16/2022] [Accepted: 06/20/2022] [Indexed: 12/10/2022] Open
Abstract
The availability of multiple wheat genome sequences enables us to identify core genes and characterize their genetic and epigenetic features, thereby advancing our understanding of their biological implications within individual plant species. It is, however, largely understudied in wheat. To this end, we reanalyzed genome sequences from 16 different wheat varieties and identified 62,299 core genes. We found that core and non-core genes have different roles in subgenome differentiation. Meanwhile, according to their expression profiles, these core genes can be classified into genes related to tissue development and stress responses, including 3376 genes highly expressed in both spikelets and at high temperatures. After associating with six histone marks and open chromatin, we found that these core genes can be divided into eight sub-clusters with distinct epigenomic features. Furthermore, we found that ca. 51% of the expressed transcription factors (TFs) were marked with both H3K27me3 and H3K4me3, indicative of the bivalency feature, which can be involved in tissue development through the TF-centered regulatory network. Thus, our study provides a valuable resource for the functional characterization of core genes in stress responses and tissue development in wheat.
Collapse
|
10
|
Krinsky BH, Arthur RK, Xia S, Sosa D, Arsala D, White KP, Long M. Rapid Cis-Trans Coevolution Driven by a Novel Gene Retroposed from a Eukaryotic Conserved CCR4-NOT Component in Drosophila. Genes (Basel) 2021; 13:57. [PMID: 35052398 PMCID: PMC8774992 DOI: 10.3390/genes13010057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 12/10/2021] [Accepted: 12/23/2021] [Indexed: 12/11/2022] Open
Abstract
Young, or newly evolved, genes arise ubiquitously across the tree of life, and they can rapidly acquire novel functions that influence a diverse array of biological processes. Previous work identified a young regulatory duplicate gene in Drosophila, Zeus that unexpectedly diverged rapidly from its parent, Caf40, an extremely conserved component in the CCR4-NOT machinery in post-transcriptional and post-translational regulation of eukaryotic cells, and took on roles in the male reproductive system. This neofunctionalization was accompanied by differential binding of the Zeus protein to loci throughout the Drosophila melanogaster genome. However, the way in which new DNA-binding proteins acquire and coevolve with their targets in the genome is not understood. Here, by comparing Zeus ChIP-Seq data from D. melanogaster and D. simulans to the ancestral Caf40 binding events from D. yakuba, a species that diverged before the duplication event, we found a dynamic pattern in which Zeus binding rapidly coevolved with a previously unknown DNA motif, which we term Caf40 and Zeus-Associated Motif (CAZAM), under the influence of positive selection. Interestingly, while both copies of Zeus acquired targets at male-biased and testis-specific genes, D. melanogaster and D. simulans proteins have specialized binding on different chromosomes, a pattern echoed in the evolution of the associated motif. Using CRISPR-Cas9-mediated gene knockout of Zeus and RNA-Seq, we found that Zeus regulated the expression of 661 differentially expressed genes (DEGs). Our results suggest that the evolution of young regulatory genes can be coupled to substantial rewiring of the transcriptional networks into which they integrate, even over short evolutionary timescales. Our results thus uncover dynamic genome-wide evolutionary processes associated with new genes.
Collapse
Affiliation(s)
- Benjamin H. Krinsky
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Robert K. Arthur
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Dylan Sosa
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Deanna Arsala
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Kevin P. White
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Manyuan Long
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| |
Collapse
|
11
|
Dergilev AI, Orlova NG, Dobrovolskaya OB, Orlov YL. Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data. J Integr Bioinform 2021; 19:jib-2020-0036. [PMID: 34953471 PMCID: PMC9069649 DOI: 10.1515/jib-2020-0036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 11/25/2021] [Indexed: 12/03/2022] Open
Abstract
The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.
Collapse
Affiliation(s)
- Arthur I. Dergilev
- Novosibirsk State University, 630090Novosibirsk, Russia
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 630090Novosibirsk, Russia
| | - Nina G. Orlova
- Financial University under the Government of the Russian Federation, 125993Moscow, Russia
- Moscow State Technical University of Civil Aviation, 125993Moscow, Russia
| | - Oxana B. Dobrovolskaya
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 630090Novosibirsk, Russia
- Agrarian and Technological Institute, Peoples’ Friendship University of Russia,117198Moscow, Russia
| | - Yuriy L. Orlov
- Novosibirsk State University, 630090Novosibirsk, Russia
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 630090Novosibirsk, Russia
- Agrarian and Technological Institute, Peoples’ Friendship University of Russia,117198Moscow, Russia
- The Digital Health Institute, I.M.Sechenov First Moscow State Medical University (Sechenov University), 119991Moscow, Russia
| |
Collapse
|
12
|
Yi X, Zheng Z, Xu H, Zhou Y, Huang D, Wang J, Feng X, Zhao K, Fan X, Zhang S, Dong X, Wang Z, Shen Y, Cheng H, Shi L, Li MJ. Interrogating cell type-specific cooperation of transcriptional regulators in 3D chromatin. iScience 2021; 24:103468. [PMID: 34888502 PMCID: PMC8634045 DOI: 10.1016/j.isci.2021.103468] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/23/2021] [Accepted: 11/12/2021] [Indexed: 12/14/2022] Open
Abstract
Context-specific activities of transcription regulators (TRs) in the nucleus modulate spatiotemporal gene expression precisely. Using the largest ChIP-seq data and chromatin loops in the human K562 cell line, we initially interrogated TR cooperation in 3D chromatin via a graphical model and revealed many known and novel TRs manipulating context-specific pathways. To explore TR cooperation across broad tissue/cell types, we systematically leveraged large-scale open chromatin profiles, computational footprinting, and high-resolution chromatin interactions to investigate tissue/cell type-specific TR cooperation. We first delineated a landscape of TR cooperation across 40 human tissue/cell types. Network modularity analyses uncovered the commonality and specificity of TR cooperation in different conditions. We also demonstrated that TR cooperation information can better interpret the disease-causal variants identified by genome-wide association studies and recapitulate cell states during neural development. Our study characterizes shared and unique patterns of TR cooperation associated with the cell type specificity of gene regulation in 3D chromatin. Computational inference of transcriptional regulator (TR) cooperation in 3D chromatin A landscape of 3D TR cooperation across 40 human tissue/cell types TR cooperation can better interpret the disease-causal variants identified by GWAS Cooperation of certain TRs shapes context-specific gene regulation in cell development
Collapse
Affiliation(s)
- Xianfu Yi
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, China.,Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China
| | - Zhanye Zheng
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hang Xu
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Dandan Huang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiangling Feng
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Ke Zhao
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xutong Fan
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiaobao Dong
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yujun Shen
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hui Cheng
- State Key Laboratory of Experimental Hematology, Chinese Academy of Medical Sciences, Tianjin 300070, China
| | - Lei Shi
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Mulin Jun Li
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
13
|
Dibaeinia P, Sinha S. Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks. Nucleic Acids Res 2021; 49:10309-10327. [PMID: 34508359 PMCID: PMC8501998 DOI: 10.1093/nar/gkab765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/18/2021] [Accepted: 08/25/2021] [Indexed: 11/18/2022] Open
Abstract
Deciphering the sequence-function relationship encoded in enhancers holds the key to interpreting non-coding variants and understanding mechanisms of transcriptomic variation. Several quantitative models exist for predicting enhancer function and underlying mechanisms; however, there has been no systematic comparison of these models characterizing their relative strengths and shortcomings. Here, we interrogated a rich data set of neuroectodermal enhancers in Drosophila, representing cis- and trans- sources of expression variation, with a suite of biophysical and machine learning models. We performed rigorous comparisons of thermodynamics-based models implementing different mechanisms of activation, repression and cooperativity. Moreover, we developed a convolutional neural network (CNN) model, called CoNSEPT, that learns enhancer ‘grammar’ in an unbiased manner. CoNSEPT is the first general-purpose CNN tool for predicting enhancer function in varying conditions, such as different cell types and experimental conditions, and we show that such complex models can suggest interpretable mechanisms. We found model-based evidence for mechanisms previously established for the studied system, including cooperative activation and short-range repression. The data also favored one hypothesized activation mechanism over another and suggested an intriguing role for a direct, distance-independent repression mechanism. Our modeling shows that while fundamentally different models can yield similar fits to data, they vary in their utility for mechanistic inference. CoNSEPT is freely available at: https://github.com/PayamDiba/CoNSEPT.
Collapse
Affiliation(s)
- Payam Dibaeinia
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.,Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
14
|
Dai S, Qu L, Li J, Chen Y. Toward a mechanistic understanding of DNA binding by forkhead transcription factors and its perturbation by pathogenic mutations. Nucleic Acids Res 2021; 49:10235-10249. [PMID: 34551426 PMCID: PMC8501956 DOI: 10.1093/nar/gkab807] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 09/01/2021] [Accepted: 09/08/2021] [Indexed: 01/12/2023] Open
Abstract
Forkhead box (FOX) proteins are an evolutionarily conserved family of transcription factors that play numerous regulatory roles in eukaryotes during developmental and adult life. Dysfunction of FOX proteins has been implicated in a variety of human diseases, including cancer, neurodevelopment disorders and genetic diseases. The FOX family members share a highly conserved DNA-binding domain (DBD), which is essential for DNA recognition, binding and function. Since the first FOX structure was resolved in 1993, >30 FOX structures have been reported to date. It is clear now that the structure and DNA recognition mechanisms vary among FOX members; however, a systematic review on this aspect is lacking. In this manuscript, we present an overview of the mechanisms by which FOX transcription factors bind DNA, including protein structures, DNA binding properties and disease-causing mutations. This review should enable a better understanding of FOX family transcription factors for basic researchers and clinicians.
Collapse
Affiliation(s)
- Shuyan Dai
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Laboratory of Structural Biology, National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Linzhi Qu
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Laboratory of Structural Biology, National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Jun Li
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Laboratory of Structural Biology, National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Yongheng Chen
- Department of Oncology, NHC Key Laboratory of Cancer Proteomics, Laboratory of Structural Biology, National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| |
Collapse
|
15
|
Waymack R, Gad M, Wunderlich Z. Molecular competition can shape enhancer activity in the Drosophila embryo. iScience 2021; 24:103034. [PMID: 34568782 PMCID: PMC8449247 DOI: 10.1016/j.isci.2021.103034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 07/27/2021] [Accepted: 08/20/2021] [Indexed: 01/12/2023] Open
Abstract
Transgenic reporters allow the measurement of regulatory DNA activity in vivo and consequently have long been useful tools for studying enhancers. Despite their utility, few studies have investigated the effects these reporters may have on the expression of other genes. Understanding these effects is required to accurately interpret reporter data and characterize gene regulatory mechanisms. By measuring the expression of Kruppel (Kr) enhancer reporters in live Drosophila embryos, we find reporters inhibit one another’s expression and that of a nearby endogenous gene. Using synthetic transcription factor (TF) binding site arrays, we present evidence that competition for TFs is partially responsible for the observed transcriptional inhibition. We develop a simple thermodynamic model that predicts competition of the measured magnitude specifically when TF binding is restricted to distinct nuclear subregions. Our findings underline an unexpected role of the non-homogenous nature of the nucleus in regulating gene expression. Live tracking of transcription reveals competition between transgenic reporters Transgenic reporters can also depress the expression of a neighboring gene Expression inhibition is in part because of competition for transcription factors (TFs) Competition is predicted with a model that restricts TFs to sub-nuclear “hubs”
Collapse
Affiliation(s)
- Rachel Waymack
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Mario Gad
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Zeba Wunderlich
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA.,Department of Biology, Boston University, 610 Commonwealth Ave., Boston, MA 02215, USA.,Biological Design Center, Boston University, 610 Commonwealth Avenue, Boston, MA 02215, USA
| |
Collapse
|
16
|
Chen L, Capra JA. Learning and interpreting the gene regulatory grammar in a deep learning framework. PLoS Comput Biol 2020; 16:e1008334. [PMID: 33137083 PMCID: PMC7660921 DOI: 10.1371/journal.pcbi.1008334] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 11/12/2020] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
Deep neural networks (DNNs) have achieved state-of-the-art performance in identifying gene regulatory sequences, but they have provided limited insight into the biology of regulatory elements due to the difficulty of interpreting the complex features they learn. Several models of how combinatorial binding of transcription factors, i.e. the regulatory grammar, drives enhancer activity have been proposed, ranging from the flexible TF billboard model to the stringent enhanceosome model. However, there is limited knowledge of the prevalence of these (or other) sequence architectures across enhancers. Here we perform several hypothesis-driven analyses to explore the ability of DNNs to learn the regulatory grammar of enhancers. We created synthetic datasets based on existing hypotheses about combinatorial transcription factor binding site (TFBS) patterns, including homotypic clusters, heterotypic clusters, and enhanceosomes, from real TF binding motifs from diverse TF families. We then trained deep residual neural networks (ResNets) to model the sequences under a range of scenarios that reflect real-world multi-label regulatory sequence prediction tasks. We developed a gradient-based unsupervised clustering method to extract the patterns learned by the ResNet models. We demonstrated that simulated regulatory grammars are best learned in the penultimate layer of the ResNets, and the proposed method can accurately retrieve the regulatory grammar even when there is heterogeneity in the enhancer categories and a large fraction of TFBS outside of the regulatory grammar. However, we also identify common scenarios where ResNets fail to learn simulated regulatory grammars. Finally, we applied the proposed method to mouse developmental enhancers and were able to identify the components of a known heterotypic TF cluster. Our results provide a framework for interpreting the regulatory rules learned by ResNets, and they demonstrate that the ability and efficiency of ResNets in learning the regulatory grammar depends on the nature of the prediction task.
Collapse
Affiliation(s)
- Ling Chen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
| | - John A. Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
- Vanderbilt Genetics Institute and Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States of America
| |
Collapse
|
17
|
Pratihar S, Suseela YV, Govindaraju T. Threading Intercalator-Induced Nanocondensates and Role of Endogenous Metal Ions in Decondensation for DNA Delivery. ACS APPLIED BIO MATERIALS 2020; 3:6979-6991. [PMID: 35019357 DOI: 10.1021/acsabm.0c00870] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The interplay of condensation and decondensation of DNA plays a crucial role in chromosome maintenance and gene expression. The molecular architectonics governing the chromatin condensation-decondensation cycle are worth studying, as DNA performs unique and distinct roles in each state and switches between two states without the loss of structural and functional integrity. This phenomenon has been adapted and implemented in transfection studies. Effective gene delivery into the cells to achieve respectable transfection efficiency has remained a challenge and emphasizes the need for understanding the steps involved in DNA delivery and transfection. Especially, recognizing the factors that effectively regulate DNA decondensation can provide logical solutions to the hurdles affecting the transfection efficiency. We designed a set of small molecule-based threading intercalation ligands as model condensing agents to study various factors influencing the DNA condensation and decondensation process. This study revealed condensation of DNA into nanocondensate by the threading intercalator and endogenous stimuli induced effective decondensation. Further, DNA nanocondensates are tracked using the intrinsic fluorescence in the lower pH of endocytic pathway and were evaluated as nonviral vectors for in cellulo delivery of plasmids. The correlation of decondensation of DNA nanocondensate with endogenous metal ions at their physiological concentrations provided valuable insights and implications for intracellular DNA delivery.
Collapse
Affiliation(s)
- Sumon Pratihar
- Bioorganic Chemistry Laboratory, New Chemistry Unit and School of Advanced Materials (SAMat), Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR), Jakkur, P.O., Bengaluru, Karnataka 560064, India
| | - Yelisetty Venkata Suseela
- Bioorganic Chemistry Laboratory, New Chemistry Unit and School of Advanced Materials (SAMat), Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR), Jakkur, P.O., Bengaluru, Karnataka 560064, India
| | - Thimmaiah Govindaraju
- Bioorganic Chemistry Laboratory, New Chemistry Unit and School of Advanced Materials (SAMat), Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR), Jakkur, P.O., Bengaluru, Karnataka 560064, India
| |
Collapse
|
18
|
Levitsky V, Oshchepkov D, Zemlyanskaya E, Merkulova T. Asymmetric Conservation within Pairs of Co-Occurred Motifs Mediates Weak Direct Binding of Transcription Factors in ChIP-Seq Data. Int J Mol Sci 2020; 21:E6023. [PMID: 32825662 PMCID: PMC7504069 DOI: 10.3390/ijms21176023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 08/18/2020] [Accepted: 08/18/2020] [Indexed: 12/30/2022] Open
Abstract
(1) Background: Transcription factors (TFs) are main regulators of eukaryotic gene expression. The cooperative binding to genomic DNA of at least two TFs is the widespread mechanism of transcription regulation. Cooperating TFs can be revealed through the analysis of co-occurrence of their motifs. (2) Methods: We applied the motifs co-occurrence tool (MCOT) that predicted pairs of spaced or overlapped motifs (composite elements, CEs) for a single ChIP-seq dataset. We improved MCOT capability for the prediction of asymmetric CEs with one of the participating motifs possessing higher conservation than another does. (3) Results: Analysis of 119 ChIP-seq datasets for 45 human TFs revealed that almost for all families of TFs the co-occurrence with an overlap between motifs of target TFs and more conserved partner motifs was significantly higher than that for less conserved partner motifs. The asymmetry toward partner TFs was the most clear for partner motifs of TFs from the ETS (E26 Transformation Specific) family. (4) Conclusion: Co-occurrence with an overlap of less conserved motif of a target TF and more conserved motifs of partner TFs explained a substantial portion of ChIP-seq data lacking conserved motifs of target TFs. Among other TF families, conservative motifs of TFs from ETS family were the most prone to mediate interaction of target TFs with its weak motifs in ChIP-seq.
Collapse
Affiliation(s)
- Victor Levitsky
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (D.O.); (E.Z.)
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia
| | - Dmitry Oshchepkov
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (D.O.); (E.Z.)
| | - Elena Zemlyanskaya
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (D.O.); (E.Z.)
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia
| | - Tatyana Merkulova
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (D.O.); (E.Z.)
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia
| |
Collapse
|
19
|
Levitsky V, Oshchepkov D, Zemlyanskaya E, Merkulova T. Asymmetric Conservation within Pairs of Co-Occurred Motifs Mediates Weak Direct Binding of Transcription Factors in ChIP-Seq Data. Int J Mol Sci 2020; 21:ijms21176023. [PMID: 32825662 DOI: 10.20944/preprints202007.0639.v2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 08/18/2020] [Accepted: 08/18/2020] [Indexed: 05/28/2023] Open
Abstract
(1) Background: Transcription factors (TFs) are main regulators of eukaryotic gene expression. The cooperative binding to genomic DNA of at least two TFs is the widespread mechanism of transcription regulation. Cooperating TFs can be revealed through the analysis of co-occurrence of their motifs. (2) Methods: We applied the motifs co-occurrence tool (MCOT) that predicted pairs of spaced or overlapped motifs (composite elements, CEs) for a single ChIP-seq dataset. We improved MCOT capability for the prediction of asymmetric CEs with one of the participating motifs possessing higher conservation than another does. (3) Results: Analysis of 119 ChIP-seq datasets for 45 human TFs revealed that almost for all families of TFs the co-occurrence with an overlap between motifs of target TFs and more conserved partner motifs was significantly higher than that for less conserved partner motifs. The asymmetry toward partner TFs was the most clear for partner motifs of TFs from the ETS (E26 Transformation Specific) family. (4) Conclusion: Co-occurrence with an overlap of less conserved motif of a target TF and more conserved motifs of partner TFs explained a substantial portion of ChIP-seq data lacking conserved motifs of target TFs. Among other TF families, conservative motifs of TFs from ETS family were the most prone to mediate interaction of target TFs with its weak motifs in ChIP-seq.
Collapse
Affiliation(s)
- Victor Levitsky
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia
| | - Dmitry Oshchepkov
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
| | - Elena Zemlyanskaya
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia
| | - Tatyana Merkulova
- Department of System Biology, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
- Department of Natural Science, Novosibirsk State University, 630090 Novosibirsk, Russia
| |
Collapse
|
20
|
Levitsky V, Zemlyanskaya E, Oshchepkov D, Podkolodnaya O, Ignatieva E, Grosse I, Mironova V, Merkulova T. A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package. Nucleic Acids Res 2020; 47:e139. [PMID: 31750523 PMCID: PMC6868382 DOI: 10.1093/nar/gkz800] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 08/12/2019] [Accepted: 09/09/2019] [Indexed: 01/20/2023] Open
Abstract
Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.
Collapse
Affiliation(s)
- Victor Levitsky
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Elena Zemlyanskaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Dmitry Oshchepkov
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Olga Podkolodnaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Elena Ignatieva
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Ivo Grosse
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany
| | - Victoria Mironova
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Tatyana Merkulova
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Department of Molecular Genetics, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| |
Collapse
|
21
|
Toivonen J, Das PK, Taipale J, Ukkonen E. MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs. Bioinformatics 2020; 36:2690-2696. [PMID: 31999322 PMCID: PMC7203737 DOI: 10.1093/bioinformatics/btaa045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 12/23/2019] [Accepted: 01/23/2020] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Position-specific probability matrices (PPMs, also called position-specific weight matrices) have been the dominating model for transcription factor (TF)-binding motifs in DNA. There is, however, increasing recent evidence of better performance of higher order models such as Markov models of order one, also called adjacent dinucleotide matrices (ADMs). ADMs can model dependencies between adjacent nucleotides, unlike PPMs. A modeling technique and software tool that would estimate such models simultaneously both for monomers and their dimers have been missing. RESULTS We present an ADM-based mixture model for monomeric and dimeric TF-binding motifs and an expectation maximization algorithm MODER2 for learning such models from training data and seeds. The model is a mixture that includes monomers and dimers, built from the monomers, with a description of the dimeric structure (spacing, orientation). The technique is modular, meaning that the co-operative effect of dimerization is made explicit by evaluating the difference between expected and observed models. The model is validated using HT-SELEX and generated datasets, and by comparing to some earlier PPM and ADM techniques. The ADM models explain data slightly better than PPM models for 314 tested TFs (or their DNA-binding domains) from four families (bHLH, bZIP, ETS and Homeodomain), the ADM mixture models by MODER2 being the best on average. AVAILABILITY AND IMPLEMENTATION Software implementation is available from https://github.com/jttoivon/moder2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jarkko Toivonen
- Department of Computer Science, University of Helsinki, Helsinki FI-00014, Finland
| | - Pratyush K Das
- Applied Tumor Genomics, Research Programs Unit, University of Helsinki, Helsinki FI-00014, Finland
| | - Jussi Taipale
- Department of Biochemistry, University of Cambridge, CB2 1GA Cambridge, UK
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, SE 141 83 Stockholm, Sweden
- Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
- Genome-Scale Biology Program, University of Helsinki, Helsinki FI-00014, Finland
| | - Esko Ukkonen
- Department of Computer Science, University of Helsinki, Helsinki FI-00014, Finland
| |
Collapse
|
22
|
Multi-level and lineage-specific interactomes of the Hox transcription factor Ubx contribute to its functional specificity. Nat Commun 2020; 11:1388. [PMID: 32170121 PMCID: PMC7069958 DOI: 10.1038/s41467-020-15223-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 02/21/2020] [Indexed: 12/21/2022] Open
Abstract
Transcription factors (TFs) control cell fates by precisely orchestrating gene expression. However, how individual TFs promote transcriptional diversity remains unclear. Here, we use the Hox TF Ultrabithorax (Ubx) as a model to explore how a single TF specifies multiple cell types. Using proximity-dependent Biotin IDentification in Drosophila, we identify Ubx interactomes in three embryonic tissues. We find that Ubx interacts with largely non-overlapping sets of proteins with few having tissue-specific RNA expression. Instead most interactors are active in many cell types, controlling gene expression from chromatin regulation to the initiation of translation. Genetic interaction assays in vivo confirm that they act strictly lineage- and process-specific. Thus, functional specificity of Ubx seems to play out at several regulatory levels and to result from the controlled restriction of the interaction potential by the cellular environment. Thereby, it challenges long-standing assumptions such as differential RNA expression as determinant for protein complexes. Many transcription factors regulate gene expression in a lineage- and process-specific manner, despite being expressed in several cell types. Here, the authors show that the Hox transcription factor Ubx has lineage-specific interactomes, which contribute to its cell context-dependent functions.
Collapse
|
23
|
Panchy NL, Lloyd JP, Shiu SH. Improved recovery of cell-cycle gene expression in Saccharomyces cerevisiae from regulatory interactions in multiple omics data. BMC Genomics 2020; 21:159. [PMID: 32054475 PMCID: PMC7020519 DOI: 10.1186/s12864-020-6554-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 02/04/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Gene expression is regulated by DNA-binding transcription factors (TFs). Together with their target genes, these factors and their interactions collectively form a gene regulatory network (GRN), which is responsible for producing patterns of transcription, including cyclical processes such as genome replication and cell division. However, identifying how this network regulates the timing of these patterns, including important interactions and regulatory motifs, remains a challenging task. RESULTS We employed four in vivo and in vitro regulatory data sets to investigate the regulatory basis of expression timing and phase-specific patterns cell-cycle expression in Saccharomyces cerevisiae. Specifically, we considered interactions based on direct binding between TF and target gene, indirect effects of TF deletion on gene expression, and computational inference. We found that the source of regulatory information significantly impacts the accuracy and completeness of recovering known cell-cycle expressed genes. The best approach involved combining TF-target and TF-TF interactions features from multiple datasets in a single model. In addition, TFs important to multiple phases of cell-cycle expression also have the greatest impact on individual phases. Important TFs regulating a cell-cycle phase also tend to form modules in the GRN, including two sub-modules composed entirely of unannotated cell-cycle regulators (STE12-TEC1 and RAP1-HAP1-MSN4). CONCLUSION Our findings illustrate the importance of integrating both multiple omics data and regulatory motifs in order to understand the significance regulatory interactions involved in timing gene expression. This integrated approached allowed us to recover both known cell-cycles interactions and the overall pattern of phase-specific expression across the cell-cycle better than any single data set. Likewise, by looking at regulatory motifs in the form of TF-TF interactions, we identified sets of TFs whose co-regulation of target genes was important for cell-cycle expression, even when regulation by individual TFs was not. Overall, this demonstrates the power of integrating multiple data sets and models of interaction in order to understand the regulatory basis of established biological processes and their associated gene regulatory networks.
Collapse
Affiliation(s)
- Nicholas L Panchy
- Genetics Graduate Program, Michigan State University, East Lansing, MI, 48824, USA.,Present address: National Institute for Mathematical and Biological Synthesis, University of Tennessee, 1122 Volunteer Blvd., Suite 106, Knoxville, TN, 37996-3410, USA
| | - John P Lloyd
- Department of Human Genetics and Internal Medicine, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Shin-Han Shiu
- Genetics Graduate Program, Michigan State University, East Lansing, MI, 48824, USA. .,Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA. .,Michigan State University, Plant Biology Laboratories, 612 Wilson Road, Room 166, East Lansing, MI, 48824-1312, USA.
| |
Collapse
|
24
|
Mahmud AKMF, Yang D, Stenberg P, Ioshikhes I, Nandi S. Exploring a Drosophila Transcription Factor Interaction Network to Identify Cis-Regulatory Modules. J Comput Biol 2019; 27:1313-1328. [PMID: 31855461 DOI: 10.1089/cmb.2018.0160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Multiple transcription factors (TFs) bind to specific sites in the genome and interact among themselves to form the cis-regulatory modules (CRMs). They are essential in modulating the expression of genes, and it is important to study this interplay to understand gene regulation. In the present study, we integrated experimentally identified TF binding sites collected from published studies with computationally predicted TF binding sites to identify Drosophila CRMs. Along with the detection of the previously known CRMs, this approach identified novel protein combinations. We determined high-occupancy target sites, where a large number of TFs bind. Investigating these sites revealed that Giant, Dichaete, and Knirp are highly enriched in these locations. A common TAG team motif was observed at these sites, which might play a role in recruiting other TFs. While comparing the binding sites at distal and proximal promoters, we found that certain regulatory TFs, such as Zelda, were highly enriched in enhancers. Our study has shown that, from the information available concerning the TF binding sites, the real CRMs could be predicted accurately and efficiently. Although we only may claim co-occurrence of these proteins in this study, it may actually point to their interaction (as known interaction proteins typically co-occur together). Such an integrative approach can, therefore, help us to provide a better understanding of the interplay among the factors, even though further experimental verification is required.
Collapse
Affiliation(s)
| | - Doo Yang
- Ottawa Institute of Computational Biology and Bioinformatics (OICBB) and Ottawa Institute of Systems Biology (OISB) and Department of Biochemistry, Microbiology and Immunology (BMI), Faculty of Medicine, University of Ottawa, Ottawa, Canada
| | - Per Stenberg
- Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Ilya Ioshikhes
- Ottawa Institute of Computational Biology and Bioinformatics (OICBB) and Ottawa Institute of Systems Biology (OISB) and Department of Biochemistry, Microbiology and Immunology (BMI), Faculty of Medicine, University of Ottawa, Ottawa, Canada
| | - Soumyadeep Nandi
- Life Sciences Division, Institute of Advanced Study in Science and Technology, Vigyan Path, Paschim Boragaon, Guwahati, India; Amity University Haryana, Gurugram, India
| |
Collapse
|
25
|
Saha TT, Roy S, Pei G, Dou W, Zou Z, Raikhel AS. Synergistic action of the transcription factors Krüppel homolog 1 and Hairy in juvenile hormone/Methoprene-tolerant-mediated gene-repression in the mosquito Aedes aegypti. PLoS Genet 2019; 15:e1008443. [PMID: 31661489 PMCID: PMC6818763 DOI: 10.1371/journal.pgen.1008443] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 09/23/2019] [Indexed: 12/31/2022] Open
Abstract
Arthropod-specific juvenile hormones control numerous essential functions in development and reproduction. In the dengue-fever mosquito Aedes aegypti, in addition to its role in immature stages, juvenile hormone III (JH) governs post-eclosion (PE) development in adult females, a phase required for competence acquisition for blood feeding and subsequent egg maturation. During PE, JH through its receptor Methoprene-tolerant (Met) regulate the expression of many genes, causing either activation or repression. Met-mediated gene repression is indirect, requiring involvement of intermediate repressors. Hairy, which functions downstream of Met in the JH gene-repression hierarchy, is one such factor. Krüppel-homolog 1, a zinc-finger transcriptional factor, is directly regulated by Met and has been implicated in both activation and repression of JH-regulated genes. However, the interaction between Hairy and Kr-h1 in the JH-repression hierarchy is not well understood. Our RNAseq-based transcriptomic analysis of the Kr-h1-depleted mosquito fat body revealed that 92% of Kr-h1 repressed genes are also repressed by Met, supporting the existence of a hierarchy between Met and Kr-h1 as previously demonstrated in various insects. Notably, 130 genes are co-repressed by both Kr-h1 and Hairy, indicating regulatory complexity of the JH-mediated PE gene repression. A mosquito Kr-h1 binding site in genes co-regulated by this factor and Hairy was identified computationally. Moreover, this was validated using electrophoretic mobility shift assays. A complete phenocopy of the effect of Met RNAi depletion on target genes could only be observed after Kr-h1 and Hairy double RNAi knockdown, suggesting a synergistic action between these two factors in target gene repression. This was confirmed using a cell-culture-based luciferase reporter assay. Taken together, our results indicate that Hairy and Kr-h1 not only function as intermediate downstream factors, but also act together in a synergistic fashion in the JH/Met gene repression hierarchy. Juvenile hormone (JH) plays an essential role in preparing Aedes aegypti female mosquitoes for blood feeding, egg development, and pathogen transmission. JH acting through its receptor Methoprene-tolerant (Met) regulates the expression of large gene cohorts. JH mediated gene repression, unlike activation that is directly mediated by Met, is indirect and requires intermediate transcriptional repressors Hairy and Krüppel-homolog 1 (Kr-h1). Here, we demonstrate that Hairy and Kr-h1 can act synergistically in the JH-Met gene repression pathway in Aedes female mosquitoes. These interact directly with regulatory regions of the genes that have both Hairy and Kr-h1 binding sites. Thus, this study has significantly advanced our understanding of the complexity of the JH-mediated gene expression pathway. This research yields valuable information about the JH control of reproductive development of the mosquito A. aegypti, one of the most important vectors of human diseases.
Collapse
Affiliation(s)
- Tusar T. Saha
- Department of Entomology and Institute of Integrative Biology, University of California, Riverside, California, United States of America
- Department of Biological Sciences, Birla Institute of Technology and Science Pilani, K. K. Birla Goa Campus, Goa, India
| | - Sourav Roy
- Department of Entomology and Institute of Integrative Biology, University of California, Riverside, California, United States of America
- Department of Biological Sciences, University of Texas El Paso, Texas
| | - Gaofeng Pei
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wei Dou
- Department of Entomology and Institute of Integrative Biology, University of California, Riverside, California, United States of America
- College of Plant Protection, Southwest University, Chongqing, China
| | - Zhen Zou
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Alexander S. Raikhel
- Department of Entomology and Institute of Integrative Biology, University of California, Riverside, California, United States of America
- * E-mail:
| |
Collapse
|
26
|
Toivonen J, Kivioja T, Jolma A, Yin Y, Taipale J, Ukkonen E. Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets. Nucleic Acids Res 2019; 46:e44. [PMID: 29385521 PMCID: PMC5934673 DOI: 10.1093/nar/gky027] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 01/12/2018] [Indexed: 01/06/2023] Open
Abstract
In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional.
Collapse
Affiliation(s)
- Jarkko Toivonen
- Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland
| | - Teemu Kivioja
- Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland
| | - Arttu Jolma
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
| | - Yimeng Yin
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
| | - Jussi Taipale
- Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland.,Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden.,Department of Biochemistry, University of Cambridge, CB2 1GA Cambridge, UK
| | - Esko Ukkonen
- Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland.,Helsinki Institute for Information Technology HIIT, University of Helsinki & Aalto University, Helsinki, Finland
| |
Collapse
|
27
|
Ghosh RP, Shi Q, Yang L, Reddick MP, Nikitina T, Zhurkin VB, Fordyce P, Stasevich TJ, Chang HY, Greenleaf WJ, Liphardt JT. Satb1 integrates DNA binding site geometry and torsional stress to differentially target nucleosome-dense regions. Nat Commun 2019; 10:3221. [PMID: 31324780 PMCID: PMC6642133 DOI: 10.1038/s41467-019-11118-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 06/20/2019] [Indexed: 01/12/2023] Open
Abstract
The Satb1 genome organizer regulates multiple cellular and developmental processes. It is not yet clear how Satb1 selects different sets of targets throughout the genome. Here we have used live-cell single molecule imaging and deep sequencing to assess determinants of Satb1 binding-site selectivity. We have found that Satb1 preferentially targets nucleosome-dense regions and can directly bind consensus motifs within nucleosomes. Some genomic regions harbor multiple, regularly spaced Satb1 binding motifs (typical separation ~1 turn of the DNA helix) characterized by highly cooperative binding. The Satb1 homeodomain is dispensable for high affinity binding but is essential for specificity. Finally, we find that Satb1-DNA interactions are mechanosensitive. Increasing negative torsional stress in DNA enhances Satb1 binding and Satb1 stabilizes base unpairing regions against melting by molecular machines. The ability of Satb1 to control diverse biological programs may reflect its ability to combinatorially use multiple site selection criteria.
Collapse
Affiliation(s)
- Rajarshi P Ghosh
- Bioengineering, Stanford University, Stanford, CA, 94305, USA
- BioX Institute, Stanford University, Stanford, CA, 94305, USA
- ChEM-H, Stanford University, Stanford, CA, 94305, USA
- Cell Biology Division, Stanford Cancer Institute, Stanford, CA, 94305, USA
| | - Quanming Shi
- Bioengineering, Stanford University, Stanford, CA, 94305, USA
- BioX Institute, Stanford University, Stanford, CA, 94305, USA
- ChEM-H, Stanford University, Stanford, CA, 94305, USA
- Cell Biology Division, Stanford Cancer Institute, Stanford, CA, 94305, USA
| | - Linfeng Yang
- Bioengineering, Stanford University, Stanford, CA, 94305, USA
- BioX Institute, Stanford University, Stanford, CA, 94305, USA
- ChEM-H, Stanford University, Stanford, CA, 94305, USA
- Cell Biology Division, Stanford Cancer Institute, Stanford, CA, 94305, USA
| | - Michael P Reddick
- Bioengineering, Stanford University, Stanford, CA, 94305, USA
- BioX Institute, Stanford University, Stanford, CA, 94305, USA
- ChEM-H, Stanford University, Stanford, CA, 94305, USA
- Cell Biology Division, Stanford Cancer Institute, Stanford, CA, 94305, USA
- Chemical Engineering, Stanford University, Stanford, CA, 94305, USA
| | - Tatiana Nikitina
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Victor B Zhurkin
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Polly Fordyce
- Bioengineering, Stanford University, Stanford, CA, 94305, USA
- ChEM-H, Stanford University, Stanford, CA, 94305, USA
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA, 94158, USA
| | - Timothy J Stasevich
- Department of Biochemistry and Molecular Biology and the Institute for Genome Architecture and Function, Colorado State University, Fort Collins, CO, USA
| | - Howard Y Chang
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, 94305, USA
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
- Department of Applied Physics, Stanford University, Stanford, United States
| | - Jan T Liphardt
- Bioengineering, Stanford University, Stanford, CA, 94305, USA.
- BioX Institute, Stanford University, Stanford, CA, 94305, USA.
- ChEM-H, Stanford University, Stanford, CA, 94305, USA.
- Cell Biology Division, Stanford Cancer Institute, Stanford, CA, 94305, USA.
| |
Collapse
|
28
|
Assad N, Tillo D, Ray S, Dzienny A, FitzGerald PC, Vinson C. GABPα and CREB1 Binding to Double Nucleotide Polymorphisms of Their Consensus Motifs and Cooperative Binding to the Composite ETS ⇔ CRE Motif ( ACCGGAAGTGACGTCA). ACS OMEGA 2019; 4:9904-9910. [PMID: 34151054 PMCID: PMC8208074 DOI: 10.1021/acsomega.9b00540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 05/24/2019] [Indexed: 06/13/2023]
Abstract
Previously, cooperative binding of the bZIP domain of CREB1 and the ETS domain of GABPα was observed for the composite DNA ETS ⇔ CRE motif (A 0 C 1 C 2 G 3 G 4 A 5 A 6 G 7 T 8 G 9 A 10 C 11 G 12 T 13 C 14 A 15 ). Single nucleotide polymorphisms (SNPs) at the beginning and end of the ETS motif (ACCGGAAGT) increased cooperative binding. Here, we use an Agilent microarray of 60-mers containing all double nucleotide polymorphisms (DNPs) of the ETS ⇔ CRE motif to explore GABPα and CREB1 binding to their individual motifs and their cooperative binding. For GABPα, all DNPs were bound as if each SNP acted independently. In contrast, CREB1 binding to some DNPs was stronger or weaker than expected, depending on the locations of each SNP. CREB1 binding to DNPs where both SNPs were in the same half site, T 8 G 9 A 10 or T 13 C 14 A 15 , was greater than expected, indicating that an additional SNP cannot destroy binding as much as expected, suggesting that an individual SNP is enough to abolish sequence-specific DNA binding of a single bZIP monomer. If a DNP contains SNPs in each half site, binding is weaker than expected. Similar results were observed for additional ETS and bZIP family members. Cooperative binding between GABPα and CREB1 to the ETS ⇔ CRE motif was weaker than expected except for DNPs containing A 7 and SNPs at the beginning of the ETS motif.
Collapse
Affiliation(s)
- Nima Assad
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Desiree Tillo
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Sreejana Ray
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alexa Dzienny
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Peter C. FitzGerald
- Genome
Analysis Unit, Genetics Branch, National Cancer Institute, National Institutes of Health, Building 37, Bethesda, Maryland 20892, United States
| | - Charles Vinson
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
29
|
Datta V, Hannenhalli S, Siddharthan R. ChIPulate: A comprehensive ChIP-seq simulation pipeline. PLoS Comput Biol 2019; 15:e1006921. [PMID: 30897079 PMCID: PMC6445533 DOI: 10.1371/journal.pcbi.1006921] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 04/02/2019] [Accepted: 03/04/2019] [Indexed: 12/17/2022] Open
Abstract
ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) is a high-throughput technique to identify genomic regions that are bound in vivo by a particular protein, e.g., a transcription factor (TF). Biological factors, such as chromatin state, indirect and cooperative binding, as well as experimental factors, such as antibody quality, cross-linking, and PCR biases, are known to affect the outcome of ChIP-seq experiments. However, the relative impact of these factors on inferences made from ChIP-seq data is not entirely clear. Here, via a detailed ChIP-seq simulation pipeline, ChIPulate, we assess the impact of various biological and experimental sources of variation on several outcomes of a ChIP-seq experiment, viz., the recoverability of the TF binding motif, accuracy of TF-DNA binding detection, the sensitivity of inferred TF-DNA binding strength, and number of replicates needed to confidently infer binding strength. We find that the TF motif can be recovered despite poor and non-uniform extraction and PCR amplification efficiencies. The recovery of the motif is, however, affected to a larger extent by the fraction of sites that are either cooperatively or indirectly bound. Importantly, our simulations reveal that the number of ChIP-seq replicates needed to accurately measure in vivo occupancy at high-affinity sites is larger than the recommended community standards. Our results establish statistical limits on the accuracy of inferences of protein-DNA binding from ChIP-seq and suggest that increasing the mean extraction efficiency, rather than amplification efficiency, would better improve sensitivity. The source code and instructions for running ChIPulate can be found at https://github.com/vishakad/chipulate. DNA-binding proteins perform many key roles in biology, such as transcriptional regulation of gene expression and chromatin modification. ChIP-seq (Chromatin immunoprecipitation followed by high-throughput sequencing) is a widely used experimental technique to identify DNA-binding sites of specific proteins of interest, within cells, genome-wide. DNA fragments from genomic regions that are bound by a protein of interest, often a transcription factor (TF), are selectively extracted using specific antibodies, amplified using PCR, and sequenced. The sequences are mapped to the reference genome. Regions where many sequences map, called “peaks”, are used to infer the location of TF-bound loci (peaks), in vivo occupancy at those loci, and the sequence pattern (motif) to which the TF shows a binding affinity. But measurements of TF occupancy and motif inference are vulnerable to several biological and experimental sources of variation that are poorly understood and difficult to assess directly. Here, we simulate key steps of the ChIP-seq protocol with the aim of estimating the relative effects of various sources of variations on motif inference and binding affinity estimations. Besides providing specific insights and recommendations, we provide a general framework to simulate sequence reads in a ChIP-seq experiment, which should considerably aid in the development of software aimed at analyzing ChIP-seq data.
Collapse
Affiliation(s)
- Vishaka Datta
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, TIFR, Bengaluru, Karnataka, India
- * E-mail:
| | - Sridhar Hannenhalli
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Rahul Siddharthan
- The Institute of Mathematical Sciences/HBNI, Taramani, Chennai, India
| |
Collapse
|
30
|
Vandel J, Cassan O, Lèbre S, Lecellier CH, Bréhélin L. Probing transcription factor combinatorics in different promoter classes and in enhancers. BMC Genomics 2019; 20:103. [PMID: 30709337 PMCID: PMC6359851 DOI: 10.1186/s12864-018-5408-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Accepted: 12/26/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND In eukaryotic cells, transcription factors (TFs) are thought to act in a combinatorial way, by competing and collaborating to regulate common target genes. However, several questions remain regarding the conservation of these combinations among different gene classes, regulatory regions and cell types. RESULTS We propose a new approach named TFcoop to infer the TF combinations involved in the binding of a target TF in a particular cell type. TFcoop aims to predict the binding sites of the target TF upon the nucleotide content of the sequences and of the binding affinity of all identified cooperating TFs. The set of cooperating TFs and model parameters are learned from ChIP-seq data of the target TF. We used TFcoop to investigate the TF combinations involved in the binding of 106 TFs on 41 cell types and in four regulatory regions: promoters of mRNAs, lncRNAs and pri-miRNAs, and enhancers. We first assess that TFcoop is accurate and outperforms simple PWM methods for predicting TF binding sites. Next, analysis of the learned models sheds light on important properties of TF combinations in different promoter classes and in enhancers. First, we show that combinations governing TF binding on enhancers are more cell-type specific than that governing binding in promoters. Second, for a given TF and cell type, we observe that TF combinations are different between promoters and enhancers, but similar for promoters of mRNAs, lncRNAs and pri-miRNAs. Analysis of the TFs cooperating with the different targets show over-representation of pioneer TFs and a clear preference for TFs with binding motif composition similar to that of the target. Lastly, our models accurately distinguish promoters associated with specific biological processes. CONCLUSIONS TFcoop appears as an accurate approach for studying TF combinations. Its use on ENCODE and FANTOM data allowed us to discover important properties of human TF combinations in different promoter classes and in enhancers. The R code for learning a TFcoop model and for reproducing the main experiments described in the paper is available in an R Markdown file at address https://gite.lirmm.fr/brehelin/TFcoop .
Collapse
Affiliation(s)
- Jimmy Vandel
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
- IBC, CNRS, Univ. Montpellier, Montpellier, France
| | - Océane Cassan
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
- IBC, CNRS, Univ. Montpellier, Montpellier, France
| | - Sophie Lèbre
- IBC, CNRS, Univ. Montpellier, Montpellier, France
- IMAG, Univ. Montpellier, CNRS, Montpellier, France
- Univ. Paul Valery Montpellier, Montpellier, France
| | - Charles-Henri Lecellier
- IBC, CNRS, Univ. Montpellier, Montpellier, France.
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.
| | - Laurent Bréhélin
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France.
- IBC, CNRS, Univ. Montpellier, Montpellier, France.
| |
Collapse
|
31
|
Hope CM, Webber JL, Tokamov SA, Rebay I. Tuned polymerization of the transcription factor Yan limits off-DNA sequestration to confer context-specific repression. eLife 2018; 7:37545. [PMID: 30412049 PMCID: PMC6226293 DOI: 10.7554/elife.37545] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 10/22/2018] [Indexed: 01/08/2023] Open
Abstract
During development, transcriptional complexes at enhancers regulate gene expression in complex spatiotemporal patterns. To achieve robust expression without spurious activation, the affinity and specificity of transcription factor–DNA interactions must be precisely balanced. Protein–protein interactions among transcription factors are also critical, yet how their affinities impact enhancer output is not understood. The Drosophila transcription factor Yan provides a well-suited model to address this, as its function depends on the coordinated activities of two independent and essential domains: the DNA-binding ETS domain and the self-associating SAM domain. To explore how protein–protein affinity influences Yan function, we engineered mutants that increase SAM affinity over four orders of magnitude. This produced a dramatic subcellular redistribution of Yan into punctate structures, reduced repressive output and compromised survival. Cell-type specification and genetic interaction defects suggest distinct requirements for polymerization in different regulatory decisions. We conclude that tuned protein–protein interactions enable the dynamic spectrum of complexes that are required for proper regulation.
Collapse
Affiliation(s)
- C Matthew Hope
- Department of Biochemistry and Molecular Biophysics, University of Chicago, Chicago, United States
| | - Jemma L Webber
- Ben May Department for Cancer Research, University of Chicago, Chicago, United States
| | - Sherzod A Tokamov
- Committee on Development, Regeneration, and Stem Cell Biology, University of Chicago, Chicago, United States
| | - Ilaria Rebay
- Ben May Department for Cancer Research, University of Chicago, Chicago, United States.,Committee on Development, Regeneration, and Stem Cell Biology, University of Chicago, Chicago, United States
| |
Collapse
|
32
|
van Bömmel A, Love MI, Chung HR, Vingron M. coTRaCTE predicts co-occurring transcription factors within cell-type specific enhancers. PLoS Comput Biol 2018; 14:e1006372. [PMID: 30142147 PMCID: PMC6126874 DOI: 10.1371/journal.pcbi.1006372] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Revised: 09/06/2018] [Accepted: 07/17/2018] [Indexed: 02/06/2023] Open
Abstract
Cell-type specific gene expression is regulated by the combinatorial action of transcription factors (TFs). In this study, we predict transcription factor (TF) combinations that cooperatively bind in a cell-type specific manner. We first divide DNase hypersensitive sites into cell-type specifically open vs. ubiquitously open sites in 64 cell types to describe possible cell-type specific enhancers. Based on the pattern contrast between these two groups of sequences we develop "co-occurring TF predictor on Cell-Type specific Enhancers" (coTRaCTE) - a novel statistical method to determine regulatory TF co-occurrences. Contrasting the co-binding of TF pairs between cell-type specific and ubiquitously open chromatin guarantees the high cell-type specificity of the predictions. coTRaCTE predicts more than 2000 co-occurring TF pairs in 64 cell types. The large majority (70%) of these TF pairs is highly cell-type specific and overlaps in TF pair co-occurrence are highly consistent among related cell types. Furthermore, independently validated co-occurring and directly interacting TFs are significantly enriched in our predictions. Focusing on the regulatory network derived from the predicted co-occurring TF pairs in embryonic stem cells (ESCs) we find that it consists of three subnetworks with distinct functions: maintenance of pluripotency governed by OCT4, SOX2 and NANOG, regulation of early development governed by KLF4, STAT3, ZIC3 and ZNF148 and general functions governed by MYC, TCF3 and YY1. In summary, coTRaCTE predicts highly cell-type specific co-occurring TFs which reveal new insights into transcriptional regulatory mechanisms.
Collapse
Affiliation(s)
- Alena van Bömmel
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Michael I. Love
- Department of Biostatistics, Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Ho-Ryun Chung
- Otto Warburg Laboratory, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Philipps-Universität Marburg, Fachbereich Medizin, Institut für Medizinische Bioinformatik und Biostatistik, Marburg, Germany
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail:
| |
Collapse
|
33
|
Detection of cooperatively bound transcription factor pairs using ChIP-seq peak intensities and expectation maximization. PLoS One 2018; 13:e0199771. [PMID: 30016330 PMCID: PMC6049898 DOI: 10.1371/journal.pone.0199771] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 06/13/2018] [Indexed: 11/19/2022] Open
Abstract
Transcription factors (TFs) often work cooperatively, where the binding of one TF to DNA enhances the binding affinity of a second TF to a nearby location. Such cooperative binding is important for activating gene expression from promoters and enhancers in both prokaryotic and eukaryotic cells. Existing methods to detect cooperative binding of a TF pair rely on analyzing the sequence that is bound. We propose a method that uses, instead, only ChIP-seq peak intensities and an expectation maximization (CPI-EM) algorithm. We validate our method using ChIP-seq data from cells where one of a pair of TFs under consideration has been genetically knocked out. Our algorithm relies on our observation that cooperative TF-TF binding is correlated with weak binding of one of the TFs, which we demonstrate in a variety of cell types, including E. coli, S. cerevisiae and M. musculus cells. We show that this method performs significantly better than a predictor based only on the ChIP-seq peak distance of the TFs under consideration. This suggests that peak intensities contain information that can help detect the cooperative binding of a TF pair. CPI-EM also outperforms an existing sequence-based algorithm in detecting cooperative binding. The CPI-EM algorithm is available at https://github.com/vishakad/cpi-em.
Collapse
|
34
|
Fiévet JB, Nidelet T, Dillmann C, de Vienne D. Heterosis Is a Systemic Property Emerging From Non-linear Genotype-Phenotype Relationships: Evidence From in Vitro Genetics and Computer Simulations. Front Genet 2018; 9:159. [PMID: 29868111 PMCID: PMC5968397 DOI: 10.3389/fgene.2018.00159] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Accepted: 04/17/2018] [Indexed: 11/13/2022] Open
Abstract
Heterosis, the superiority of hybrids over their parents for quantitative traits, represents a crucial issue in plant and animal breeding as well as evolutionary biology. Heterosis has given rise to countless genetic, genomic and molecular studies, but has rarely been investigated from the point of view of systems biology. We hypothesized that heterosis is an emergent property of living systems resulting from frequent concave relationships between genotypic variables and phenotypes, or between different phenotypic levels. We chose the enzyme-flux relationship as a model of the concave genotype-phenotype (GP) relationship, and showed that heterosis can be easily created in the laboratory. First, we reconstituted in vitro the upper part of glycolysis. We simulated genetic variability of enzyme activity by varying enzyme concentrations in test tubes. Mixing the content of "parental" tubes resulted in "hybrids," whose fluxes were compared to the parental fluxes. Frequent heterotic fluxes were observed, under conditions that were determined analytically and confirmed by computer simulation. Second, to test this model in a more realistic situation, we modeled the glycolysis/fermentation network in yeast by considering one input flux, glucose, and two output fluxes, glycerol and acetaldehyde. We simulated genetic variability by randomly drawing parental enzyme concentrations under various conditions, and computed the parental and hybrid fluxes using a system of differential equations. Again we found that a majority of hybrids exhibited positive heterosis for metabolic fluxes. Cases of negative heterosis were due to local convexity between certain enzyme concentrations and fluxes. In both approaches, heterosis was maximized when the parents were phenotypically close and when the distributions of parental enzyme concentrations were contrasted and constrained. These conclusions are not restricted to metabolic systems: they only depend on the concavity of the GP relationship, which is commonly observed at various levels of the phenotypic hierarchy, and could account for the pervasiveness of heterosis.
Collapse
Affiliation(s)
- Julie B Fiévet
- GQE-Le Moulon, INRA, Centre National de la Recherche Scientifique, AgroParisTech, Université Paris-Sud, Gif-sur-Yvette, France
| | - Thibault Nidelet
- Sciences Pour l'Œnologie, INRA, Université de Montpellier, Montpellier, France
| | - Christine Dillmann
- GQE-Le Moulon, INRA, Centre National de la Recherche Scientifique, AgroParisTech, Université Paris-Sud, Gif-sur-Yvette, France
| | - Dominique de Vienne
- GQE-Le Moulon, INRA, Centre National de la Recherche Scientifique, AgroParisTech, Université Paris-Sud, Gif-sur-Yvette, France
| |
Collapse
|
35
|
Mitani T, Yabuta Y, Ohta H, Nakamura T, Yamashiro C, Yamamoto T, Saitou M, Kurimoto K. Principles for the regulation of multiple developmental pathways by a versatile transcriptional factor, BLIMP1. Nucleic Acids Res 2017; 45:12152-12169. [PMID: 28981894 PMCID: PMC5716175 DOI: 10.1093/nar/gkx798] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 08/30/2017] [Indexed: 11/14/2022] Open
Abstract
Single transcription factors (TFs) regulate multiple developmental pathways, but the underlying mechanisms remain unclear. Here, we quantitatively characterized the genome-wide occupancy profiles of BLIMP1, a key transcriptional regulator for diverse developmental processes, during the development of three germ-layer derivatives (photoreceptor precursors, embryonic intestinal epithelium and plasmablasts) and the germ cell lineage (primordial germ cells). We identified BLIMP1-binding sites shared among multiple developmental processes, and such sites were highly occupied by BLIMP1 with a stringent recognition motif and were located predominantly in promoter proximities. A subset of bindings common to all the lineages exhibited a new, strong recognition sequence, a GGGAAA repeat. Paradoxically, however, the shared/common bindings had only a slight impact on the associated gene expression. In contrast, BLIMP1 occupied more distal sites in a cell type-specific manner; despite lower occupancy and flexible sequence recognitions, such bindings contributed effectively to the repression of the associated genes. Recognition motifs of other key TFs in BLIMP1-binding sites had little impact on the expression-level changes. These findings suggest that the shared/common sites might serve as potential reservoirs of BLIMP1 that functions at the specific sites, providing the foundation for a unified understanding of the genome regulation by BLIMP1, and, possibly, TFs in general.
Collapse
Affiliation(s)
- Tadahiro Mitani
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Yukihiro Yabuta
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Hiroshi Ohta
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Tomonori Nakamura
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Chika Yamashiro
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Takuya Yamamoto
- Center for iPS Cell Research and Application, Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto 606-8507, Japan.,AMED-CREST, AMED 1-7-1 Otemachi, Chiyoda-ku, Tokyo 100-0004, Japan
| | - Mitinori Saitou
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,Center for iPS Cell Research and Application, Kyoto University, 53 Kawahara-cho, Shogoin, Sakyo-ku, Kyoto 606-8507, Japan.,Institute for Integrated Cell-Material Sciences, Kyoto University, Yoshida-Ushinomiya-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Kazuki Kurimoto
- Department of Anatomy and Cell Biology, Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.,JST, ERATO, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
| |
Collapse
|
36
|
Ye Y, Gao L, Zhang S. Integrative Analysis of Transcription Factor Combinatorial Interactions Using a Bayesian Tensor Factorization Approach. Front Genet 2017; 8:140. [PMID: 29033978 PMCID: PMC5625019 DOI: 10.3389/fgene.2017.00140] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 09/15/2017] [Indexed: 11/13/2022] Open
Abstract
Transcription factors play a key role in transcriptional regulation of genes and determination of cellular identity through combinatorial interactions. However, current studies about combinatorial regulation is deficient due to lack of experimental data in the same cellular environment and extensive existence of data noise. Here, we adopt a Bayesian CANDECOMP/PARAFAC (CP) factorization approach (BCPF) to integrate multiple datasets in a network paradigm for determining precise TF interaction landscapes. In our first application, we apply BCPF to integrate three networks built based on diverse datasets of multiple cell lines from ENCODE respectively to predict a global and precise TF interaction network. This network gives 38 novel TF interactions with distinct biological functions. In our second application, we apply BCPF to seven types of cell type TF regulatory networks and predict seven cell lineage TF interaction networks, respectively. By further exploring the dynamics and modularity of them, we find cell lineage-specific hub TFs participate in cell type or lineage-specific regulation by interacting with non-specific TFs. Furthermore, we illustrate the biological function of hub TFs by taking those of cancer lineage and blood lineage as examples. Taken together, our integrative analysis can reveal more precise and extensive description about human TF combinatorial interactions.
Collapse
Affiliation(s)
- Yusen Ye
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
37
|
Saul MC, Seward CH, Troy JM, Zhang H, Sloofman LG, Lu X, Weisner PA, Caetano-Anolles D, Sun H, Zhao SD, Chandrasekaran S, Sinha S, Stubbs L. Transcriptional regulatory dynamics drive coordinated metabolic and neural response to social challenge in mice. Genome Res 2017; 27:959-972. [PMID: 28356321 PMCID: PMC5453329 DOI: 10.1101/gr.214221.116] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 03/24/2017] [Indexed: 12/22/2022]
Abstract
Agonistic encounters are powerful effectors of future behavior, and the ability to learn from this type of social challenge is an essential adaptive trait. We recently identified a conserved transcriptional program defining the response to social challenge across animal species, highly enriched in transcription factor (TF), energy metabolism, and developmental signaling genes. To understand the trajectory of this program and to uncover the most important regulatory influences controlling this response, we integrated gene expression data with the chromatin landscape in the hypothalamus, frontal cortex, and amygdala of socially challenged mice over time. The expression data revealed a complex spatiotemporal patterning of events starting with neural signaling molecules in the frontal cortex and ending in the modulation of developmental factors in the amygdala and hypothalamus, underpinned by a systems-wide shift in expression of energy metabolism-related genes. The transcriptional signals were correlated with significant shifts in chromatin accessibility and a network of challenge-associated TFs. Among these, the conserved metabolic and developmental regulator ESRRA was highlighted for an especially early and important regulatory role. Cell-type deconvolution analysis attributed the differential metabolic and developmental signals in this social context primarily to oligodendrocytes and neurons, respectively, and we show that ESRRA is expressed in both cell types. Localizing ESRRA binding sites in cortical chromatin, we show that this nuclear receptor binds both differentially expressed energy-related and neurodevelopmental TF genes. These data link metabolic and neurodevelopmental signaling to social challenge, and identify key regulatory drivers of this process with unprecedented tissue and temporal resolution.
Collapse
Affiliation(s)
- Michael C Saul
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Christopher H Seward
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Joseph M Troy
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Illinois Informatics Institute, Urbana, Illinois 61801, USA
| | - Huimin Zhang
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Laura G Sloofman
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Xiaochen Lu
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Patricia A Weisner
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Derek Caetano-Anolles
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Hao Sun
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Sihai Dave Zhao
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Statistics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Sriram Chandrasekaran
- Harvard Society of Fellows, Harvard University, Cambridge, Massachusetts 02138, USA
- Faculty of Arts and Sciences, Harvard University, Cambridge, Massachusetts 02138, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Saurabh Sinha
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Computer Science
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Lisa Stubbs
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
38
|
Shpigler HY, Saul MC, Murdoch EE, Cash-Ahmed AC, Seward CH, Sloofman L, Chandrasekaran S, Sinha S, Stubbs LJ, Robinson GE. Behavioral, transcriptomic and epigenetic responses to social challenge in honey bees. GENES BRAIN AND BEHAVIOR 2017; 16:579-591. [DOI: 10.1111/gbb.12379] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Revised: 03/03/2017] [Accepted: 03/14/2017] [Indexed: 01/06/2023]
Affiliation(s)
- H. Y. Shpigler
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - M. C. Saul
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - E. E. Murdoch
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - A. C. Cash-Ahmed
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - C. H. Seward
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Department of Cell and Developmental Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - L. Sloofman
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Center for Biophysics and Quantitative Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - S. Chandrasekaran
- Harvard Society of Fellows; Harvard University; Cambridge MA USA
- Faculty of Arts and Sciences; Harvard University; Cambridge MA USA
- Broad Institute of MIT and Harvard; Cambridge MA USA
- Department of Biomedical Engineering; University of Michigan; Ann Arbor MI USA
| | - S. Sinha
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Center for Biophysics and Quantitative Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Department of Computer Science; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Department of Entomology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - L. J. Stubbs
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Department of Cell and Developmental Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Neuroscience Program; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| | - G. E. Robinson
- Carl R. Woese Institute for Genomic Biology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Department of Entomology; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
- Neuroscience Program; University of Illinois at Urbana-Champaign (UIUC); Urbana IL USA
| |
Collapse
|
39
|
Lengyel IM, Morelli LG. Multiple binding sites for transcriptional repressors can produce regular bursting and enhance noise suppression. Phys Rev E 2017; 95:042412. [PMID: 28505727 DOI: 10.1103/physreve.95.042412] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Indexed: 06/07/2023]
Abstract
Cells may control fluctuations in protein levels by means of negative autoregulation, where transcription factors bind DNA sites to repress their own production. Theoretical studies have assumed a single binding site for the repressor, while in most species it is found that multiple binding sites are arranged in clusters. We study a stochastic description of negative autoregulation with multiple binding sites for the repressor. We find that increasing the number of binding sites induces regular bursting of gene products. By tuning the threshold for repression, we show that multiple binding sites can also suppress fluctuations. Our results highlight possible roles for the presence of multiple binding sites of negative autoregulators.
Collapse
Affiliation(s)
- Iván M Lengyel
- Instituto de Investigación en Biomedicina de Buenos Aires (IBioBA)-CONICET-Partner Institute of the Max Planck Society, Polo Científico Tecnológico, Godoy Cruz 2390, C1425FQD, Buenos Aires, Argentina
- Departamento de Física, FCEyN UBA, Ciudad Universitaria, 1428 Buenos Aires, Argentina
| | - Luis G Morelli
- Instituto de Investigación en Biomedicina de Buenos Aires (IBioBA)-CONICET-Partner Institute of the Max Planck Society, Polo Científico Tecnológico, Godoy Cruz 2390, C1425FQD, Buenos Aires, Argentina
- Departamento de Física, FCEyN UBA, Ciudad Universitaria, 1428 Buenos Aires, Argentina
- Max Planck Institute for Molecular Physiology, Department of Systemic Cell Biology, Otto-Hahn-Strasse 11, D-44227 Dortmund, Germany
| |
Collapse
|
40
|
Singh P, Han EH, Endrizzi JA, O'Brien RM, Chi YI. Crystal structures reveal a new and novel FoxO1 binding site within the human glucose-6-phosphatase catalytic subunit 1 gene promoter. J Struct Biol 2017; 198:54-64. [PMID: 28223045 DOI: 10.1016/j.jsb.2017.02.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Revised: 02/10/2017] [Accepted: 02/14/2017] [Indexed: 01/07/2023]
Abstract
Human glucose-6-phosphatase plays a vital role in blood glucose homeostasis and holds promise as a therapeutic target for diabetes. Expression of its catalytic subunit gene 1 (G6PC1) is tightly regulated by metabolic-response transcription factors such as FoxO1 and CREB. Although at least three potential FoxO1 binding sites (insulin response elements, IREs) and one CREB binding site (cAMP response element, CRE) within the proximal region of the G6PC1 promoter have been identified, the interplay between FoxO1 and CREB and between FoxO1 bound at multiple IREs has not been well characterized. Here we present the crystal structures of the FoxO1 DNA binding domain in complex with the G6PC1 promoter. These complexes reveal the presence of a new non-consensus FoxO1 binding site that overlaps the CRE, suggesting a mutual exclusion mechanism for FoxO1 and CREB binding at the G6PC1 promoter. Additional findings include (i) non-canonical FoxO1 recognition sites, (ii) incomplete FoxO1 occupancies at the available IRE sites, and (iii) FoxO1 dimeric interactions that may play a role in stabilizing DNA looping. These findings provide insight into the regulation of G6PC1 gene transcription by FoxO1, and demonstrate a high versatility of target gene recognition by FoxO1 that correlates with its diverse roles in biology.
Collapse
Affiliation(s)
- Puja Singh
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - Eun Hee Han
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - James A Endrizzi
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - Richard M O'Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232, United States.
| | - Young-In Chi
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States.
| |
Collapse
|
41
|
|
42
|
van Dijk D, Sharon E, Lotan-Pompan M, Weinberger A, Segal E, Carey LB. Large-scale mapping of gene regulatory logic reveals context-dependent repression by transcriptional activators. Genome Res 2016; 27:87-94. [PMID: 27965290 PMCID: PMC5204347 DOI: 10.1101/gr.212316.116] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 11/15/2016] [Indexed: 12/31/2022]
Abstract
Transcription factors (TFs) are key mediators that propagate extracellular and intracellular signals through to changes in gene expression profiles. However, the rules by which promoters decode the amount of active TF into target gene expression are not well understood. To determine the mapping between promoter DNA sequence, TF concentration, and gene expression output, we have conducted in budding yeast a large-scale measurement of the activity of thousands of designed promoters at six different levels of TF. We observe that maximum promoter activity is determined by TF concentration and not by the number of binding sites. Surprisingly, the addition of an activator site often reduces expression. A thermodynamic model that incorporates competition between neighboring binding sites for a local pool of TF molecules explains this behavior and accurately predicts both absolute expression and the amount by which addition of a site increases or reduces expression. Taken together, our findings support a model in which neighboring binding sites interact competitively when TF is limiting but otherwise act additively.
Collapse
Affiliation(s)
- David van Dijk
- Department of Biological Sciences, Department of Systems Biology, Columbia University, New York, New York 10027, USA.,Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Eilon Sharon
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Maya Lotan-Pompan
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Adina Weinberger
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Lucas B Carey
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| |
Collapse
|
43
|
Jin J, Lian T, Gu C, Yu K, Gao YQ, Su XD. The effects of cytosine methylation on general transcription factors. Sci Rep 2016; 6:29119. [PMID: 27385050 PMCID: PMC4935894 DOI: 10.1038/srep29119] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Accepted: 06/15/2016] [Indexed: 12/24/2022] Open
Abstract
DNA methylation on CpG sites is the most common epigenetic modification. Recently, methylation in a non-CpG context was found to occur widely on genomic DNA. Moreover, methylation of non-CpG sites is a highly controlled process, and its level may vary during cellular development. To study non-CpG methylation effects on DNA/protein interactions, we have chosen three human transcription factors (TFs): glucocorticoid receptor (GR), brain and muscle ARNT-like 1 (BMAL1) - circadian locomotor output cycles kaput (CLOCK) and estrogen receptor (ER) with methylated or unmethylated DNA binding sequences, using single-molecule and isothermal titration calorimetry assays. The results demonstrated that these TFs interact with methylated DNA with different effects compared with their cognate DNA sequences. The effects of non-CpG methylation on transcriptional regulation were validated by cell-based luciferase assay at protein level. The mechanisms of non-CpG methylation influencing DNA-protein interactions were investigated by crystallographic analyses and molecular dynamics simulation. With BisChIP-seq assays in HEK-293T cells, we found that GR can recognize highly methylated sites within chromatin in cells. Therefore, we conclude that non-CpG methylation of DNA can provide a mechanism for regulating gene expression through directly affecting the binding of TFs.
Collapse
Affiliation(s)
- Jianshi Jin
- Biodynamic Optical Imaging Center (BIOPIC), School of Life Sciences, Peking University, Beijing, China
- State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, China
| | - Tengfei Lian
- Biodynamic Optical Imaging Center (BIOPIC), School of Life Sciences, Peking University, Beijing, China
- State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, China
| | - Chan Gu
- Biodynamic Optical Imaging Center (BIOPIC), School of Life Sciences, Peking University, Beijing, China
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Kai Yu
- Biodynamic Optical Imaging Center (BIOPIC), School of Life Sciences, Peking University, Beijing, China
- State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, China
| | - Yi Qin Gao
- Biodynamic Optical Imaging Center (BIOPIC), School of Life Sciences, Peking University, Beijing, China
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xiao-Dong Su
- Biodynamic Optical Imaging Center (BIOPIC), School of Life Sciences, Peking University, Beijing, China
- State Key Laboratory of Protein and Plant Gene Research, Peking University, Beijing, China
| |
Collapse
|
44
|
Jovelin R, Krizus A, Taghizada B, Gray JC, Phillips PC, Claycomb JM, Cutter AD. Comparative genomic analysis of upstream miRNA regulatory motifs in Caenorhabditis. RNA (NEW YORK, N.Y.) 2016; 22:968-978. [PMID: 27140965 PMCID: PMC4911920 DOI: 10.1261/rna.055392.115] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 02/18/2016] [Indexed: 06/05/2023]
Abstract
MicroRNAs (miRNAs) comprise a class of short noncoding RNA molecules that play diverse developmental and physiological roles by controlling mRNA abundance and protein output of the vast majority of transcripts. Despite the importance of miRNAs in regulating gene function, we still lack a complete understanding of how miRNAs themselves are transcriptionally regulated. To fill this gap, we predicted regulatory sequences by searching for abundant short motifs located upstream of miRNAs in eight species of Caenorhabditis nematodes. We identified three conserved motifs across the Caenorhabditis phylogeny that show clear signatures of purifying selection from comparative genomics, patterns of nucleotide changes in motifs of orthologous miRNAs, and correlation between motif incidence and miRNA expression. We then validated our predictions with transgenic green fluorescent protein reporters and site-directed mutagenesis for a subset of motifs located in an enhancer region upstream of let-7 We demonstrate that a CT-dinucleotide motif is sufficient for proper expression of GFP in the seam cells of adult C. elegans, and that two other motifs play incremental roles in combination with the CT-rich motif. Thus, functional tests of sequence motifs identified through analysis of molecular evolutionary signatures provide a powerful path for efficiently characterizing the transcriptional regulation of miRNA genes.
Collapse
Affiliation(s)
- Richard Jovelin
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada Informatics and Bio-Computing Program, Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada
| | - Aldis Krizus
- Department of Molecular Genetics, University of Toronto, Ontario M5S 1A8, Canada
| | - Bakhtiyar Taghizada
- Department of Molecular Genetics, University of Toronto, Ontario M5S 1A8, Canada
| | - Jeremy C Gray
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Patrick C Phillips
- Institute of Ecology and Evolution, University of Oregon, Oregon 97403, USA
| | - Julie M Claycomb
- Department of Molecular Genetics, University of Toronto, Ontario M5S 1A8, Canada
| | - Asher D Cutter
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| |
Collapse
|
45
|
Kim J, Kang H, Park J, Kim W, Yoo J, Lee N, Kim J, Yoon TY, Choi G. PIF1-Interacting Transcription Factors and Their Binding Sequence Elements Determine the in Vivo Targeting Sites of PIF1. THE PLANT CELL 2016; 28:1388-405. [PMID: 27303023 PMCID: PMC4944412 DOI: 10.1105/tpc.16.00125] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Revised: 06/06/2016] [Accepted: 06/10/2016] [Indexed: 05/18/2023]
Abstract
The bHLH transcription factor PHYTOCHROME INTERACTING FACTOR1 (PIF1) binds G-box elements in vitro and inhibits light-dependent germination in Arabidopsis thaliana A previous genome-wide analysis of PIF1 targeting indicated that PIF1 binds 748 sites in imbibed seeds, only 59% of which possess G-box elements. This suggests the G-box is not the sole determinant of PIF1 targeting. The targeting of PIF1 to specific sites could be stabilized by PIF1-interacting transcription factors (PTFs) that bind other nearby sequence elements. Here, we report PIF1 targeting sites are enriched with not only G-boxes but also with other hexameric sequence elements we named G-box coupling elements (GCEs). One of these GCEs possesses an ACGT core and serves as a binding site for group A bZIP transcription factors, including ABSCISIC ACID INSENSITIVE5 (ABI5), which inhibits seed germination in abscisic acid signaling. PIF1 interacts with ABI5 and other group A bZIP transcription factors and together they target a subset of PIF1 binding sites in vivo. In vitro single-molecule fluorescence imaging confirms that ABI5 facilitates PIF1 binding to DNA fragments possessing multiple G-boxes or the GCE alone. Thus, we show in vivo PIF1 targeting to specific binding sites is determined by its interaction with PTFs and their binding to GCEs.
Collapse
Affiliation(s)
- Junghyun Kim
- Department of Biological Sciences, KAIST, Daejeon 34141, Korea
| | - Hyojin Kang
- Department of Convergence Technology Research, KISTI, Daejeon 34141, Korea
| | - Jeongmoo Park
- Department of Biological Sciences, KAIST, Daejeon 34141, Korea
| | - Woohyun Kim
- Department of Biological Sciences, KAIST, Daejeon 34141, Korea
| | - Janghyun Yoo
- Department of Physics, KAIST, Daejeon 34141, Korea
| | - Nayoung Lee
- Department of Biological Sciences, KAIST, Daejeon 34141, Korea
| | - Jaewook Kim
- Department of Biological Sciences, KAIST, Daejeon 34141, Korea
| | | | - Giltsu Choi
- Department of Biological Sciences, KAIST, Daejeon 34141, Korea
| |
Collapse
|
46
|
Sayal R, Dresch JM, Pushel I, Taylor BR, Arnosti DN. Quantitative perturbation-based analysis of gene expression predicts enhancer activity in early Drosophila embryo. eLife 2016; 5. [PMID: 27152947 PMCID: PMC4859806 DOI: 10.7554/elife.08445] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 04/04/2016] [Indexed: 01/02/2023] Open
Abstract
Enhancers constitute one of the major components of regulatory machinery of metazoans. Although several genome-wide studies have focused on finding and locating enhancers in the genomes, the fundamental principles governing their internal architecture and cis-regulatory grammar remain elusive. Here, we describe an extensive, quantitative perturbation analysis targeting the dorsal-ventral patterning gene regulatory network (GRN) controlled by Drosophila NF-κB homolog Dorsal. To understand transcription factor interactions on enhancers, we employed an ensemble of mathematical models, testing effects of cooperativity, repression, and factor potency. Models trained on the dataset correctly predict activity of evolutionarily divergent regulatory regions, providing insights into spatial relationships between repressor and activator binding sites. Importantly, the collective predictions of sets of models were effective at novel enhancer identification and characterization. Our study demonstrates how experimental dataset and modeling can be effectively combined to provide quantitative insights into cis-regulatory information on a genome-wide scale. DOI:http://dx.doi.org/10.7554/eLife.08445.001 DNA contains regions known as genes, which may be “transcribed” to produce the RNA molecules that act as templates for building proteins and regulate cell activity. Proteins called transcription factors can bind to specific sequences of DNA to influence whether nearby genes are transcribed. For example, so-called enhancer regions of DNA contain several binding sites for transcription factors, and this binding activates gene transcription. Little is known about how the transcription factor binding sites are organized in enhancer regions, which makes it difficult to use DNA sequence information alone to predict the regulation of genes. A transcription factor called Dorsal controls the activity of a network of genes that plays a crucial role in the development of fruit fly embryos. Dorsal binds to the enhancer region of a gene called rhomboid, which has been well studied and is known to be a fairly typical example of an enhancer region. To understand the regulatory information encoded in the DNA sequences of enhancers, Sayal, Dresch et al. have now used a technique called perturbation analysis to investigate the interactions that are likely to occur between Dorsal and other transcription factors as they bind to the rhomboid enhancer. This technique involves systematically mutating the enhancer to remove different combinations of transcription factor binding sites and quantitatively investigating the effect this has on gene activity. A large set of mathematical models were then trained using this data and shown to correctly predict the activity of a range of other gene regulatory regions. The collective predictions of the models identified new enhancer regions and revealed details about how different types of transcription factor binding sites are arranged within enhancers. As we enter an era where the DNA sequences of entire human populations are increasingly accessible, we would like to know the functional significance of changes in gene regulatory regions. Sayal, Dresch et al. show that the regulatory properties of specific control proteins are accessible by employing quantitative experiments and mathematical models. Similar studies will be required to learn how mutations found across the genome may alter gene expression, leading to better diagnosis and treatment of disease. DOI:http://dx.doi.org/10.7554/eLife.08445.002
Collapse
Affiliation(s)
- Rupinder Sayal
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, United States.,Department of Biochemistry, DAV University, Jalandhar, India
| | - Jacqueline M Dresch
- Department of Mathematics, Michigan State University, East Lansing, United States.,Department of Mathematics and Computer Science, Clark University, Worcester, United States
| | - Irina Pushel
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, United States.,Stowers Institute for Medical Research, Kansas City, United States
| | - Benjamin R Taylor
- Department of Computer Science and Engineering, Michigan State University, East Lansing, United States.,School of Computer Science, Georgia Institute of Technology, Atlanta, United States
| | - David N Arnosti
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, United States
| |
Collapse
|
47
|
Jankowski A, Tiuryn J, Prabhakar S. Romulus: robust multi-state identification of transcription factor binding sites from DNase-seq data. Bioinformatics 2016; 32:2419-26. [PMID: 27153645 PMCID: PMC4978937 DOI: 10.1093/bioinformatics/btw209] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 04/12/2016] [Indexed: 12/24/2022] Open
Abstract
Motivation: Computational prediction of transcription factor (TF) binding sites in the genome remains a challenging task. Here, we present Romulus, a novel computational method for identifying individual TF binding sites from genome sequence information and cell-type–specific experimental data, such as DNase-seq. It combines the strengths of previous approaches, and improves robustness by reducing the number of free parameters in the model by an order of magnitude. Results: We show that Romulus significantly outperforms existing methods across three sources of DNase-seq data, by assessing the performance of these tools against ChIP-seq profiles. The difference was particularly significant when applied to binding site prediction for low-information-content motifs. Our method is capable of inferring multiple binding modes for a single TF, which differ in their DNase I cut profile. Finally, using the model learned by Romulus and ChIP-seq data, we introduce Binding in Closed Chromatin (BCC) as a quantitative measure of TF pioneer factor activity. Uniquely, our measure quantifies a defining feature of pioneer factors, namely their ability to bind closed chromatin. Availability and Implementation: Romulus is freely available as an R package at http://github.com/ajank/Romulus. Contact:ajank@mimuw.edu.pl Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Aleksander Jankowski
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, 02-097 Warszawa, Poland Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Jerzy Tiuryn
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, 02-097 Warszawa, Poland
| | - Shyam Prabhakar
- Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| |
Collapse
|
48
|
Bottani S, Veitia RA. Hill function-based models of transcriptional switches: impact of specific, nonspecific, functional and nonfunctional binding. Biol Rev Camb Philos Soc 2016; 92:953-963. [PMID: 27061969 DOI: 10.1111/brv.12262] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Revised: 02/12/2016] [Accepted: 02/16/2016] [Indexed: 12/25/2022]
Abstract
We explore minimalist models of transcription in which we take into account that a cis-regulatory sequence is embedded in, and interacts with, a complex genome. The classical Hill equation is the simplest way to represent a transcriptional response. However, it may overlook the fact that a transcription factor (TF) establishes specific and nonspecific nonfunctional interactions with chromatin. Classical papers have shown that nonfunctional binding (not leading to transcription) may influence gene expression. We examine how the presence of additional binding sites for a TF, besides those on the gene(s) of interest, affect the shape and parameters of the transcriptional response. We consider two conditions: at equilibrium and at steady-state. In many cases the TF level is determined by the position of the cell within a spatial or temporal gradient. We show that such gradients can be adjusted by evolutionary selection to compensate for the alteration of the gene transcription response by the presence of nonfunctional binding sites. Finally, we analyse how the transcriptional response is affected by a decrease in TF concentration, as in cases of haploinsufficiency. We show that the nonlinearity of the transcriptional response as a function of [TF] exacerbates the effect of a decrease in the latter, at least for weakly expressed TFs. Although decades of work on TFs have led to the impression that almost everything is known about the control of gene expression, we show that even the simplest models of transcription control have not delivered all their secrets yet.
Collapse
Affiliation(s)
- Samuel Bottani
- Matière et Systèmes Complexes CNRS UMR 7057, 75013 Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, 75013 Paris, France
| | - Reiner A Veitia
- Université Paris Diderot, Sorbonne Paris Cité, 75013 Paris, France.,Institut Jacques Monod, CNRS UMR 7592, 75013 Paris, France
| |
Collapse
|
49
|
Abstract
Transcriptional control of gene expression requires interactions between the cis-regulatory elements (CREs) controlling gene promoters. We developed a sensitive computational method to identify CRE combinations with conserved spacing that does not require genome alignments. When applied to seven sensu stricto and sensu lato Saccharomyces species, 80% of the predicted interactions displayed some evidence of combinatorial transcriptional behavior in several existing datasets including: (1) chromatin immunoprecipitation data for colocalization of transcription factors, (2) gene expression data for coexpression of predicted regulatory targets, and (3) gene ontology databases for common pathway membership of predicted regulatory targets. We tested several predicted CRE interactions with chromatin immunoprecipitation experiments in a wild-type strain and strains in which a predicted cofactor was deleted. Our experiments confirmed that transcription factor (TF) occupancy at the promoters of the CRE combination target genes depends on the predicted cofactor while occupancy of other promoters is independent of the predicted cofactor. Our method has the additional advantage of identifying regulatory differences between species. By analyzing the S. cerevisiae and S. bayanus genomes, we identified differences in combinatorial cis-regulation between the species and showed that the predicted changes in gene regulation explain several of the species-specific differences seen in gene expression datasets. In some instances, the same CRE combinations appear to regulate genes involved in distinct biological processes in the two different species. The results of this research demonstrate that (1) combinatorial cis-regulation can be inferred by multi-genome analysis and (2) combinatorial cis-regulation can explain differences in gene expression between species.
Collapse
|
50
|
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Soboleva AV, Kasianov AS, Ashoor H, Ba-Alawi W, Bajic VB, Medvedeva YA, Kolpakov FA, Makeev VJ. HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res 2016; 44:D116-25. [PMID: 26586801 PMCID: PMC4702883 DOI: 10.1093/nar/gkv1249] [Citation(s) in RCA: 146] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 10/29/2015] [Accepted: 10/30/2015] [Indexed: 02/06/2023] Open
Abstract
Models of transcription factor (TF) binding sites provide a basis for a wide spectrum of studies in regulatory genomics, from reconstruction of regulatory networks to functional annotation of transcripts and sequence variants. While TFs may recognize different sequence patterns in different conditions, it is pragmatic to have a single generic model for each particular TF as a baseline for practical applications. Here we present the expanded and enhanced version of HOCOMOCO (http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco10), the collection of models of DNA patterns, recognized by transcription factors. HOCOMOCO now provides position weight matrix (PWM) models for binding sites of 601 human TFs and, in addition, PWMs for 396 mouse TFs. Furthermore, we introduce the largest up to date collection of dinucleotide PWM models for 86 (52) human (mouse) TFs. The update is based on the analysis of massive ChIP-Seq and HT-SELEX datasets, with the validation of the resulting models on in vivo data. To facilitate a practical application, all HOCOMOCO models are linked to gene and protein databases (Entrez Gene, HGNC, UniProt) and accompanied by precomputed score thresholds. Finally, we provide command-line tools for PWM and diPWM threshold estimation and motif finding in nucleotide sequences.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ivan S Yevshin
- Design Technological Institute of Digital Techniques, Siberian Branch of the Russian Academy of Sciences, 630090, Academician Rzhanov 6, Novosibirsk, Russia Institute of Systems Biology Ltd, 630112, office 901, Krasina 54, Novosibirsk, Russia
| | - Anastasiia V Soboleva
- Moscow Institute of Physics and Technology, 141700, Institutskiy per. 9, Dolgoprudny, Moscow Region, Russia
| | - Artem S Kasianov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Haitham Ashoor
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Wail Ba-Alawi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia Center for Bioengineering, Russian Academy of Sciences, 117312, 60-letiya Oktyabrya 7/2, Moscow, Russia
| | - Fedor A Kolpakov
- Design Technological Institute of Digital Techniques, Siberian Branch of the Russian Academy of Sciences, 630090, Academician Rzhanov 6, Novosibirsk, Russia Institute of Systems Biology Ltd, 630112, office 901, Krasina 54, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia Moscow Institute of Physics and Technology, 141700, Institutskiy per. 9, Dolgoprudny, Moscow Region, Russia
| |
Collapse
|