1
|
Long T, Bhattacharyya T, Repele A, Naylor M, Nooti S, Krueger S, Manu. The contributions of DNA accessibility and transcription factor occupancy to enhancer activity during cellular differentiation. G3 (BETHESDA, MD.) 2024; 14:jkad269. [PMID: 38124496 PMCID: PMC11090500 DOI: 10.1093/g3journal/jkad269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 11/01/2023] [Indexed: 12/23/2023]
Abstract
During gene regulation, DNA accessibility is thought to limit the availability of transcription factor (TF) binding sites, while TFs can increase DNA accessibility to recruit additional factors that upregulate gene expression. Given this interplay, the causative regulatory events in the modulation of gene expression remain unknown for the vast majority of genes. We utilized deeply sequenced ATAC-Seq data and site-specific knock-in reporter genes to investigate the relationship between the binding-site resolution dynamics of DNA accessibility and the expression dynamics of the enhancers of Cebpa during macrophage-neutrophil differentiation. While the enhancers upregulate reporter expression during the earliest stages of differentiation, there is little corresponding increase in their total accessibility. Conversely, total accessibility peaks during the last stages of differentiation without any increase in enhancer activity. The accessibility of positions neighboring C/EBP-family TF binding sites, which indicates TF occupancy, does increase significantly during early differentiation, showing that the early upregulation of enhancer activity is driven by TF binding. These results imply that a generalized increase in DNA accessibility is not sufficient, and binding by enhancer-specific TFs is necessary, for the upregulation of gene expression. Additionally, high-coverage ATAC-Seq combined with time-series expression data can infer the sequence of regulatory events at binding-site resolution.
Collapse
Affiliation(s)
- Trevor Long
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| | - Tapas Bhattacharyya
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| | - Andrea Repele
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| | - Madison Naylor
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| | - Sunil Nooti
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| | - Shawn Krueger
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| | - Manu
- Department of Biology, University of North Dakota, Grand Forks, ND 58202-9019, USA
| |
Collapse
|
2
|
Kang CK, Kim AR. Deep molecular learning of transcriptional control of a synthetic CRE enhancer and its variants. iScience 2024; 27:108747. [PMID: 38222110 PMCID: PMC10784702 DOI: 10.1016/j.isci.2023.108747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/29/2023] [Accepted: 12/12/2023] [Indexed: 01/16/2024] Open
Abstract
Massively parallel reporter assay measures transcriptional activities of various cis-regulatory modules (CRMs) in a single experiment. We developed a thermodynamic computational model framework that calculates quantitative levels of gene expression directly from regulatory DNA sequences. Using the framework, we investigated the molecular mechanisms of cis-regulatory mutations of a synthetic enhancer that cause abnormal gene expression. We found that, in a human cell line, competitive binding between family transcription factors (TFs) with slightly different binding preferences significantly increases the accuracy of recapitulating the transcriptional effects of thousands of single- or multi-mutations. We also discovered that even if various harmful mutations occurred in an activator binding site, CRM could stably maintain or even increase gene expression through a certain form of competitive binding between family TFs. These findings enhance understanding the effect of SNPs and indels on CRMs and would help building robust custom-designed CRMs for biologics production and gene therapy.
Collapse
Affiliation(s)
- Chan-Koo Kang
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| | - Ah-Ram Kim
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- School of Applied Artificial Intelligence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| |
Collapse
|
3
|
Nooti S, Naylor M, Long T, Groll B, Manu. LucFlow: A method to measure Luciferase reporter expression in single cells. PLoS One 2023; 18:e0292317. [PMID: 37792708 PMCID: PMC10550117 DOI: 10.1371/journal.pone.0292317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 09/18/2023] [Indexed: 10/06/2023] Open
Abstract
Reporter assays, in which the expression of an inert protein is driven by gene regulatory elements such as promoters and enhancers, are a workhorse for investigating gene regulation. Techniques for measuring reporter gene expression vary from single-cell or single-molecule approaches having low throughput to bulk Luciferase assays that have high throughput. We developed a Luciferase Reporter Assay using Flow-Cytometry (LucFlow), which measures reporter expression in single cells immunostained for Luciferase. We optimized and tested LucFlow with a murine cell line that can be differentiated into neutrophils, into which promoter-reporter and enhancer-promoter-reporter constructs have been integrated in a site-specific manner. The single-cell measurements are comparable to bulk ones but we found that dead cells have no detectable Luciferase protein, so that bulk assays underestimate reporter expression. LucFlow is able to achieve a higher accuracy than bulk methods by excluding dead cells during flow cytometry. Prior to fixation and staining, the samples are spiked with stained cells that can be discriminated during flow cytometry and control for tube-to-tube variation in experimental conditions. Computing fold change relative to control cells allows LucFlow to achieve a high level of precision. LucFlow, therefore, enables the accurate and precise measurement of reporter expression in a high throughput manner.
Collapse
Affiliation(s)
- Sunil Nooti
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Madison Naylor
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Trevor Long
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Brayden Groll
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Manu
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| |
Collapse
|
4
|
Long T, Bhattacharyya T, Repele A, Naylor M, Nooti S, Krueger S, Manu. The contributions of DNA accessibility and transcription factor occupancy to enhancer activity during cellular differentiation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.22.529579. [PMID: 37090616 PMCID: PMC10120690 DOI: 10.1101/2023.02.22.529579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
The upregulation of gene expression by enhancers depends upon the interplay between the binding of sequence-specific transcription factors (TFs) and DNA accessibility. DNA accessibility is thought to limit the ability of TFs to bind to their sites, while TFs can increase accessibility to recruit additional factors that upregulate gene expression. Given this interplay, the causative regulatory events underlying the modulation of gene expression during cellular differentiation remain unknown for the vast majority of genes. We investigated the binding-site resolution dynamics of DNA accessibility and the expression dynamics of the enhancers of an important neutrophil gene, Cebpa, during macrophage-neutrophil differentiation. Reporter genes were integrated in a site-specific manner in PUER cells, which are progenitors that can be differentiated into neutrophils or macrophages in vitro by activating the pan-leukocyte TF PU.1. Time series data show that two enhancers upregulate reporter expression during the first 48 hours of neutrophil differentiation. Surprisingly, there is little or no increase in the total accessibility, measured by ATAC-Seq, of the enhancers during the same time period. Conversely, total accessibility peaks 96 hrs after PU.1 activation-consistent with its role as a pioneer-but the enhancers do not upregulate gene expression. Combining deeply sequenced ATAC-Seq data with a new bias-correction method allowed the profiling of accessibility at single-nucleotide resolution and revealed protected regions in the enhancers that match all previously characterized TF binding sites and ChIP-Seq data. Although the accessibility of most positions does not change during early differentiation, that of positions neighboring TF binding sites, an indicator of TF occupancy, did increase significantly. The localized accessibility changes are limited to nucleotides neighboring C/EBP-family TF binding sites, showing that the upregulation of enhancer activity during early differentiation is driven by C/EBP-family TF binding. These results show that increasing the total accessibility of enhancers is not sufficient for upregulating their activity and other events such as TF binding are necessary for upregulation. Also, TF binding can cause upregulation without a perceptible increase in total accessibility. Finally, this study demonstrates the feasibility of comprehensively mapping individual TF binding sites as footprints using high coverage ATAC-Seq and inferring the sequence of events in gene regulation by combining with time-series gene expression data.
Collapse
Affiliation(s)
- Trevor Long
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| | - Tapas Bhattacharyya
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| | - Andrea Repele
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| | - Madison Naylor
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| | - Sunil Nooti
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| | - Shawn Krueger
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| | - Manu
- Department of Biology, University of North Dakota, Grand Forks, 58202-9019 ND, USA
| |
Collapse
|
5
|
Perkins ML, Gandara L, Crocker J. A synthetic synthesis to explore animal evolution and development. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200517. [PMID: 35634925 PMCID: PMC9149795 DOI: 10.1098/rstb.2020.0517] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Identifying the general principles by which genotypes are converted into phenotypes remains a challenge in the post-genomic era. We still lack a predictive understanding of how genes shape interactions among cells and tissues in response to signalling and environmental cues, and hence how regulatory networks generate the phenotypic variation required for adaptive evolution. Here, we discuss how techniques borrowed from synthetic biology may facilitate a systematic exploration of evolvability across biological scales. Synthetic approaches permit controlled manipulation of both endogenous and fully engineered systems, providing a flexible platform for investigating causal mechanisms in vivo. Combining synthetic approaches with multi-level phenotyping (phenomics) will supply a detailed, quantitative characterization of how internal and external stimuli shape the morphology and behaviour of living organisms. We advocate integrating high-throughput experimental data with mathematical and computational techniques from a variety of disciplines in order to pursue a comprehensive theory of evolution. This article is part of the theme issue ‘Genetic basis of adaptation and speciation: from loci to causative mutations’.
Collapse
Affiliation(s)
- Mindy Liu Perkins
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lautaro Gandara
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Justin Crocker
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| |
Collapse
|
6
|
Gaiewski MJ, Drewell RA, Dresch JM. Fitting thermodynamic-based models: Incorporating parameter sensitivity improves the performance of an evolutionary algorithm. Math Biosci 2021; 342:108716. [PMID: 34687735 DOI: 10.1016/j.mbs.2021.108716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 09/10/2021] [Accepted: 09/17/2021] [Indexed: 11/30/2022]
Abstract
A detailed comprehension of transcriptional regulation is critical to understanding the genetic control of development and disease across many different organisms. To more fully investigate the complex molecular interactions controlling the precise expression of genes, many groups have constructed mathematical models to complement their experimental approaches. A critical step in such studies is choosing the most appropriate parameter estimation algorithm to enable detailed analysis of the parameters that contribute to the models. In this study, we develop a novel set of evolutionary algorithms that use a pseudo-random Sobol Set to construct the initial population and incorporate parameter sensitivities into the adaptation of mutation rates, using local, global, and hybrid strategies. Comparison of the performance of these new algorithms to a number of current state-of-the-art global parameter estimation algorithms on a range of continuous test functions, as well as synthetic biological data representing models of gene regulatory systems, reveals improved performance of the new algorithms in terms of runtime, error and reproducibility. In addition, by analyzing the ability of these algorithms to fit datasets of varying quality, we provide the experimentalist with a guide to how the algorithms perform across a range of noisy data. These results demonstrate the improved performance of the new set of parameter estimation algorithms and facilitate meaningful integration of model parameters and predictions in our understanding of the molecular mechanisms of gene regulation.
Collapse
Affiliation(s)
- Michael J Gaiewski
- Department of Mathematics and Computer Science, Clark University, Worcester, MA, USA; Department of Mathematics, University of Connecticut, Storrs, CT, USA.
| | | | | |
Collapse
|
7
|
Moqtaderi Z, Brown S, Bender W. Genome-wide oscillations in G + C density and sequence conservation. Genome Res 2021; 31:2050-2057. [PMID: 34649930 PMCID: PMC8559709 DOI: 10.1101/gr.274332.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 09/01/2021] [Indexed: 11/25/2022]
Abstract
Eukaryotic genomes typically show a uniform G + C content among chromosomes, but on smaller scales, many species have a G + C density that fluctuates with a characteristic wavelength. This oscillation is evident in many insect species, with wavelengths ranging between 700 bp and 4 kb. Measures of evolutionary conservation oscillate in phase with G + C content, with conserved regions having higher G + C. Loci with large regulatory regions show more regular oscillations; coding sequences and heterochromatic regions show little or no oscillation. There is little oscillation in vertebrate genomes in regions with densely distributed mobile repetitive elements. However, species with few repeats show oscillation in both G + C density and sequence conservation. These oscillations may reflect optimal spacing of cis-regulatory elements.
Collapse
Affiliation(s)
- Zarmik Moqtaderi
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Susan Brown
- Department of Biology, Kansas State University, Manhattan, Kansas 66506, USA
| | - Welcome Bender
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
8
|
Dibaeinia P, Sinha S. Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks. Nucleic Acids Res 2021; 49:10309-10327. [PMID: 34508359 PMCID: PMC8501998 DOI: 10.1093/nar/gkab765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/18/2021] [Accepted: 08/25/2021] [Indexed: 11/18/2022] Open
Abstract
Deciphering the sequence-function relationship encoded in enhancers holds the key to interpreting non-coding variants and understanding mechanisms of transcriptomic variation. Several quantitative models exist for predicting enhancer function and underlying mechanisms; however, there has been no systematic comparison of these models characterizing their relative strengths and shortcomings. Here, we interrogated a rich data set of neuroectodermal enhancers in Drosophila, representing cis- and trans- sources of expression variation, with a suite of biophysical and machine learning models. We performed rigorous comparisons of thermodynamics-based models implementing different mechanisms of activation, repression and cooperativity. Moreover, we developed a convolutional neural network (CNN) model, called CoNSEPT, that learns enhancer 'grammar' in an unbiased manner. CoNSEPT is the first general-purpose CNN tool for predicting enhancer function in varying conditions, such as different cell types and experimental conditions, and we show that such complex models can suggest interpretable mechanisms. We found model-based evidence for mechanisms previously established for the studied system, including cooperative activation and short-range repression. The data also favored one hypothesized activation mechanism over another and suggested an intriguing role for a direct, distance-independent repression mechanism. Our modeling shows that while fundamentally different models can yield similar fits to data, they vary in their utility for mechanistic inference. CoNSEPT is freely available at: https://github.com/PayamDiba/CoNSEPT.
Collapse
Affiliation(s)
- Payam Dibaeinia
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
9
|
Abstract
Motivation The universal expressibility assumption of Deep Neural Networks (DNNs) is the key motivation behind recent worksin the systems biology community to employDNNs to solve important problems in functional genomics and moleculargenetics. Typically, such investigations have taken a ‘black box’ approach in which the internal structure of themodel used is set purely by machine learning considerations with little consideration of representing the internalstructure of the biological system by the mathematical structure of the DNN. DNNs have not yet been applied to thedetailed modeling of transcriptional control in which mRNA production is controlled by the binding of specific transcriptionfactors to DNA, in part because such models are in part formulated in terms of specific chemical equationsthat appear different in form from those used in neural networks. Results In this paper, we give an example of a DNN whichcan model the detailed control of transcription in a precise and predictive manner. Its internal structure is fully interpretableand is faithful to underlying chemistry of transcription factor binding to DNA. We derive our DNN from asystems biology model that was not previously recognized as having a DNN structure. Although we apply our DNNto data from the early embryo of the fruit fly Drosophila, this system serves as a test bed for analysis of much larger datasets obtained by systems biology studies on a genomic scale. . Availability and implementation The implementation and data for the models used in this paper are in a zip file in the supplementary material. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Liu
- Department of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - Kenneth Barr
- Department of Human Genetics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
10
|
Le Poul Y, Xin Y, Ling L, Mühling B, Jaenichen R, Hörl D, Bunk D, Harz H, Leonhardt H, Wang Y, Osipova E, Museridze M, Dharmadhikari D, Murphy E, Rohs R, Preibisch S, Prud'homme B, Gompel N. Regulatory encoding of quantitative variation in spatial activity of a Drosophila enhancer. SCIENCE ADVANCES 2020; 6:6/49/eabe2955. [PMID: 33268361 PMCID: PMC7821883 DOI: 10.1126/sciadv.abe2955] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 10/20/2020] [Indexed: 06/12/2023]
Abstract
Developmental enhancers control the expression of genes prefiguring morphological patterns. The activity of an enhancer varies among cells of a tissue, but collectively, expression levels in individual cells constitute a spatial pattern of gene expression. How the spatial and quantitative regulatory information is encoded in an enhancer sequence is elusive. To link spatial pattern and activity levels of an enhancer, we used systematic mutations of the yellow spot enhancer, active in developing Drosophila wings, and tested their effect in a reporter assay. Moreover, we developed an analytic framework based on the comprehensive quantification of spatial reporter activity. We show that the quantitative enhancer activity results from densely packed regulatory information along the sequence, and that a complex interplay between activators and multiple tiers of repressors carves the spatial pattern. Our results shed light on how an enhancer reads and integrates trans-regulatory landscape information to encode a spatial quantitative pattern.
Collapse
Affiliation(s)
- Yann Le Poul
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Yaqun Xin
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Liucong Ling
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Bettina Mühling
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Rita Jaenichen
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - David Hörl
- Human Biology and Bioimaging, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - David Bunk
- Human Biology and Bioimaging, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Hartmann Harz
- Human Biology and Bioimaging, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Heinrich Leonhardt
- Human Biology and Bioimaging, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Yingfei Wang
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics and Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Elena Osipova
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Mariam Museridze
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Deepak Dharmadhikari
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Eamonn Murphy
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Remo Rohs
- Quantitative and Computational Biology, Departments of Biological Sciences, Chemistry, Physics and Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Stephan Preibisch
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str. 10, 13092 Berlin, Germany
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Benjamin Prud'homme
- Aix-Marseille Université, CNRS, IBDM, Institut de Biologie du Développement de Marseille, Campus de Luminy Case 907, 13288 Marseille Cedex 9, France.
| | - Nicolas Gompel
- Evolutionary Ecology, Ludwig-Maximilians Universität München, Fakultät für Biologie, Biozentrum, Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany.
| |
Collapse
|
11
|
Binary Expression Enhances Reliability of Messaging in Gene Networks. ENTROPY 2020; 22:e22040479. [PMID: 33286254 PMCID: PMC7516962 DOI: 10.3390/e22040479] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 04/20/2020] [Accepted: 04/20/2020] [Indexed: 01/31/2023]
Abstract
The promoter state of a gene and its expression levels are modulated by the amounts of transcription factors interacting with its regulatory regions. Hence, one may interpret a gene network as a communicating system in which the state of the promoter of a gene (the source) is communicated by the amounts of transcription factors that it expresses (the message) to modulate the state of the promoter and expression levels of another gene (the receptor). The reliability of the gene network dynamics can be quantified by Shannon's entropy of the message and the mutual information between the message and the promoter state. Here we consider a stochastic model for a binary gene and use its exact steady state solutions to calculate the entropy and mutual information. We show that a slow switching promoter with long and equally standing ON and OFF states maximizes the mutual information and reduces entropy. That is a binary gene expression regime generating a high variance message governed by a bimodal probability distribution with peaks of the same height. Our results indicate that Shannon's theory can be a powerful framework for understanding how bursty gene expression conciliates with the striking spatio-temporal precision exhibited in pattern formation of developing organisms.
Collapse
|
12
|
The 3D Genome Shapes the Regulatory Code of Developmental Genes. J Mol Biol 2020; 432:712-723. [DOI: 10.1016/j.jmb.2019.10.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 10/11/2019] [Accepted: 10/24/2019] [Indexed: 02/06/2023]
|
13
|
Barr K, Reinitz J, Radulescu O. An in silico analysis of robust but fragile gene regulation links enhancer length to robustness. PLoS Comput Biol 2019; 15:e1007497. [PMID: 31730659 PMCID: PMC6881076 DOI: 10.1371/journal.pcbi.1007497] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 11/27/2019] [Accepted: 10/22/2019] [Indexed: 12/31/2022] Open
Abstract
Organisms must ensure that expression of genes is directed to the appropriate tissues at the correct times, while simultaneously ensuring that these gene regulatory systems are robust to perturbation. This idea is captured by a mathematical concept called r-robustness, which says that a system is robust to a perturbation in up to r - 1 randomly chosen parameters. r-robustness implies that the biological system has a small number of sensitive parameters and that this number can be used as a robustness measure. In this work we use this idea to investigate the robustness of gene regulation using a sequence level model of the Drosophila melanogaster gene even-skipped. We consider robustness with respect to mutations of the enhancer sequence and with respect to changes of the transcription factor concentrations. We find that gene regulation is r-robust with respect to mutations in the enhancer sequence and identify a number of sensitive nucleotides. In both natural and in silico predicted enhancers, the number of nucleotides that are sensitive to mutation correlates negatively with the length of the sequence, meaning that longer sequences are more robust. The exact degree of robustness obtained is dependent not only on DNA sequence, but also on the local concentration of regulatory factors. We find that gene regulation can be remarkably sensitive to changes in transcription factor concentrations at the boundaries of expression features, while it is robust to perturbation elsewhere.
Collapse
Affiliation(s)
- Kenneth Barr
- Department of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - John Reinitz
- Departments of Statistics, Ecology & Evolution, Molecular Genetics & Cell Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Ovidiu Radulescu
- LPHI UMR CNRS 5235, University of Montpellier, Montpellier, France
| |
Collapse
|
14
|
Park J, Estrada J, Johnson G, Vincent BJ, Ricci-Tam C, Bragdon MDJ, Shulgina Y, Cha A, Wunderlich Z, Gunawardena J, DePace AH. Dissecting the sharp response of a canonical developmental enhancer reveals multiple sources of cooperativity. eLife 2019; 8:e41266. [PMID: 31223115 PMCID: PMC6588347 DOI: 10.7554/elife.41266] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 03/04/2019] [Indexed: 12/19/2022] Open
Abstract
Developmental enhancers integrate graded concentrations of transcription factors (TFs) to create sharp gene expression boundaries. Here we examine the hunchback P2 (HbP2) enhancer which drives a sharp expression pattern in the Drosophila blastoderm embryo in response to the transcriptional activator Bicoid (Bcd). We systematically interrogate cis and trans factors that influence the shape and position of expression driven by HbP2, and find that the prevailing model, based on pairwise cooperative binding of Bcd to HbP2 is not adequate. We demonstrate that other proteins, such as pioneer factors, Mediator and histone modifiers influence the shape and position of the HbP2 expression pattern. Comparing our results to theory reveals how higher-order cooperativity and energy expenditure impact boundary location and sharpness. Our results emphasize that the bacterial view of transcription regulation, where pairwise interactions between regulatory proteins dominate, must be reexamined in animals, where multiple molecular mechanisms collaborate to shape the gene regulatory function.
Collapse
Affiliation(s)
- Jeehae Park
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | - Javier Estrada
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | - Gemma Johnson
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | - Ben J Vincent
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | - Chiara Ricci-Tam
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | - Meghan DJ Bragdon
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | | | - Anna Cha
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | - Zeba Wunderlich
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| | | | - Angela H DePace
- Department of Systems BiologyHarvard Medical SchoolBostonUnited States
| |
Collapse
|
15
|
Repele A, Krueger S, Bhattacharyya T, Tuineau MY. The regulatory control of Cebpa enhancers and silencers in the myeloid and red-blood cell lineages. PLoS One 2019; 14:e0217580. [PMID: 31181110 PMCID: PMC6557489 DOI: 10.1371/journal.pone.0217580] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 05/14/2019] [Indexed: 12/31/2022] Open
Abstract
Cebpa encodes a transcription factor (TF) that plays an instructive role in the development of multiple myeloid lineages. The expression of Cebpa itself is finely modulated, as Cebpa is expressed at high and intermediate levels in neutrophils and macrophages respectively and downregulated in non-myeloid lineages. The cis-regulatory logic underlying the lineage-specific modulation of Cebpa's expression level is yet to be fully characterized. Previously, we had identified 6 new cis-regulatory modules (CRMs) in a 78kb region surrounding Cebpa. We had also inferred the TFs that regulate each CRM by fitting a sequence-based thermodynamic model to a comprehensive reporter activity dataset. Here, we report the cis-regulatory logic of Cebpa CRMs at the resolution of individual binding sites. We tested the binding sites and functional roles of inferred TFs by designing and constructing mutated CRMs and comparing theoretical predictions of their activity against empirical measurements in a myeloid cell line. The enhancers were confirmed to be activated by combinations of PU.1, C/EBP family TFs, Egr1, and Gfi1 as predicted by the model. We show that silencers repress the activity of the proximal promoter in a dominant manner in G1ME cells, which are derived from the red-blood cell lineage. Dominant repression in G1ME cells can be traced to binding sites for GATA and Myb, a motif shared by all of the silencers. Finally, we demonstrate that GATA and Myb act redundantly to silence the proximal promoter. These results indicate that dominant repression is a novel mechanism for resolving hematopoietic lineages. Furthermore, Cebpa has a fail-safe cis-regulatory architecture, featuring several functionally similar CRMs, each of which contains redundant binding sites for multiple TFs. Lastly, by experimentally demonstrating the predictive ability of our sequence-based thermodynamic model, this work highlights the utility of this computational approach for understanding mammalian gene regulation.
Collapse
Affiliation(s)
- Andrea Repele
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Shawn Krueger
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Tapas Bhattacharyya
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| | - Michelle Y Tuineau
- Department of Biology, University of North Dakota, Grand Forks, ND, United States of America
| |
Collapse
|
16
|
Thermodynamic model of gene regulation for the Or59b olfactory receptor in Drosophila. PLoS Comput Biol 2019; 15:e1006709. [PMID: 30653495 PMCID: PMC6353224 DOI: 10.1371/journal.pcbi.1006709] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 01/30/2019] [Accepted: 12/07/2018] [Indexed: 12/22/2022] Open
Abstract
Complex eukaryotic promoters normally contain multiple cis-regulatory sequences for different transcription factors (TFs). The binding patterns of the TFs to these sites, as well as the way the TFs interact with each other and with the RNA polymerase (RNAp), lead to combinatorial problems rarely understood in detail, especially under varying epigenetic conditions. The aim of this paper is to build a model describing how the main regulatory cluster of the olfactory receptor Or59b drives transcription of this gene in Drosophila. The cluster-driven expression of this gene is represented as the equilibrium probability of RNAp being bound to the promoter region, using a statistical thermodynamic approach. The RNAp equilibrium probability is computed in terms of the occupancy probabilities of the single TFs of the cluster to the corresponding binding sites, and of the interaction rules among TFs and RNAp, using experimental data of Or59b expression to tune the model parameters. The model reproduces correctly the changes in RNAp binding probability induced by various mutation of specific sites and epigenetic modifications. Some of its predictions have also been validated in novel experiments.
Collapse
|
17
|
Vincent BJ, Staller MV, Lopez-Rivera F, Bragdon MDJ, Pym ECG, Biette KM, Wunderlich Z, Harden TT, Estrada J, DePace AH. Hunchback is counter-repressed to regulate even-skipped stripe 2 expression in Drosophila embryos. PLoS Genet 2018; 14:e1007644. [PMID: 30192762 PMCID: PMC6145585 DOI: 10.1371/journal.pgen.1007644] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 09/19/2018] [Accepted: 08/17/2018] [Indexed: 01/18/2023] Open
Abstract
Hunchback is a bifunctional transcription factor that can activate and repress gene expression in Drosophila development. We investigated the regulatory DNA sequence features that control Hunchback function by perturbing enhancers for one of its target genes, even-skipped (eve). While Hunchback directly represses the eve stripe 3+7 enhancer, we found that in the eve stripe 2+7 enhancer, Hunchback repression is prevented by nearby sequences-this phenomenon is called counter-repression. We also found evidence that Caudal binding sites are responsible for counter-repression, and that this interaction may be a conserved feature of eve stripe 2 enhancers. Our results alter the textbook view of eve stripe 2 regulation wherein Hb is described as a direct activator. Instead, to generate stripe 2, Hunchback repression must be counteracted. We discuss how counter-repression may influence eve stripe 2 regulation and evolution.
Collapse
Affiliation(s)
- Ben J. Vincent
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Max V. Staller
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Francheska Lopez-Rivera
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Meghan D. J. Bragdon
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Edward C. G. Pym
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Kelly M. Biette
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Zeba Wunderlich
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Timothy T. Harden
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Angela H. DePace
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
18
|
Samee MAH, Lydiard-Martin T, Biette KM, Vincent BJ, Bragdon MD, Eckenrode KB, Wunderlich Z, Estrada J, Sinha S, DePace AH. Quantitative Measurement and Thermodynamic Modeling of Fused Enhancers Support a Two-Tiered Mechanism for Interpreting Regulatory DNA. Cell Rep 2018; 21:236-245. [PMID: 28978476 DOI: 10.1016/j.celrep.2017.09.033] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Revised: 07/30/2017] [Accepted: 09/08/2017] [Indexed: 02/07/2023] Open
Abstract
Computational models of enhancer function generally assume that transcription factors (TFs) exert their regulatory effects independently, modeling an enhancer as a "bag of sites." These models fail on endogenous loci that harbor multiple enhancers, and a "two-tier" model appears better suited: in each enhancer TFs work independently, and the total expression is a weighted sum of their expression readouts. Here, we test these two opposing views on how cis-regulatory information is integrated. We fused two Drosophila blastoderm enhancers, measured their readouts, and applied the above two models to these data. The two-tier mechanism better fits these readouts, suggesting that these fused enhancers comprise multiple independent modules, despite having sequence characteristics typical of single enhancers. We show that short-range TF-TF interactions are not sufficient to designate such modules, suggesting unknown underlying mechanisms. Our results underscore that mechanisms of how modules are defined and how their outputs are combined remain to be elucidated.
Collapse
Affiliation(s)
- Md Abul Hassan Samee
- Gladstone Institutes, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tara Lydiard-Martin
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Kelly M Biette
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Meghan D Bragdon
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Kelly B Eckenrode
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Zeba Wunderlich
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
19
|
|
20
|
Crocker J, Ilsley GR. Using synthetic biology to study gene regulatory evolution. Curr Opin Genet Dev 2017; 47:91-101. [DOI: 10.1016/j.gde.2017.09.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Revised: 09/06/2017] [Accepted: 09/11/2017] [Indexed: 12/21/2022]
|
21
|
Barr KA, Martinez C, Moran JR, Kim AR, Ramos AF, Reinitz J. Synthetic enhancer design by in silico compensatory evolution reveals flexibility and constraint in cis-regulation. BMC SYSTEMS BIOLOGY 2017; 11:116. [PMID: 29187214 PMCID: PMC5708098 DOI: 10.1186/s12918-017-0485-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 11/09/2017] [Indexed: 11/12/2022]
Abstract
BACKGROUND Models that incorporate specific chemical mechanisms have been successful in describing the activity of Drosophila developmental enhancers as a function of underlying transcription factor binding motifs. Despite this, the minimum set of mechanisms required to reconstruct an enhancer from its constituent parts is not known. Synthetic biology offers the potential to test the sufficiency of known mechanisms to describe the activity of enhancers, as well as to uncover constraints on the number, order, and spacing of motifs. RESULTS Using a functional model and in silico compensatory evolution, we generated putative synthetic even-skipped stripe 2 enhancers with varying degrees of similarity to the natural enhancer. These elements represent the evolutionary trajectories of the natural stripe 2 enhancer towards two synthetic enhancers designed ab initio. In the first trajectory, spatially regulated expression was maintained, even after more than a third of binding sites were lost. In the second, sequences with high similarity to the natural element did not drive expression, but a highly diverged sequence about half the length of the minimal stripe 2 enhancer drove ten times greater expression. Additionally, homotypic clusters of Zelda or Stat92E motifs, but not Bicoid, drove expression in developing embryos. CONCLUSIONS Here, we present a functional model of gene regulation to test the degree to which the known transcription factors and their interactions explain the activity of the Drosophila even-skipped stripe 2 enhancer. Initial success in the first trajectory showed that the gene regulation model explains much of the function of the stripe 2 enhancer. Cases where expression deviated from prediction indicates that undescribed factors likely act to modulate expression. We also showed that activation driven Bicoid and Hunchback is highly sensitive to spatial organization of binding motifs. In contrast, Zelda and Stat92E drive expression from simple homotypic clusters, suggesting that activation driven by these factors is less constrained. Collectively, the 40 sequences generated in this work provides a powerful training set for building future models of gene regulation.
Collapse
Affiliation(s)
- Kenneth A Barr
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Zoology 111, 1101 E 57th St, Chicago, 60637, Illinois, USA.
- Department of Ecology and Evolution, The University of Chicago, Chicago, 60637, Illinois, USA.
| | - Carlos Martinez
- Department Biochemistry and Molecular Genetics, Northwestern University, Chicago, 60611, Illinois, USA
| | - Jennifer R Moran
- Department Human Genetics, The University of Chicago, Chicago, 60637, Illinois, USA
- Institute for Genomics & Systems Biology, The University of Chicago, Chicago, 60637, Illinois, USA
| | - Ah-Ram Kim
- School of Life Science, Handong Global University, Pohang, 37554, Gyeongbuk, South Korea
| | - Alexandre F Ramos
- Departamento de Radiologia - Faculdade de Medicina, Universidade de São Paulo & Instituto do Câncer do Estado de São Paulo, São Paulo, SP CEP, 05403-911, Brazil
- Escola de Artes, Ciências e Humanidades & Núcleo de Estudos Interdisciplinares em Sistemas Complexos, Universidade de São Paulo, Av. Arlindo Béttio, São Paulo, 1000 CEP 03828-000, SP, Brazil
| | - John Reinitz
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Zoology 111, 1101 E 57th St, Chicago, 60637, Illinois, USA
- Department of Ecology and Evolution, The University of Chicago, Chicago, 60637, Illinois, USA
- Institute for Genomics & Systems Biology, The University of Chicago, Chicago, 60637, Illinois, USA
- Department Statistics, The University of Chicago, 5747 S. Ellis Avenue Jones 312, Chicago, 60637, IL, USA
| |
Collapse
|
22
|
Gursky VV, Kozlov KN, Kulakovskiy IV, Zubair A, Marjoram P, Lawrie DS, Nuzhdin SV, Samsonova MG. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network. PLoS One 2017; 12:e0184657. [PMID: 28898266 PMCID: PMC5595321 DOI: 10.1371/journal.pone.0184657] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 08/28/2017] [Indexed: 11/18/2022] Open
Abstract
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects.
Collapse
Affiliation(s)
- Vitaly V. Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
- * E-mail:
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Ivan V. Kulakovskiy
- Engelhardt Institute of Molecular Biology, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Asif Zubair
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Paul Marjoram
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - David S. Lawrie
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sergey V. Nuzhdin
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| |
Collapse
|
23
|
Barr KA, Reinitz J. A sequence level model of an intact locus predicts the location and function of nonadditive enhancers. PLoS One 2017; 12:e0180861. [PMID: 28715438 PMCID: PMC5513433 DOI: 10.1371/journal.pone.0180861] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Accepted: 06/22/2017] [Indexed: 01/24/2023] Open
Abstract
Metazoan gene expression is controlled through the action of long stretches of noncoding DNA that contain enhancers-shorter sequences responsible for controlling a single aspect of a gene's expression pattern. Models built on thermodynamics have shown how enhancers interpret protein concentration in order to determine specific levels of gene expression, but the emergent regulatory logic of a complete regulatory locus shows qualitative and quantitative differences from isolated enhancers. Such differences may arise from steric competition limiting the quantity of DNA that can simultaneously influence the transcription machinery. We incorporated this competition into a mechanistic model of gene regulation, generated efficient algorithms for this computation, and applied it to the regulation of Drosophila even-skipped (eve). This model finds the location of enhancers and identifies which factors control the boundaries of eve expression. This model predicts a new enhancer that, when assayed in vivo, drives expression in a non-eve pattern. Incorporation of chromatin accessibility eliminates this inconsistency.
Collapse
Affiliation(s)
- Kenneth A. Barr
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| | - John Reinitz
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Statistics, University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
24
|
Shamarova E, Chertovskih R, Ramos AF, Aguiar P. Backward-stochastic-differential-equation approach to modeling of gene expression. Phys Rev E 2017; 95:032418. [PMID: 28415290 DOI: 10.1103/physreve.95.032418] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Indexed: 11/07/2022]
Abstract
In this article, we introduce a backward method to model stochastic gene expression and protein-level dynamics. The protein amount is regarded as a diffusion process and is described by a backward stochastic differential equation (BSDE). Unlike many other SDE techniques proposed in the literature, the BSDE method is backward in time; that is, instead of initial conditions it requires the specification of end-point ("final") conditions, in addition to the model parametrization. To validate our approach we employ Gillespie's stochastic simulation algorithm (SSA) to generate (forward) benchmark data, according to predefined gene network models. Numerical simulations show that the BSDE method is able to correctly infer the protein-level distributions that preceded a known final condition, obtained originally from the forward SSA. This makes the BSDE method a powerful systems biology tool for time-reversed simulations, allowing, for example, the assessment of the biological conditions (e.g., protein concentrations) that preceded an experimentally measured event of interest (e.g., mitosis, apoptosis, etc.).
Collapse
Affiliation(s)
- Evelina Shamarova
- Departamento de Matemática, Universidade Federal da Paraíba, 58051-900 João Pessoa, Brazil
| | - Roman Chertovskih
- Samara National Research University, Moskovskoe shosse 34, 443086 Samara, Russian Federation
| | - Alexandre F Ramos
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, Av. Arlindo Béttio 1000, 03828-00 São Paulo, SP, Brazil
| | - Paulo Aguiar
- INEB, Instituto de Engenharia Biomédica i3S, Instituto de Investigação e Inovação em Saúde, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
| |
Collapse
|
25
|
Chertkova AA, Schiffman JS, Nuzhdin SV, Kozlov KN, Samsonova MG, Gursky VV. In silico evolution of the Drosophila gap gene regulatory sequence under elevated mutational pressure. BMC Evol Biol 2017; 17:4. [PMID: 28251865 PMCID: PMC5333172 DOI: 10.1186/s12862-016-0866-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. RESULTS We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. CONCLUSION In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.
Collapse
Affiliation(s)
- Aleksandra A. Chertkova
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
| | - Joshua S. Schiffman
- Molecular and Computational Biology, University of Southern California, Los Angeles, 90089 CA USA
| | - Sergey V. Nuzhdin
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
- Molecular and Computational Biology, University of Southern California, Los Angeles, 90089 CA USA
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
| | - Vitaly V. Gursky
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
- Theoretical Department, Ioffe Institute, Polytechnicheskaya, 26, St. Petersburg, 194021 Russia
| |
Collapse
|
26
|
A Looping-Based Model for Quenching Repression. PLoS Comput Biol 2017; 13:e1005337. [PMID: 28085884 PMCID: PMC5279812 DOI: 10.1371/journal.pcbi.1005337] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 01/30/2017] [Accepted: 12/29/2016] [Indexed: 12/18/2022] Open
Abstract
We model the regulatory role of proteins bound to looped DNA using a simulation in which dsDNA is represented as a self-avoiding chain, and proteins as spherical protrusions. We simulate long self-avoiding chains using a sequential importance sampling Monte-Carlo algorithm, and compute the probabilities for chain looping with and without a protrusion. We find that a protrusion near one of the chain’s termini reduces the probability of looping, even for chains much longer than the protrusion–chain-terminus distance. This effect increases with protrusion size, and decreases with protrusion-terminus distance. The reduced probability of looping can be explained via an eclipse-like model, which provides a novel inhibitory mechanism. We test the eclipse model on two possible transcription-factor occupancy states of the D. melanogastereve 3/7 enhancer, and show that it provides a possible explanation for the experimentally-observed eve stripe 3 and 7 expression patterns. Biological regulation-at-a-distance, whereby a transcription factor (TF) is able to generate susbstantial regulatory effects on gene expression even though it may be bound a large distance away from its target (500 bp–1 Mbp), is only partially understood. Using a biophysical model and a computer simulation that take dsDNA and TF volumes into account, we identify a downregulatory mechanism which functions at large distances, whereby a TF bound within ∼ 150 bp from an activator decreases the probability of looping-based interaction between the activator and the distant core promoter. This “eclipse” mechanism provides insight into the question of how enhancer architecture dictates gene expression.
Collapse
|
27
|
Hang S, Gergen JP. Different modes of enhancer-specific regulation by Runt and Even-skipped during Drosophila segmentation. Mol Biol Cell 2017; 28:681-691. [PMID: 28077616 PMCID: PMC5328626 DOI: 10.1091/mbc.e16-09-0630] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Revised: 12/13/2016] [Accepted: 01/04/2017] [Indexed: 12/04/2022] Open
Abstract
Expression of the Drosophila slp1 gene depends on nonadditive interactions between two cis-regulatory enhancers. These enhancers are repressed by preventing either Pol II recruitment or release of promoter-proximal paused Pol II in a manner that is both enhancer and transcription factor specific and can account for their nonadditive interaction. The initial metameric expression of the Drosophila sloppy paired 1 (slp1) gene is controlled by two distinct cis-regulatory DNA elements that interact in a nonadditive manner to integrate inputs from transcription factors encoded by the pair-rule segmentation genes. We performed chromatin immunoprecipitation on reporter genes containing these elements in different embryonic genotypes to investigate the mechanism of their regulation. The distal early stripe element (DESE) mediates both activation and repression by Runt. We find that the differential response of DESE to Runt is due to an inhibitory effect of Fushi tarazu (Ftz) on P-TEFb recruitment and the regulation of RNA polymerase II (Pol II) pausing. The proximal early stripe element (PESE) is also repressed by Runt, but in this case, Runt prevents PESE-dependent Pol II recruitment and preinitiation complex (PIC) assembly. PESE is also repressed by Even-skipped (Eve), but, of interest, this repression involves regulation of P-TEFb recruitment and promoter-proximal Pol II pausing. These results demonstrate that the mode of slp1 repression by Runt is enhancer specific, whereas the mode of repression of the slp1 PESE enhancer is transcription factor specific. We propose a model based on these differential regulatory interactions that accounts for the nonadditive interactions between the PESE and DESE enhancers during Drosophila segmentation.
Collapse
Affiliation(s)
- Saiyu Hang
- Department of Biochemistry and Cell Biology and Center for Developmental Genetics and.,Graduate Program in Biochemistry and Structural Biology, Stony Brook University, Stony Brook, NY 11794
| | - J Peter Gergen
- Department of Biochemistry and Cell Biology and Center for Developmental Genetics and
| |
Collapse
|
28
|
Lee H, Cho DY, Whitworth C, Eisman R, Phelps M, Roote J, Kaufman T, Cook K, Russell S, Przytycka T, Oliver B. Effects of Gene Dose, Chromatin, and Network Topology on Expression in Drosophila melanogaster. PLoS Genet 2016; 12:e1006295. [PMID: 27599372 PMCID: PMC5012587 DOI: 10.1371/journal.pgen.1006295] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Accepted: 08/10/2016] [Indexed: 11/18/2022] Open
Abstract
Deletions, commonly referred to as deficiencies by Drosophila geneticists, are valuable tools for mapping genes and for genetic pathway discovery via dose-dependent suppressor and enhancer screens. More recently, it has become clear that deviations from normal gene dosage are associated with multiple disorders in a range of species including humans. While we are beginning to understand some of the transcriptional effects brought about by gene dosage changes and the chromosome rearrangement breakpoints associated with them, much of this work relies on isolated examples. We have systematically examined deficiencies of the left arm of chromosome 2 and characterize gene-by-gene dosage responses that vary from collapsed expression through modest partial dosage compensation to full or even over compensation. We found negligible long-range effects of creating novel chromosome domains at deletion breakpoints, suggesting that cases of gene regulation due to altered nuclear architecture are rare. These rare cases include trans de-repression when deficiencies delete chromatin characterized as repressive in other studies. Generally, effects of breakpoints on expression are promoter proximal (~100bp) or in the gene body. Effects of deficiencies genome-wide are in genes with regulatory relationships to genes within the deleted segments, highlighting the subtle expression network defects in these sensitized genetic backgrounds.
Collapse
Affiliation(s)
- Hangnoh Lee
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Kidney and Digestive Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Dong-Yeon Cho
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Cale Whitworth
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Kidney and Digestive Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - Robert Eisman
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - Melissa Phelps
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - John Roote
- Department of Genetics and Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
| | - Thomas Kaufman
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - Kevin Cook
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - Steven Russell
- Department of Genetics and Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
| | - Teresa Przytycka
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Brian Oliver
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Kidney and Digestive Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
29
|
Lou Z, Reinitz J. Parallel Simulated Annealing Using an Adaptive Resampling Interval. PARALLEL COMPUTING 2016; 53:23-31. [PMID: 26941469 PMCID: PMC4770898 DOI: 10.1016/j.parco.2016.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This paper presents a parallel simulated annealing algorithm that is able to achieve 90% parallel efficiency in iteration on up to 192 processors and up to 40% parallel efficiency in time when applied to a 5000-dimension Rastrigin function. Our algorithm breaks scalability barriers in the method of Chu et al. (1999) by abandoning adaptive cooling based on variance. The resulting gains in parallel efficiency are much larger than the loss of serial efficiency from lack of adaptive cooling. Our algorithm resamples the states across processors periodically. The resampling interval is tuned according to the success rate for each specific number of processors. We further present an adaptive method to determine the resampling interval based on the adoption rate. This adaptive method is able to achieve nearly identical parallel efficiency but higher success rates compared to the fixed interval one using the best interval found.
Collapse
Affiliation(s)
- Zhihao Lou
- Department of Computer Science, the University of Chicago,
Chicago, Illinois, USA
| | - John Reinitz
- Department of Statistics, Department of Ecology and
Evolution, Department of Molecular Genetics and Cell Biology, Institute of Genomics
and Systems Biology, the University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
30
|
Prata GN, Hornos JEM, Ramos AF. Stochastic model for gene transcription on Drosophila melanogaster embryos. Phys Rev E 2016; 93:022403. [PMID: 26986358 DOI: 10.1103/physreve.93.022403] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Indexed: 02/04/2023]
Abstract
We examine immunostaining experimental data for the formation of stripe 2 of even-skipped (eve) transcripts on D. melanogaster embryos. An estimate of the factor converting immunofluorescence intensity units into molecular numbers is given. The analysis of the eve dynamics at the region of stripe 2 suggests that the promoter site of the gene has two distinct regimes: an earlier phase when it is predominantly activated until a critical time when it becomes mainly repressed. That suggests proposing a stochastic binary model for gene transcription on D. melanogaster embryos. Our model has two random variables: the transcripts number and the state of the source of mRNAs given as active or repressed. We are able to reproduce available experimental data for the average number of transcripts. An analysis of the random fluctuations on the number of eves and their consequences on the spatial precision of stripe 2 is presented. We show that the position of the anterior or posterior borders fluctuate around their average position by ∼1% of the embryo length, which is similar to what is found experimentally. The fitting of data by such a simple model suggests that it can be useful to understand the functions of randomness during developmental processes.
Collapse
Affiliation(s)
- Guilherme N Prata
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, Avenida Arlindo Béttio, 1000, Ermelino Matarazzo, São Paulo, SP CEP 03828-000, Brazil
| | - José Eduardo M Hornos
- Instituto de Física de São Carlos, Universidade de São Paulo, Av. Trabalhador São-Carlense, 400, São Carlos, SP CEP 13566-590, Brazil
| | - Alexandre F Ramos
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, Avenida Arlindo Béttio, 1000, Ermelino Matarazzo, São Paulo, SP CEP 03828-000, Brazil.,Departamento de Radiologia - Faculdade de Medicina, Universidade de São Paulo, São Carlos, SP CEP 13566-590, Brazil.,Núcleo de Estudos Interdisciplinares em Sistemas Complexos, Universidade de São Paulo, São Carlos, SP CEP 13566-590, Brazil
| |
Collapse
|
31
|
Vincent BJ, Estrada J, DePace AH. The appeasement of Doug: a synthetic approach to enhancer biology. Integr Biol (Camb) 2016; 8:475-84. [DOI: 10.1039/c5ib00321k] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Ben J. Vincent
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | - Angela H. DePace
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| |
Collapse
|
32
|
Bertolino E, Reinitz J, Manu. The analysis of novel distal Cebpa enhancers and silencers using a transcriptional model reveals the complex regulatory logic of hematopoietic lineage specification. Dev Biol 2016; 413:128-44. [PMID: 26945717 DOI: 10.1016/j.ydbio.2016.02.030] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 01/13/2016] [Accepted: 02/15/2016] [Indexed: 11/25/2022]
Abstract
C/EBPα plays an instructive role in the macrophage-neutrophil cell-fate decision and its expression is necessary for neutrophil development. How Cebpa itself is regulated in the myeloid lineage is not known. We decoded the cis-regulatory logic of Cebpa, and two other myeloid transcription factors, Egr1 and Egr2, using a combined experimental-computational approach. With a reporter design capable of detecting both distal enhancers and silencers, we analyzed 46 putative cis-regulatory modules (CRMs) in cells representing myeloid progenitors, and derived early macrophages or neutrophils. In addition to novel enhancers, this analysis revealed a surprisingly large number of silencers. We determined the regulatory roles of 15 potential transcriptional regulators by testing 32,768 alternative sequence-based transcriptional models against CRM activity data. This comprehensive analysis allowed us to infer the cis-regulatory logic for most of the CRMs. Silencer-mediated repression of Cebpa was found to be effected mainly by TFs expressed in non-myeloid lineages, highlighting a previously unappreciated contribution of long-distance silencing to hematopoietic lineage resolution. The repression of Cebpa by multiple factors expressed in alternative lineages suggests that hematopoietic genes are organized into densely interconnected repressive networks instead of hierarchies of mutually repressive pairs of pivotal TFs. More generally, our results demonstrate that de novo cis-regulatory dissection is feasible on a large scale with the aid of transcriptional modeling. Current address: Department of Biology, University of North Dakota, 10 Cornell Street, Stop 9019, Grand Forks, ND 58202-9019, USA.
Collapse
Affiliation(s)
- Eric Bertolino
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, IL 60637, USA.
| | - John Reinitz
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, IL 60637, USA; Department of Statistics, The University of Chicago, Chicago, IL 60637, USA; Department of Ecology and Evolution and Institute of Genomics and Systems Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Manu
- Department of Ecology and Evolution and Institute of Genomics and Systems Biology, The University of Chicago, Chicago, IL 60637, USA; Department of Biology, University of North Dakota, 10 Cornell Street, Stop 9019, Grand Forks, ND 58202-9019, USA.
| |
Collapse
|
33
|
Hoermann A, Cicin-Sain D, Jaeger J. A quantitative validated model reveals two phases of transcriptional regulation for the gap gene giant in Drosophila. Dev Biol 2016; 411:325-338. [DOI: 10.1016/j.ydbio.2016.01.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Revised: 12/22/2015] [Accepted: 01/08/2016] [Indexed: 01/05/2023]
|
34
|
Quantitatively predictable control of Drosophila transcriptional enhancers in vivo with engineered transcription factors. Nat Genet 2016; 48:292-8. [DOI: 10.1038/ng.3509] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2015] [Accepted: 01/15/2016] [Indexed: 12/13/2022]
|
35
|
Peng PC, Hassan Samee MA, Sinha S. Incorporating chromatin accessibility data into sequence-to-expression modeling. Biophys J 2016; 108:1257-67. [PMID: 25762337 DOI: 10.1016/j.bpj.2014.12.037] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 12/01/2014] [Accepted: 12/11/2014] [Indexed: 01/30/2023] Open
Abstract
Prediction of gene expression levels from regulatory sequences is one of the major challenges of genomic biology today. A particularly promising approach to this problem is that taken by thermodynamics-based models that interpret an enhancer sequence in a given cellular context specified by transcription factor concentration levels and predict precise expression levels driven by that enhancer. Such models have so far not accounted for the effect of chromatin accessibility on interactions between transcription factor and DNA and consequently on gene-expression levels. Here, we extend a thermodynamics-based model of gene expression, called GEMSTAT (Gene Expression Modeling Based on Statistical Thermodynamics), to incorporate chromatin accessibility data and quantify its effect on accuracy of expression prediction. In the new model, called GEMSTAT-A, accessibility at a binding site is assumed to affect the transcription factor's binding strength at the site, whereas all other aspects are identical to the GEMSTAT model. We show that this modification results in significantly better fits in a data set of over 30 enhancers regulating spatial expression patterns in the blastoderm-stage Drosophila embryo. It is important to note that the improved fits result not from an overall elevated accessibility in active enhancers but from the variation of accessibility levels within an enhancer. With whole-genome DNA accessibility measurements becoming increasingly popular, our work demonstrates how such data may be useful for sequence-to-expression models. It also calls for future advances in modeling accessibility levels from sequence and the transregulatory context, so as to predict accurately the effect of cis and trans perturbations on gene expression.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Md Abul Hassan Samee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.
| |
Collapse
|
36
|
Demidov GM, Samsonova MG, Gursky VV. A stochastic model of the formation of the molecular configuration of an enhancer site. Biophysics (Nagoya-shi) 2016. [DOI: 10.1134/s0006350916010073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
37
|
Samee MAH, Lim B, Samper N, Lu H, Rushlow CA, Jiménez G, Shvartsman SY, Sinha S. A Systematic Ensemble Approach to Thermodynamic Modeling of Gene Expression from Sequence Data. Cell Syst 2015; 1:396-407. [PMID: 27136354 DOI: 10.1016/j.cels.2015.12.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Revised: 10/19/2015] [Accepted: 12/02/2015] [Indexed: 11/17/2022]
Abstract
To understand the relationship between an enhancer DNA sequence and quantitative gene expression, thermodynamics-driven mathematical models of transcription are often employed. These "sequence-to-expression" models can describe an incomplete or even incorrect set of regulatory relationships if the parameter space is not searched systematically. Here, we focus on an enhancer of the Drosophila gene ind and demonstrate how a systematic search of parameter space can reveal a more comprehensive picture of a gene's regulatory mechanisms, resolve outstanding ambiguities, and suggest testable hypotheses. We describe an approach that generates an ensemble of ind models; all of these models are technically acceptable solutions to the sequence-to-expression problem in light of wild-type data, and some represent mechanistically distinct hypotheses about the regulation of ind. This ensemble can be restricted to biologically plausible models using requirements gleaned from in vivo perturbation experiments. Biologically plausible models make unique predictions about how specific ind enhancer sequences affect ind expression; we validate these predictions in vivo through site mutagenesis in transgenic Drosophila embryos.
Collapse
Affiliation(s)
- Md Abul Hassan Samee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Bomyi Lim
- Department of Chemical and Biological Engineering and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Núria Samper
- Department of Developmental Biology, Instituto de Biología Molecular de Barcelona, Consejo Superior de Investigaciones Científicas (CSIC), Barcelona 08208, Spain
| | - Hang Lu
- School of Chemical and Biomolecular Engineering and Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | | | - Gerardo Jiménez
- Department of Developmental Biology, Instituto de Biología Molecular de Barcelona, Consejo Superior de Investigaciones Científicas (CSIC), Barcelona 08208, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona 08010, Spain
| | - Stanislav Y Shvartsman
- Department of Chemical and Biological Engineering and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
38
|
Kreamer NN, Phillips R, Newman DK, Boedicker JQ. Predicting the impact of promoter variability on regulatory outputs. Sci Rep 2015; 5:18238. [PMID: 26675057 PMCID: PMC4682146 DOI: 10.1038/srep18238] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 11/16/2015] [Indexed: 11/24/2022] Open
Abstract
The increased availability of whole genome sequences calls for quantitative models of global gene expression, yet predicting gene expression patterns directly from genome sequence remains a challenge. We examine the contributions of an individual regulator, the ferrous iron-responsive regulatory element, BqsR, on global patterns of gene expression in Pseudomonas aeruginosa. The position weight matrix (PWM) derived for BqsR uncovered hundreds of likely binding sites throughout the genome. Only a subset of these potential binding sites had a regulatory consequence, suggesting that BqsR/DNA interactions were not captured within the PWM or that the broader regulatory context at each promoter played a greater role in setting promoter outputs. The architecture of the BqsR operator was systematically varied to understand how binding site parameters influence expression. We found that BqsR operator affinity was predicted by the PWM well. At many promoters the surrounding regulatory context, including overlapping operators of BqsR or the presence of RhlR binding sites, were influential in setting promoter outputs. These results indicate more comprehensive models that include local regulatory contexts are needed to develop a predictive understanding of global regulatory outputs.
Collapse
Affiliation(s)
- Naomi N Kreamer
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.,Department of Chemistry, California Institute of Technology, Pasadena, CA 91125, USA
| | - Rob Phillips
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.,Department of Applied Physics, California Institute of Technology, Pasadena, CA 91125, USA
| | - Dianne K Newman
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.,Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA 91125, USA
| | - James Q Boedicker
- Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089, USA.,Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
39
|
Jiang P, Ludwig MZ, Kreitman M, Reinitz J. Natural variation of the expression pattern of the segmentation gene even-skipped in melanogaster. Dev Biol 2015; 405:173-81. [PMID: 26129990 PMCID: PMC4529771 DOI: 10.1016/j.ydbio.2015.06.019] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Revised: 06/23/2015] [Accepted: 06/24/2015] [Indexed: 11/28/2022]
Abstract
The evolution of canalized traits is a central question in evolutionary biology. Natural variation in highly conserved traits can provide clues about their evolutionary potential. Here we investigate natural variation in a conserved trait-even-skipped (eve) expression at the cellular blastoderm stage of embryonic development in Drosophila melanogaster. Expression of the pair-rule gene eve was quantitatively measured in three inbred lines derived from a natural population of D. melanogaster. One line showed marked differences in the spacing, amplitude and timing of formation of the characteristic seven-striped pattern over a 50-min period prior to the onset of gastrulation. Stripe 5 amplitude and the width of the interstripe between stripes 4 and 5 were both reduced in this line, while the interstripe distance between stripes 3 and 4 was increased. Engrailed expression in stage 10 embryos revealed a statistically significant increase in the length of parasegment 6 and a decrease in the length of parasegments 8 and 9. These changes are larger than those previously reported between D. melanogaster and D. pseudoobscura, two species that are thought to have diverged from a common ancestor over 25 million years ago. This line harbors a rare 448 bp deletion in the first intron of knirps (kni). This finding suggested that reduced Kni levels caused the deviant eve expression, and indeed we observed lower levels of Kni protein at early cycle 14A in L2 compared to the other two lines. A second of the three lines displayed an approximately 20% greater level of expression for all seven eve stripes. The three lines are each viable and fertile, and none display a segmentation defect as adults, suggesting that early-acting variation in eve expression is ameliorated by developmental buffering mechanisms acting later in development. Canalization of the segmentation pathway may reduce the fitness consequences of genetic variation, thus allowing the persistence of mutations with unexpectedly strong gene expression phenotypes.
Collapse
Affiliation(s)
- Pengyao Jiang
- Department of Ecology & Evolution, University of Chicago, IL 60637, USA.
| | - Michael Z Ludwig
- Department of Ecology & Evolution, University of Chicago, IL 60637, USA; Institute for Genomics & Systems Biology, Chicago, IL 60637, USA
| | - Martin Kreitman
- Department of Ecology & Evolution, University of Chicago, IL 60637, USA; Institute for Genomics & Systems Biology, Chicago, IL 60637, USA
| | - John Reinitz
- Department of Ecology & Evolution, University of Chicago, IL 60637, USA; Institute for Genomics & Systems Biology, Chicago, IL 60637, USA; Department of Statistics, University of Chicago, IL 60637, USA; Department of Molecular Genetics and Cell Biology, University of Chicago, IL 60637, USA
| |
Collapse
|
40
|
Ma X, Ezer D, Navarro C, Adryan B. Reliable scaling of position weight matrices for binding strength comparisons between transcription factors. BMC Bioinformatics 2015; 16:265. [PMID: 26289072 PMCID: PMC4545934 DOI: 10.1186/s12859-015-0666-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 07/08/2015] [Indexed: 01/05/2023] Open
Abstract
Background Scoring DNA sequences against Position Weight Matrices (PWMs) is a widely adopted method to identify putative transcription factor binding sites. While common bioinformatics tools produce scores that can reflect the binding strength between a specific transcription factor and the DNA, these scores are not directly comparable between different transcription factors. Other methods, including p-value associated approaches (Touzet H, Varré J-S. Efficient and accurate p-value computation for position weight matrices. Algorithms Mol Biol. 2007;2(1510.1186):1748–7188), provide more rigorous ways to identify potential binding sites, but their results are difficult to interpret in terms of binding energy, which is essential for the modeling of transcription factor binding dynamics and enhancer activities. Results Here, we provide two different ways to find the scaling parameter λ that allows us to infer binding energy from a PWM score. The first approach uses a PWM and background genomic sequence as input to estimate λ for a specific transcription factor, which we applied to show that λ distributions for different transcription factor families correspond with their DNA binding properties. Our second method can reliably convert λ between different PWMs of the same transcription factor, which allows us to directly compare PWMs that were generated by different approaches. Conclusion These two approaches provide computationally efficient ways to scale PWM scores and estimate the strength of transcription factor binding sites in quantitative studies of binding dynamics. Their results are consistent with each other and previous reports in most of cases. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0666-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaoyan Ma
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK. .,Cambridge Systems Biology Center, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.
| | - Daphne Ezer
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK. .,Cambridge Systems Biology Center, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.
| | - Carmen Navarro
- Cambridge Systems Biology Center, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK. .,Department of Computer Science and Artificial Intelligence, University of Granada, Periodista Daniel Saucedo Aranda, Granada, Spain.
| | - Boris Adryan
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK. .,Cambridge Systems Biology Center, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.
| |
Collapse
|
41
|
Duque T, Sinha S. What does it take to evolve an enhancer? A simulation-based study of factors influencing the emergence of combinatorial regulation. Genome Biol Evol 2015; 7:1415-31. [PMID: 25956793 PMCID: PMC4494070 DOI: 10.1093/gbe/evv080] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
There is widespread interest today in understanding enhancers, which are regulatory elements typically harboring several transcription factor binding sites and mediating the combinatorial effect of transcription factors on gene expression. The evolution of enhancers poses interesting unanswered questions, for example, the evolutionary time taken for a typical enhancer to emerge or the factors shaping its evolution. Existing approaches to cis-regulatory evolution have often ignored the combinatorial nature and varied biochemical mechanisms of gene regulation encoded in enhancers. We report on our investigation of enhancer evolution through the use of PEBCRES, a framework for evolutionary simulation of enhancers that employs a mechanistic and well-supported sequence-to-expression model to assign fitness to the evolving enhancer genotype. We estimated the time necessary to evolve, from genomic background, enhancers capable of driving complex gene expression patterns similar to those involved in early development in Drosophila. We found the time-to-evolve to range between 0.5 and 10 Myr, and to vary greatly with the target expression pattern, complexity of the real enhancer known to encode that pattern, and the strength of input from specific transcription factors. To our knowledge, this is the first estimate of waiting times for realistic enhancers to evolve. The in silico evolved enhancers had, with a few interesting exceptions, site compositions similar to those seen in real enhancers for the same patterns. Our simulations also revealed that certain features of an enhancer might evolve not due to their biological function but as aids to the evolutionary process itself.
Collapse
Affiliation(s)
- Thyago Duque
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign Institute for Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
42
|
Staller MV, Fowlkes CC, Bragdon MDJ, Wunderlich Z, Estrada J, DePace AH. A gene expression atlas of a bicoid-depleted Drosophila embryo reveals early canalization of cell fate. Development 2015; 142:587-96. [PMID: 25605785 PMCID: PMC4302997 DOI: 10.1242/dev.117796] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 12/01/2014] [Indexed: 01/31/2023]
Abstract
In developing embryos, gene regulatory networks drive cells towards discrete terminal fates, a process called canalization. We studied the behavior of the anterior-posterior segmentation network in Drosophila melanogaster embryos by depleting a key maternal input, bicoid (bcd), and measuring gene expression patterns of the network at cellular resolution. This method results in a gene expression atlas containing the levels of mRNA or protein expression of 13 core patterning genes over six time points for every cell of the blastoderm embryo. This is the first cellular resolution dataset of a genetically perturbed Drosophila embryo that captures all cells in 3D. We describe the technical developments required to build this atlas and how the method can be employed and extended by others. We also analyze this novel dataset to characterize the degree and timing of cell fate canalization in the segmentation network. We find that in two layers of this gene regulatory network, following depletion of bcd, individual cells rapidly canalize towards normal cell fates. This result supports the hypothesis that the segmentation network directly canalizes cell fate, rather than an alternative hypothesis whereby cells are initially mis-specified and later eliminated by apoptosis. Our gene expression atlas provides a high resolution picture of a classic perturbation and will enable further computational modeling of canalization and gene regulation in this transcriptional network.
Collapse
Affiliation(s)
- Max V Staller
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Charless C Fowlkes
- Department of Computer Science, University of California Irvine, Irvine, CA 92697, USA
| | - Meghan D J Bragdon
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Zeba Wunderlich
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
43
|
Shadow enhancers enable Hunchback bifunctionality in the Drosophila embryo. Proc Natl Acad Sci U S A 2015; 112:785-90. [PMID: 25564665 DOI: 10.1073/pnas.1413877112] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Hunchback (Hb) is a bifunctional transcription factor that activates and represses distinct enhancers. Here, we investigate the hypothesis that Hb can activate and repress the same enhancer. Computational models predicted that Hb bifunctionally regulates the even-skipped (eve) stripe 3+7 enhancer (eve3+7) in Drosophila blastoderm embryos. We measured and modeled eve expression at cellular resolution under multiple genetic perturbations and found that the eve3+7 enhancer could not explain endogenous eve stripe 7 behavior. Instead, we found that eve stripe 7 is controlled by two enhancers: the canonical eve3+7 and a sequence encompassing the minimal eve stripe 2 enhancer (eve2+7). Hb bifunctionally regulates eve stripe 7, but it executes these two activities on different pieces of regulatory DNA--it activates the eve2+7 enhancer and represses the eve3+7 enhancer. These two "shadow enhancers" use different regulatory logic to create the same pattern.
Collapse
|
44
|
Naturally occurring deletions of hunchback binding sites in the even-skipped stripe 3+7 enhancer. PLoS One 2014; 9:e91924. [PMID: 24786295 PMCID: PMC4006794 DOI: 10.1371/journal.pone.0091924] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 02/18/2014] [Indexed: 11/23/2022] Open
Abstract
Changes in regulatory DNA contribute to phenotypic differences within and between taxa. Comparative studies show that many transcription factor binding sites (TFBS) are conserved between species whereas functional studies reveal that some mutations segregating within species alter TFBS function. Consistently, in this analysis of 13 regulatory elements in Drosophila melanogaster populations, single base and insertion/deletion polymorphism are rare in characterized regulatory elements. Experimentally defined TFBS are nearly devoid of segregating mutations and, as has been shown before, are quite conserved. For instance 8 of 11 Hunchback binding sites in the stripe 3+7 enhancer of even-skipped are conserved between D. melanogaster and Drosophila virilis. Oddly, we found a 72 bp deletion that removes one of these binding sites (Hb8), segregating within D. melanogaster. Furthermore, a 45 bp deletion polymorphism in the spacer between the stripe 3+7 and stripe 2 enhancers, removes another predicted Hunchback site. These two deletions are separated by ∼250 bp, sit on distinct haplotypes, and segregate at appreciable frequency. The Hb8Δ is at 5 to 35% frequency in the new world, but also shows cosmopolitan distribution. There is depletion of sequence variation on the Hb8Δ-carrying haplotype. Quantitative genetic tests indicate that Hb8Δ affects developmental time, but not viability of offspring. The Eve expression pattern differs between inbred lines, but the stripe 3 and 7 boundaries seem unaffected by Hb8Δ. The data reveal segregating variation in regulatory elements, which may reflect evolutionary turnover of characterized TFBS due to drift or co-evolution.
Collapse
|
45
|
Samee MAH, Sinha S. Quantitative modeling of a gene's expression from its intergenic sequence. PLoS Comput Biol 2014; 10:e1003467. [PMID: 24604095 PMCID: PMC3945089 DOI: 10.1371/journal.pcbi.1003467] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2012] [Accepted: 12/18/2013] [Indexed: 11/18/2022] Open
Abstract
Modeling a gene's expression from its intergenic locus and trans-regulatory context is a fundamental goal in computational biology. Owing to the distributed nature of cis-regulatory information and the poorly understood mechanisms that integrate such information, gene locus modeling is a more challenging task than modeling individual enhancers. Here we report the first quantitative model of a gene's expression pattern as a function of its locus. We model the expression readout of a locus in two tiers: 1) combinatorial regulation by transcription factors bound to each enhancer is predicted by a thermodynamics-based model and 2) independent contributions from multiple enhancers are linearly combined to fit the gene expression pattern. The model does not require any prior knowledge about enhancers contributing toward a gene's expression. We demonstrate that the model captures the complex multi-domain expression patterns of anterior-posterior patterning genes in the early Drosophila embryo. Altogether, we model the expression patterns of 27 genes; these include several gap genes, pair-rule genes, and anterior, posterior, trunk, and terminal genes. We find that the model-selected enhancers for each gene overlap strongly with its experimentally characterized enhancers. Our findings also suggest the presence of sequence-segments in the locus that would contribute ectopic expression patterns and hence were "shut down" by the model. We applied our model to identify the transcription factors responsible for forming the stripe boundaries of the studied genes. The resulting network of regulatory interactions exhibits a high level of agreement with known regulatory influences on the target genes. Finally, we analyzed whether and why our assumption of enhancer independence was necessary for the genes we studied. We found a deterioration of expression when binding sites in one enhancer were allowed to influence the readout of another enhancer. Thus, interference between enhancer activities was a possible factor necessitating enhancer independence in our model.
Collapse
Affiliation(s)
- Md. Abul Hassan Samee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail: (MAHS); (SS)
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail: (MAHS); (SS)
| |
Collapse
|
46
|
Martinez C, Rest JS, Kim AR, Ludwig M, Kreitman M, White K, Reinitz J. Ancestral resurrection of the Drosophila S2E enhancer reveals accessible evolutionary paths through compensatory change. Mol Biol Evol 2014; 31:903-16. [PMID: 24408913 DOI: 10.1093/molbev/msu042] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Upstream regulatory sequences that control gene expression evolve rapidly, yet the expression patterns and functions of most genes are typically conserved. To address this paradox, we have reconstructed computationally and resurrected in vivo the cis-regulatory regions of the ancestral Drosophila eve stripe 2 element and evaluated its evolution using a mathematical model of promoter function. Our feed-forward transcriptional model predicts gene expression patterns directly from enhancer sequence. We used this functional model along with phylogenetics to generate a set of possible ancestral eve stripe 2 sequences for the common ancestors of 1) D. simulans and D. sechellia; 2) D. melanogaster, D. simulans, and D. sechellia; and 3) D. erecta and D. yakuba. These ancestral sequences were synthesized and resurrected in vivo. Using a combination of quantitative and computational analysis, we find clear support for functional compensation between the binding sites for Bicoid, Giant, and Krüppel over the course of 40-60 My of Drosophila evolution. We show that this compensation is driven by a coupling interaction between Bicoid activation and repression at the anterior and posterior border necessary for proper placement of the anterior stripe 2 border. A multiplicity of mechanisms for binding site turnover exemplified by Bicoid, Giant, and Krüppel sites, explains how rapid sequence change may occur while maintaining the function of the cis-regulatory element.
Collapse
Affiliation(s)
- Carlos Martinez
- Institute for Genomics and Systems Biology, University of Chicago
| | | | | | | | | | | | | |
Collapse
|
47
|
Ilsley GR, Fisher J, Apweiler R, DePace AH, Luscombe NM. Cellular resolution models for even skipped regulation in the entire Drosophila embryo. eLife 2013; 2:e00522. [PMID: 23930223 PMCID: PMC3736529 DOI: 10.7554/elife.00522] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 06/17/2013] [Indexed: 12/14/2022] Open
Abstract
Transcriptional control ensures genes are expressed in the right amounts at the correct times and locations. Understanding quantitatively how regulatory systems convert input signals to appropriate outputs remains a challenge. For the first time, we successfully model even skipped (eve) stripes 2 and 3+7 across the entire fly embryo at cellular resolution. A straightforward statistical relationship explains how transcription factor (TF) concentrations define eve's complex spatial expression, without the need for pairwise interactions or cross-regulatory dynamics. Simulating thousands of TF combinations, we recover known regulators and suggest new candidates. Finally, we accurately predict the intricate effects of perturbations including TF mutations and misexpression. Our approach imposes minimal assumptions about regulatory function; instead we infer underlying mechanisms from models that best fit the data, like the lack of TF-specific thresholds and the positional value of homotypic interactions. Our study provides a general and quantitative method for elucidating the regulation of diverse biological systems. DOI:http://dx.doi.org/10.7554/eLife.00522.001.
Collapse
Affiliation(s)
- Garth R Ilsley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
- Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Jasmin Fisher
- Microsoft Research Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Rolf Apweiler
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, United States
| | - Nicholas M Luscombe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
- Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- UCL Genetics Institute, Department of Genetics, Evolution, and Environment, University College London, London, United Kingdom
- London Research Institute, Cancer Research UK, London, United Kingdom
| |
Collapse
|
48
|
Dresch JM, Richards M, Ay A. A primer on thermodynamic-based models for deciphering transcriptional regulatory logic. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2013; 1829:946-53. [PMID: 23643643 DOI: 10.1016/j.bbagrm.2013.04.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2012] [Revised: 04/24/2013] [Accepted: 04/25/2013] [Indexed: 11/27/2022]
Abstract
A rigorous analysis of transcriptional regulation at the DNA level is crucial to the understanding of many biological systems. Mathematical modeling has offered researchers a new approach to understanding this central process. In particular, thermodynamic-based modeling represents the most biophysically informed approach aimed at connecting DNA level regulatory sequences to the expression of specific genes. The goal of this review is to give biologists a thorough description of the steps involved in building, analyzing, and implementing a thermodynamic-based model of transcriptional regulation. The data requirements for this modeling approach are described, the derivation for a specific regulatory region is shown, and the challenges and future directions for the quantitative modeling of gene regulation are discussed.
Collapse
|