1
|
Tabe-Bordbar S, Song YJ, Lunt BJ, Alavi Z, Prasanth KV, Sinha S. Mechanistic analysis of enhancer sequences in the estrogen receptor transcriptional program. Commun Biol 2024; 7:719. [PMID: 38862711 PMCID: PMC11167054 DOI: 10.1038/s42003-024-06400-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 05/30/2024] [Indexed: 06/13/2024] Open
Abstract
Estrogen Receptor α (ERα) is a major lineage determining transcription factor (TF) in mammary gland development. Dysregulation of ERα-mediated transcriptional program results in cancer. Transcriptomic and epigenomic profiling of breast cancer cell lines has revealed large numbers of enhancers involved in this regulatory program, but how these enhancers encode function in their sequence remains poorly understood. A subset of ERα-bound enhancers are transcribed into short bidirectional RNA (enhancer RNA or eRNA), and this property is believed to be a reliable marker of active enhancers. We therefore analyze thousands of ERα-bound enhancers and build quantitative, mechanism-aware models to discriminate eRNAs from non-transcribing enhancers based on their sequence. Our thermodynamics-based models provide insights into the roles of specific TFs in ERα-mediated transcriptional program, many of which are supported by the literature. We use in silico perturbations to predict TF-enhancer regulatory relationships and integrate these findings with experimentally determined enhancer-promoter interactions to construct a gene regulatory network. We also demonstrate that the model can prioritize breast cancer-related sequence variants while providing mechanistic explanations for their function. Finally, we experimentally validate the model-proposed mechanisms underlying three such variants.
Collapse
Affiliation(s)
- Shayan Tabe-Bordbar
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - You Jin Song
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Bryan J Lunt
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Zahra Alavi
- Department of Physics, Loyola Marymount University, Los Angeles, CA, USA
| | - Kannanganattu V Prasanth
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Saurabh Sinha
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
2
|
Duk MA, Gursky VV, Samsonova MG, Surkova SY. Modeling the Flowering Activation Motif during Vernalization in Legumes: A Case Study of M. trancatula. Life (Basel) 2023; 14:26. [PMID: 38255642 PMCID: PMC10817331 DOI: 10.3390/life14010026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/04/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
In many plant species, flowering is promoted by the cold treatment or vernalization. The mechanism of vernalization-induced flowering has been extensively studied in Arabidopsis but remains largely unknown in legumes. The orthologs of the FLC gene, a major regulator of vernalization response in Arabidopsis, are absent or non-functional in the vernalization-sensitive legume species. Nevertheless, the legume integrator genes FT and SOC1 are involved in the transition of the vernalization signal to meristem identity genes, including PIM (AP1 ortholog). However, the regulatory contribution of these genes to PIM activation in legumes remains elusive. Here, we presented the theoretical and data-driven analyses of a feed-forward regulatory motif that includes a vernalization-responsive FT gene and several SOC1 genes, which independently activate PIM and thereby mediate floral transition. Our theoretical model showed that the multiple regulatory branches in this regulatory motif facilitated the elimination of no-sense signals and amplified useful signals from the upstream regulator. We further developed and analyzed four data-driven models of PIM activation in Medicago trancatula in vernalized and non-vernalized conditions in wild-type and fta1-1 mutants. The model with FTa1 providing both direct activation and indirect activation via three intermediate activators, SOC1a, SOC1b, and SOC1c, resulted in the most relevant PIM dynamics. In this model, the difference between regulatory inputs of SOC1 genes was nonessential. As a result, in the M. trancatula model, the cumulative action of SOC1a, SOC1b, and SOC1c was favored. Overall, in this study, we first presented the in silico analysis of vernalization-induced flowering in legumes. The considered vernalization network motif can be supplemented with additional regulatory branches as new experimental data become available.
Collapse
Affiliation(s)
- Maria A. Duk
- Mathematical Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, 195251 St. Petersburg, Russia
- Theoretical Department, Ioffe Institute, 194021 St. Petersburg, Russia
| | - Vitaly V. Gursky
- Theoretical Department, Ioffe Institute, 194021 St. Petersburg, Russia
| | - Maria G. Samsonova
- Mathematical Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, 195251 St. Petersburg, Russia
| | - Svetlana Yu. Surkova
- Mathematical Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, 195251 St. Petersburg, Russia
| |
Collapse
|
3
|
Massonis G, Villaverde AF, Banga JR. Improving dynamic predictions with ensembles of observable models. Bioinformatics 2022; 39:6842325. [PMID: 36416122 PMCID: PMC9805594 DOI: 10.1093/bioinformatics/btac755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/20/2022] [Accepted: 11/22/2022] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION Dynamic mechanistic modelling in systems biology has been hampered by the complexity and variability associated with the underlying interactions, and by uncertain and sparse experimental measurements. Ensemble modelling, a concept initially developed in statistical mechanics, has been introduced in biological applications with the aim of mitigating those issues. Ensemble modelling uses a collection of different models compatible with the observed data to describe the phenomena of interest. However, since systems biology models often suffer from a lack of identifiability and observability, ensembles of models are particularly unreliable when predicting non-observable states. RESULTS We present a strategy to assess and improve the reliability of a class of model ensembles. In particular, we consider kinetic models described using ordinary differential equations with a fixed structure. Our approach builds an ensemble with a selection of the parameter vectors found when performing parameter estimation with a global optimization metaheuristic. This technique enforces diversity during the sampling of parameter space and it can quantify the uncertainty in the predictions of state trajectories. We couple this strategy with structural identifiability and observability analysis, and when these tests detect possible prediction issues we obtain model reparameterizations that surmount them. The end result is an ensemble of models with the ability to predict the internal dynamics of a biological process. We demonstrate our approach with models of glucose regulation, cell division, circadian oscillations and the JAK-STAT signalling pathway. AVAILABILITY AND IMPLEMENTATION The code that implements the methodology and reproduces the results is available at https://doi.org/10.5281/zenodo.6782638. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gemma Massonis
- Computational Biology Lab, MBG-CSIC (Spanish National Research Council), Pontevedra, Galicia 36143, Spain
| | | | | |
Collapse
|
4
|
Emmert-Streib F, Yli-Harja O. What Is a Digital Twin? Experimental Design for a Data-Centric Machine Learning Perspective in Health. Int J Mol Sci 2022; 23:13149. [PMID: 36361936 PMCID: PMC9653941 DOI: 10.3390/ijms232113149] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 10/25/2022] [Accepted: 10/27/2022] [Indexed: 08/08/2023] Open
Abstract
The idea of a digital twin has recently gained widespread attention. While, so far, it has been used predominantly for problems in engineering and manufacturing, it is believed that a digital twin also holds great promise for applications in medicine and health. However, a problem that severely hampers progress in these fields is the lack of a solid definition of the concept behind a digital twin that would be directly amenable for such big data-driven fields requiring a statistical data analysis. In this paper, we address this problem. We will see that the term 'digital twin', as used in the literature, is like a Matryoshka doll. For this reason, we unstack the concept via a data-centric machine learning perspective, allowing us to define its main components. As a consequence, we suggest to use the term Digital Twin System instead of digital twin because this highlights its complex interconnected substructure. In addition, we address ethical concerns that result from treatment suggestions for patients based on simulated data and a possible lack of explainability of the underling models.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, 33100 Tampere, Finland
| | - Olli Yli-Harja
- Computational Systems Biology, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland
- Institute for Systems Biology, Seattle, WA 98195, USA
| |
Collapse
|
5
|
Oliveira SMD, Densmore D. Hardware, Software, and Wetware Codesign Environment for Synthetic Biology. BIODESIGN RESEARCH 2022; 2022:9794510. [PMID: 37850136 PMCID: PMC10521664 DOI: 10.34133/2022/9794510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 08/10/2022] [Indexed: 10/19/2023] Open
Abstract
Synthetic biology is the process of forward engineering living systems. These systems can be used to produce biobased materials, agriculture, medicine, and energy. One approach to designing these systems is to employ techniques from the design of embedded electronics. These techniques include abstraction, standards, modularity, automated design, and formal semantic models of computation. Together, these elements form the foundation of "biodesign automation," where software, robotics, and microfluidic devices combine to create exciting biological systems of the future. This paper describes a "hardware, software, wetware" codesign vision where software tools can be made to act as "genetic compilers" that transform high-level specifications into engineered "genetic circuits" (wetware). This is followed by a process where automation equipment, well-defined experimental workflows, and microfluidic devices are explicitly designed to house, execute, and test these circuits (hardware). These systems can be used as either massively parallel experimental platforms or distributed bioremediation and biosensing devices. Next, scheduling and control algorithms (software) manage these systems' actual execution and data analysis tasks. A distinguishing feature of this approach is how all three of these aspects (hardware, software, and wetware) may be derived from the same basic specification in parallel and generated to fulfill specific cost, performance, and structural requirements.
Collapse
Affiliation(s)
- Samuel M. D. Oliveira
- Department of Electrical and Computer Engineering, Boston University, MA 02215, USA
- Biological Design Center, Boston University, MA 02215, USA
| | - Douglas Densmore
- Department of Electrical and Computer Engineering, Boston University, MA 02215, USA
- Biological Design Center, Boston University, MA 02215, USA
| |
Collapse
|
6
|
Bhogale S, Sinha S. Thermodynamics-based modeling reveals regulatory effects of indirect transcription factor-DNA binding. iScience 2022; 25:104152. [PMID: 35465052 PMCID: PMC9018382 DOI: 10.1016/j.isci.2022.104152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 12/28/2021] [Accepted: 03/21/2022] [Indexed: 11/30/2022] Open
Abstract
Transcription factors (TFs) influence gene expression by binding to DNA, yet experimental data suggests that they also frequently bind regulatory DNA indirectly by interacting with other DNA-bound proteins. Here, we used a data modeling approach to test if such indirect binding by TFs plays a significant role in gene regulation. We first incorporated regulatory function of indirectly bound TFs into a thermodynamics-based model for predicting enhancer-driven expression from its sequence. We then fit the new model to a rich data set comprising hundreds of enhancers and their regulatory activities during mesoderm specification in Drosophila embryogenesis and showed that the newly incorporated mechanism results in significantly better agreement with data. In the process, we derived the first sequence-level model of this extensively characterized regulatory program. We further showed that allowing indirect binding of a TF explains its localization at enhancers more accurately than with direct binding only. Our model also provided a simple explanation of how a TF may switch between activating and repressive roles depending on context. Inclusion of indirect DNA binding of transcription factor improves enhancer function prediction Context specific activating or repressive roles of TFs Indirect binding improves fits to experimental TF-DNA binding data Role of Tinman depends on its DNA-binding mode (direct or indirect)
Collapse
|
7
|
Gaiewski MJ, Drewell RA, Dresch JM. Fitting thermodynamic-based models: Incorporating parameter sensitivity improves the performance of an evolutionary algorithm. Math Biosci 2021; 342:108716. [PMID: 34687735 DOI: 10.1016/j.mbs.2021.108716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 09/10/2021] [Accepted: 09/17/2021] [Indexed: 11/30/2022]
Abstract
A detailed comprehension of transcriptional regulation is critical to understanding the genetic control of development and disease across many different organisms. To more fully investigate the complex molecular interactions controlling the precise expression of genes, many groups have constructed mathematical models to complement their experimental approaches. A critical step in such studies is choosing the most appropriate parameter estimation algorithm to enable detailed analysis of the parameters that contribute to the models. In this study, we develop a novel set of evolutionary algorithms that use a pseudo-random Sobol Set to construct the initial population and incorporate parameter sensitivities into the adaptation of mutation rates, using local, global, and hybrid strategies. Comparison of the performance of these new algorithms to a number of current state-of-the-art global parameter estimation algorithms on a range of continuous test functions, as well as synthetic biological data representing models of gene regulatory systems, reveals improved performance of the new algorithms in terms of runtime, error and reproducibility. In addition, by analyzing the ability of these algorithms to fit datasets of varying quality, we provide the experimentalist with a guide to how the algorithms perform across a range of noisy data. These results demonstrate the improved performance of the new set of parameter estimation algorithms and facilitate meaningful integration of model parameters and predictions in our understanding of the molecular mechanisms of gene regulation.
Collapse
Affiliation(s)
- Michael J Gaiewski
- Department of Mathematics and Computer Science, Clark University, Worcester, MA, USA; Department of Mathematics, University of Connecticut, Storrs, CT, USA.
| | | | | |
Collapse
|
8
|
Dibaeinia P, Sinha S. Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks. Nucleic Acids Res 2021; 49:10309-10327. [PMID: 34508359 PMCID: PMC8501998 DOI: 10.1093/nar/gkab765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/18/2021] [Accepted: 08/25/2021] [Indexed: 11/18/2022] Open
Abstract
Deciphering the sequence-function relationship encoded in enhancers holds the key to interpreting non-coding variants and understanding mechanisms of transcriptomic variation. Several quantitative models exist for predicting enhancer function and underlying mechanisms; however, there has been no systematic comparison of these models characterizing their relative strengths and shortcomings. Here, we interrogated a rich data set of neuroectodermal enhancers in Drosophila, representing cis- and trans- sources of expression variation, with a suite of biophysical and machine learning models. We performed rigorous comparisons of thermodynamics-based models implementing different mechanisms of activation, repression and cooperativity. Moreover, we developed a convolutional neural network (CNN) model, called CoNSEPT, that learns enhancer 'grammar' in an unbiased manner. CoNSEPT is the first general-purpose CNN tool for predicting enhancer function in varying conditions, such as different cell types and experimental conditions, and we show that such complex models can suggest interpretable mechanisms. We found model-based evidence for mechanisms previously established for the studied system, including cooperative activation and short-range repression. The data also favored one hypothesized activation mechanism over another and suggested an intriguing role for a direct, distance-independent repression mechanism. Our modeling shows that while fundamentally different models can yield similar fits to data, they vary in their utility for mechanistic inference. CoNSEPT is freely available at: https://github.com/PayamDiba/CoNSEPT.
Collapse
Affiliation(s)
- Payam Dibaeinia
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
9
|
Garbuzov FE, Gursky VV. Nonequilibrium model of short-range repression in gene transcription regulation. Phys Rev E 2021; 104:014407. [PMID: 34412298 DOI: 10.1103/physreve.104.014407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 06/24/2021] [Indexed: 11/07/2022]
Abstract
Transcription factors are proteins that regulate gene activity by activating or repressing gene transcription. A special class of transcriptional repressors operates via a short-range mechanism, making local DNA regions inaccessible to binding by activators, and thus providing an indirect repressive action on the target gene. This mechanism is commonly modeled assuming that repressors interact with DNA under thermodynamic equilibrium and neglecting some configurations of the gene regulatory region. We elaborate on a more general nonequilibrium model of short-range repression using the graph formalism for transitions between gene states, and we apply analytical calculations to compare it with the equilibrium model in terms of the repression strength and expression noise. In contrast to the equilibrium approach, the new model allows us to separate two basic mechanisms of short-range repression. The first mechanism is associated with the recruiting of factors that mediate chromatin condensation, and the second one concerns the blocking of factors that mediate chromatin loosening. The nonequilibrium model demonstrates better performance on previously published gene expression data obtained for transcription factors controlling Drosophila development, and furthermore it predicts that the first repression mechanism is the most favorable in this system. The presented approach can be scaled to larger gene networks and can be used to infer specific modes and parameters of transcriptional regulation from gene expression data.
Collapse
Affiliation(s)
- F E Garbuzov
- Ioffe Institute, 26 Polytekhnicheskaya, St. Petersburg 194021, Russia
| | - V V Gursky
- Ioffe Institute, 26 Polytekhnicheskaya, St. Petersburg 194021, Russia
| |
Collapse
|
10
|
Gautam P, Kumar Sinha S. Anticipating response function in gene regulatory networks. J R Soc Interface 2021; 18:20210206. [PMID: 34062105 DOI: 10.1098/rsif.2021.0206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The origin of an ordered genetic response of a complex and noisy biological cell is intimately related to the detailed mechanism of protein-DNA interactions present in a wide variety of gene regulatory (GR) systems. However, the quantitative prediction of genetic response and the correlation between the mechanism and the response curve is poorly understood. Here, we report in silico binding studies of GR systems to show that the transcription factor (TF) binds to multiple DNA sites with high cooperativity spreads from specific binding sites into adjacent non-specific DNA and bends the DNA. Our analysis is not limited only to the isolated model system but also can be applied to a system containing multiple interacting genes. The controlling role of TF oligomerization, TF-ligand interactions, and DNA looping for gene expression has been also characterized. The predictions are validated against detailed grand canonical Monte Carlo simulations and published data for the lac operon system. Overall, our study reveals that the expression of target genes can be quantitatively controlled by modulating TF-ligand interactions and the bending energy of DNA.
Collapse
Affiliation(s)
- Pankaj Gautam
- Theoretical and Computational Biophysical Chemistry Group, Department of Chemistry, Indian Institute of Technology, Ropar 140001, India
| | - Sudipta Kumar Sinha
- Theoretical and Computational Biophysical Chemistry Group, Department of Chemistry, Indian Institute of Technology, Ropar 140001, India
| |
Collapse
|
11
|
Pavlinova P, Samsonova MG, Gursky VV. Dynamical Modeling of the Core Gene Network Controlling Transition to Flowering in Pisum sativum. Front Genet 2021; 12:614711. [PMID: 33777095 PMCID: PMC7990781 DOI: 10.3389/fgene.2021.614711] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 01/28/2021] [Indexed: 11/29/2022] Open
Abstract
Transition to flowering is an important stage of plant development. Many regulatory modules that control floral transition are conservative across plants. This process is best studied for the model plant Arabidopsis thaliana. The homologues of Arabidopsis genes responsible for the flowering initiation in legumes have been identified, and available data on their expression provide a good basis for gene network modeling. In this study, we developed several dynamical models of a gene network controlling transition to flowering in pea (Pisum sativum) using two different approaches. We used differential equations for modeling a previously proposed gene regulation scheme of floral initiation in pea and tested possible alternative hypothesis about some regulations. As the second approach, we applied neural networks to infer interactions between genes in the network directly from gene expression data. All models were verified on previously published experimental data on the dynamic expression of the main genes in the wild type and in three mutant genotypes. Based on modeling results, we made conclusions about the functionality of the previously proposed interactions in the gene network and about the influence of different growing conditions on the network architecture. It was shown that regulation of the PIM, FTa1, and FTc genes in pea does not correspond to the previously proposed hypotheses. The modeling suggests that short- and long-day growing conditions are characterized by different gene network architectures. Overall, the results obtained can be used to plan new experiments and create more accurate models to study the flowering initiation in pea and, in a broader context, in legumes.
Collapse
Affiliation(s)
- Polina Pavlinova
- Mathematical Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Maria G Samsonova
- Mathematical Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Vitaly V Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia
| |
Collapse
|
12
|
Abstract
Large amounts of effort have been invested in trying to understand how a single genome is able to specify the identity of hundreds of cell types. Inspired by some aspects of Caenorhabditis elegans biology, we implemented an in silico evolutionary strategy to produce gene regulatory networks (GRNs) that drive cell-specific gene expression patterns, mimicking the process of terminal cell differentiation. Dynamics of the gene regulatory networks are governed by a thermodynamic model of gene expression, which uses DNA sequences and transcription factor degenerate position weight matrixes as input. In a version of the model, we included chromatin accessibility. Experimentally, it has been determined that cell-specific and broadly expressed genes are regulated differently. In our in silico evolved GRNs, broadly expressed genes are regulated very redundantly and the architecture of their cis-regulatory modules is different, in accordance to what has been found in C. elegans and also in other systems. Finally, we found differences in topological positions in GRNs between these two classes of genes, which help to explain why broadly expressed genes are so resilient to mutations. Overall, our results offer an explanatory hypothesis on why broadly expressed genes are regulated so redundantly compared to cell-specific genes, which can be extrapolated to phenomena such as ChIP-seq HOT regions.
Collapse
Affiliation(s)
- Carlos Mora-Martinez
- Evo-devo Helsinki community, Centre of Excellence in Experimental and Computational Developmental Biology, Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| |
Collapse
|
13
|
Makashov AA, Myasnikova EM, Spirov AV. Fuzzy Linguistic Modeling of the Regulation of Drosophila Segmentation Genes. Biophysics (Nagoya-shi) 2021. [DOI: 10.1134/s0006350921010073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
14
|
Eck E, Liu J, Kazemzadeh-Atoufi M, Ghoreishi S, Blythe SA, Garcia HG. Quantitative dissection of transcription in development yields evidence for transcription-factor-driven chromatin accessibility. eLife 2020; 9:e56429. [PMID: 33074101 PMCID: PMC7738189 DOI: 10.7554/elife.56429] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 10/16/2020] [Indexed: 12/28/2022] Open
Abstract
Thermodynamic models of gene regulation can predict transcriptional regulation in bacteria, but in eukaryotes, chromatin accessibility and energy expenditure may call for a different framework. Here, we systematically tested the predictive power of models of DNA accessibility based on the Monod-Wyman-Changeux (MWC) model of allostery, which posits that chromatin fluctuates between accessible and inaccessible states. We dissected the regulatory dynamics of hunchback by the activator Bicoid and the pioneer-like transcription factor Zelda in living Drosophila embryos and showed that no thermodynamic or non-equilibrium MWC model can recapitulate hunchback transcription. Therefore, we explored a model where DNA accessibility is not the result of thermal fluctuations but is catalyzed by Bicoid and Zelda, possibly through histone acetylation, and found that this model can predict hunchback dynamics. Thus, our theory-experiment dialogue uncovered potential molecular mechanisms of transcriptional regulatory dynamics, a key step toward reaching a predictive understanding of developmental decision-making.
Collapse
Affiliation(s)
- Elizabeth Eck
- Biophysics Graduate Group, University of California at BerkeleyBerkeleyUnited States
| | - Jonathan Liu
- Department of Physics, University of California at BerkeleyBerkeleyUnited States
| | | | - Sydney Ghoreishi
- Department of Molecular and Cell Biology, University of California at BerkeleyBerkeleyUnited States
| | - Shelby A Blythe
- Department of Molecular Biosciences, Northwestern UniversityEvanstonUnited States
| | - Hernan G Garcia
- Biophysics Graduate Group, University of California at BerkeleyBerkeleyUnited States
- Department of Physics, University of California at BerkeleyBerkeleyUnited States
- Department of Molecular and Cell Biology, University of California at BerkeleyBerkeleyUnited States
- Institute for Quantitative Biosciences-QB3, University of California at BerkeleyBerkeleyUnited States
| |
Collapse
|
15
|
Abstract
Terminal regions of the early Drosophila embryo are patterned by the highly conserved ERK cascade, giving rise to the nonsegmented terminal structures of the future larva. In less than an hour, this signaling event establishes several gene expression boundaries and sets in motion a sequence of elaborate morphogenetic events. Genetic studies of terminal patterning discovered signaling components and transcription factors that are involved in numerous developmental contexts and deregulated in human diseases. This review summarizes current understanding of signaling and morphogenesis during terminal patterning and discusses several open questions that can now be rigorously investigated using live imaging, omics, and optogenetic approaches. The anatomical simplicity of the terminal patterning system and its amenability to a broad range of increasingly sophisticated genetic perturbations will continue to make it a premier quantitative model for studying multiple aspects of tissue patterning by dynamically controlled cell signaling pathways.
Collapse
|
16
|
Garcia HG, Berrocal A, Kim YJ, Martini G, Zhao J. Lighting up the central dogma for predictive developmental biology. Curr Top Dev Biol 2019; 137:1-35. [PMID: 32143740 DOI: 10.1016/bs.ctdb.2019.10.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Although the last 30years have witnessed the mapping of the wiring diagrams of the gene regulatory networks that dictate cell fate and animal body plans, specific understanding building on such network diagrams that shows how DNA regulatory regions control gene expression lags far behind. These networks have yet to yield the predictive power necessary to, for example, calculate how the concentration dynamics of input transcription factors and DNA regulatory sequence prescribes output patterns of gene expression that, in turn, determine body plans themselves. Here, we argue that reaching a predictive understanding of developmental decision-making calls for an interplay between theory and experiment aimed at revealing how the regulation of the processes of the central dogma dictate network connections and how network topology guides cells toward their ultimate developmental fate. To make this possible, it is crucial to break free from the snapshot-based understanding of embryonic development facilitated by fixed-tissue approaches and embrace new technologies that capture the dynamics of developmental decision-making at the single cell level, in living embryos.
Collapse
Affiliation(s)
- Hernan G Garcia
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, United States; Department of Physics, University of California at Berkeley, Berkeley, CA, United States; Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States; Quantitative Biosciences-QB3, University of California at Berkeley, Berkeley, CA, United States.
| | - Augusto Berrocal
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, United States
| | - Yang Joon Kim
- Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States
| | - Gabriella Martini
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, United States
| | - Jiaxi Zhao
- Department of Physics, University of California at Berkeley, Berkeley, CA, United States
| |
Collapse
|
17
|
Wan C, Chang W, Zhang Y, Shah F, Lu X, Zang Y, Zhang A, Cao S, Fishel ML, Ma Q, Zhang C. LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data. Nucleic Acids Res 2019; 47:e111. [PMID: 31372654 PMCID: PMC6765121 DOI: 10.1093/nar/gkz655] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 07/11/2019] [Accepted: 07/16/2019] [Indexed: 12/14/2022] Open
Abstract
A key challenge in modeling single-cell RNA-seq data is to capture the diversity of gene expression states regulated by different transcriptional regulatory inputs across individual cells, which is further complicated by largely observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model, from the kinetic relationships of the transcriptional regulatory inputs, mRNA metabolism and abundance in single cells. LTMG infers the expression multi-modalities across single cells, meanwhile, the dropouts and low expressions are treated as left truncated. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of scRNA-seq data, comparing to three other state-of-the-art models. Our biological assumption of the low non-zero expressions, rationality of the multimodality setting, and the capability of LTMG in extracting expression states specific to cell types or functions, are validated on independent experimental data sets. A differential gene expression test and a co-regulation module identification method are further developed. We experimentally validated that our differential expression test has higher sensitivity and specificity, compared with other five popular methods. The co-regulation analysis is capable of retrieving gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at https://github.com/zy26/LTMGSCA.
Collapse
Affiliation(s)
- Changlin Wan
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
- Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA
- Department of Electrical and Computer Engineering, Purdue University, Indianapolis, IN 46202, USA
| | - Wennan Chang
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
- Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA
- Department of Electrical and Computer Engineering, Purdue University, Indianapolis, IN 46202, USA
| | - Yu Zhang
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
- Colleges of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Fenil Shah
- Department of Pediatrics and Herman B Wells Center for Pediatric Research, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
| | - Xiaoyu Lu
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
| | - Yong Zang
- Department of Biostatistics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
| | - Anru Zhang
- Department of Statistics, University of Wisconsin–Madison, Madison, WI 53706, USA
| | - Sha Cao
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
- Department of Biostatistics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
| | - Melissa L Fishel
- Department of Pediatrics and Herman B Wells Center for Pediatric Research, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
- Department of Pharmacology and Toxicology, Indiana University, School of Medicine, Indianapolis, IN,46202, USA
| | - Qin Ma
- Department of Biomedical Informatics, the Ohio State University, Columbus, OH 43210, USA
| | - Chi Zhang
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN 46202, USA
- Department of Electrical and Computer Engineering, Purdue University, Indianapolis, IN 46202, USA
| |
Collapse
|
18
|
Simón-Carrasco L, Jiménez G, Barbacid M, Drosten M. The Capicua tumor suppressor: a gatekeeper of Ras signaling in development and cancer. Cell Cycle 2019; 17:702-711. [PMID: 29578365 DOI: 10.1080/15384101.2018.1450029] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022] Open
Abstract
The transcriptional repressor Capicua (CIC) has emerged as an important rheostat of cell growth regulated by RAS/MAPK signaling. Cic was originally discovered in Drosophila, where it was shown to be inactivated by MAPK signaling downstream of the RTKs Torso and EGFR, which results in signal-dependent responses that are required for normal cell fate specification, proliferation and survival of developing and adult tissues. CIC is highly conserved in mammals, where it is also negatively regulated by MAPK signaling. Here, we review the roles of CIC during mammalian development, tissue homeostasis, tumor formation and therapy resistance. Available data indicate that CIC is involved in multiple biological processes, including lung development, liver homeostasis, autoimmunity and neurobehavioral processes. Moreover, CIC has been shown to be involved in tumor development as a tumor suppressor, both in human as well as in mouse models. Finally, several lines of evidence implicate CIC as a determinant of sensitivity to EGFR and MAPK pathway inhibitors, suggesting that CIC may play a broader role in human cancer than originally anticipated.
Collapse
Affiliation(s)
- Lucía Simón-Carrasco
- a Molecular Oncology Programme, Centro Nacional de Investigaciones Oncológicas (CNIO) , Melchor Fernández Almagro 3, Madrid , Spain
| | - Gerardo Jiménez
- b Institut de Biologia Molecular de Barcelona-CSIC , Parc Científic de Barcelona, Barcelona , Spain.,c ICREA , Pg. Lluís Companys 23, Barcelona , Spain
| | - Mariano Barbacid
- a Molecular Oncology Programme, Centro Nacional de Investigaciones Oncológicas (CNIO) , Melchor Fernández Almagro 3, Madrid , Spain
| | - Matthias Drosten
- a Molecular Oncology Programme, Centro Nacional de Investigaciones Oncológicas (CNIO) , Melchor Fernández Almagro 3, Madrid , Spain
| |
Collapse
|
19
|
Tsigkinopoulou A, Hawari A, Uttley M, Breitling R. Defining informative priors for ensemble modeling in systems biology. Nat Protoc 2019; 13:2643-2663. [PMID: 30353176 DOI: 10.1038/s41596-018-0056-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Ensemble modeling in molecular systems biology requires the reproducible translation of kinetic parameter data into informative probability distributions (priors), as well as approaches that sample parameters from these distributions without violating the thermodynamic consistency of the overall model. Although a number of pioneering frameworks for ensemble modeling have been published, the issue of generating informative priors has not yet been addressed. Here, we present a protocol that aims to fill this gap. This protocol discusses the collection of parameter values from a diverse range of sources (literature, databases and experiments), assessment of their plausibility, and creation of log-normal probability distributions that can be used as informative priors in ensemble modeling. Furthermore, the protocol enables sampling from the generated distributions while maintaining thermodynamic consistency. Once all parameter values have been retrieved from literature and databases, the protocol can be implemented within ~5-10 min per parameter. The aim of this protocol is to facilitate the design and use of informative distributions for ensemble modeling, especially in fields such as synthetic biology and systems medicine.
Collapse
Affiliation(s)
- Areti Tsigkinopoulou
- Manchester Institute of Biotechnology, School of Chemistry, University of Manchester, Manchester, United Kingdom
| | - Aliah Hawari
- Manchester Institute of Biotechnology, School of Chemistry, University of Manchester, Manchester, United Kingdom
| | - Megan Uttley
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Rainer Breitling
- Manchester Institute of Biotechnology, School of Chemistry, University of Manchester, Manchester, United Kingdom.
| |
Collapse
|
20
|
Yildirim N, Aktas ME, Ozcan SN, Akbas E, Ay A. Differential transcriptional regulation by alternatively designed mechanisms: A mathematical modeling approach. In Silico Biol 2019; 12:95-127. [PMID: 27497472 DOI: 10.3233/isb-160467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Cells maintain cellular homeostasis employing different regulatory mechanisms to respond external stimuli. We study two groups of signal-dependent transcriptional regulatory mechanisms. In the first group, we assume that repressor and activator proteins compete for binding to the same regulatory site on DNA (competitive mechanisms). In the second group, they can bind to different regulatory regions in a noncompetitive fashion (noncompetitive mechanisms). For both competitive and noncompetitive mechanisms, we studied the gene expression dynamics by increasing the repressor or decreasing the activator abundance (inhibition mechanisms), or by decreasing the repressor or increasing the activator abundance (activation mechanisms). We employed delay differential equation models. Our simulation results show that the competitive and noncompetitive inhibition mechanisms exhibit comparable repression effectiveness. However, response time is fastest in the noncompetitive inhibition mechanism due to increased repressor abundance, and slowest in the competitive inhibition mechanism by increased repressor level. The competitive and noncompetitive inhibition mechanisms through decreased activator abundance show comparable and moderate response times, while the competitive and noncompetitive activation mechanisms by increased activator protein level display more effective and faster response. Our study exemplifies the importance of mathematical modeling and computer simulation in the analysis of gene expression dynamics.
Collapse
Affiliation(s)
- Necmettin Yildirim
- Division of Natural Sciences, New College of Florida, Bayshore Road, Sarasota, FL, USA
| | - Mehmet Emin Aktas
- Department of Mathematics, Florida State University, W College Ave, Tallahassee, FL, USA
| | - Seyma Nur Ozcan
- Department of Mathematics, North Carolina State University, Raleigh, NC, USA
| | - Esra Akbas
- Department of Computer Science, Florida State University, W College Ave, Tallahassee, FL, USA
| | - Ahmet Ay
- Departments of Biology and Mathematics, Colgate University, Oak Drive, Hamilton, NY, USA
| |
Collapse
|
21
|
Gursky VV, Kozlov KN, Nuzhdin SV, Samsonova MG. Dynamical Modeling of the Core Gene Network Controlling Flowering Suggests Cumulative Activation From the FLOWERING LOCUS T Gene Homologs in Chickpea. Front Genet 2018; 9:547. [PMID: 30524469 PMCID: PMC6262361 DOI: 10.3389/fgene.2018.00547] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Accepted: 10/26/2018] [Indexed: 11/13/2022] Open
Abstract
Initiation of flowering moves plants from vegetative to reproductive development. The time when this transition happens (flowering time), an important indicator of productivity, depends on both endogenous and environmental factors. The core genetic regulatory network canalizing the flowering signals to the decision to flower has been studied extensively in the model plant Arabidopsis thaliana and has been shown to preserve its main regulatory blocks in other species. It integrates activation from the FLOWERING LOCUS T (FT) gene or its homologs to the flowering decision expressed as high expression of the meristem identity genes, including AP1. We elaborated a dynamical model of this flowering gene regulatory network and applied it to the previously published expression data from two cultivars of domesticated chickpea (Cicer arietinum), obtained for two photoperiod durations. Due to a large number of free parameters in the model, we used an ensemble approach analyzing the model solutions at many parameter sets that provide equally good fit to data. Testing several alternative hypotheses about regulatory roles of the five FT homologs present in chickpea revealed no preference in segregating individual FT copies as singled-out activators with their own regulatory parameters, thus favoring the hypothesis that the five genes possess similar regulatory properties and provide cumulative activation in the network. The analysis reveals that different levels of activation from AP1 can explain a small difference observed in the expression of the two homologs of the repressor gene TFL1. Finally, the model predicts highly reduced activation between LFY and AP1, thus suggesting that this regulatory block is not conserved in chickpea and needs other mechanisms. Overall, this study provides the first attempt to quantitatively test the flowering time gene network in chickpea based on data-driven modeling.
Collapse
Affiliation(s)
- Vitaly V Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia.,Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Konstantin N Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Sergey V Nuzhdin
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia.,Molecular and Computational Biology, University of Southern California, Los Angeles, CA, United States
| | - Maria G Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| |
Collapse
|
22
|
Meinecke L, Sharma PP, Du H, Zhang L, Nie Q, Schilling TF. Modeling craniofacial development reveals spatiotemporal constraints on robust patterning of the mandibular arch. PLoS Comput Biol 2018; 14:e1006569. [PMID: 30481168 PMCID: PMC6258504 DOI: 10.1371/journal.pcbi.1006569] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 10/16/2018] [Indexed: 12/11/2022] Open
Abstract
How does pattern formation occur accurately when confronted with tissue growth and stochastic fluctuations (noise) in gene expression? Dorso-ventral (D-V) patterning of the mandibular arch specifies upper versus lower jaw skeletal elements through a combination of Bone morphogenetic protein (Bmp), Endothelin-1 (Edn1), and Notch signaling, and this system is highly robust. We combine NanoString experiments of early D-V gene expression with live imaging of arch development in zebrafish to construct a computational model of the D-V mandibular patterning network. The model recapitulates published genetic perturbations in arch development. Patterning is most sensitive to changes in Bmp signaling, and the temporal order of gene expression modulates the response of the patterning network to noise. Thus, our integrated systems biology approach reveals non-intuitive features of the complex signaling system crucial for craniofacial development, including novel insights into roles of gene expression timing and stochasticity in signaling and gene regulation.
Collapse
Affiliation(s)
- Lina Meinecke
- Department of Mathematics, University of California, Irvine, CA, United States of America
- Center for Complex Biological Systems, University of California, Irvine, CA, United States of America
| | - Praveer P. Sharma
- Center for Complex Biological Systems, University of California, Irvine, CA, United States of America
- Department of Developmental and Cell Biology, University of California, Irvine, CA, United States of America
| | - Huijing Du
- Department of Mathematics, University of Nebraska, Lincoln, NE, United States of America
| | - Lei Zhang
- Beijing International Center for Mathematical Research, Peking University, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing, China
| | - Qing Nie
- Department of Mathematics, University of California, Irvine, CA, United States of America
- Center for Complex Biological Systems, University of California, Irvine, CA, United States of America
- Department of Developmental and Cell Biology, University of California, Irvine, CA, United States of America
| | - Thomas F. Schilling
- Center for Complex Biological Systems, University of California, Irvine, CA, United States of America
- Department of Developmental and Cell Biology, University of California, Irvine, CA, United States of America
| |
Collapse
|
23
|
An information theoretic treatment of sequence-to-expression modeling. PLoS Comput Biol 2018; 14:e1006459. [PMID: 30256780 PMCID: PMC6175532 DOI: 10.1371/journal.pcbi.1006459] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 10/08/2018] [Accepted: 08/24/2018] [Indexed: 11/23/2022] Open
Abstract
Studying a gene’s regulatory mechanisms is a tedious process that involves identification of candidate regulators by transcription factor (TF) knockout or over-expression experiments, delineation of enhancers by reporter assays, and demonstration of direct TF influence by site mutagenesis, among other approaches. Such experiments are often chosen based on the biologist’s intuition, from several testable hypotheses. We pursue the goal of making this process systematic by using ideas from information theory to reason about experiments in gene regulation, in the hope of ultimately enabling rigorous experiment design strategies. For this, we make use of a state-of-the-art mathematical model of gene expression, which provides a way to formalize our current knowledge of cis- as well as trans- regulatory mechanisms of a gene. Ambiguities in such knowledge can be expressed as uncertainties in the model, which we capture formally by building an ensemble of plausible models that fit the existing data and defining a probability distribution over the ensemble. We then characterize the impact of a new experiment on our understanding of the gene’s regulation based on how the ensemble of plausible models and its probability distribution changes when challenged with results from that experiment. This allows us to assess the ‘value’ of the experiment retroactively as the reduction in entropy of the distribution (information gain) resulting from the experiment’s results. We fully formalize this novel approach to reasoning about gene regulation experiments and use it to evaluate a variety of perturbation experiments on two developmental genes of D. melanogaster. We also provide objective and ‘biologist-friendly’ descriptions of the information gained from each such experiment. The rigorously defined information theoretic approaches presented here can be used in the future to formulate systematic strategies for experiment design pertaining to studies of gene regulatory mechanisms. In-depth studies of gene regulatory mechanisms employ a variety of experimental approaches such as identifying a gene’s enhancer(s) and testing its variants through reporter assays, followed by transcription factor mis-expression or knockouts, site mutagenesis, etc. The biologist is often faced with the challenging problem of selecting the ideal next experiment to perform so that its results provide novel mechanistic insights, and has to rely on their intuition about what is currently known on the topic and which experiments may add to that knowledge. We seek to make this intuition-based process more systematic, by borrowing ideas from the mature statistical field of experiment design. Towards this goal, we use the language of mathematical models to formally describe what is known about a gene’s regulatory mechanisms, and how an experiment’s results enhance that knowledge. We use information theoretic ideas to assign a ‘value’ to an experiment as well as explain objectively what is learned from that experiment. We demonstrate use of this novel approach on two extensively studied developmental genes in fruitfly. We expect our work to lead to systematic strategies for selecting the most informative experiments in a study of gene regulation.
Collapse
|
24
|
Abstract
The extracellular signal-regulated kinase (ERK) pathway leads to activation of the effector molecule ERK, which controls downstream responses by phosphorylating a variety of substrates, including transcription factors. Crucial insights into the regulation and function of this pathway came from studying embryos in which specific phenotypes arise from aberrant ERK activation. Despite decades of research, several important questions remain to be addressed for deeper understanding of this highly conserved signaling system and its function. Answering these questions will require quantifying the first steps of pathway activation, elucidating the mechanisms of transcriptional interpretation and measuring the quantitative limits of ERK signaling within which the system must operate to avoid developmental defects.
Collapse
Affiliation(s)
- Aleena L Patel
- Lewis Sigler Institute for Integrative Genomics, Department of Chemical Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Stanislav Y Shvartsman
- Lewis Sigler Institute for Integrative Genomics, Department of Chemical Engineering, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
25
|
Samee MAH, Lydiard-Martin T, Biette KM, Vincent BJ, Bragdon MD, Eckenrode KB, Wunderlich Z, Estrada J, Sinha S, DePace AH. Quantitative Measurement and Thermodynamic Modeling of Fused Enhancers Support a Two-Tiered Mechanism for Interpreting Regulatory DNA. Cell Rep 2018; 21:236-245. [PMID: 28978476 DOI: 10.1016/j.celrep.2017.09.033] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Revised: 07/30/2017] [Accepted: 09/08/2017] [Indexed: 02/07/2023] Open
Abstract
Computational models of enhancer function generally assume that transcription factors (TFs) exert their regulatory effects independently, modeling an enhancer as a "bag of sites." These models fail on endogenous loci that harbor multiple enhancers, and a "two-tier" model appears better suited: in each enhancer TFs work independently, and the total expression is a weighted sum of their expression readouts. Here, we test these two opposing views on how cis-regulatory information is integrated. We fused two Drosophila blastoderm enhancers, measured their readouts, and applied the above two models to these data. The two-tier mechanism better fits these readouts, suggesting that these fused enhancers comprise multiple independent modules, despite having sequence characteristics typical of single enhancers. We show that short-range TF-TF interactions are not sufficient to designate such modules, suggesting unknown underlying mechanisms. Our results underscore that mechanisms of how modules are defined and how their outputs are combined remain to be elucidated.
Collapse
Affiliation(s)
- Md Abul Hassan Samee
- Gladstone Institutes, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tara Lydiard-Martin
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Kelly M Biette
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Meghan D Bragdon
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Kelly B Eckenrode
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Zeba Wunderlich
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
26
|
Abstract
A common path to the formation of complex 3D structures starts with an epithelial sheet that is patterned by inductive cues that control the spatiotemporal activities of transcription factors. These activities are then interpreted by the cis-regulatory regions of the genes involved in cell differentiation and tissue morphogenesis. Although this general strategy has been documented in multiple developmental contexts, the range of experimental models in which each of the steps can be examined in detail and evaluated in its effect on the final structure remains very limited. Studies of the Drosophila eggshell patterning provide unique insights into the multiscale mechanisms that connect gene regulation and 3D epithelial morphogenesis. Here we review the current understanding of this system, emphasizing how the recent identification of cis-regulatory regions of genes within the eggshell patterning network enables mechanistic analysis of its spatiotemporal dynamics and evolutionary diversification. It appears that cis-regulatory changes can account for only some aspects of the morphological diversity of Drosophila eggshells, such as the prominent differences in the number of the respiratory dorsal appendages. Other changes, such as the appearance of the respiratory eggshell ridges, are caused by changes in the spatial distribution of inductive signals. Both types of mechanisms are at play in this rapidly evolving system, which provides an excellent model of developmental patterning and morphogenesis.
Collapse
|
27
|
Gursky VV, Kozlov KN, Kulakovskiy IV, Zubair A, Marjoram P, Lawrie DS, Nuzhdin SV, Samsonova MG. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network. PLoS One 2017; 12:e0184657. [PMID: 28898266 PMCID: PMC5595321 DOI: 10.1371/journal.pone.0184657] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 08/28/2017] [Indexed: 11/18/2022] Open
Abstract
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects.
Collapse
Affiliation(s)
- Vitaly V. Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
- * E-mail:
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Ivan V. Kulakovskiy
- Engelhardt Institute of Molecular Biology, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Asif Zubair
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Paul Marjoram
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - David S. Lawrie
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sergey V. Nuzhdin
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| |
Collapse
|
28
|
Wei Y, Gokhale RH, Sonnenschein A, Montgomery KM, Ingersoll A, Arnosti DN. Complex cis-regulatory landscape of the insulin receptor gene underlies the broad expression of a central signaling regulator. Development 2017; 143:3591-3603. [PMID: 27702787 DOI: 10.1242/dev.138073] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 08/10/2016] [Indexed: 12/16/2022]
Abstract
Insulin signaling plays key roles in development, growth and metabolism through dynamic control of glucose uptake, global protein translation and transcriptional regulation. Altered levels of insulin signaling are known to play key roles in development and disease, yet the molecular basis of such differential signaling remains obscure. Expression of the insulin receptor (InR) gene itself appears to play an important role, but the nature of the molecular wiring controlling InR transcription has not been elucidated. We characterized the regulatory elements driving Drosophila InR expression and found that the generally broad expression of this gene is belied by complex individual switch elements, the dynamic regulation of which reflects direct and indirect contributions of FOXO, EcR, Rbf and additional transcription factors through redundant elements dispersed throughout ∼40 kb of non-coding regions. The control of InR transcription in response to nutritional and tissue-specific inputs represents an integration of multiple cis-regulatory elements, the structure and function of which may have been sculpted by evolutionary selection to provide a highly tailored set of signaling responses on developmental and tissue-specific levels.
Collapse
Affiliation(s)
- Yiliang Wei
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Rewatee H Gokhale
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Anne Sonnenschein
- Genetics Program, Michigan State University, East Lansing, MI 48824, USA
| | - Kelly Mone't Montgomery
- Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Andrew Ingersoll
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - David N Arnosti
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA Genetics Program, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
29
|
Sayal R, Dresch JM, Pushel I, Taylor BR, Arnosti DN. Quantitative perturbation-based analysis of gene expression predicts enhancer activity in early Drosophila embryo. eLife 2016; 5. [PMID: 27152947 PMCID: PMC4859806 DOI: 10.7554/elife.08445] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 04/04/2016] [Indexed: 01/02/2023] Open
Abstract
Enhancers constitute one of the major components of regulatory machinery of metazoans. Although several genome-wide studies have focused on finding and locating enhancers in the genomes, the fundamental principles governing their internal architecture and cis-regulatory grammar remain elusive. Here, we describe an extensive, quantitative perturbation analysis targeting the dorsal-ventral patterning gene regulatory network (GRN) controlled by Drosophila NF-κB homolog Dorsal. To understand transcription factor interactions on enhancers, we employed an ensemble of mathematical models, testing effects of cooperativity, repression, and factor potency. Models trained on the dataset correctly predict activity of evolutionarily divergent regulatory regions, providing insights into spatial relationships between repressor and activator binding sites. Importantly, the collective predictions of sets of models were effective at novel enhancer identification and characterization. Our study demonstrates how experimental dataset and modeling can be effectively combined to provide quantitative insights into cis-regulatory information on a genome-wide scale. DOI:http://dx.doi.org/10.7554/eLife.08445.001 DNA contains regions known as genes, which may be “transcribed” to produce the RNA molecules that act as templates for building proteins and regulate cell activity. Proteins called transcription factors can bind to specific sequences of DNA to influence whether nearby genes are transcribed. For example, so-called enhancer regions of DNA contain several binding sites for transcription factors, and this binding activates gene transcription. Little is known about how the transcription factor binding sites are organized in enhancer regions, which makes it difficult to use DNA sequence information alone to predict the regulation of genes. A transcription factor called Dorsal controls the activity of a network of genes that plays a crucial role in the development of fruit fly embryos. Dorsal binds to the enhancer region of a gene called rhomboid, which has been well studied and is known to be a fairly typical example of an enhancer region. To understand the regulatory information encoded in the DNA sequences of enhancers, Sayal, Dresch et al. have now used a technique called perturbation analysis to investigate the interactions that are likely to occur between Dorsal and other transcription factors as they bind to the rhomboid enhancer. This technique involves systematically mutating the enhancer to remove different combinations of transcription factor binding sites and quantitatively investigating the effect this has on gene activity. A large set of mathematical models were then trained using this data and shown to correctly predict the activity of a range of other gene regulatory regions. The collective predictions of the models identified new enhancer regions and revealed details about how different types of transcription factor binding sites are arranged within enhancers. As we enter an era where the DNA sequences of entire human populations are increasingly accessible, we would like to know the functional significance of changes in gene regulatory regions. Sayal, Dresch et al. show that the regulatory properties of specific control proteins are accessible by employing quantitative experiments and mathematical models. Similar studies will be required to learn how mutations found across the genome may alter gene expression, leading to better diagnosis and treatment of disease. DOI:http://dx.doi.org/10.7554/eLife.08445.002
Collapse
Affiliation(s)
- Rupinder Sayal
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, United States.,Department of Biochemistry, DAV University, Jalandhar, India
| | - Jacqueline M Dresch
- Department of Mathematics, Michigan State University, East Lansing, United States.,Department of Mathematics and Computer Science, Clark University, Worcester, United States
| | - Irina Pushel
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, United States.,Stowers Institute for Medical Research, Kansas City, United States
| | - Benjamin R Taylor
- Department of Computer Science and Engineering, Michigan State University, East Lansing, United States.,School of Computer Science, Georgia Institute of Technology, Atlanta, United States
| | - David N Arnosti
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, United States
| |
Collapse
|
30
|
Dresch JM, Arnosti DN. The Wisdom of Crowds: Can Mathematical Models Crack the cis Regulatory Code? Cell Syst 2015; 1:379-80. [PMID: 27136351 DOI: 10.1016/j.cels.2015.12.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Genomic information includes not just a "parts list" of encoded proteins and RNAs, but also the information on regulation and function. To understand this more complex, deeper layer of biological information, recent efforts have turned to mathematical models as discovery engines of the cis regulatory code.
Collapse
Affiliation(s)
- Jacqueline M Dresch
- Mathematics and Computer Science Department, Clark University, 950 Main Street, Worcester, MA 01610, USA.
| | - David N Arnosti
- Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|