1
|
Carilli M, Gorin G, Choi Y, Chari T, Pachter L. Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data. Nat Methods 2024; 21:1466-1469. [PMID: 39054391 DOI: 10.1038/s41592-024-02365-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/27/2024] [Indexed: 07/27/2024]
Abstract
Here we present biVI, which combines the variational autoencoder framework of scVI with biophysical models describing the transcription and splicing kinetics of RNA molecules. We demonstrate on simulated and experimental single-cell RNA sequencing data that biVI retains the variational autoencoder's ability to capture cell type structure in a low-dimensional space while further enabling genome-wide exploration of the biophysical mechanisms, such as system burst sizes and degradation rates, that underlie observations.
Collapse
Affiliation(s)
- Maria Carilli
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA
- Fauna Bio, Emeryville, CA, USA
| | - Yongin Choi
- Department of Biomedical Engineering, University of California, Davis, Davis, CA, USA
- Genome Center, University of California, Davis, Davis, CA, USA
| | - Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
2
|
Ma M, Szavits-Nossan J, Singh A, Grima R. Analysis of a detailed multi-stage model of stochastic gene expression using queueing theory and model reduction. Math Biosci 2024; 373:109204. [PMID: 38710441 PMCID: PMC11536769 DOI: 10.1016/j.mbs.2024.109204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/03/2024] [Accepted: 04/29/2024] [Indexed: 05/08/2024]
Abstract
We introduce a biologically detailed, stochastic model of gene expression describing the multiple rate-limiting steps of transcription, nuclear pre-mRNA processing, nuclear mRNA export, cytoplasmic mRNA degradation and translation of mRNA into protein. The processes in sub-cellular compartments are described by an arbitrary number of processing stages, thus accounting for a significantly finer molecular description of gene expression than conventional models such as the telegraph, two-stage and three-stage models of gene expression. We use two distinct tools, queueing theory and model reduction using the slow-scale linear-noise approximation, to derive exact or approximate analytic expressions for the moments or distributions of nuclear mRNA, cytoplasmic mRNA and protein fluctuations, as well as lower bounds for their Fano factors in steady-state conditions. We use these to study the phase diagram of the stochastic model; in particular we derive parametric conditions determining three types of transitions in the properties of mRNA fluctuations: from sub-Poissonian to super-Poissonian noise, from high noise in the nucleus to high noise in the cytoplasm, and from a monotonic increase to a monotonic decrease of the Fano factor with the number of processing stages. In contrast, protein fluctuations are always super-Poissonian and show weak dependence on the number of mRNA processing stages. Our results delineate the region of parameter space where conventional models give qualitatively incorrect results and provide insight into how the number of processing stages, e.g. the number of rate-limiting steps in initiation, splicing and mRNA degradation, shape stochastic gene expression by modulation of molecular memory.
Collapse
Affiliation(s)
- Muhan Ma
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | | | - Abhyudai Singh
- Department of Electrical and Computer Engineering, University of Delaware, Newark DE 19716, USA
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK.
| |
Collapse
|
3
|
Ham L, Coomer MA, Öcal K, Grima R, Stumpf MPH. A stochastic vs deterministic perspective on the timing of cellular events. Nat Commun 2024; 15:5286. [PMID: 38902228 PMCID: PMC11190182 DOI: 10.1038/s41467-024-49624-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 06/12/2024] [Indexed: 06/22/2024] Open
Abstract
Cells are the fundamental units of life, and like all life forms, they change over time. Changes in cell state are driven by molecular processes; of these many are initiated when molecule numbers reach and exceed specific thresholds, a characteristic that can be described as "digital cellular logic". Here we show how molecular and cellular noise profoundly influence the time to cross a critical threshold-the first-passage time-and map out scenarios in which stochastic dynamics result in shorter or longer average first-passage times compared to noise-less dynamics. We illustrate the dependence of the mean first-passage time on noise for a set of exemplar models of gene expression, auto-regulatory feedback control, and enzyme-mediated catalysis. Our theory provides intuitive insight into the origin of these effects and underscores two important insights: (i) deterministic predictions for cellular event timing can be highly inaccurate when molecule numbers are within the range known for many cells; (ii) molecular noise can significantly shift mean first-passage times, particularly within auto-regulatory genetic feedback circuits.
Collapse
Affiliation(s)
- Lucy Ham
- School of BioSciences, University of Melbourne, Parkville, Australia
- School of Mathematics and Statistics, University of Melbourne, Parkville, Australia
| | - Megan A Coomer
- School of BioSciences, University of Melbourne, Parkville, Australia
- School of Mathematics and Statistics, University of Melbourne, Parkville, Australia
| | - Kaan Öcal
- School of Informatics, University of Edinburgh, Edinburgh, UK
- School of BioSciences, University of Melbourne, Parkville, Australia
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Michael P H Stumpf
- School of BioSciences, University of Melbourne, Parkville, Australia.
- School of Mathematics and Statistics, University of Melbourne, Parkville, Australia.
| |
Collapse
|
4
|
Grima R, Esmenjaud PM. Quantifying and correcting bias in transcriptional parameter inference from single-cell data. Biophys J 2024; 123:4-30. [PMID: 37885177 PMCID: PMC10808030 DOI: 10.1016/j.bpj.2023.10.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/12/2023] [Accepted: 10/19/2023] [Indexed: 10/28/2023] Open
Abstract
The snapshot distribution of mRNA counts per cell can be measured using single-molecule fluorescence in situ hybridization or single-cell RNA sequencing. These distributions are often fit to the steady-state distribution of the two-state telegraph model to estimate the three transcriptional parameters for a gene of interest: mRNA synthesis rate, the switching on rate (the on state being the active transcriptional state), and the switching off rate. This model assumes no extrinsic noise, i.e., parameters do not vary between cells, and thus estimated parameters are to be understood as approximating the average values in a population. The accuracy of this approximation is currently unclear. Here, we develop a theory that explains the size and sign of estimation bias when inferring parameters from single-cell data using the standard telegraph model. We find specific bias signatures depending on the source of extrinsic noise (which parameter is most variable across cells) and the mode of transcriptional activity. If gene expression is not bursty then the population averages of all three parameters are overestimated if extrinsic noise is in the synthesis rate; underestimation occurs if extrinsic noise is in the switching on rate; both underestimation and overestimation can occur if extrinsic noise is in the switching off rate. We find that some estimated parameters tend to infinity as the size of extrinsic noise approaches a critical threshold. In contrast when gene expression is bursty, we find that in all cases the mean burst size (ratio of the synthesis rate to the switching off rate) is overestimated while the mean burst frequency (the switching on rate) is underestimated. We estimate the size of extrinsic noise from the covariance matrix of sequencing data and use this together with our theory to correct published estimates of transcriptional parameters for mammalian genes.
Collapse
Affiliation(s)
- Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| | - Pierre-Marie Esmenjaud
- Biology Department, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| |
Collapse
|
5
|
Wang Y, Yu Z, Grima R, Cao Z. Exact solution of a three-stage model of stochastic gene expression including cell-cycle dynamics. J Chem Phys 2023; 159:224102. [PMID: 38063222 DOI: 10.1063/5.0173742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/04/2023] [Indexed: 12/18/2023] Open
Abstract
The classical three-stage model of stochastic gene expression predicts the statistics of single cell mRNA and protein number fluctuations as a function of the rates of promoter switching, transcription, translation, degradation and dilution. While this model is easily simulated, its analytical solution remains an unsolved problem. Here we modify this model to explicitly include cell-cycle dynamics and then derive an exact solution for the time-dependent joint distribution of mRNA and protein numbers. We show large differences between this model and the classical model which captures cell-cycle effects implicitly via effective first-order dilution reactions. In particular we find that the Fano factor of protein numbers calculated from a population snapshot measurement are underestimated by the classical model whereas the correlation between mRNA and protein can be either over- or underestimated, depending on the timescales of mRNA degradation and promoter switching relative to the mean cell-cycle duration time.
Collapse
Affiliation(s)
- Yiling Wang
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
| | - Zhenhua Yu
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
| | - Ramon Grima
- School of Biological Sciences, The University of Edinburgh, Max Born Crescent, Edinburgh EH9 3BF, Scotland, United Kingdom
| | - Zhixing Cao
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
- Department of Chemical Engineering, Queen's University, Kingston, Ontario K7L 3N6, Canada
| |
Collapse
|
6
|
Gorin G, Yoshida S, Pachter L. Assessing Markovian and Delay Models for Single-Nucleus RNA Sequencing. Bull Math Biol 2023; 85:114. [PMID: 37828255 DOI: 10.1007/s11538-023-01213-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 09/11/2023] [Indexed: 10/14/2023]
Abstract
The serial nature of reactions involved in the RNA life-cycle motivates the incorporation of delays in models of transcriptional dynamics. The models couple a transcriptional process to a fairly general set of delayed monomolecular reactions with no feedback. We provide numerical strategies for calculating the RNA copy number distributions induced by these models, and solve several systems with splicing, degradation, and catalysis. An analysis of single-cell and single-nucleus RNA sequencing data using these models reveals that the kinetics of nuclear export do not appear to require invocation of a non-Markovian waiting time.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Shawn Yoshida
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA.
| |
Collapse
|
7
|
Szavits-Nossan J, Grima R. Uncovering the effect of RNA polymerase steric interactions on gene expression noise: Analytical distributions of nascent and mature RNA numbers. Phys Rev E 2023; 108:034405. [PMID: 37849194 DOI: 10.1103/physreve.108.034405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/24/2023] [Indexed: 10/19/2023]
Abstract
The telegraph model is the standard model of stochastic gene expression, which can be solved exactly to obtain the distribution of mature RNA numbers per cell. A modification of this model also leads to an analytical distribution of nascent RNA numbers. These solutions are routinely used for the analysis of single-cell data, including the inference of transcriptional parameters. However, these models neglect important mechanistic features of transcription elongation, such as the stochastic movement of RNA polymerases and their steric (excluded-volume) interactions. Here we construct a model of gene expression describing promoter switching between inactive and active states, binding of RNA polymerases in the active state, their stochastic movement including steric interactions along the gene, and their unbinding leading to a mature transcript that subsequently decays. We derive the steady-state distributions of the nascent and mature RNA numbers in two important limiting cases: constitutive expression and slow promoter switching. We show that RNA fluctuations are suppressed by steric interactions between RNA polymerases, and that this suppression can in some instances even lead to sub-Poissonian fluctuations; these effects are most pronounced for nascent RNA and less prominent for mature RNA, since the latter is not a direct sensor of transcription. We find a relationship between the parameters of our microscopic mechanistic model and those of the standard models that ensures excellent consistency in their prediction of the first and second RNA number moments over vast regions of parameter space, encompassing slow, intermediate, and rapid promoter switching, provided the RNA number distributions are Poissonian or super-Poissonian. Furthermore, we identify the limitations of inference from mature RNA data, specifically showing that it cannot differentiate between highly distinct RNA polymerase traffic patterns on a gene.
Collapse
Affiliation(s)
- Juraj Szavits-Nossan
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| |
Collapse
|
8
|
Wagner V, Radde N. The impossible challenge of estimating non-existent moments of the Chemical Master Equation. Bioinformatics 2023; 39:i440-i447. [PMID: 37387158 PMCID: PMC10311328 DOI: 10.1093/bioinformatics/btad205] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The Chemical Master Equation (CME) is a set of linear differential equations that describes the evolution of the probability distribution on all possible configurations of a (bio-)chemical reaction system. Since the number of configurations and therefore the dimension of the CME rapidly increases with the number of molecules, its applicability is restricted to small systems. A widely applied remedy for this challenge is moment-based approaches which consider the evolution of the first few moments of the distribution as summary statistics for the complete distribution. Here, we investigate the performance of two moment-estimation methods for reaction systems whose equilibrium distributions encounter fat-tailedness and do not possess statistical moments. RESULTS We show that estimation via stochastic simulation algorithm (SSA) trajectories lose consistency over time and estimated moment values span a wide range of values even for large sample sizes. In comparison, the method of moments returns smooth moment estimates but is not able to indicate the non-existence of the allegedly predicted moments. We furthermore analyze the negative effect of a CME solution's fat-tailedness on SSA run times and explain inherent difficulties. While moment-estimation techniques are a commonly applied tool in the simulation of (bio-)chemical reaction networks, we conclude that they should be used with care, as neither the system definition nor the moment-estimation techniques themselves reliably indicate the potential fat-tailedness of the CME's solution.
Collapse
Affiliation(s)
- Vincent Wagner
- Institute for Systems Theory and Automatic Control, University of Stuttgart, Stuttgart 70569, Germany
| | - Nicole Radde
- Institute for Systems Theory and Automatic Control, University of Stuttgart, Stuttgart 70569, Germany
- Stuttgart Center for Simulation Science, University of Stuttgart, Stuttgart 70569, Germany
| |
Collapse
|
9
|
Jia C, Grima R. Coupling gene expression dynamics to cell size dynamics and cell cycle events: Exact and approximate solutions of the extended telegraph model. iScience 2023; 26:105746. [PMID: 36619980 PMCID: PMC9813732 DOI: 10.1016/j.isci.2022.105746] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/02/2022] [Accepted: 12/02/2022] [Indexed: 12/12/2022] Open
Abstract
The standard model describing the fluctuations of mRNA numbers in single cells is the telegraph model which includes synthesis and degradation of mRNA, and switching of the gene between active and inactive states. While commonly used, this model does not describe how fluctuations are influenced by the cell cycle phase, cellular growth and division, and other crucial aspects of cellular biology. Here, we derive the analytical time-dependent solution of an extended telegraph model that explicitly considers the doubling of gene copy numbers upon DNA replication, dependence of the mRNA synthesis rate on cellular volume, gene dosage compensation, partitioning of molecules during cell division, cell-cycle duration variability, and cell-size control strategies. Based on the time-dependent solution, we obtain the analytical distributions of transcript numbers for lineage and population measurements in steady-state growth and also find a linear relation between the Fano factor of mRNA fluctuations and cell volume fluctuations. We show that generally the lineage and population distributions in steady-state growth cannot be accurately approximated by the steady-state solution of extrinsic noise models, i.e. a telegraph model with parameters drawn from probability distributions. This is because the mRNA lifetime is often not small enough compared to the cell cycle duration to erase the memory of division and replication. Accurate approximations are possible when this memory is weak, e.g. for genes with bursty expression and for which there is sufficient gene dosage compensation when replication occurs.
Collapse
Affiliation(s)
- Chen Jia
- Applied and Computational Mathematics Division, Beijing Computational Science Research Center, Beijing 100193, China
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| |
Collapse
|
10
|
Gorin G, Pachter L. Length biases in single-cell RNA sequencing of pre-mRNA. BIOPHYSICAL REPORTS 2022; 3:100097. [PMID: 36660179 PMCID: PMC9843228 DOI: 10.1016/j.bpr.2022.100097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022]
Abstract
Single-cell RNA sequencing data can be modeled using Markov chains to yield genome-wide insights into transcriptional physics. However, quantitative inference with such data requires careful assessment of noise sources. We find that long pre-mRNA transcripts are over-represented in sequencing data. To explain this trend, we propose a length-based model of capture bias, which may produce false-positive observations. We solve this model and use it to find concordant parameter trends as well as systematic, mechanistically interpretable technical and biological differences in paired data sets.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California
| | - Lior Pachter
- Division of Biology and Biological Engineering, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
- Corresponding author
| |
Collapse
|
11
|
Boe RH, Ayyappan V, Schuh L, Raj A. Allelic correlation is a marker of trade-offs between barriers to transmission of expression variability and signal responsiveness in genetic networks. Cell Syst 2022; 13:1016-1032.e6. [PMID: 36450286 PMCID: PMC9811561 DOI: 10.1016/j.cels.2022.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 06/28/2022] [Accepted: 10/28/2022] [Indexed: 12/03/2022]
Abstract
Genetic networks should respond to signals but prevent the transmission of spontaneous fluctuations. Limited data from mammalian cells suggest that noise transmission is uncommon, but systematic claims about noise transmission have been limited by the inability to directly measure it. Here, we build a mathematical framework modeling allelic correlation and noise transmission, showing that allelic correlation and noise transmission correspond across model parameters and network architectures. Limiting noise transmission comes with the trade-off of being unresponsive to signals, and within responsive regimes, there is a further trade-off between response time and basal noise transmission. Analysis of allele-specific single-cell RNA-sequencing data revealed that genes encoding upstream factors in signaling pathways and cell-type-specific factors have higher allelic correlation than downstream factors, suggesting they are more subject to regulation. Overall, our findings suggest that some noise transmission must result from signal responsiveness, but it can be minimized by trading off for a slower response. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Ryan H Boe
- Genetics and Epigenetics, Cell and Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Vinay Ayyappan
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Lea Schuh
- Institute of AI for Health, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764 Neuherberg, Germany; Department of Mathematics, Technical University of Munich, Garching 85748, Germany
| | - Arjun Raj
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, PA, USA; Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
12
|
Gorin G, Vastola JJ, Fang M, Pachter L. Interpretable and tractable models of transcriptional noise for the rational design of single-molecule quantification experiments. Nat Commun 2022; 13:7620. [PMID: 36494337 PMCID: PMC9734650 DOI: 10.1038/s41467-022-34857-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 11/09/2022] [Indexed: 12/13/2022] Open
Abstract
The question of how cell-to-cell differences in transcription rate affect RNA count distributions is fundamental for understanding biological processes underlying transcription. Answering this question requires quantitative models that are both interpretable (describing concrete biophysical phenomena) and tractable (amenable to mathematical analysis). This enables the identification of experiments which best discriminate between competing hypotheses. As a proof of principle, we introduce a simple but flexible class of models involving a continuous stochastic transcription rate driving a discrete RNA transcription and splicing process, and compare and contrast two biologically plausible hypotheses about transcription rate variation. One assumes variation is due to DNA experiencing mechanical strain, while the other assumes it is due to regulator number fluctuations. We introduce a framework for numerically and analytically studying such models, and apply Bayesian model selection to identify candidate genes that show signatures of each model in single-cell transcriptomic data from mouse glutamatergic neurons.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - John J Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115, USA
| | - Meichen Fang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA.
| |
Collapse
|
13
|
Molecular Origins of Transcriptional Heterogeneity in Diazotrophic Klebsiella oxytoca. mSystems 2022; 7:e0059622. [PMID: 36073804 PMCID: PMC9600154 DOI: 10.1128/msystems.00596-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Phenotypic heterogeneity in clonal bacterial batch cultures has been shown for a range of bacterial systems; however, the molecular origins of such heterogeneity and its magnitude are not well understood. Under conditions of extreme low-nitrogen stress in the model diazotroph Klebsiella oxytoca, we found remarkably high heterogeneity of nifHDK gene expression, which codes for the structural genes of nitrogenase, one key enzyme of the global nitrogen cycle. This heterogeneity limited the bulk observed nitrogen-fixing capacity of the population. Using dual-probe, single-cell RNA fluorescent in situ hybridization, we correlated nifHDK expression with that of nifLA and glnK-amtB, which code for the main upstream regulatory components. Through stochastic transcription models and mutual information analysis, we revealed likely molecular origins for heterogeneity in nitrogenase expression. In the wild type and regulatory variants, we found that nifHDK transcription was inherently bursty, but we established that noise propagation through signaling was also significant. The regulatory gene glnK had the highest discernible effect on nifHDK variance, while noise from factors outside the regulatory pathway were negligible. Understanding the basis of inherent heterogeneity of nitrogenase expression and its origins can inform biotechnology strategies seeking to enhance biological nitrogen fixation. Finally, we speculate on potential benefits of diazotrophic heterogeneity in natural soil environments. IMPORTANCE Nitrogen is an essential micronutrient for both plant and animal life and naturally exists in both reactive and inert chemical forms. Modern agriculture is heavily reliant on nitrogen that has been "fixed" into a reactive form via the energetically expensive Haber-Bosch process, with significant environmental consequences. Nitrogen-fixing bacteria provide an alternative source of fixed nitrogen for use in both biotechnological and agricultural settings, but this relies on a firm understanding of how the fixation process is regulated within individual bacterial cells. We examined the cell-to-cell variability in the nitrogen-fixing behavior of Klebsiella oxytoca, a free-living bacterium. The significance of our research is in identifying not only the presence of marked variability but also the specific mechanisms that give rise to it. This understanding gives insight into both the evolutionary advantages of variable behavior as well as strategies for biotechnological applications.
Collapse
|
14
|
Fu X, Patel HP, Coppola S, Xu L, Cao Z, Lenstra TL, Grima R. Quantifying how post-transcriptional noise and gene copy number variation bias transcriptional parameter inference from mRNA distributions. eLife 2022; 11:e82493. [PMID: 36250630 PMCID: PMC9648968 DOI: 10.7554/elife.82493] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 10/14/2022] [Indexed: 11/13/2022] Open
Abstract
Transcriptional rates are often estimated by fitting the distribution of mature mRNA numbers measured using smFISH (single molecule fluorescence in situ hybridization) with the distribution predicted by the telegraph model of gene expression, which defines two promoter states of activity and inactivity. However, fluctuations in mature mRNA numbers are strongly affected by processes downstream of transcription. In addition, the telegraph model assumes one gene copy but in experiments, cells may have two gene copies as cells replicate their genome during the cell cycle. While it is often presumed that post-transcriptional noise and gene copy number variation affect transcriptional parameter estimation, the size of the error introduced remains unclear. To address this issue, here we measure both mature and nascent mRNA distributions of GAL10 in yeast cells using smFISH and classify each cell according to its cell cycle phase. We infer transcriptional parameters from mature and nascent mRNA distributions, with and without accounting for cell cycle phase and compare the results to live-cell transcription measurements of the same gene. We find that: (i) correcting for cell cycle dynamics decreases the promoter switching rates and the initiation rate, and increases the fraction of time spent in the active state, as well as the burst size; (ii) additional correction for post-transcriptional noise leads to further increases in the burst size and to a large reduction in the errors in parameter estimation. Furthermore, we outline how to correctly adjust for measurement noise in smFISH due to uncertainty in transcription site localisation when introns cannot be labelled. Simulations with parameters estimated from nascent smFISH data, which is corrected for cell cycle phases and measurement noise, leads to autocorrelation functions that agree with those obtained from live-cell imaging.
Collapse
Affiliation(s)
- Xiaoming Fu
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and TechnologyShanghaiChina
- School of Biological Sciences, University of EdinburghEdinburghUnited Kingdom
- Center for Advanced Systems Understanding, Helmholtz-Zentrum Dresden-RossendorfGörlitzGermany
| | - Heta P Patel
- The Netherlands Cancer Institute, Oncode Institute, Division of Gene RegulationAmsterdamNetherlands
| | - Stefano Coppola
- The Netherlands Cancer Institute, Oncode Institute, Division of Gene RegulationAmsterdamNetherlands
| | - Libin Xu
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and TechnologyShanghaiChina
| | - Zhixing Cao
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and TechnologyShanghaiChina
| | - Tineke L Lenstra
- The Netherlands Cancer Institute, Oncode Institute, Division of Gene RegulationAmsterdamNetherlands
| | - Ramon Grima
- School of Biological Sciences, University of EdinburghEdinburghUnited Kingdom
| |
Collapse
|
15
|
Concentration fluctuations in growing and dividing cells: Insights into the emergence of concentration homeostasis. PLoS Comput Biol 2022; 18:e1010574. [PMID: 36194626 PMCID: PMC9565450 DOI: 10.1371/journal.pcbi.1010574] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 10/14/2022] [Accepted: 09/14/2022] [Indexed: 11/19/2022] Open
Abstract
Intracellular reaction rates depend on concentrations and hence their levels are often regulated. However classical models of stochastic gene expression lack a cell size description and cannot be used to predict noise in concentrations. Here, we construct a model of gene product dynamics that includes a description of cell growth, cell division, size-dependent gene expression, gene dosage compensation, and size control mechanisms that can vary with the cell cycle phase. We obtain expressions for the approximate distributions and power spectra of concentration fluctuations which lead to insight into the emergence of concentration homeostasis. We find that (i) the conditions necessary to suppress cell division-induced concentration oscillations are difficult to achieve; (ii) mRNA concentration and number distributions can have different number of modes; (iii) two-layer size control strategies such as sizer-timer or adder-timer are ideal because they maintain constant mean concentrations whilst minimising concentration noise; (iv) accurate concentration homeostasis requires a fine tuning of dosage compensation, replication timing, and size-dependent gene expression; (v) deviations from perfect concentration homeostasis show up as deviations of the concentration distribution from a gamma distribution. Some of these predictions are confirmed using data for E. coli, fission yeast, and budding yeast.
Collapse
|
16
|
Gorin G, Fang M, Chari T, Pachter L. RNA velocity unraveled. PLoS Comput Biol 2022; 18:e1010492. [PMID: 36094956 PMCID: PMC9499228 DOI: 10.1371/journal.pcbi.1010492] [Citation(s) in RCA: 58] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 09/22/2022] [Accepted: 08/14/2022] [Indexed: 11/24/2022] Open
Abstract
We perform a thorough analysis of RNA velocity methods, with a view towards understanding the suitability of the various assumptions underlying popular implementations. In addition to providing a self-contained exposition of the underlying mathematics, we undertake simulations and perform controlled experiments on biological datasets to assess workflow sensitivity to parameter choices and underlying biology. Finally, we argue for a more rigorous approach to RNA velocity, and present a framework for Markovian analysis that points to directions for improvement and mitigation of current problems.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Meichen Fang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, United States of America
| |
Collapse
|
17
|
Low protein expression enhances phenotypic evolvability by intensifying selection on folding stability. Nat Ecol Evol 2022; 6:1155-1164. [PMID: 35798838 PMCID: PMC7613228 DOI: 10.1038/s41559-022-01797-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 05/19/2022] [Indexed: 01/09/2023]
Abstract
Protein abundance affects the evolution of protein genotypes, but we do not know how it affects the evolution of protein phenotypes. Here we investigate the role of protein abundance in the evolvability of green fluorescent protein (GFP) towards the novel phenotype of cyan fluorescence. We evolve GFP in E. coli through multiple cycles of mutation and selection and show that low GFP expression facilitates the evolution of cyan fluorescence. A computational model whose predictions we test experimentally helps explain why: lowly expressed proteins are under stronger selection for proper folding, which facilitates their evolvability on short evolutionary time scales. The reason is that high fluorescence can be achieved by either few proteins that fold well or by many proteins that fold less well. In other words, we observe a synergy between a protein's scarcity and its stability. Because many proteins meet the essential requirements for this scarcity-stability synergy, it may be a widespread mechanism by which low expression helps proteins evolve new phenotypes and functions.
Collapse
|
18
|
Ham L, Coomer M, Stumpf M. The chemical Langevin equation for biochemical systems in dynamic environments. J Chem Phys 2022; 157:094105. [DOI: 10.1063/5.0095840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Modelling and simulation of complex biochemical reaction networks form cornerstones of modern biophysics. Many of the approaches developed so far capture temporal fluctuations due to the inherent stochasticity of the biophysical processes, referred to as intrinsic noise. Stochastic fluctuations, however, predominantly stem from the interplay of the network with many other - and mostly unknown - fluctuating processes, as well as with various random signals arising from the extracellular world; these sources contribute extrinsic noise. Here we provide a computational simulation method to probe the stochastic dynamics of biochemical systems subject to both intrinsic and extrinsic noise. We develop an extrinsic chemical Langevin equation-a physically motivated extension of the chemical Langevin equation- to model intrinsically noisy reaction networks embedded in a stochastically fluctuating environment. The extrinsic CLE is a continuous approximation to the Chemical Master Equation (CME) with time-varying propensities. In our approach, noise is incorporated at the level of the CME, and can account for the full dynamics of the exogenous noise process, irrespective of timescales and their mismatches. We show that our method accurately captures the first two moments of the stationary probability density when compared with exact stochastic simulation methods, while reducing the computational runtime by several orders of magnitude. Our approach provides a method that is practical, computationally efficient and physically accurate to study systems that are simultaneously subject to a variety of noise sources.
Collapse
Affiliation(s)
- Lucy Ham
- The University of Melbourne, University of Melbourne, Australia
| | | | | |
Collapse
|
19
|
Filatova T, Popović N, Grima R. Modulation of nuclear and cytoplasmic mRNA fluctuations by time-dependent stimuli: Analytical distributions. Math Biosci 2022; 347:108828. [DOI: 10.1016/j.mbs.2022.108828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/15/2022] [Accepted: 04/15/2022] [Indexed: 10/18/2022]
|
20
|
Gorin G, Pachter L. Modeling bursty transcription and splicing with the chemical master equation. Biophys J 2022; 121:1056-1069. [PMID: 35143775 PMCID: PMC8943761 DOI: 10.1016/j.bpj.2022.02.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 11/29/2021] [Accepted: 02/03/2022] [Indexed: 11/16/2022] Open
Abstract
Splicing cascades that alter gene products posttranscriptionally also affect expression dynamics. We study a class of processes and associated distributions that emerge from models of bursty promoters coupled to directed acyclic graphs of splicing. These solutions provide full time-dependent joint distributions for an arbitrary number of species with general noise behaviors and transient phenomena, offering qualitative and quantitative insights about how splicing can regulate expression dynamics. Finally, we derive a set of quantitative constraints on the minimum complexity necessary to reproduce gene coexpression patterns using synchronized burst models. We validate these findings by analyzing long-read sequencing data, where we find evidence of expression patterns largely consistent with these constraints.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California
| | - Lior Pachter
- Division of Biology and Biological Engineering & Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California.
| |
Collapse
|
21
|
Szavits-Nossan J, Grima R. Mean-field theory accurately captures the variation of copy number distributions across the mRNA life cycle. Phys Rev E 2022; 105:014410. [PMID: 35193216 DOI: 10.1103/physreve.105.014410] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 11/15/2021] [Indexed: 06/14/2023]
Abstract
We consider a stochastic model where a gene switches between two states, an mRNA transcript is released in the active state, and subsequently it undergoes an arbitrary number of sequential unimolecular steps before being degraded. The reactions effectively describe various stages of the mRNA life cycle such as initiation, elongation, termination, splicing, export, and degradation. We construct a mean-field approach that leads to closed-form steady-state distributions for the number of transcript molecules at each stage of the mRNA life cycle. By comparison with stochastic simulations, we show that the approximation is highly accurate over all the parameter space, independent of the type of expression (constitutive or bursty) and of the shape of the distribution (unimodal, bimodal, and nearly bimodal). The theory predicts that in a population of identical cells, any bimodality is gradually washed away as the mRNA progresses through its life cycle.
Collapse
Affiliation(s)
- Juraj Szavits-Nossan
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| |
Collapse
|
22
|
Rocca A, Kholodenko BN. Can Systems Biology Advance Clinical Precision Oncology? Cancers (Basel) 2021; 13:6312. [PMID: 34944932 PMCID: PMC8699328 DOI: 10.3390/cancers13246312] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Accepted: 12/10/2021] [Indexed: 12/13/2022] Open
Abstract
Precision oncology is perceived as a way forward to treat individual cancer patients. However, knowing particular cancer mutations is not enough for optimal therapeutic treatment, because cancer genotype-phenotype relationships are nonlinear and dynamic. Systems biology studies the biological processes at the systems' level, using an array of techniques, ranging from statistical methods to network reconstruction and analysis, to mathematical modeling. Its goal is to reconstruct the complex and often counterintuitive dynamic behavior of biological systems and quantitatively predict their responses to environmental perturbations. In this paper, we review the impact of systems biology on precision oncology. We show examples of how the analysis of signal transduction networks allows to dissect resistance to targeted therapies and inform the choice of combinations of targeted drugs based on tumor molecular alterations. Patient-specific biomarkers based on dynamical models of signaling networks can have a greater prognostic value than conventional biomarkers. These examples support systems biology models as valuable tools to advance clinical and translational oncological research.
Collapse
Affiliation(s)
- Andrea Rocca
- Hygiene and Public Health, Local Health Unit of Romagna, 47121 Forlì, Italy
| | - Boris N. Kholodenko
- Systems Biology Ireland, School of Medicine, University College Dublin, Belfield, D04 V1W8 Dublin, Ireland
- Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, D04 V1W8 Dublin, Ireland
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT 06520, USA
| |
Collapse
|
23
|
Rommelfanger MK, MacLean AL. A single-cell resolved cell-cell communication model explains lineage commitment in hematopoiesis. Development 2021; 148:273837. [PMID: 34935903 PMCID: PMC8722395 DOI: 10.1242/dev.199779] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 11/06/2021] [Indexed: 01/29/2023]
Abstract
Cells do not make fate decisions independently. Arguably, every cell-fate decision occurs in response to environmental signals. In many cases, cell-cell communication alters the dynamics of the internal gene regulatory network of a cell to initiate cell-fate transitions, yet models rarely take this into account. Here, we have developed a multiscale perspective to study the granulocyte-monocyte versus megakaryocyte-erythrocyte fate decisions. This transition is dictated by the GATA1-PU.1 network: a classical example of a bistable cell-fate system. We show that, for a wide range of cell communication topologies, even subtle changes in signaling can have pronounced effects on cell-fate decisions. We go on to show how cell-cell coupling through signaling can spontaneously break the symmetry of a homogenous cell population. Noise, both intrinsic and extrinsic, shapes the decision landscape profoundly, and affects the transcriptional dynamics underlying this important hematopoietic cell-fate decision-making system. This article has an associated ‘The people behind the papers’ interview. Summary: Through theory and computational modeling, cell-cell communication is revealed to be a crucial and under-appreciated determinant of cell-fate decision-making during hematopoiesis.
Collapse
Affiliation(s)
- Megan K Rommelfanger
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA
| | - Adam L MacLean
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA
| |
Collapse
|
24
|
Ham L, Jackson M, Stumpf MPH. Pathway dynamics can delineate the sources of transcriptional noise in gene expression. eLife 2021; 10:e69324. [PMID: 34636320 PMCID: PMC8608387 DOI: 10.7554/elife.69324] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 10/11/2021] [Indexed: 11/25/2022] Open
Abstract
Single-cell expression profiling opens up new vistas on cellular processes. Extensive cell-to-cell variability at the transcriptomic and proteomic level has been one of the stand-out observations. Because most experimental analyses are destructive we only have access to snapshot data of cellular states. This loss of temporal information presents significant challenges for inferring dynamics, as well as causes of cell-to-cell variability. In particular, we typically cannot separate dynamic variability from within cells ('intrinsic noise') from variability across the population ('extrinsic noise'). Here, we make this non-identifiability mathematically precise, allowing us to identify new experimental set-ups that can assist in resolving this non-identifiability. We show that multiple generic reporters from the same biochemical pathways (e.g. mRNA and protein) can infer magnitudes of intrinsic and extrinsic transcriptional noise, identifying sources of heterogeneity. Stochastic simulations support our theory, and demonstrate that 'pathway-reporters' compare favourably to the well-known, but often difficult to implement, dual-reporter method.
Collapse
Affiliation(s)
- Lucy Ham
- School of BioSciences, University of MelbourneMelbourneAustralia
| | - Marcel Jackson
- Department of Mathematics and Statistics, La Trobe UniversityMelbourneAustralia
| | - Michael PH Stumpf
- School of Mathematics and Statistics, University of MelbourneMelbourneAustralia
| |
Collapse
|
25
|
Noise distorts the epigenetic landscape and shapes cell-fate decisions. Cell Syst 2021; 13:83-102.e6. [PMID: 34626539 DOI: 10.1016/j.cels.2021.09.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 06/21/2021] [Accepted: 09/02/2021] [Indexed: 12/24/2022]
Abstract
The Waddington epigenetic landscape has become an iconic representation of the cellular differentiation process. Recent single-cell transcriptomic data provide new opportunities for quantifying this originally conceptual tool, offering insight into the gene regulatory networks underlying cellular development. While many methods for constructing the landscape have been proposed, by far the most commonly employed approach is based on computing the landscape as the negative logarithm of the steady-state probability distribution. Here, we use simple models to highlight the complexities and limitations that arise when reconstructing the potential landscape in the presence of stochastic fluctuations. We consider how the landscape changes in accordance with different stochastic systems and show that it is the subtle interplay between the deterministic and stochastic components of the system that ultimately shapes the landscape. We further discuss how the presence of noise has important implications for the identifiability of the regulatory dynamics from experimental data. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
|
26
|
Ham L, Schnoerr D, Brackston RD, Stumpf MPH. Exactly solvable models of stochastic gene expression. J Chem Phys 2020; 152:144106. [PMID: 32295361 DOI: 10.1063/1.5143540] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Stochastic models are key to understanding the intricate dynamics of gene expression. However, the simplest models that only account for active and inactive states of a gene fail to capture common observations in both prokaryotic and eukaryotic organisms. Here, we consider multistate models of gene expression that generalize the canonical Telegraph process and are capable of capturing the joint effects of transcription factors, heterochromatin state, and DNA accessibility (or, in prokaryotes, sigma-factor activity) on transcript abundance. We propose two approaches for solving classes of these generalized systems. The first approach offers a fresh perspective on a general class of multistate models and allows us to "decompose" more complicated systems into simpler processes, each of which can be solved analytically. This enables us to obtain a solution of any model from this class. Next, we develop an approximation method based on a power series expansion of the stationary distribution for an even broader class of multistate models of gene transcription. We further show that models from both classes cannot have a heavy-tailed distribution in the absence of extrinsic noise. The combination of analytical and computational solutions for these realistic gene expression models also holds the potential to design synthetic systems and control the behavior of naturally evolved gene expression systems in guiding cell-fate decisions.
Collapse
Affiliation(s)
- Lucy Ham
- School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Parkville, VIC 3010, Australia
| | - David Schnoerr
- Department of Life Sciences, Imperial College London, South Kensington, London SW7 2AZ, United Kingdom
| | - Rowan D Brackston
- Department of Life Sciences, Imperial College London, South Kensington, London SW7 2AZ, United Kingdom
| | - Michael P H Stumpf
- School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Parkville, VIC 3010, Australia
| |
Collapse
|
27
|
Gorin G, Pachter L. Special function methods for bursty models of transcription. Phys Rev E 2020; 102:022409. [PMID: 32942485 DOI: 10.1103/physreve.102.022409] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 08/10/2020] [Indexed: 11/07/2022]
Abstract
We explore a Markov model used in the analysis of gene expression, involving the bursty production of pre-mRNA, its conversion to mature mRNA, and its consequent degradation. We demonstrate that the integration used to compute the solution of the stochastic system can be approximated by the evaluation of special functions. Furthermore, the form of the special function solution generalizes to a broader class of burst distributions. In light of the broader goal of biophysical parameter inference from transcriptomics data, we apply the method to simulated data, demonstrating effective control of precision and runtime. Finally, we propose and validate a non-Bayesian approach for parameter estimation based on the characteristic function of the target joint distribution of pre-mRNA and mRNA.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering & Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, USA
| |
Collapse
|
28
|
Perez-Carrasco R, Beentjes C, Grima R. Effects of cell cycle variability on lineage and population measurements of messenger RNA abundance. J R Soc Interface 2020; 17:20200360. [PMID: 32634365 PMCID: PMC7423421 DOI: 10.1098/rsif.2020.0360] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 06/17/2020] [Indexed: 12/17/2022] Open
Abstract
Many models of gene expression do not explicitly incorporate a cell cycle description. Here, we derive a theory describing how messenger RNA (mRNA) fluctuations for constitutive and bursty gene expression are influenced by stochasticity in the duration of the cell cycle and the timing of DNA replication. Analytical expressions for the moments show that omitting cell cycle duration introduces an error in the predicted mean number of mRNAs that is a monotonically decreasing function of η, which is proportional to the ratio of the mean cell cycle duration and the mRNA lifetime. By contrast, the error in the variance of the mRNA distribution is highest for intermediate values of η consistent with genome-wide measurements in many organisms. Using eukaryotic cell data, we estimate the errors in the mean and variance to be at most 3% and 25%, respectively. Furthermore, we derive an accurate negative binomial mixture approximation to the mRNA distribution. This indicates that stochasticity in the cell cycle can introduce fluctuations in mRNA numbers that are similar to the effect of bursty transcription. Finally, we show that for real experimental data, disregarding cell cycle stochasticity can introduce errors in the inference of transcription rates larger than 10%.
Collapse
Affiliation(s)
- Ruben Perez-Carrasco
- Department of Mathematics, University College London, London, UK
- Department of Life Sciences, Imperial College London, London, UK
| | | | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
29
|
Schuh L, Saint-Antoine M, Sanford EM, Emert BL, Singh A, Marr C, Raj A, Goyal Y. Gene Networks with Transcriptional Bursting Recapitulate Rare Transient Coordinated High Expression States in Cancer. Cell Syst 2020; 10:363-378.e12. [PMID: 32325034 PMCID: PMC7293108 DOI: 10.1016/j.cels.2020.03.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 02/03/2020] [Accepted: 03/24/2020] [Indexed: 10/24/2022]
Abstract
Non-genetic transcriptional variability is a potential mechanism for therapy resistance in melanoma. Specifically, rare subpopulations of cells occupy a transient pre-resistant state characterized by coordinated high expression of several genes and survive therapy. How might these rare states arise and disappear within the population? It is unclear whether the canonical models of probabilistic transcriptional pulsing can explain this behavior, or if it requires special, hitherto unidentified mechanisms. We show that a minimal model of transcriptional bursting and gene interactions can give rise to rare coordinated high expression states. These states occur more frequently in networks with low connectivity and depend on three parameters. While entry into these states is initiated by a long transcriptional burst that also triggers entry of other genes, the exit occurs through independent inactivation of individual genes. Together, we demonstrate that established principles of gene regulation are sufficient to describe this behavior and argue for its more general existence. A record of this paper's transparent peer review process is included in the Supplemental Information.
Collapse
Affiliation(s)
- Lea Schuh
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Helmholtz Zentrum München-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg 85764, Germany; Department of Mathematics, Technical University of Munich, Garching 85748, Germany
| | - Michael Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19716, USA
| | - Eric M Sanford
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Benjamin L Emert
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Abhyudai Singh
- Electrical and Computer Engineering, University of Delaware, Newark, DE 19716, USA
| | - Carsten Marr
- Helmholtz Zentrum München-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg 85764, Germany
| | - Arjun Raj
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Yogesh Goyal
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
| |
Collapse
|