1
|
Torres A, Cockerell S, Phillips M, Balázsi G, Ghosh K. MaxCal can infer models from coupled stochastic trajectories of gene expression and cell division. Biophys J 2023; 122:2623-2635. [PMID: 37218129 PMCID: PMC10397576 DOI: 10.1016/j.bpj.2023.05.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/03/2023] [Accepted: 05/18/2023] [Indexed: 05/24/2023] Open
Abstract
Gene expression is inherently noisy due to small numbers of proteins and nucleic acids inside a cell. Likewise, cell division is stochastic, particularly when tracking at the level of a single cell. The two can be coupled when gene expression affects the rate of cell division. Single-cell time-lapse experiments can measure both fluctuations by simultaneously recording protein levels inside a cell and its stochastic division. These information-rich noisy trajectory data sets can be harnessed to learn about the underlying molecular and cellular details that are often not known a priori. A critical question is: How can we infer a model given data where fluctuations at two levels-gene expression and cell division-are intricately convoluted? We show the principle of maximum caliber (MaxCal)-integrated within a Bayesian framework-can be used to infer several cellular and molecular details (division rates, protein production, and degradation rates) from these coupled stochastic trajectories (CSTs). We demonstrate this proof of concept using synthetic data generated from a known model. An additional challenge in data analysis is that trajectories are often not in protein numbers, but in noisy fluorescence that depends on protein number in a probabilistic manner. We again show that MaxCal can infer important molecular and cellular rates even when data are in fluorescence, another example of CST with three confounding factors-gene expression noise, cell division noise, and fluorescence distortion-all coupled. Our approach will provide guidance to build models in synthetic biology experiments as well as general biological systems where examples of CSTs are abundant.
Collapse
Affiliation(s)
- Andrew Torres
- Department of Physics and Astronomy, University of Denver, Denver, Colorado
| | - Spencer Cockerell
- Department of Physics and Astronomy, University of Denver, Denver, Colorado
| | - Michael Phillips
- Department of Physics and Astronomy, University of Denver, Denver, Colorado
| | - Gábor Balázsi
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
| | - Kingshuk Ghosh
- Molecular and Cellular Biophysics, University of Denver, Denver, Colorado; Department of Physics and Astronomy, University of Denver, Denver, Colorado.
| |
Collapse
|
2
|
Wang Y, He S. Inference on autoregulation in gene expression with variance-to-mean ratio. J Math Biol 2023; 86:87. [PMID: 37131095 PMCID: PMC10154285 DOI: 10.1007/s00285-023-01924-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/14/2023] [Accepted: 04/18/2023] [Indexed: 05/04/2023]
Abstract
Some genes can promote or repress their own expressions, which is called autoregulation. Although gene regulation is a central topic in biology, autoregulation is much less studied. In general, it is extremely difficult to determine the existence of autoregulation with direct biochemical approaches. Nevertheless, some papers have observed that certain types of autoregulations are linked to noise levels in gene expression. We generalize these results by two propositions on discrete-state continuous-time Markov chains. These two propositions form a simple but robust method to infer the existence of autoregulation from gene expression data. This method only needs to compare the mean and variance of the gene expression level. Compared to other methods for inferring autoregulation, our method only requires non-interventional one-time data, and does not need to estimate parameters. Besides, our method has few restrictions on the model. We apply this method to four groups of experimental data and find some genes that might have autoregulation. Some inferred autoregulations have been verified by experiments or other theoretical works.
Collapse
Affiliation(s)
- Yue Wang
- Department of Computational Medicine, University of California, Los Angeles, CA, 90095, USA.
- Institut des Hautes Études Scientifiques (IHÉS), Bures-sur-Yvette, 91440, Essonne, France.
| | - Siqi He
- Simons Center for Geometry and Physics, Stony Brook University, Stony Brook, NY, 11794, USA
| |
Collapse
|
3
|
Critical Comparison of MaxCal and Other Stochastic Modeling Approaches in Analysis of Gene Networks. ENTROPY 2021; 23:e23030357. [PMID: 33802879 PMCID: PMC8002683 DOI: 10.3390/e23030357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/09/2021] [Accepted: 03/10/2021] [Indexed: 11/24/2022]
Abstract
Learning the underlying details of a gene network with feedback is critical in designing new synthetic circuits. Yet, quantitative characterization of these circuits remains limited. This is due to the fact that experiments can only measure partial information from which the details of the circuit must be inferred. One potentially useful avenue is to harness hidden information from single-cell stochastic gene expression time trajectories measured for long periods of time—recorded at frequent intervals—over multiple cells. This raises the feasibility vs. accuracy dilemma while deciding between different models of mining these stochastic trajectories. We demonstrate that inference based on the Maximum Caliber (MaxCal) principle is the method of choice by critically evaluating its computational efficiency and accuracy against two other typical modeling approaches: (i) a detailed model (DM) with explicit consideration of multiple molecules including protein-promoter interaction, and (ii) a coarse-grain model (CGM) using Hill type functions to model feedback. MaxCal provides a reasonably accurate model while being significantly more computationally efficient than DM and CGM. Furthermore, MaxCal requires minimal assumptions since it is a top-down approach and allows systematic model improvement by including constraints of higher order, in contrast to traditional bottom-up approaches that require more parameters or ad hoc assumptions. Thus, based on efficiency, accuracy, and ability to build minimal models, we propose MaxCal as a superior alternative to traditional approaches (DM, CGM) when inferring underlying details of gene circuits with feedback from limited data.
Collapse
|
4
|
Weistuch C, Agozzino L, Mujica-Parodi LR, Dill KA. Inferring a network from dynamical signals at its nodes. PLoS Comput Biol 2020; 16:e1008435. [PMID: 33253160 PMCID: PMC7728228 DOI: 10.1371/journal.pcbi.1008435] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 12/10/2020] [Accepted: 10/12/2020] [Indexed: 12/26/2022] Open
Abstract
We give an approximate solution to the difficult inverse problem of inferring the topology of an unknown network from given time-dependent signals at the nodes. For example, we measure signals from individual neurons in the brain, and infer how they are inter-connected. We use Maximum Caliber as an inference principle. The combinatorial challenge of high-dimensional data is handled using two different approximations to the pairwise couplings. We show two proofs of principle: in a nonlinear genetic toggle switch circuit, and in a toy neural network.
Collapse
Affiliation(s)
- Corey Weistuch
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
| | - Luca Agozzino
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Lilianne R. Mujica-Parodi
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York, USA
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York, USA
- Program in Neuroscience, Stony Brook University, Stony Brook, New York, USA
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Ken A. Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York, USA
- Department of Chemistry, Stony Brook University, Stony Brook, New York, USA
| |
Collapse
|
5
|
Abstract
Ever since Clausius in 1865 and Boltzmann in 1877, the concepts of entropy and of its maximization have been the foundations for predicting how material equilibria derive from microscopic properties. But, despite much work, there has been no equally satisfactory general variational principle for nonequilibrium situations. However, in 1980, a new avenue was opened by E.T. Jaynes and by Shore and Johnson. We review here maximum caliber, which is a maximum-entropy-like principle that can infer distributions of flows over pathways, given dynamical constraints. This approach is providing new insights, particularly into few-particle complex systems, such as gene circuits, protein conformational reaction coordinates, network traffic, bird flocking, cell motility, and neuronal firing.
Collapse
Affiliation(s)
- Kingshuk Ghosh
- Department of Physics and Astronomy, University of Denver, Denver, Colorado 80209, USA
| | - Purushottam D. Dixit
- Department of Systems Biology, Columbia University, New York, NY 10032, USA,Department of Physics, University of Florida, Gainesville, Florida 32611, USA
| | - Luca Agozzino
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA
| | - Ken A. Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA
| |
Collapse
|
6
|
Tavakoli M, Tsekouras K, Day R, Dunn KW, Pressé S. Quantitative Kinetic Models from Intravital Microscopy: A Case Study Using Hepatic Transport. J Phys Chem B 2019; 123:7302-7312. [PMID: 31298856 PMCID: PMC6857640 DOI: 10.1021/acs.jpcb.9b04729] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The liver performs critical physiological functions, including metabolizing and removing substances, such as toxins and drugs, from the bloodstream. Hepatotoxicity itself is intimately linked to abnormal hepatic transport, and hepatotoxicity remains the primary reason drugs in development fail and approved drugs are withdrawn from the market. For this reason, we propose to analyze, across liver compartments, the transport kinetics of fluorescein-a fluorescent marker used as a proxy for drug molecules-using intravital microscopy data. To resolve the transport kinetics quantitatively from fluorescence data, we account for the effect that different liver compartments (with different chemical properties) have on fluorescein's emission rate. To do so, we develop ordinary differential equation transport models from the data where the kinetics is related to the observable fluorescence levels by "measurement parameters" that vary across different liver compartments. On account of the steep non-linearities in the kinetics and stochasticity inherent to the model, we infer kinetic and measurement parameters by generalizing the method of parameter cascades. For this application, the method of parameter cascades ensures fast and precise parameter estimates from noisy time traces.
Collapse
Affiliation(s)
- Meysam Tavakoli
- Department of Physics, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
| | | | - Richard Day
- Department of Cellular and Integrative Physiology, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Kenneth W. Dunn
- Department of Medicine and Biochemistry, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Steve Pressé
- Center for Biological Physics, Arizona State University, Tempe, Arizona 85287, United States
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
| |
Collapse
|
7
|
Firman T, Amgalan A, Ghosh K. Maximum Caliber Can Build and Infer Models of Oscillation in a Three-Gene Feedback Network. J Phys Chem B 2019; 123:343-355. [PMID: 30507199 DOI: 10.1021/acs.jpcb.8b07465] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Single-cell protein expression time trajectories provide rich temporal data quantifying cellular variability and its role in dictating fitness. However, theoretical models to analyze and fully extract information from these measurements remain limited for three reasons: (i) gene expression profiles are noisy, rendering models of averages inapplicable, (ii) experiments typically measure only a few protein species while leaving other molecular actors-necessary to build traditional bottom-up models-unnoticed, and (iii) measured data are in fluorescence, not particle number. We recently addressed these challenges in an alternate top-down approach using the principle of Maximum Caliber (MaxCal) to model genetic switches with one and two protein species. In the present work we address scalability and broader applicability of MaxCal by extending to a three-gene (A, B, C) feedback network that exhibits oscillation, commonly known as the repressilator. We test MaxCal's inferential power by using synthetic data of noisy protein number time traces-serving as a proxy for experimental data-generated from a known underlying model. We notice that the minimal MaxCal model-accounting for production, degradation, and only one type of symmetric coupling between all three species-reasonably infers several underlying features of the circuit such as the effective production rate, degradation rate, frequency of oscillation, and protein number distribution. Next, we build models of higher complexity including different levels of coupling between A, B, and C and rigorously assess their relative performance. While the minimal model (with four parameters) performs remarkably well, we note that the most complex model (with six parameters) allowing all possible forms of crosstalk between A, B, and C slightly improves prediction of rates, but avoids ad hoc assumption of all the other models. It is also the model of choice based on Bayesian information criteria. We further analyzed time trajectories in arbitrary fluorescence (using synthetic trajectories) to mimic realistic data. We conclude that even with a three-protein system including both fluorescence noise and intrinsic gene expression fluctuations, MaxCal can faithfully infer underlying details of the network, opening future directions to model other network motifs with many species.
Collapse
|
8
|
Firman T, Amgalan A, Ghosh K. Maximum Caliber Can Build and Infer Models of Oscillation in a Three-Gene Feedback Network. J Phys Chem A 2018. [DOI: 10.1021/acs.jpca.8b07465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
9
|
Lakhani V, Tan L, Mukherjee S, Stewart WCL, Swords WE, Das J. Mutations in bacterial genes induce unanticipated changes in the relationship between bacterial pathogens in experimental otitis media. ROYAL SOCIETY OPEN SCIENCE 2018; 5:180810. [PMID: 30564392 PMCID: PMC6281918 DOI: 10.1098/rsos.180810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 10/19/2018] [Indexed: 05/09/2023]
Abstract
Otitis media (OM) is a common polymicrobial infection of the middle ear in children under the age of 15 years. A widely used experimental strategy to analyse roles of specific phenotypes of bacterial pathogens of OM is to study changes in co-infection kinetics of bacterial populations in animal models when a wild-type bacterial strain is replaced by a specific isogenic mutant strain in the co-inoculating mixtures. As relationships between the OM bacterial pathogens within the host are regulated by many interlinked processes, connecting the changes in the co-infection kinetics to a bacterial phenotype can be challenging. We investigated middle ear co-infections in adult chinchillas (Chinchilla lanigera) by two major OM pathogens: non-typeable Haemophilus influenzae (NTHi) and Moraxella catarrhalis (Mcat), as well as isogenic mutant strains in each bacterial species. We analysed the infection kinetic data using Lotka-Volterra population dynamics, maximum entropy inference and Akaike information criteria-(AIC)-based model selection. We found that changes in relationships between the bacterial pathogens that were not anticipated in the design of the co-infection experiments involving mutant strains are common and were strong regulators of the co-infecting bacterial populations. The framework developed here allows for a systematic analysis of host-host variations of bacterial populations and small sizes of animal cohorts in co-infection experiments to quantify the role of specific mutant strains in changing the infection kinetics. Our combined approach can be used to analyse the functional footprint of mutant strains in regulating co-infection kinetics in models of experimental OM and other polymicrobial diseases.
Collapse
Affiliation(s)
- Vinal Lakhani
- Battelle Center for Mathematical Medicine, The Research Institute at the Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH 43205, USA
| | - Li Tan
- Department of Microbiology and Immunology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Sayak Mukherjee
- Battelle Center for Mathematical Medicine, The Research Institute at the Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH 43205, USA
| | - William C. L. Stewart
- Battelle Center for Mathematical Medicine, The Research Institute at the Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH 43205, USA
| | - W. Edward Swords
- Department of Microbiology and Immunology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Division of Pulmonary, Allergy & Critical Care Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Jayajit Das
- Battelle Center for Mathematical Medicine, The Research Institute at the Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH 43205, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH 43210, USA
- Department of Physics, The Ohio State University, Columbus, OH 43210, USA
- Department of Biophysics Graduate Program, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|