1
|
Tang J, Du W, Shu Z, Cao Z. A generative benchmark for evaluating the performance of fluorescent cell image segmentation. Synth Syst Biotechnol 2024; 9:627-637. [PMID: 38798889 PMCID: PMC11127598 DOI: 10.1016/j.synbio.2024.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 04/13/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024] Open
Abstract
Fluorescent cell imaging technology is fundamental in life science research, offering a rich source of image data crucial for understanding cell spatial positioning, differentiation, and decision-making mechanisms. As the volume of this data expands, precise image analysis becomes increasingly critical. Cell segmentation, a key analysis step, significantly influences quantitative analysis outcomes. However, selecting the most effective segmentation method is challenging, hindered by existing evaluation methods' inaccuracies, lack of graded evaluation, and narrow assessment scope. Addressing this, we developed a novel framework with two modules: StyleGAN2-based contour generation and Pix2PixHD-based image rendering, producing diverse, graded-density cell images. Using this dataset, we evaluated three leading cell segmentation methods: DeepCell, CellProfiler, and CellPose. Our comprehensive comparison revealed CellProfiler's superior accuracy in segmenting cytoplasm and nuclei. Our framework diversifies cell image data generation and systematically addresses evaluation challenges in cell segmentation technologies, establishing a solid foundation for advancing research and applications in cell image analysis.
Collapse
Affiliation(s)
- Jun Tang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, 200237, China
- MOE Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai, 200237, China
| | - Wei Du
- MOE Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai, 200237, China
| | - Zhanpeng Shu
- College of Electrical Engineering, Shanghai Dianji University, Shanghai, 201306, China
| | - Zhixing Cao
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai, 200237, China
| |
Collapse
|
2
|
Cao Z, Chen R, Xu L, Zhou X, Fu X, Zhong W, Grima R. Efficient and scalable prediction of stochastic reaction-diffusion processes using graph neural networks. Math Biosci 2024; 375:109248. [PMID: 38986837 DOI: 10.1016/j.mbs.2024.109248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 05/07/2024] [Accepted: 07/03/2024] [Indexed: 07/12/2024]
Abstract
The dynamics of locally interacting particles that are distributed in space give rise to a multitude of complex behaviours. However the simulation of reaction-diffusion processes which model such systems is highly computationally expensive, the cost increasing rapidly with the size of space. Here, we devise a graph neural network based approach that uses cheap Monte Carlo simulations of reaction-diffusion processes in a small space to cast predictions of the dynamics of the same processes in a much larger and complex space, including spaces modelled by networks with heterogeneous topology. By applying the method to two biological examples, we show that it leads to accurate results in a small fraction of the computation time of standard stochastic simulation methods. The scalability and accuracy of the method suggest it is a promising approach for studying reaction-diffusion processes in complex spatial domains such as those modelling biochemical reactions, population evolution and epidemic spreading.
Collapse
Affiliation(s)
- Zhixing Cao
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China; Department of Chemical Engineering, Queen's University, Kingston, Canada K7L 3N6.
| | - Rui Chen
- Shanghai Jiao Tong University School of Medicine, Shanghai 200127, China
| | - Libin Xu
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Xinyi Zhou
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Xiaoming Fu
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Weimin Zhong
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
| | - Ramon Grima
- School of Biological Sciences, the University of Edinburgh, Max Born Crescent, Edinburgh, EH9 3BF, Scotland, United Kingdom.
| |
Collapse
|
3
|
Fang Z, Gupta A, Kumar S, Khammash M. Advanced methods for gene network identification and noise decomposition from single-cell data. Nat Commun 2024; 15:4911. [PMID: 38851792 PMCID: PMC11162465 DOI: 10.1038/s41467-024-49177-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 05/24/2024] [Indexed: 06/10/2024] Open
Abstract
Central to analyzing noisy gene expression systems is solving the Chemical Master Equation (CME), which characterizes the probability evolution of the reacting species' copy numbers. Solving CMEs for high-dimensional systems suffers from the curse of dimensionality. Here, we propose a computational method for improved scalability through a divide-and-conquer strategy that optimally decomposes the whole system into a leader system and several conditionally independent follower subsystems. The CME is solved by combining Monte Carlo estimation for the leader system with stochastic filtering procedures for the follower subsystems. We demonstrate this method with high-dimensional numerical examples and apply it to identify a yeast transcription system at the single-cell resolution, leveraging mRNA time-course experimental data. The identification results enable an accurate examination of the heterogeneity in rate parameters among isogenic cells. To validate this result, we develop a noise decomposition technique exploiting time-course data but requiring no supplementary components, e.g., dual-reporters.
Collapse
Affiliation(s)
- Zhou Fang
- Department of Biosystems Science and Engineering, ETH Zurich, CH-4056, Basel, Switzerland
| | - Ankit Gupta
- Department of Biosystems Science and Engineering, ETH Zurich, CH-4056, Basel, Switzerland
| | - Sant Kumar
- Department of Biosystems Science and Engineering, ETH Zurich, CH-4056, Basel, Switzerland
| | - Mustafa Khammash
- Department of Biosystems Science and Engineering, ETH Zurich, CH-4056, Basel, Switzerland.
| |
Collapse
|
4
|
Miles CE, McKinley SA, Ding F, Lehoucq RB. Inferring Stochastic Rates from Heterogeneous Snapshots of Particle Positions. Bull Math Biol 2024; 86:74. [PMID: 38740619 DOI: 10.1007/s11538-024-01301-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/20/2024] [Indexed: 05/16/2024]
Abstract
Many imaging techniques for biological systems-like fixation of cells coupled with fluorescence microscopy-provide sharp spatial resolution in reporting locations of individuals at a single moment in time but also destroy the dynamics they intend to capture. These snapshot observations contain no information about individual trajectories, but still encode information about movement and demographic dynamics, especially when combined with a well-motivated biophysical model. The relationship between spatially evolving populations and single-moment representations of their collective locations is well-established with partial differential equations (PDEs) and their inverse problems. However, experimental data is commonly a set of locations whose number is insufficient to approximate a continuous-in-space PDE solution. Here, motivated by popular subcellular imaging data of gene expression, we embrace the stochastic nature of the data and investigate the mathematical foundations of parametrically inferring demographic rates from snapshots of particles undergoing birth, diffusion, and death in a nuclear or cellular domain. Toward inference, we rigorously derive a connection between individual particle paths and their presentation as a Poisson spatial process. Using this framework, we investigate the properties of the resulting inverse problem and study factors that affect quality of inference. One pervasive feature of this experimental regime is the presence of cell-to-cell heterogeneity. Rather than being a hindrance, we show that cell-to-cell geometric heterogeneity can increase the quality of inference on dynamics for certain parameter regimes. Altogether, the results serve as a basis for more detailed investigations of subcellular spatial patterns of RNA molecules and other stochastically evolving populations that can only be observed for single instants in their time evolution.
Collapse
Affiliation(s)
| | - Scott A McKinley
- Department of Mathematics, Tulane University, New Orleans, LA, USA
| | - Fangyuan Ding
- Departments of Biomedical Engineering, Developmental and Cell Biology, University of California, Irvine, Irvine, USA
| | - Richard B Lehoucq
- Discrete Math and Optimization, Sandia National Laboratories, Albuquerque, NM, USA
| |
Collapse
|
5
|
Szavits-Nossan J, Grima R. Solving stochastic gene-expression models using queueing theory: A tutorial review. Biophys J 2024; 123:1034-1057. [PMID: 38594901 PMCID: PMC11079947 DOI: 10.1016/j.bpj.2024.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 02/12/2024] [Accepted: 04/02/2024] [Indexed: 04/11/2024] Open
Abstract
Stochastic models of gene expression are typically formulated using the chemical master equation, which can be solved exactly or approximately using a repertoire of analytical methods. Here, we provide a tutorial review of an alternative approach based on queueing theory that has rarely been used in the literature of gene expression. We discuss the interpretation of six types of infinite-server queues from the angle of stochastic single-cell biology and provide analytical expressions for the stationary and nonstationary distributions and/or moments of mRNA/protein numbers and bounds on the Fano factor. This approach may enable the solution of complex models that have hitherto evaded analytical solution.
Collapse
Affiliation(s)
- Juraj Szavits-Nossan
- School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| |
Collapse
|
6
|
Jiang Q, Jiang J, Wang W, Pan C, Zhong W. Partial Cross Mapping Based on Sparse Variable Selection for Direct Fault Root Cause Diagnosis for Industrial Processes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6218-6230. [PMID: 37022853 DOI: 10.1109/tnnls.2023.3242361] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Root cause diagnosis of process industry is of significance to ensure safe production and improve production efficiency. Conventional contribution plot methods have challenges in root cause diagnosis due to the smearing effect. Other traditional root cause diagnosis methods, such as Granger causality (GC) and transfer entropy, have unsatisfactory performance in root cause diagnosis for complex industrial processes due to the existence of indirect causality. In this work, a regularization and partial cross mapping (PCM)-based root cause diagnosis framework is proposed for efficient direct causality inference and fault propagation path tracing. First, generalized Lasso-based variable selection is performed. The Hotelling T2 statistic is formulated and the Lasso-based fault reconstruction is applied to select candidate root cause variables. Second, the root cause is diagnosed through the PCM and the propagation path is drawn out according to the diagnosis result. The proposed framework is studied in four cases to verify its rationality and effectiveness, including a numerical example, the Tennessee Eastman benchmark process, the wastewater treatment process (WWTP), and the decarburization process of high-speed wire rod spring steel.
Collapse
|
7
|
Liu C, Wang J. Distilling dynamical knowledge from stochastic reaction networks. Proc Natl Acad Sci U S A 2024; 121:e2317422121. [PMID: 38530895 PMCID: PMC10998579 DOI: 10.1073/pnas.2317422121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/20/2024] [Indexed: 03/28/2024] Open
Abstract
Stochastic reaction networks are widely used in the modeling of stochastic systems across diverse domains such as biology, chemistry, physics, and ecology. However, the comprehension of the dynamic behaviors inherent in stochastic reaction networks is a formidable undertaking, primarily due to the exponential growth in the number of possible states or trajectories as the state space dimension increases. In this study, we introduce a knowledge distillation method based on reinforcement learning principles, aimed at compressing the dynamical knowledge encoded in stochastic reaction networks into a singular neural network construct. The trained neural network possesses the capability to accurately predict the state conditional joint probability distribution that corresponds to the given query contexts, when prompted with rate parameters, initial conditions, and time values. This obviates the need to track the dynamical process, enabling the direct estimation of normalized state and trajectory probabilities, without necessitating the integration over the complete state space. By applying our method to representative examples, we have observed a high degree of accuracy in both multimodal and high-dimensional systems. Additionally, the trained neural network can serve as a foundational model for developing efficient algorithms for parameter inference and trajectory ensemble generation. These results collectively underscore the efficacy of our approach as a universal means of distilling knowledge from stochastic reaction networks. Importantly, our methodology also spotlights the potential utility in harnessing a singular, pretrained, large-scale model to encapsulate the solution space underpinning a wide spectrum of stochastic dynamical systems.
Collapse
Affiliation(s)
- Chuanbo Liu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin130022, People’s Republic of China
| | - Jin Wang
- Center for Theoretical Interdisciplinary Sciences, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325001, People’s Republic of China
- Department of Chemistry and of Physics and Astronomy, State University of New York at Stony Brook, NY11794-3400
| |
Collapse
|
8
|
Jo H, Hong H, Hwang HJ, Chang W, Kim JK. Density physics-informed neural networks reveal sources of cell heterogeneity in signal transduction. PATTERNS (NEW YORK, N.Y.) 2024; 5:100899. [PMID: 38370126 PMCID: PMC10873160 DOI: 10.1016/j.patter.2023.100899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 11/05/2023] [Accepted: 11/24/2023] [Indexed: 02/20/2024]
Abstract
The transduction time between signal initiation and final response provides valuable information on the underlying signaling pathway, including its speed and precision. Furthermore, multi-modality in a transduction-time distribution indicates that the response is regulated by multiple pathways with different transduction speeds. Here, we developed a method called density physics-informed neural networks (Density-PINNs) to infer the transduction-time distribution from measurable final stress response time traces. We applied Density-PINNs to single-cell gene expression data from sixteen promoters regulated by unknown pathways in response to antibiotic stresses. We found that promoters with slower signaling initiation and transduction exhibit larger cell-to-cell heterogeneity in response intensity. However, this heterogeneity was greatly reduced when the response was regulated by slow and fast pathways together. This suggests a strategy for identifying effective signaling pathways for consistent cellular responses to disease treatments. Density-PINNs can also be applied to understand other time delay systems, including infectious diseases.
Collapse
Affiliation(s)
- Hyeontae Jo
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon 34126, Republic of Korea
| | - Hyukpyo Hong
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon 34126, Republic of Korea
- Department of Mathematical Sciences, KAIST, Daejeon 34141, Republic of Korea
| | - Hyung Ju Hwang
- Department of Mathematics, Pohang University of Science and Technology, Pohang 37673, Republic of Korea
| | - Won Chang
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Jae Kyoung Kim
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon 34126, Republic of Korea
- Department of Mathematical Sciences, KAIST, Daejeon 34141, Republic of Korea
| |
Collapse
|
9
|
Peng C, Ying X, ZhiQi H. Industrial Process Monitoring Based on Dynamic Overcomplete Broad Learning Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1761-1772. [PMID: 35802548 DOI: 10.1109/tnnls.2022.3185167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Most industrial processes feature high nonlinearity, non-Gaussianity, and time correlation. Models based on overcomplete broad learning system (OBLS) have been successfully applied in the fault monitoring realm, which may relatively deal with the nonlinear and non-Gaussian characteristics. However, these models barely take time correlation into full consideration, hindering the further improvement of the monitoring accuracy of the network. Therefore, an effective dynamic overcomplete broad learning system (DOBLS) based on matrix extension is proposed, which extends the raw data in the batch process with the idea of "time lag" in this article. Subsequently, the OBLS monitoring network is employed to continue the analysis of the extended dynamic input data. Finally, a monitoring model is established to tackle the coexistence of nonlinearity, non-Gaussianity, and time correlation in process data. To illustrate the superiority and feasibility, the proposed model is conducted on the penicillin fermentation simulation platform, the experimental result of which illustrates that the model can extract the feature of process data more comprehensively and be self-updated more efficiently. With shorter training time and higher monitoring accuracy, the proposed model can witness an improvement of average monitoring accuracy by 3.69% and 1.26% in 26 process fault types compared to the state-of-the-art fault monitoring methods BLS and OBLS, respectively.
Collapse
|
10
|
Wang Y, Yu Z, Grima R, Cao Z. Exact solution of a three-stage model of stochastic gene expression including cell-cycle dynamics. J Chem Phys 2023; 159:224102. [PMID: 38063222 DOI: 10.1063/5.0173742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/04/2023] [Indexed: 12/18/2023] Open
Abstract
The classical three-stage model of stochastic gene expression predicts the statistics of single cell mRNA and protein number fluctuations as a function of the rates of promoter switching, transcription, translation, degradation and dilution. While this model is easily simulated, its analytical solution remains an unsolved problem. Here we modify this model to explicitly include cell-cycle dynamics and then derive an exact solution for the time-dependent joint distribution of mRNA and protein numbers. We show large differences between this model and the classical model which captures cell-cycle effects implicitly via effective first-order dilution reactions. In particular we find that the Fano factor of protein numbers calculated from a population snapshot measurement are underestimated by the classical model whereas the correlation between mRNA and protein can be either over- or underestimated, depending on the timescales of mRNA degradation and promoter switching relative to the mean cell-cycle duration time.
Collapse
Affiliation(s)
- Yiling Wang
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
| | - Zhenhua Yu
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
| | - Ramon Grima
- School of Biological Sciences, The University of Edinburgh, Max Born Crescent, Edinburgh EH9 3BF, Scotland, United Kingdom
| | - Zhixing Cao
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
- Department of Chemical Engineering, Queen's University, Kingston, Ontario K7L 3N6, Canada
| |
Collapse
|
11
|
Miles CE, McKinley SA, Ding F, Lehoucq RB. Inferring stochastic rates from heterogeneous snapshots of particle positions. ARXIV 2023:arXiv:2311.04880v1. [PMID: 37986720 PMCID: PMC10659442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Many imaging techniques for biological systems - like fixation of cells coupled with fluorescence microscopy - provide sharp spatial resolution in reporting locations of individuals at a single moment in time but also destroy the dynamics they intend to capture. These snapshot observations contain no information about individual trajectories, but still encode information about movement and demographic dynamics, especially when combined with a well-motivated biophysical model. The relationship between spatially evolving populations and single-moment representations of their collective locations is well-established with partial differential equations (PDEs) and their inverse problems. However, experimental data is commonly a set of locations whose number is insufficient to approximate a continuous-in-space PDE solution. Here, motivated by popular subcellular imaging data of gene expression, we embrace the stochastic nature of the data and investigate the mathematical foundations of parametrically inferring demographic rates from snapshots of particles undergoing birth, diffusion, and death in a nuclear or cellular domain. Toward inference, we rigorously derive a connection between individual particle paths and their presentation as a Poisson spatial process. Using this framework, we investigate the properties of the resulting inverse problem and study factors that affect quality of inference. One pervasive feature of this experimental regime is the presence of cell-to-cell heterogeneity. Rather than being a hindrance, we show that cell-to-cell geometric heterogeneity can increase the quality of inference on dynamics for certain parameter regimes. Altogether, the results serve as a basis for more detailed investigations of subcellular spatial patterns of RNA molecules and other stochastically evolving populations that can only be observed for single instants in their time evolution.
Collapse
Affiliation(s)
| | | | - Fangyuan Ding
- Department of Biomedical Engineering, University of California, Irvine
| | | |
Collapse
|
12
|
Hong H, Cortez MJ, Cheng YY, Kim HJ, Choi B, Josić K, Kim JK. Inferring delays in partially observed gene regulation processes. Bioinformatics 2023; 39:btad670. [PMID: 37935426 PMCID: PMC10660296 DOI: 10.1093/bioinformatics/btad670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 10/25/2023] [Accepted: 11/02/2023] [Indexed: 11/09/2023] Open
Abstract
MOTIVATION Cell function is regulated by gene regulatory networks (GRNs) defined by protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer the properties of GRNs using partial observation, unobserved sequential processes can be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on the resulting model suffer from the curse of dimensionality. RESULTS We develop a simulation-based Bayesian MCMC method employing an approximate likelihood for the efficient and accurate inference of GRN parameters when only some of their products are observed. We illustrate our approach using a two-step activation model: an activation signal leads to the accumulation of an unobserved regulatory protein, which triggers the expression of observed fluorescent proteins. With prior information about observed fluorescent protein synthesis, our method successfully infers the dynamics of the unobserved regulatory protein. We can estimate the delay and kinetic parameters characterizing target regulation including transcription, translation, and target searching of an unobserved protein from experimental measurements of the products of its target gene. Our method is scalable and can be used to analyze non-Markovian models with hidden components. AVAILABILITY AND IMPLEMENTATION Our code is implemented in R and is freely available with a simple example data at https://github.com/Mathbiomed/SimMCMC.
Collapse
Affiliation(s)
- Hyukpyo Hong
- Department of Mathematical Sciences, KAIST, Daejeon 34141, Korea
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon 34126, Korea
| | - Mark Jayson Cortez
- Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, Laguna 4031, Philippines
| | - Yu-Yu Cheng
- Department of Biochemistry, University of Wisconsin–Madison, Madison, WI 53706, United States
| | - Hang Joon Kim
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH 45221, United States
| | - Boseung Choi
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon 34126, Korea
- Division of Big Data Science, Korea University Sejong Campus, Sejong 30019, Korea
- College of Public Health, The Ohio State University, Columbus, OH 43210, United States
| | - Krešimir Josić
- Department of Mathematics, University of Houston, Houston, TX 77204, United States
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, United States
| | - Jae Kyoung Kim
- Department of Mathematical Sciences, KAIST, Daejeon 34141, Korea
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon 34126, Korea
| |
Collapse
|
13
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. Cell Syst 2023; 14:822-843.e22. [PMID: 37751736 PMCID: PMC10725240 DOI: 10.1016/j.cels.2023.08.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/16/2023] [Accepted: 08/25/2023] [Indexed: 09/28/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - John J Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
14
|
Shi C, Yang X, Zhang J, Zhou T. Stochastic modeling of the mRNA life process: A generalized master equation. Biophys J 2023; 122:4023-4041. [PMID: 37653725 PMCID: PMC10598292 DOI: 10.1016/j.bpj.2023.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 06/29/2023] [Accepted: 08/29/2023] [Indexed: 09/02/2023] Open
Abstract
The mRNA life cycle is a complex biochemical process, involving transcription initiation, elongation, termination, splicing, and degradation. Each of these molecular events is multistep and can create a memory. The effect of this molecular memory on gene expression is not clear, although there are many related yet scattered experimental reports. To address this important issue, we develop a general theoretical framework formulated as a master equation in the sense of queue theory, which can reduce to multiple previously studied gene models in limiting cases. This framework allows us to interpret experimental observations, extract kinetic parameters from experimental data, and identify how the mRNA kinetics vary under regulatory influences. Notably, it allows us to evaluate the influences of elongation processes on mature RNA distribution; e.g., we find that the non-exponential elongation time can induce the bimodal mRNA expression and there is an optimal elongation noise intensity such that the mature RNA noise achieves the lowest level. In a word, our framework can not only provide insight into complex mRNA life processes but also bridge a dialogue between theoretical studies and experimental data.
Collapse
Affiliation(s)
- Changhong Shi
- State Key Laboratory of Respiratory Disease, School of Public Health, Guangzhou Medical University, Guangzhou, China
| | - Xiyan Yang
- School of Financial Mathematics and Statistics, Guangdong University of Finance, Guangzhou, China
| | - Jiajun Zhang
- School of Mathematics and Computational Science and Guangdong Province Key Laboratory of Computational Science, Sun Yat-Sen University, Guangzhou, China.
| | - Tianshou Zhou
- School of Mathematics and Computational Science and Guangdong Province Key Laboratory of Computational Science, Sun Yat-Sen University, Guangzhou, China.
| |
Collapse
|
15
|
Gorin G, Yoshida S, Pachter L. Assessing Markovian and Delay Models for Single-Nucleus RNA Sequencing. Bull Math Biol 2023; 85:114. [PMID: 37828255 DOI: 10.1007/s11538-023-01213-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 09/11/2023] [Indexed: 10/14/2023]
Abstract
The serial nature of reactions involved in the RNA life-cycle motivates the incorporation of delays in models of transcriptional dynamics. The models couple a transcriptional process to a fairly general set of delayed monomolecular reactions with no feedback. We provide numerical strategies for calculating the RNA copy number distributions induced by these models, and solve several systems with splicing, degradation, and catalysis. An analysis of single-cell and single-nucleus RNA sequencing data using these models reveals that the kinetics of nuclear export do not appear to require invocation of a non-Markovian waiting time.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Shawn Yoshida
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA.
| |
Collapse
|
16
|
Szavits-Nossan J, Grima R. Uncovering the effect of RNA polymerase steric interactions on gene expression noise: Analytical distributions of nascent and mature RNA numbers. Phys Rev E 2023; 108:034405. [PMID: 37849194 DOI: 10.1103/physreve.108.034405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/24/2023] [Indexed: 10/19/2023]
Abstract
The telegraph model is the standard model of stochastic gene expression, which can be solved exactly to obtain the distribution of mature RNA numbers per cell. A modification of this model also leads to an analytical distribution of nascent RNA numbers. These solutions are routinely used for the analysis of single-cell data, including the inference of transcriptional parameters. However, these models neglect important mechanistic features of transcription elongation, such as the stochastic movement of RNA polymerases and their steric (excluded-volume) interactions. Here we construct a model of gene expression describing promoter switching between inactive and active states, binding of RNA polymerases in the active state, their stochastic movement including steric interactions along the gene, and their unbinding leading to a mature transcript that subsequently decays. We derive the steady-state distributions of the nascent and mature RNA numbers in two important limiting cases: constitutive expression and slow promoter switching. We show that RNA fluctuations are suppressed by steric interactions between RNA polymerases, and that this suppression can in some instances even lead to sub-Poissonian fluctuations; these effects are most pronounced for nascent RNA and less prominent for mature RNA, since the latter is not a direct sensor of transcription. We find a relationship between the parameters of our microscopic mechanistic model and those of the standard models that ensures excellent consistency in their prediction of the first and second RNA number moments over vast regions of parameter space, encompassing slow, intermediate, and rapid promoter switching, provided the RNA number distributions are Poissonian or super-Poissonian. Furthermore, we identify the limitations of inference from mature RNA data, specifically showing that it cannot differentiate between highly distinct RNA polymerase traffic patterns on a gene.
Collapse
Affiliation(s)
- Juraj Szavits-Nossan
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| |
Collapse
|
17
|
Kilic Z, Schweiger M, Moyer C, Pressé S. Monte Carlo samplers for efficient network inference. PLoS Comput Biol 2023; 19:e1011256. [PMID: 37463156 PMCID: PMC10353823 DOI: 10.1371/journal.pcbi.1011256] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 06/09/2023] [Indexed: 07/20/2023] Open
Abstract
Accessing information on an underlying network driving a biological process often involves interrupting the process and collecting snapshot data. When snapshot data are stochastic, the data's structure necessitates a probabilistic description to infer underlying reaction networks. As an example, we may imagine wanting to learn gene state networks from the type of data collected in single molecule RNA fluorescence in situ hybridization (RNA-FISH). In the networks we consider, nodes represent network states, and edges represent biochemical reaction rates linking states. Simultaneously estimating the number of nodes and constituent parameters from snapshot data remains a challenging task in part on account of data uncertainty and timescale separations between kinetic parameters mediating the network. While parametric Bayesian methods learn parameters given a network structure (with known node numbers) with rigorously propagated measurement uncertainty, learning the number of nodes and parameters with potentially large timescale separations remain open questions. Here, we propose a Bayesian nonparametric framework and describe a hybrid Bayesian Markov Chain Monte Carlo (MCMC) sampler directly addressing these challenges. In particular, in our hybrid method, Hamiltonian Monte Carlo (HMC) leverages local posterior geometries in inference to explore the parameter space; Adaptive Metropolis Hastings (AMH) learns correlations between plausible parameter sets to efficiently propose probable models; and Parallel Tempering takes into account multiple models simultaneously with tempered information content to augment sampling efficiency. We apply our method to synthetic data mimicking single molecule RNA-FISH, a popular snapshot method in probing transcriptional networks to illustrate the identified challenges inherent to learning dynamical models from these snapshots and how our method addresses them.
Collapse
Affiliation(s)
- Zeliha Kilic
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, Tennessee, United States of America
| | - Max Schweiger
- Center for Biological Physics, ASU, Tempe, Arizona, United States of America
- Department of Physics ASU, Tempe, Arizona, United States of America
| | - Camille Moyer
- Center for Biological Physics, ASU, Tempe, Arizona, United States of America
- School of Mathematics and Statistical Sciences, ASU, Tempe, Arizona, United States of America
| | - Steve Pressé
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, Tennessee, United States of America
- Center for Biological Physics, ASU, Tempe, Arizona, United States of America
- School of Molecular Sciences, ASU, Tempe, Arizona, United States of America
| |
Collapse
|
18
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541250. [PMID: 37292934 PMCID: PMC10245677 DOI: 10.1101/2023.05.17.541250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125
| | - John J. Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| |
Collapse
|
19
|
Carilli M, Gorin G, Choi Y, Chari T, Pachter L. Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.13.523995. [PMID: 36712140 PMCID: PMC9882246 DOI: 10.1101/2023.01.13.523995] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
We motivate and present biVI, which combines the variational autoencoder framework of scVI with biophysically motivated, bivariate models for nascent and mature RNA distributions. While previous approaches to integrate bimodal data via the variational autoencoder framework ignore the causal relationship between measurements, biVI models the biophysical processes that give rise to observations. We demonstrate through simulated benchmarking that biVI captures cell type structure in a low-dimensional space and accurately recapitulates parameter values and copy number distributions. On biological data, biVI provides a scalable route for identifying the biophysical mechanisms underlying gene expression. This analytical approach outlines a generalizable strategy for treating multimodal datasets generated by high-throughput, single-cell genomic assays.
Collapse
Affiliation(s)
- Maria Carilli
- Division of Biology and Biological Engineering, California Institute of Technology
| | - Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology
| | - Yongin Choi
- Biomedical Engineering Graduate Group, University of California, Davis
- Genome Center, University of California, Davis
| | - Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology
- Department of Computing and Mathematical Sciences, California Institute of Technology
| |
Collapse
|
20
|
Tang Y, Weng J, Zhang P. Neural-network solutions to stochastic reaction networks. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00632-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
|
21
|
Kilic Z, Schweiger M, Moyer C, Shepherd D, Pressé S. Gene expression model inference from snapshot RNA data using Bayesian non-parametrics. NATURE COMPUTATIONAL SCIENCE 2023; 3:174-183. [PMID: 38125199 PMCID: PMC10732567 DOI: 10.1038/s43588-022-00392-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2023]
Abstract
Gene expression models, which are key towards understanding cellular regulatory response, underlie observations of single-cell transcriptional dynamics. Although RNA expression data encode information on gene expression models, existing computational frameworks do not perform simultaneous Bayesian inference of gene expression models and parameters from such data. Rather, gene expression models-composed of gene states, their connectivities and associated parameters-are currently deduced by pre-specifying gene state numbers and connectivity before learning associated rate parameters. Here we propose a method to learn full distributions over gene states, state connectivities and associated rate parameters, simultaneously and self-consistently from single-molecule RNA counts. We propagate noise from fluctuating RNA counts over models by treating models themselves as random variables. We achieve this within a Bayesian non-parametric paradigm. We demonstrate our method on the Escherichia coli lacZ pathway and the Saccharomyces cerevisiae STL1 pathway, and verify its robustness on synthetic data.
Collapse
Affiliation(s)
- Zeliha Kilic
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN, USA
- These authors contributed equally: Zeliha Kilic, Max Schweiger
| | - Max Schweiger
- Center for Biological Physics, ASU, Tempe, AZ, USA
- Department of Physics, ASU, Tempe, AZ, USA
- These authors contributed equally: Zeliha Kilic, Max Schweiger
| | - Camille Moyer
- Center for Biological Physics, ASU, Tempe, AZ, USA
- School of Mathematics and Statistical Sciences, ASU, Tempe, AZ, USA
| | - Douglas Shepherd
- Center for Biological Physics, ASU, Tempe, AZ, USA
- Department of Physics, ASU, Tempe, AZ, USA
| | - Steve Pressé
- Center for Biological Physics, ASU, Tempe, AZ, USA
- Department of Physics, ASU, Tempe, AZ, USA
- School of Molecular Sciences, ASU, Tempe, AZ, USA
| |
Collapse
|
22
|
Du L, Jin W, Wang Y, Jiang Q. Dynamic Batch Process Monitoring Based on Time-Slice Latent Variable Correlation Analysis. ACS OMEGA 2022; 7:41069-41081. [PMID: 36406484 PMCID: PMC9670696 DOI: 10.1021/acsomega.2c04445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
Batch processes are generally characterized by complex dynamics and remarkable data collinearity, thereby rendering the monitoring of such processes necessary but challenging. This paper proposes a data-driven time-slice latent variable correlation analysis-based model predictive fault detection framework to ensure accurate fault detection in dynamic batch processes. The three-way batch process data are first unfolded into the two-way time slice. For each single time slice, process data are mapped to both major latent variables and residual subspaces to deal with the variable-wise data collinearity and extract dominant data information. A measurement status is then determined with a canonical correlation analysis of the major latent variables and correlated variables, using both the time and batch perspectives. Prediction-based residuals are generated, which provide the basis for identifying the property of faults detected, namely, static or dynamic. Based on experiments using a simulated penicillin production and an industrial inject molding process, the proposed monitoring scheme has been proven feasible and effective.
Collapse
Affiliation(s)
- Le Du
- Key
Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry
of Education, East China University of Science
and Technology, Shanghai 200237, P. R. China
- Key
Laboratory of Complex System Safety and Control, Ministry of Education, Chongqing University, Chongqing 400044, P. R.
China
| | - Wenhao Jin
- Key
Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry
of Education, East China University of Science
and Technology, Shanghai 200237, P. R. China
| | - Yang Wang
- School
of Electric Engineering, Shanghai Dianji
University, Shanghai 200240, P. R. China
| | - Qingchao Jiang
- Key
Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry
of Education, East China University of Science
and Technology, Shanghai 200237, P. R. China
- Key
Laboratory of Complex System Safety and Control, Ministry of Education, Chongqing University, Chongqing 400044, P. R.
China
| |
Collapse
|
23
|
Jaleel EA, Anzar SM, Rehannara Beegum T, Mohamed Shahid PA. System identification and control of heat integrated distillation column using artificial bee colony based support vector regression. CHEM ENG COMMUN 2022. [DOI: 10.1080/00986445.2021.1974409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
| | - S. M. Anzar
- Department of Electronics and Communication Engineering, TKM College of Engineering, Kollam, Kerala, India
| | - T. Rehannara Beegum
- Department of Computer Science and Engineering, TKM College of Engineering, Kollam, Kerala, India
| | - P. A. Mohamed Shahid
- Department of Mechanical Engineering, TKM College of Engineering, Kollam, Kerala, India
| |
Collapse
|
24
|
Sukys A, Öcal K, Grima R. Approximating solutions of the Chemical Master equation using neural networks. iScience 2022; 25:105010. [PMID: 36117994 PMCID: PMC9474291 DOI: 10.1016/j.isci.2022.105010] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 06/13/2022] [Accepted: 08/18/2022] [Indexed: 10/27/2022] Open
Abstract
The Chemical Master Equation (CME) provides an accurate description of stochastic biochemical reaction networks in well-mixed conditions, but it cannot be solved analytically for most systems of practical interest. Although Monte Carlo methods provide a principled means to probe system dynamics, the large number of simulations typically required can render the estimation of molecule number distributions and other quantities infeasible. In this article, we aim to leverage the representational power of neural networks to approximate the solutions of the CME and propose a framework for the Neural Estimation of Stochastic Simulations for Inference and Exploration (Nessie). Our approach is based on training neural networks to learn the distributions predicted by the CME from relatively few stochastic simulations. We show on biologically relevant examples that simple neural networks with one hidden layer can capture highly complex distributions across parameter space, thereby accelerating computationally intensive tasks such as parameter exploration and inference.
Collapse
Affiliation(s)
- Augustinas Sukys
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
- The Alan Turing Institute, London NW1 2DB, UK
| | - Kaan Öcal
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| |
Collapse
|
25
|
Fu X, Zhou X, Gu D, Cao Z, Grima R. DelaySSAToolkit.jl: stochastic simulation of reaction systems with time delays in Julia. Bioinformatics 2022; 38:4243-4245. [PMID: 35799359 DOI: 10.1093/bioinformatics/btac472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 06/02/2022] [Accepted: 07/06/2022] [Indexed: 12/24/2022] Open
Abstract
SUMMARY DelaySSAToolkit.jl is a Julia package for modelling reaction systems with non-Markovian dynamics, specifically those with time delays. These delays implicitly capture multiple intermediate reaction steps and hence serve as an effective model reduction technique for complex systems in biology, chemistry, ecology and genetics. The package implements a variety of exact formulations of the delay stochastic simulation algorithm. AVAILABILITY AND IMPLEMENTATION The source code and documentation of DelaySSAToolkit.jl are available at https://github.com/palmtree2013/DelaySSAToolkit.jl.
Collapse
Affiliation(s)
- Xiaoming Fu
- MOE Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai, China.,School of Biological Sciences, The University of Edinburgh, Edinburgh, UK
| | - Xinyi Zhou
- MOE Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai, China
| | - Dongyang Gu
- MOE Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai, China
| | - Zhixing Cao
- MOE Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai, China
| | - Ramon Grima
- School of Biological Sciences, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
26
|
Öcal K, Gutmann MU, Sanguinetti G, Grima R. Inference and uncertainty quantification of stochastic gene expression via synthetic models. J R Soc Interface 2022; 19:20220153. [PMID: 35858045 PMCID: PMC9277240 DOI: 10.1098/rsif.2022.0153] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/21/2022] [Indexed: 12/26/2022] Open
Abstract
Estimating uncertainty in model predictions is a central task in quantitative biology. Biological models at the single-cell level are intrinsically stochastic and nonlinear, creating formidable challenges for their statistical estimation which inevitably has to rely on approximations that trade accuracy for tractability. Despite intensive interest, a sweet spot in this trade-off has not been found yet. We propose a flexible procedure for uncertainty quantification in a wide class of reaction networks describing stochastic gene expression including those with feedback. The method is based on creating a tractable coarse-graining of the model that is learned from simulations, a synthetic model, to approximate the likelihood function. We demonstrate that synthetic models can substantially outperform state-of-the-art approaches on a number of non-trivial systems and datasets, yielding an accurate and computationally viable solution to uncertainty quantification in stochastic models of gene expression.
Collapse
Affiliation(s)
- Kaan Öcal
- School of Informatics, University of Edinburgh, Edinburgh EH9 3JH, UK
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| | | | - Guido Sanguinetti
- Scuola Internazionale Superiore di Studi Avanzati, 34136 Trieste, Italy
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| |
Collapse
|
27
|
Simpson MJ, Baker RE, Buenzli PR, Nicholson R, Maclaren OJ. Reliable and efficient parameter estimation using approximate continuum limit descriptions of stochastic models. J Theor Biol 2022; 549:111201. [PMID: 35752285 DOI: 10.1016/j.jtbi.2022.111201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 06/08/2022] [Accepted: 06/10/2022] [Indexed: 11/28/2022]
Abstract
Stochastic individual-based mathematical models are attractive for modelling biological phenomena because they naturally capture the stochasticity and variability that is often evident in biological data. Such models also allow us to track the motion of individuals within the population of interest. Unfortunately, capturing this microscopic detail means that simulation and parameter inference can become computationally expensive. One approach for overcoming this computational limitation is to coarse-grain the stochastic model to provide an approximate continuum model that can be solved using far less computational effort. However, coarse-grained continuum models can be biased or inaccurate, particularly for certain parameter regimes. In this work, we combine stochastic and continuum mathematical models in the context of lattice-based models of two-dimensional cell biology experiments by demonstrating how to simulate two commonly used experiments: cell proliferation assays and barrier assays. Our approach involves building a simple statistical model of the discrepancy between the expensive stochastic model and the associated computationally inexpensive coarse-grained continuum model. We form this statistical model based on a limited number of expensive stochastic model evaluations at design points sampled from a user-chosen distribution, corresponding to a computer experiment design problem. With straightforward design point selection schemes, we show that using the statistical model of the discrepancy in tandem with the computationally inexpensive continuum model allows us to carry out prediction and inference while correcting for biases and inaccuracies due to the continuum approximation. We demonstrate this approach by simulating a proliferation assay, where the continuum limit model is the well-known logistic ordinary differential equation, as well as a barrier assay where the continuum limit model is closely related to the well-known Fisher-KPP partial differential equation. We construct an approximate likelihood function for parameter inference, both with and without discrepancy correction terms. Using maximum likelihood estimation, we provide point estimates of the unknown parameters, and use the profile likelihood to characterise the uncertainty in these estimates and form approximate confidence intervals. For the range of inference problems considered, working with the continuum limit model alone leads to biased parameter estimation and confidence intervals with poor coverage. In contrast, incorporating correction terms arising from the statistical model of the model discrepancy allows us to recover the parameters accurately with minimal computational overhead. The main tradeoff is that the associated confidence intervals are typically broader, reflecting the additional uncertainty introduced by the approximation process. All algorithms required to replicate the results in this work are written in the open source Julia language and are available at GitHub.
Collapse
Affiliation(s)
- Matthew J Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane QLD 4001, Australia.
| | - Ruth E Baker
- Mathematical Institute, University of Oxford, Oxford, UK
| | - Pascal R Buenzli
- School of Mathematical Sciences, Queensland University of Technology, Brisbane QLD 4001, Australia
| | - Ruanui Nicholson
- Department of Engineering Science, University of Auckland, Auckland, 1142, New Zealand
| | - Oliver J Maclaren
- Department of Engineering Science, University of Auckland, Auckland, 1142, New Zealand
| |
Collapse
|
28
|
Xu L, Zhong W, Lu J, Gao F, Qian F, Cao Z. Learning of Iterative Learning Control for Flexible Manufacturing of Batch Processes. ACS OMEGA 2022; 7:19939-19947. [PMID: 35721960 PMCID: PMC9202061 DOI: 10.1021/acsomega.2c01741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/13/2022] [Indexed: 06/15/2023]
Abstract
Flexible manufacturing as an essential component of smart manufacturing implements the customized production mode, thereby requesting fast controller adaptation for producing different goods but still with high precision. This problem becomes even more acute for batch processes. Here we present a solution called learning of iterative learning control (ILC) based on neural networks. It is able to recommend control parameters for ILC controllers accordingly, so as to yield fast tracking error convergence and smaller steady-state error for disparate set-point profiles, which is deemed an abstraction of different production needs. The method substantially outperforms a benchmark ILC on a variety of systems and cases, thereby showing its potential for deployment in the industrial Internet of Things.
Collapse
Affiliation(s)
- Libin Xu
- MOE
Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai 200237, China
| | - Weimin Zhong
- MOE
Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai 200237, China
| | - Jingyi Lu
- MOE
Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai 200237, China
- Department
of Electrical Engineering and Information Technology, Paderborn University, 33098, Paderborn, Germany
| | - Furong Gao
- Department
of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Feng Qian
- MOE
Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai 200237, China
| | - Zhixing Cao
- MOE
Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
29
|
Kim DW, Hong H, Kim JK. Systematic inference identifies a major source of heterogeneity in cell signaling dynamics: The rate-limiting step number. SCIENCE ADVANCES 2022; 8:eabl4598. [PMID: 35302852 PMCID: PMC8932658 DOI: 10.1126/sciadv.abl4598] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 01/26/2022] [Indexed: 06/14/2023]
Abstract
Identifying the sources of cell-to-cell variability in signaling dynamics is essential to understand drug response variability and develop effective therapeutics. However, it is challenging because not all signaling intermediate reactions can be experimentally measured simultaneously. This can be overcome by replacing them with a single random time delay, but the resulting process is non-Markovian, making it difficult to infer cell-to-cell heterogeneity in reaction rates and time delays. To address this, we developed an efficient and scalable moment-based Bayesian inference method (MBI) with a user-friendly computational package that infers cell-to-cell heterogeneity in the non-Markovian signaling process. We applied MBI to single-cell expression profiles from promoters responding to antibiotics and discovered a major source of cell-to-cell variability in antibiotic stress response: the number of rate-limiting steps in signaling cascades. This knowledge can help identify effective therapies that destroy all pathogenic or cancer cells, and the approach can be applied to precision medicine.
Collapse
Affiliation(s)
- Dae Wook Kim
- Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
- Biomedical Mathematics Group, Institute for Basic Science, Daejeon 34126, Republic of Korea
| | - Hyukpyo Hong
- Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
- Biomedical Mathematics Group, Institute for Basic Science, Daejeon 34126, Republic of Korea
| | - Jae Kyoung Kim
- Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
- Biomedical Mathematics Group, Institute for Basic Science, Daejeon 34126, Republic of Korea
| |
Collapse
|
30
|
Xue B, Chen X, Wang X, Li C, Liu J, He Q, Liu E. Application of multivariate statistical analysis and network pharmacology to explore the mechanism of Danggui Liuhuang Tang in treating perimenopausal syndrome. JOURNAL OF ETHNOPHARMACOLOGY 2022; 284:114543. [PMID: 34428521 DOI: 10.1016/j.jep.2021.114543] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/17/2021] [Accepted: 08/18/2021] [Indexed: 06/13/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Danggui Liuhuang Tang (DGLHT), first recorded in "Lan-Shi-Mi-Cang" (written in 1276 AD), is a famous classical formula. In 2018, it was listed in the Catalogue of Ancient Classic and Famous Prescriptions (First Batch) formulated by the National Administration of Traditional Chinese Medicine and the National Medical Products Administration. Perimenopausal syndrome (PMS) refers to a series of syndromes with autonomic nervous system dysfunction and neuropsychological symptoms. The treatment of PMS demands non-hormonal drugs. Natural products are considered to be effective substitutes for the treatment of PMS. It is reported that DGLHT has not only good therapeutic effects but also higher safety and fewer side effects in the treatment of PMS. However, the mechanism of DGLHT in treating PMS is not clear. AIM OF THE STUDY To explore the chemical basis and the mechanism of DGLHT in treating PMS. MATERIALS AND METHODS Multivariate statistical analysis was used to analyze the difference of components in supernatant before and after compatibility of DGLHT based on LC-MS data. The qualitative analysis was performed on the precipitate formed in the decocting process using LC-MS while the quantitative analysis on the potential markers using LC-UV. Then, the potential markers were analyzed by network pharmacology. The regulatory effect of DGLHT on FSH, P and E2 were carried out in PMS rats. RESULTS Five potential markers, epiberberine, coptisine, palmatine, berberine and baicalin, were screened from the analysis of compounds in the supernatant. Four complexes, composed of potential marker monomers, were identified in the sediment, including two that have not been reported. The key targets of potential markers include TNF, NOS3, EGFR, ESR1, PTGS2, AR, CDC42 and RPS6KB1. The top signaling pathways include the cGMP-PKG signaling pathway, PI3K-Akt signaling pathway and estrogen signaling pathway. DGLHT could call back the hormone levels of P and E2 in PMS rats. CONCLUSION DGLHT active ingredients, epiberberine, coptisine, palmatine, berberine and baicalin contribute a lot to the therapeutic effect. And DGLHT takes effect by regulating hormones secreted by the ovary.
Collapse
Affiliation(s)
- Beibei Xue
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Xiaopeng Chen
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Xiaoli Wang
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Chunxia Li
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Jing Liu
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Qiaoyu He
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| | - Erwei Liu
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, 300193, China.
| |
Collapse
|
31
|
Chen L, Lin G, Jiao F. Using average transcription level to understand the regulation of stochastic gene activation. ROYAL SOCIETY OPEN SCIENCE 2022; 9:211757. [PMID: 35223065 PMCID: PMC8847896 DOI: 10.1098/rsos.211757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 01/24/2022] [Indexed: 05/03/2023]
Abstract
Gene activation is a random process, modelled as a framework of multiple rate-limiting steps listed sequentially, in parallel or in combination. Together with suitably assumed processes of gene inactivation, transcript birth and death, the step numbers and parameters in activation frameworks can be estimated by fitting single-cell transcription data. However, current algorithms require computing master equations that are tightly correlated with prior hypothetical frameworks of gene activation. We found that prior estimation of the framework can be facilitated by the traditional dynamical data of mRNA average level M(t), presenting discriminated dynamical features. Rigorous theory regarding M(t) profiles allows to confidently rule out the frameworks that fail to capture M(t) features and to test potential competent frameworks by fitting M(t) data. We implemented this procedure for a large number of mouse fibroblast genes under tumour necrosis factor induction and determined exactly the 'cross-talking n-state' framework; the cross-talk between the signalling and basal pathways is crucial to trigger the first peak of M(t), while the following damped gentle M(t) oscillation is regulated by the multi-step basal pathway. This framework can be used to fit sophisticated single-cell data and may facilitate a more accurate understanding of stochastic activation of mouse fibroblast genes.
Collapse
Affiliation(s)
- Liang Chen
- Guangzhou Center for Applied Mathematics, Guangzhou University, Guangzhou, People’s Republic of China
- School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, People’s Republic of China
| | - Genghong Lin
- Guangzhou Center for Applied Mathematics, Guangzhou University, Guangzhou, People’s Republic of China
- School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, People’s Republic of China
| | - Feng Jiao
- Guangzhou Center for Applied Mathematics, Guangzhou University, Guangzhou, People’s Republic of China
- School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, People’s Republic of China
| |
Collapse
|
32
|
Zhu QX, Xu TX, Xu Y, He YL. Improved Virtual Sample Generation Method Using Enhanced Conditional Generative Adversarial Networks with Cycle Structures for Soft Sensors with Limited Data. Ind Eng Chem Res 2021. [DOI: 10.1021/acs.iecr.1c03197] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Qun-Xiong Zhu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, P. R. China
| | - Tian-xiang Xu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, P. R. China
| | - Yuan Xu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, P. R. China
| | - Yan-Lin He
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, P. R. China
| |
Collapse
|
33
|
A Novel Approach for Calculating Exact Forms of mRNA Distribution in Single-Cell Measurements. MATHEMATICS 2021. [DOI: 10.3390/math10010027] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Gene transcription is a stochastic process manifested by fluctuations in mRNA copy numbers in individual isogenic cells. Together with mathematical models of stochastic transcription, the massive mRNA distribution data that can be used to quantify fluctuations in mRNA levels can be fitted by Pm(t), which is the probability of producing m mRNA molecules at time t in a single cell. Tremendous efforts have been made to derive analytical forms of Pm(t), which rely on solving infinite arrays of the master equations of models. However, current approaches focus on the steady-state (t→∞) or require several parameters to be zero or infinity. Here, we present an approach for calculating Pm(t) with time, where all parameters are positive and finite. Our approach was successfully implemented for the classical two-state model and the widely used three-state model and may be further developed for different models with constant kinetic rates of transcription. Furthermore, the direct computations of Pm(t) for the two-state model and three-state model showed that the different regulations of gene activation can generate discriminated dynamical bimodal features of mRNA distribution under the same kinetic rates and similar steady-state mRNA distribution.
Collapse
|
34
|
Zhou C, Liu T, Zhu H, Li Y, Li F. Nonstationary and Multirate Process Monitoring by Using Common Trends and Multiple Probability Principal Component Analysis. Ind Eng Chem Res 2021. [DOI: 10.1021/acs.iecr.1c03178] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Can Zhou
- School of Automation, Central South University, Changsha 410083, China
- Pengcheng Laboratory, Shenzhen 518000, China
| | - Tianhao Liu
- School of Automation, Central South University, Changsha 410083, China
| | - Hongqiu Zhu
- School of Automation, Central South University, Changsha 410083, China
| | - Yonggang Li
- School of Automation, Central South University, Changsha 410083, China
| | - Fanbiao Li
- School of Automation, Central South University, Changsha 410083, China
| |
Collapse
|
35
|
Braichenko S, Holehouse J, Grima R. Distinguishing between models of mammalian gene expression: telegraph-like models versus mechanistic models. J R Soc Interface 2021; 18:20210510. [PMID: 34610262 PMCID: PMC8492181 DOI: 10.1098/rsif.2021.0510] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Two-state models (telegraph-like models) have a successful history of predicting distributions of cellular and nascent mRNA numbers that can well fit experimental data. These models exclude key rate limiting steps, and hence it is unclear why they are able to accurately predict the number distributions. To answer this question, here we compare these models to a novel stochastic mechanistic model of transcription in mammalian cells that presents a unified description of transcriptional factor, polymerase and mature mRNA dynamics. We show that there is a large region of parameter space where the first, second and third moments of the distributions of the waiting times between two consecutively produced transcripts (nascent or mature) of two-state and mechanistic models exactly match. In this region: (i) one can uniquely express the two-state model parameters in terms of those of the mechanistic model, (ii) the models are practically indistinguishable by comparison of their transcript numbers distributions, and (iii) they are distinguishable from the shape of their waiting time distributions. Our results clarify the relationship between different gene expression models and identify a means to select between them from experimental data.
Collapse
Affiliation(s)
- Svitlana Braichenko
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK.,School of Informatics, University of Edinburgh, Edinburgh, UK
| | - James Holehouse
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
36
|
Khaled H, Abu-Elnasr O, Elmougy S, Tolba AS. Intelligent system for human activity recognition in IoT environment. COMPLEX INTELL SYST 2021; 9:1-12. [PMID: 34777979 PMCID: PMC8422064 DOI: 10.1007/s40747-021-00508-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 08/14/2021] [Indexed: 11/26/2022]
Abstract
In recent years, the adoption of machine learning has grown steadily in different fields affecting the day-to-day decisions of individuals. This paper presents an intelligent system for recognizing human's daily activities in a complex IoT environment. An enhanced model of capsule neural network called 1D-HARCapsNe is proposed. This proposed model consists of convolution layer, primary capsule layer, activity capsules flat layer and output layer. It is validated using WISDM dataset collected via smart devices and normalized using the random-SMOTE algorithm to handle the imbalanced behavior of the dataset. The experimental results indicate the potential and strengths of the proposed 1D-HARCapsNet that achieved enhanced performance with an accuracy of 98.67%, precision of 98.66%, recall of 98.67%, and F1-measure of 0.987 which shows major performance enhancement compared to the Conventional CapsNet (accuracy 90.11%, precision 91.88%, recall 89.94%, and F1-measure 0.93).
Collapse
Affiliation(s)
- Hassan Khaled
- Computer Science Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Osama Abu-Elnasr
- Computer Science Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Samir Elmougy
- Computer Science Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - A. S. Tolba
- Computer Science Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| |
Collapse
|
37
|
Cortez MJ, Hong H, Choi B, Kim JK, Josić K. Hierarchical Bayesian models of transcriptional and translational regulation processes with delays. Bioinformatics 2021; 38:187-195. [PMID: 34450624 PMCID: PMC8696106 DOI: 10.1093/bioinformatics/btab618] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 08/19/2021] [Accepted: 08/25/2021] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION Simultaneous recordings of gene network dynamics across large populations have revealed that cell characteristics vary considerably even in clonal lines. Inferring the variability of parameters that determine gene dynamics is key to understanding cellular behavior. However, this is complicated by the fact that the outcomes and effects of many reactions are not observable directly. Unobserved reactions can be replaced with time delays to reduce model dimensionality and simplify inference. However, the resulting models are non-Markovian, and require the development of new inference techniques. RESULTS We propose a non-Markovian, hierarchical Bayesian inference framework for quantifying the variability of cellular processes within and across cells in a population. We illustrate our approach using a delayed birth-death process. In general, a distributed delay model, rather than a popular fixed delay model, is needed for inference, even if only mean reaction delays are of interest. Using in silico and experimental data we show that the proposed hierarchical framework is robust and leads to improved estimates compared to its non-hierarchical counterpart. We apply our method to data obtained using time-lapse microscopy and infer the parameters that describe the dynamics of protein production at the single cell and population level. The mean delays in protein production are larger than previously reported, have a coefficient of variation of around 0.2 across the population, and are not strongly correlated with protein production or growth rates. AVAILABILITY AND IMPLEMENTATION Accompanying code in Python is available at https://github.com/mvcortez/Bayesian-Inference. CONTACT kresimir.josic@gmail.com or jaekkim@kaist.ac.kr or cbskust@korea.ac.kr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mark Jayson Cortez
- Department of Mathematics, University of Houston, Houston, TX 77204, USA,Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, Laguna 4031, Philippines
| | - Hyukpyo Hong
- Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea,Biomedical Mathematics Group, Institute for Basic Science, Daejeon 34126, Korea
| | - Boseung Choi
- To whom correspondence should be addressed. or or
| | | | | |
Collapse
|
38
|
Zhang Z, Deng Q, Wang Z, Chen Y, Zhou T. Exact results for queuing models of stochastic transcription with memory and crosstalk. Phys Rev E 2021; 103:062414. [PMID: 34271765 DOI: 10.1103/physreve.103.062414] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/03/2021] [Indexed: 11/07/2022]
Abstract
Gene transcription is a complex multistep biochemical process, which can create memory between individual reaction events. On the other hand, many inducible genes, when activated by external cues, are often coregulated by several competitive pathways with crosstalk. This raises an unexplored question: how do molecular memory and crosstalk together affect gene expressions? To address this question, we introduce a queuing model of stochastic transcription, where two crossing signaling pathways are used to direct gene activation in response to external signals and memory functions to model multistep reaction processes involved in transcription. We first establish, based on the total probability principle, the chemical master equation for this queuing model, and then we derive, based on the binomial moment approach, exact expressions for statistical quantities (including distributions) of mRNA, which provide insights into the roles of crosstalk and memory in controlling the mRNA level and noise. We find that molecular memory of gene activation decreases the mRNA level but increases the mRNA noise, and double activation pathways always reduce the mRNA noise in contrast to a single pathway. In addition, we find that molecular memory can make the mRNA bimodality disappear.
Collapse
Affiliation(s)
- Zhenquan Zhang
- Guangdong Province Key Laboratory of Computational Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Qiqi Deng
- Guangdong Province Key Laboratory of Computational Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Zihao Wang
- Guangdong Province Key Laboratory of Computational Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Yiren Chen
- College of Mathematics and Statistics, Shenzhen University, Shenzhen 518060, China
| | - Tianshou Zhou
- Guangdong Province Key Laboratory of Computational Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| |
Collapse
|