1
|
Wu J, Stewart WCL, Jayaprakash C, Das J. BioNetGMMFit: estimating parameters of a BioNetGen model from time-stamped snapshots of single cells. NPJ Syst Biol Appl 2023; 9:46. [PMID: 37736766 PMCID: PMC10516955 DOI: 10.1038/s41540-023-00299-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 07/31/2023] [Indexed: 09/23/2023] Open
Abstract
Mechanistic models are commonly employed to describe signaling and gene regulatory kinetics in single cells and cell populations. Recent advances in single-cell technologies have produced multidimensional datasets where snapshots of copy numbers (or abundances) of a large number of proteins and mRNA are measured across time in single cells. The availability of such datasets presents an attractive scenario where mechanistic models are validated against experiments, and estimated model parameters enable quantitative predictions of signaling or gene regulatory kinetics. To empower the systems biology community to easily estimate parameters accurately from multidimensional single-cell data, we have merged a widely used rule-based modeling software package BioNetGen, which provides a user-friendly way to code for mechanistic models describing biochemical reactions, and the recently introduced CyGMM, that uses cell-to-cell differences to improve parameter estimation for such networks, into a single software package: BioNetGMMFit. BioNetGMMFit provides parameter estimates of the model, supplied by the user in the BioNetGen markup language (BNGL), which yield the best fit for the observed single-cell, time-stamped data of cellular components. Furthermore, for more precise estimates, our software generates confidence intervals around each model parameter. BioNetGMMFit is capable of fitting datasets of increasing cell population sizes for any mechanistic model specified in the BioNetGen markup language. By streamlining the process of developing mechanistic models for large single-cell datasets, BioNetGMMFit provides an easily-accessible modeling framework designed for scale and the broader biochemical signaling community.
Collapse
Affiliation(s)
- John Wu
- Department of Computer Science, The Ohio State University, 281 W Lane Ave, Columbus, OH, 43210, USA.
- Steve and Cindy Rasmussen Institute for Genomics, The Abigail Wexner Research Institute, Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH, 43205, USA.
| | | | - Ciriyam Jayaprakash
- Department of Physics, The Ohio State University, 191 W Woodruff Ave, Columbus, OH, 43210, USA
| | - Jayajit Das
- Steve and Cindy Rasmussen Institute for Genomics, The Abigail Wexner Research Institute, Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH, 43205, USA.
- Departments of Pediatrics, Biomedical Informatics, Pelotonia Institute of Immuno-Oncology, College of Medicine, and Biophysics Program, The Ohio State University, 370 W 9th Ave, Columbus, OH, 43210, USA.
| |
Collapse
|
2
|
Kochen MA, Wiley HS, Feng S, Sauro HM. SBbadger: biochemical reaction networks with definable degree distributions. Bioinformatics 2022; 38:5064-5072. [PMID: 36111865 PMCID: PMC9665861 DOI: 10.1093/bioinformatics/btac630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 07/24/2022] [Accepted: 09/15/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION An essential step in developing computational tools for the inference, optimization and simulation of biochemical reaction networks is gauging tool performance against earlier efforts using an appropriate set of benchmarks. General strategies for the assembly of benchmark models include collection from the literature, creation via subnetwork extraction and de novo generation. However, with respect to biochemical reaction networks, these approaches and their associated tools are either poorly suited to generate models that reflect the wide range of properties found in natural biochemical networks or to do so in numbers that enable rigorous statistical analysis. RESULTS In this work, we present SBbadger, a python-based software tool for the generation of synthetic biochemical reaction or metabolic networks with user-defined degree distributions, multiple available kinetic formalisms and a host of other definable properties. SBbadger thus enables the creation of benchmark model sets that reflect properties of biological systems and generate the kinetics and model structures typically targeted by computational analysis and inference software. Here, we detail the computational and algorithmic workflow of SBbadger, demonstrate its performance under various settings, provide sample outputs and compare it to currently available biochemical reaction network generation software. AVAILABILITY AND IMPLEMENTATION SBbadger is implemented in Python and is freely available at https://github.com/sys-bio/SBbadger and via PyPI at https://pypi.org/project/SBbadger/. Documentation can be found at https://SBbadger.readthedocs.io. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michael A Kochen
- Department of Bioengineering, University of Washington, Seattle, WA 98105, USA
| | - H Steven Wiley
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, 99354, USA
| | - Song Feng
- Biological Science Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Herbert M Sauro
- Department of Bioengineering, University of Washington, Seattle, WA 98105, USA
| |
Collapse
|
3
|
Neumann J, Lin YT, Mallela A, Miller EF, Colvin J, Duprat AT, Chen Y, Hlavacek WS, Posner RG. Implementation of a practical Markov chain Monte Carlo sampling algorithm in PyBioNetFit. Bioinformatics 2022; 38:1770-1772. [PMID: 34986226 DOI: 10.1093/bioinformatics/btac004] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 11/30/2021] [Accepted: 01/03/2022] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Bayesian inference in biological modeling commonly relies on Markov chain Monte Carlo (MCMC) sampling of a multidimensional and non-Gaussian posterior distribution that is not analytically tractable. Here, we present the implementation of a practical MCMC method in the open-source software package PyBioNetFit (PyBNF), which is designed to support parameterization of mathematical models for biological systems. The new MCMC method, am, incorporates an adaptive move proposal distribution. For warm starts, sampling can be initiated at a specified location in parameter space and with a multivariate Gaussian proposal distribution defined initially by a specified covariance matrix. Multiple chains can be generated in parallel using a computer cluster. We demonstrate that am can be used to successfully solve real-world Bayesian inference problems, including forecasting of new Coronavirus Disease 2019 case detection with Bayesian quantification of forecast uncertainty. AVAILABILITY AND IMPLEMENTATION PyBNF version 1.1.9, the first stable release with am, is available at PyPI and can be installed using the pip package-management system on platforms that have a working installation of Python 3. PyBNF relies on libRoadRunner and BioNetGen for simulations (e.g. numerical integration of ordinary differential equations defined in SBML or BNGL files) and Dask.Distributed for task scheduling on Linux computer clusters. The Python source code can be freely downloaded/cloned from GitHub and used and modified under terms of the BSD-3 license (https://github.com/lanl/pybnf). Online documentation covering installation/usage is available (https://pybnf.readthedocs.io/en/latest/). A tutorial video is available on YouTube (https://www.youtube.com/watch?v=2aRqpqFOiS4&t=63s). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jacob Neumann
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Yen Ting Lin
- Information Sciences Group, Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Abhishek Mallela
- Department of Mathematics, University of California, Davis, CA 95616, USA
| | - Ely F Miller
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Joshua Colvin
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Abell T Duprat
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Ye Chen
- Department of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - William S Hlavacek
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Richard G Posner
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
| |
Collapse
|
4
|
Budde K, Smith J, Wilsdorf P, Haack F, Uhrmacher AM. Relating simulation studies by provenance-Developing a family of Wnt signaling models. PLoS Comput Biol 2021; 17:e1009227. [PMID: 34351901 PMCID: PMC8407594 DOI: 10.1371/journal.pcbi.1009227] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 08/31/2021] [Accepted: 06/29/2021] [Indexed: 12/28/2022] Open
Abstract
For many biological systems, a variety of simulation models exist. A new simulation model is rarely developed from scratch, but rather revises and extends an existing one. A key challenge, however, is to decide which model might be an appropriate starting point for a particular problem and why. To answer this question, we need to identify entities and activities that contributed to the development of a simulation model. Therefore, we exploit the provenance data model, PROV-DM, of the World Wide Web Consortium and, building on previous work, continue developing a PROV ontology for simulation studies. Based on a case study of 19 Wnt/β-catenin signaling models, we identify crucial entities and activities as well as useful metadata to both capture the provenance information from individual simulation studies and relate these forming a family of models. The approach is implemented in WebProv, a web application for inserting and querying provenance information. Our specialization of PROV-DM contains the entities Research Question, Assumption, Requirement, Qualitative Model, Simulation Model, Simulation Experiment, Simulation Data, and Wet-lab Data as well as activities referring to building, calibrating, validating, and analyzing a simulation model. We show that most Wnt simulation models are connected to other Wnt models by using (parts of) these models. However, the overlap, especially regarding the Wet-lab Data used for calibration or validation of the models is small. Making these aspects of developing a model explicit and queryable is an important step for assessing and reusing simulation models more effectively. Exposing this information helps to integrate a new simulation model within a family of existing ones and may lead to the development of more robust and valid simulation models. We hope that our approach becomes part of a standardization effort and that modelers adopt the benefits of provenance when considering or creating simulation models. We revise a provenance ontology for simulation studies of cellular biochemical models. Provenance information is useful for understanding the creation of a simulation model because it not only contains information about the entities and activities that have led to a simulation model but also their relations, all of which can be visualized. It provides additional structure by explicitly recording research questions, assumptions, and requirements and relating them along with data, qualitative models, simulation models, and simulation experiments through a small set of predefined but extensible activities. We have applied our concept to a family of 19 Wnt signaling models and implemented a web-based tool (WebProv) to store the provenance information from these studies. The resulting provenance graph visualizes the story line of simulation studies and demonstrates the creation and calibration of simulation models, the successive attempts of validation and extension, and shows, beyond an individual simulation study, how the Wnt models are related. Thereby, the steps and sources that contributed to a simulation model are made explicit. Our approach complements other approaches aimed at facilitating the reuse and assessment of simulation products in systems biology such as model repositories as well as annotation and documentation guidelines.
Collapse
Affiliation(s)
- Kai Budde
- Institute for Visual and Analytic Computing, University of Rostock, Rostock, Germany
- * E-mail:
| | - Jacob Smith
- Faculty of Computer Science, University of New Brunswick, Fredericton, Canada
| | - Pia Wilsdorf
- Institute for Visual and Analytic Computing, University of Rostock, Rostock, Germany
| | - Fiete Haack
- Institute for Visual and Analytic Computing, University of Rostock, Rostock, Germany
| | - Adelinde M. Uhrmacher
- Institute for Visual and Analytic Computing, University of Rostock, Rostock, Germany
| |
Collapse
|
5
|
Rukhlenko OS, Kholodenko BN. Modeling the Nonlinear Dynamics of Intracellular Signaling Networks. Bio Protoc 2021; 11:e4089. [PMID: 34395728 PMCID: PMC8329461 DOI: 10.21769/bioprotoc.4089] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Revised: 04/09/2021] [Accepted: 05/28/2021] [Indexed: 11/17/2022] Open
Abstract
This protocol illustrates a pipeline for modeling the nonlinear behavior of intracellular signaling pathways. At fixed spatial points, nonlinear signaling dynamics are described by ordinary differential equations (ODEs). At constant parameters, these ODEs may have multiple attractors, such as multiple steady states or limit cycles. Standard optimization procedures fine-tune the parameters for the system trajectories localized within the basin of attraction of only one attractor, usually a stable steady state. The suggested protocol samples the parameter space and captures the overall dynamic behavior by analyzing the number and stability of steady states and the shapes of the assembly of nullclines, which are determined as projections of quasi-steady-state trajectories into different 2D spaces of system variables. Our pipeline allows identifying main qualitative features of the model behavior, perform bifurcation analysis, and determine the borders separating the different dynamical regimes within the assembly of 2D parametric planes. Partial differential equation (PDE) systems describing the nonlinear spatiotemporal behavior are derived by coupling fixed point dynamics with species diffusion.
Collapse
Affiliation(s)
- Oleksii S Rukhlenko
- Systems Biology Ireland, School of Medicine and Medical Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Boris N Kholodenko
- Systems Biology Ireland, School of Medicine and Medical Science, University College Dublin, Belfield, Dublin 4, Ireland.,Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin 4, Ireland.,Department of Pharmacology, Yale University School of Medicine, New Haven, USA
| |
Collapse
|
6
|
Zhang L, Liu G, Kong M, Li T, Wu D, Zhou X, Yang C, Xia L, Yang Z, Chen L. Revealing dynamic regulations and the related key proteins of myeloma-initiating cells by integrating experimental data into a systems biological model. Bioinformatics 2021; 37:1554-1561. [PMID: 31350562 DOI: 10.1093/bioinformatics/btz542] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 06/17/2019] [Accepted: 07/19/2019] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION The growth and survival of myeloma cells are greatly affected by their surrounding microenvironment. To understand the molecular mechanism and the impact of stiffness on the fate of myeloma-initiating cells (MICs), we develop a systems biological model to reveal the dynamic regulations by integrating reverse-phase protein array data and the stiffness-associated pathway. RESULTS We not only develop a stiffness-associated signaling pathway to describe the dynamic regulations of the MICs, but also clearly identify three critical proteins governing the MIC proliferation and death, including FAK, mTORC1 and NFκB, which are validated to be related with multiple myeloma by our immunohistochemistry experiment, computation and manually reviewed evidences. Moreover, we demonstrate that the systematic model performs better than widely used parameter estimation algorithms for the complicated signaling pathway. AVAILABILITY AND IMPLEMENTATION We can not only use the systems biological model to infer the stiffness-associated genetic signaling pathway and locate the critical proteins, but also investigate the important pathways, proteins or genes for other type of the cancer. Thus, it holds universal scientific significance. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Le Zhang
- College of Computer Science.,Medical Big Data Center, Sichuan University, Chengdu 610065, China.,Chongqqing Zhongdi Medical Information Technology Co., Ltd, Chongqing 401320, China
| | - Guangdi Liu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China.,Library of Chengdu University, Chengdu University, Chengdu 610106, China
| | - Meijing Kong
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Tingting Li
- College of Mathematics and Statistics, Southwest University, Chongqing 400715, China
| | - Dan Wu
- Department of Radiology, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Xiaobo Zhou
- Department of Radiology, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Chuanwei Yang
- Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Lei Xia
- Cancer Center, Research Institute of Surgery, Daping Hospital, Third Military Medical University, Chongqing 400042, China
| | - Zhenzhou Yang
- Cancer Center, Research Institute of Surgery, Daping Hospital, Third Military Medical University, Chongqing 400042, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.,Shanghai Research Center for Brain Science and Brain-Inspired Intelligence, Shanghai 201210, China
| |
Collapse
|
7
|
Mitra ED, Hlavacek WS. Bayesian inference using qualitative observations of underlying continuous variables. Bioinformatics 2020; 36:3177-3184. [PMID: 32049328 PMCID: PMC7214020 DOI: 10.1093/bioinformatics/btaa084] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 01/08/2020] [Accepted: 02/03/2020] [Indexed: 01/28/2023] Open
Abstract
MOTIVATION Recent work has demonstrated the feasibility of using non-numerical, qualitative data to parameterize mathematical models. However, uncertainty quantification (UQ) of such parameterized models has remained challenging because of a lack of a statistical interpretation of the objective functions used in optimization. RESULTS We formulated likelihood functions suitable for performing Bayesian UQ using qualitative observations of underlying continuous variables or a combination of qualitative and quantitative data. To demonstrate the resulting UQ capabilities, we analyzed a published model for immunoglobulin E (IgE) receptor signaling using synthetic qualitative and quantitative datasets. Remarkably, estimates of parameter values derived from the qualitative data were nearly as consistent with the assumed ground-truth parameter values as estimates derived from the lower throughput quantitative data. These results provide further motivation for leveraging qualitative data in biological modeling. AVAILABILITY AND IMPLEMENTATION The likelihood functions presented here are implemented in a new release of PyBioNetFit, an open-source application for analyzing Systems Biology Markup Language- and BioNetGen Language-formatted models, available online at www.github.com/lanl/PyBNF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Eshan D Mitra
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - William S Hlavacek
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| |
Collapse
|
8
|
Hastings JF, O'Donnell YEI, Fey D, Croucher DR. Applications of personalised signalling network models in precision oncology. Pharmacol Ther 2020; 212:107555. [PMID: 32320730 DOI: 10.1016/j.pharmthera.2020.107555] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 04/07/2020] [Indexed: 02/07/2023]
Abstract
As our ability to provide in-depth, patient-specific characterisation of the molecular alterations within tumours rapidly improves, it is becoming apparent that new approaches will be required to leverage the power of this data and derive the full benefit for each individual patient. Systems biology approaches are beginning to emerge within this field as a potential method of incorporating large volumes of network level data and distilling a coherent, clinically-relevant prediction of drug response. However, the initial promise of this developing field is yet to be realised. Here we argue that in order to develop these precise models of individual drug response and tailor treatment accordingly, we will need to develop mathematical models capable of capturing both the dynamic nature of drug-response signalling networks and key patient-specific information such as mutation status or expression profiles. We also review the modelling approaches commonly utilised within this field, and outline recent examples of their use in furthering the application of systems biology for a precision medicine approach to cancer treatment.
Collapse
Affiliation(s)
- Jordan F Hastings
- The Kinghorn Cancer Centre, Garvan Institute of Medical Research, Sydney, Australia
| | | | - Dirk Fey
- Systems Biology Ireland, University College Dublin, Belfield, Dublin 4, Ireland; School of Medicine and Medical Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - David R Croucher
- The Kinghorn Cancer Centre, Garvan Institute of Medical Research, Sydney, Australia; School of Medicine and Medical Science, University College Dublin, Belfield, Dublin 4, Ireland; St Vincent's Hospital Clinical School, University of New South Wales, Sydney, NSW 2052, Australia.
| |
Collapse
|
9
|
Salazar-Cavazos E, Nitta CF, Mitra ED, Wilson BS, Lidke KA, Hlavacek WS, Lidke DS. Multisite EGFR phosphorylation is regulated by adaptor protein abundances and dimer lifetimes. Mol Biol Cell 2020; 31:695-708. [PMID: 31913761 PMCID: PMC7202077 DOI: 10.1091/mbc.e19-09-0548] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Differential epidermal growth factor receptor (EGFR) phosphorylation is thought to couple receptor activation to distinct signaling pathways. However, the molecular mechanisms responsible for biased signaling are unresolved due to a lack of insight into the phosphorylation patterns of full-length EGFR. We extended a single-molecule pull-down technique previously used to study protein-protein interactions to allow for robust measurement of receptor phosphorylation. We found that EGFR is predominantly phosphorylated at multiple sites, yet phosphorylation at specific tyrosines is variable and only a subset of receptors share phosphorylation at the same site, even with saturating ligand concentrations. We found distinct populations of receptors as soon as 1 min after ligand stimulation, indicating early diversification of function. To understand this heterogeneity, we developed a mathematical model. The model predicted that variations in phosphorylation are dependent on the abundances of signaling partners, while phosphorylation levels are dependent on dimer lifetimes. The predictions were confirmed in studies of cell lines with different expression levels of signaling partners, and in experiments comparing low- and high-affinity ligands and oncogenic EGFR mutants. These results reveal how ligand-regulated receptor dimerization dynamics and adaptor protein concentrations play critical roles in EGFR signaling.
Collapse
Affiliation(s)
| | | | - Eshan D Mitra
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545
| | | | - Keith A Lidke
- Comprehensive Cancer Center, and.,Department of Physics and Astronomy, University of New Mexico, Albuquerque, NM 87131
| | - William S Hlavacek
- Comprehensive Cancer Center, and.,Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545
| | - Diane S Lidke
- Department of Pathology.,Comprehensive Cancer Center, and
| |
Collapse
|
10
|
Santibáñez R, Garrido D, Martin AJM. Pleione: A tool for statistical and multi-objective calibration of Rule-based models. Sci Rep 2019; 9:15104. [PMID: 31641245 PMCID: PMC6805871 DOI: 10.1038/s41598-019-51546-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 09/24/2019] [Indexed: 11/17/2022] Open
Abstract
Mathematical models based on Ordinary Differential Equations (ODEs) are frequently used to describe and simulate biological systems. Nevertheless, such models are often difficult to understand. Unlike ODE models, Rule-Based Models (RBMs) utilise formal language to describe reactions as a cumulative number of statements that are easier to understand and correct. They are also gaining popularity because of their conciseness and simulation flexibility. However, RBMs generally lack tools to perform further analysis that requires simulation. This situation arises because exact and approximate simulations are computationally intensive. Translating RBMs into ODEs is commonly used to reduce simulation time, but this technique may be prohibitive due to combinatorial explosion. Here, we present the software called Pleione to calibrate RBMs. Parameter calibration is essential given the incomplete experimental determination of reaction rates and the goal of using models to reproduce experimental data. The software distributes stochastic simulations and calculations and incorporates equivalence tests to determine the fitness of RBMs compared with data. The primary features of Pleione were thoroughly tested on a model of gene regulation in Escherichia coli. Pleione yielded satisfactory results regarding calculation time and error reduction for multiple simulators, models, parameter search strategies, and computing infrastructures.
Collapse
Affiliation(s)
- Rodrigo Santibáñez
- Network Biology Lab, Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, 8580745, Chile
- Department of Chemical and Bioprocess Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
| | - Daniel Garrido
- Department of Chemical and Bioprocess Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
| | - Alberto J M Martin
- Network Biology Lab, Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, 8580745, Chile.
| |
Collapse
|
11
|
Mitra ED, Suderman R, Colvin J, Ionkov A, Hu A, Sauro HM, Posner RG, Hlavacek WS. PyBioNetFit and the Biological Property Specification Language. iScience 2019; 19:1012-1036. [PMID: 31522114 PMCID: PMC6744527 DOI: 10.1016/j.isci.2019.08.045] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 06/21/2019] [Accepted: 08/22/2019] [Indexed: 02/07/2023] Open
Abstract
In systems biology modeling, important steps include model parameterization, uncertainty quantification, and evaluation of agreement with experimental observations. To help modelers perform these steps, we developed the software PyBioNetFit, which in addition supports checking models against known system properties and solving design problems. PyBioNetFit introduces Biological Property Specification Language (BPSL) for the formal declaration of system properties. BPSL allows qualitative data to be used alone or in combination with quantitative data. PyBioNetFit performs parameterization with parallelized metaheuristic optimization algorithms that work directly with existing model definition standards: BioNetGen Language (BNGL) and Systems Biology Markup Language (SBML). We demonstrate PyBioNetFit's capabilities by solving various example problems, including the challenging problem of parameterizing a 153-parameter model of cell cycle control in yeast based on both quantitative and qualitative data. We demonstrate the model checking and design applications of PyBioNetFit and BPSL by analyzing a model of targeted drug interventions in autophagy signaling.
Collapse
Affiliation(s)
- Eshan D Mitra
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Ryan Suderman
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Joshua Colvin
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
| | - Alexander Ionkov
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Andrew Hu
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Herbert M Sauro
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Richard G Posner
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
| | - William S Hlavacek
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
| |
Collapse
|
12
|
Lin YT, Feng S, Hlavacek WS. Scaling methods for accelerating kinetic Monte Carlo simulations of chemical reaction networks. J Chem Phys 2019; 150:244101. [PMID: 31255063 DOI: 10.1063/1.5096774] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Various kinetic Monte Carlo algorithms become inefficient when some of the population sizes in a system are large, which gives rise to a large number of reaction events per unit time. Here, we present a new acceleration algorithm based on adaptive and heterogeneous scaling of reaction rates and stoichiometric coefficients. The algorithm is conceptually related to the commonly used idea of accelerating a stochastic simulation by considering a subvolume λΩ (0 < λ < 1) within a system of interest, which reduces the number of reaction events per unit time occurring in a simulation by a factor 1/λ at the cost of greater error in unbiased estimates of first moments and biased overestimates of second moments. Our new approach offers two unique benefits. First, scaling is adaptive and heterogeneous, which eliminates the pitfall of overaggressive scaling. Second, there is no need for an a priori classification of populations as discrete or continuous (as in a hybrid method), which is problematic when discreteness of a chemical species changes during a simulation. The method requires specification of only a single algorithmic parameter, Nc, a global critical population size above which populations are effectively scaled down to increase simulation efficiency. The method, which we term partial scaling, is implemented in the open-source BioNetGen software package. We demonstrate that partial scaling can significantly accelerate simulations without significant loss of accuracy for several published models of biological systems. These models characterize activation of the mitogen-activated protein kinase ERK, prion protein aggregation, and T-cell receptor signaling.
Collapse
Affiliation(s)
- Yen Ting Lin
- Center for Nonlinear Studies and Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Song Feng
- Center for Nonlinear Studies and Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - William S Hlavacek
- Center for Nonlinear Studies and Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| |
Collapse
|
13
|
Sorokin A, Sorokina O, Douglas Armstrong J. RKappa: Software for Analyzing Rule-Based Models. Methods Mol Biol 2019; 1945:363-390. [PMID: 30945256 DOI: 10.1007/978-1-4939-9102-0_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
Abstract
RKappa is a framework for the development, simulation, and analysis of rule-based models within the mature statistically empowered R environment. It is designed for model editing, parameter identification, simulation, sensitivity analysis, and visualization. The framework is optimized for high-performance computing platforms and facilitates analysis of large-scale systems biology models where knowledge of exact mechanisms is limited and parameter values are uncertain.The RKappa software is an open-source (GLP3 license) package for R, which is freely available online ( https://github.com/lptolik/R4Kappa ).
Collapse
Affiliation(s)
- Anatoly Sorokin
- Institute of Cell Biophysics, Russian Academy of Sciences, Moscow Region, Russia. .,Moscow Institute of Physics and Technology, Moscow Region, Russia.
| | - Oksana Sorokina
- School of Informatics, University of Edinburgh, Edinburgh, UK
| | | |
Collapse
|
14
|
Shockley EM, Vrugt JA, Lopez CF. PyDREAM: high-dimensional parameter inference for biological models in python. Bioinformatics 2019; 34:695-697. [PMID: 29028896 PMCID: PMC5860607 DOI: 10.1093/bioinformatics/btx626] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 10/03/2017] [Indexed: 11/22/2022] Open
Abstract
Summary Biological models contain many parameters whose values are difficult to measure directly via experimentation and therefore require calibration against experimental data. Markov chain Monte Carlo (MCMC) methods are suitable to estimate multivariate posterior model parameter distributions, but these methods may exhibit slow or premature convergence in high-dimensional search spaces. Here, we present PyDREAM, a Python implementation of the (Multiple-Try) Differential Evolution Adaptive Metropolis [DREAM(ZS)] algorithm developed by Vrugt and ter Braak (2008) and Laloy and Vrugt (2012). PyDREAM achieves excellent performance for complex, parameter-rich models and takes full advantage of distributed computing resources, facilitating parameter inference and uncertainty estimation of CPU-intensive biological models. Availability and implementation PyDREAM is freely available under the GNU GPLv3 license from the Lopez lab GitHub repository at http://github.com/LoLab-VU/PyDREAM. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erin M Shockley
- Department of Biochemistry, Vanderbilt University, 2215 Garland Avenue, Nashville, TN 37212, USA
| | - Jasper A Vrugt
- Department of Civil and Environmental Engineering, University of California Irvine, 4130 Engineering Gateway, Irvine, CA 92697-2175, USA.,Department of Earth System Science, University of California Irvine, 3200 Croul Hall St, Irvine, CA 92697-2175, USA
| | - Carlos F Lopez
- Department of Biochemistry, Vanderbilt University, 2215 Garland Avenue, Nashville, TN 37212, USA
| |
Collapse
|
15
|
Erickson KE, Rukhlenko OS, Shahinuzzaman M, Slavkova KP, Lin YT, Suderman R, Stites EC, Anghel M, Posner RG, Barua D, Kholodenko BN, Hlavacek WS. Modeling cell line-specific recruitment of signaling proteins to the insulin-like growth factor 1 receptor. PLoS Comput Biol 2019; 15:e1006706. [PMID: 30653502 PMCID: PMC6353226 DOI: 10.1371/journal.pcbi.1006706] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 01/30/2019] [Accepted: 12/09/2018] [Indexed: 12/27/2022] Open
Abstract
Receptor tyrosine kinases (RTKs) typically contain multiple autophosphorylation sites in their cytoplasmic domains. Once activated, these autophosphorylation sites can recruit downstream signaling proteins containing Src homology 2 (SH2) and phosphotyrosine-binding (PTB) domains, which recognize phosphotyrosine-containing short linear motifs (SLiMs). These domains and SLiMs have polyspecific or promiscuous binding activities. Thus, multiple signaling proteins may compete for binding to a common SLiM and vice versa. To investigate the effects of competition on RTK signaling, we used a rule-based modeling approach to develop and analyze models for ligand-induced recruitment of SH2/PTB domain-containing proteins to autophosphorylation sites in the insulin-like growth factor 1 (IGF1) receptor (IGF1R). Models were parameterized using published datasets reporting protein copy numbers and site-specific binding affinities. Simulations were facilitated by a novel application of model restructuration, to reduce redundancy in rule-derived equations. We compare predictions obtained via numerical simulation of the model to those obtained through simple prediction methods, such as through an analytical approximation, or ranking by copy number and/or KD value, and find that the simple methods are unable to recapitulate the predictions of numerical simulations. We created 45 cell line-specific models that demonstrate how early events in IGF1R signaling depend on the protein abundance profile of a cell. Simulations, facilitated by model restructuration, identified pairs of IGF1R binding partners that are recruited in anti-correlated and correlated fashions, despite no inclusion of cooperativity in our models. This work shows that the outcome of competition depends on the physicochemical parameters that characterize pairwise interactions, as well as network properties, including network connectivity and the relative abundances of competitors. Cells rely on networks of interacting biomolecules to sense and respond to environmental perturbations and signals. However, it is unclear how information is processed to generate appropriate and specific responses to signals, especially given that these networks tend to share many components. For example, receptors that detect distinct ligands and regulate distinct cellular activities commonly interact with overlapping sets of downstream signaling proteins. Here, to investigate the downstream signaling of a well-studied receptor tyrosine kinase (RTK), the insulin-like growth factor 1 (IGF1) receptor (IGF1R), we formulated and analyzed 45 cell line-specific mathematical models, which account for recruitment of 18 different binding partners to six sites of receptor autophosphorylation in IGF1R. The models were parameterized using available protein copy number and site-specific affinity measurements, and restructured to allow for network generation. We find that recruitment is influenced by the protein abundance profile of a cell, with different patterns of recruitment in different cell lines. Furthermore, in a given cell line, we find that pairs of IGF1R binding partners may be recruited in a correlated or anti-correlated fashion. We demonstrate that the simulations of the model have greater predictive power than protein copy number and/or binding affinity data, and that even a simple analytical model cannot reproduce the predicted recruitment ranking obtained via simulations. These findings represent testable predictions and indicate that the outputs of IGF1R signaling depend on cell line-specific properties in addition to the properties that are intrinsic to the biomolecules involved.
Collapse
Affiliation(s)
- Keesha E. Erickson
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | | | - Md Shahinuzzaman
- Department of Chemical and Biochemical Engineering, University of Missouri Science and Technology, Rolla, Missouri, United States of America
| | - Kalina P. Slavkova
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Yen Ting Lin
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Ryan Suderman
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Edward C. Stites
- The Salk Institute for Biological Studies, La Jolla, California, United States of America
| | - Marian Anghel
- Information Sciences Group, Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Richard G. Posner
- Department of Biological Sciences, Northern Arizona University, Flagstaff, Arizona, United States of America
| | - Dipak Barua
- Department of Chemical and Biochemical Engineering, University of Missouri Science and Technology, Rolla, Missouri, United States of America
| | - Boris N. Kholodenko
- Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
- School of Medicine and Medical Science and Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland
| | - William S. Hlavacek
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- * E-mail:
| |
Collapse
|
16
|
Hlavacek WS, Csicsery-Ronay JA, Baker LR, Ramos Álamo MDC, Ionkov A, Mitra ED, Suderman R, Erickson KE, Dias R, Colvin J, Thomas BR, Posner RG. A Step-by-Step Guide to Using BioNetFit. Methods Mol Biol 2019; 1945:391-419. [PMID: 30945257 DOI: 10.1007/978-1-4939-9102-0_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
BioNetFit is a software tool designed for solving parameter identification problems that arise in the development of rule-based models. It solves these problems through curve fitting (i.e., nonlinear regression). BioNetFit is compatible with deterministic and stochastic simulators that accept BioNetGen language (BNGL)-formatted files as inputs, such as those available within the BioNetGen framework. BioNetFit can be used on a laptop or stand-alone multicore workstation as well as on many Linux clusters, such as those that use the Slurm Workload Manager to schedule jobs. BioNetFit implements a metaheuristic population-based global optimization procedure, an evolutionary algorithm (EA), to minimize a user-defined objective function, such as a residual sum of squares (RSS) function. BioNetFit also implements a bootstrapping procedure for determining confidence intervals for parameter estimates. Here, we provide step-by-step instructions for using BioNetFit to estimate the values of parameters of a BNGL-encoded model and to define bootstrap confidence intervals. The process entails the use of several plain-text files, which are processed by BioNetFit and BioNetGen. In general, these files include (1) one or more EXP files, which each contains (experimental) data to be used in parameter identification/bootstrapping; (2) a BNGL file containing a model section, which defines a (rule-based) model, and an actions section, which defines simulation protocols that generate GDAT and/or SCAN files with model predictions corresponding to the data in the EXP file(s); and (3) a CONF file that configures the fitting/bootstrapping job and that defines algorithmic parameter settings.
Collapse
Affiliation(s)
- William S Hlavacek
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Jennifer A Csicsery-Ronay
- Theoretical Biology and Biophysics Group, Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Lewis R Baker
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
- Department of Applied Mathematics, University of Colorado, Boulder, CO, USA
| | - María Del Carmen Ramos Álamo
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Alexander Ionkov
- Theoretical Biology and Biophysics Group, Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Eshan D Mitra
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Ryan Suderman
- Theoretical Biology and Biophysics Group, Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
- Immunetrics, Inc., Pittsburgh, PA, USA
| | - Keesha E Erickson
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Raquel Dias
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
| | - Joshua Colvin
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
| | - Brandon R Thomas
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
| | - Richard G Posner
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA.
| |
Collapse
|
17
|
Suderman R, Deeds EJ. Intrinsic limits of information transmission in biochemical signalling motifs. Interface Focus 2018; 8:20180039. [PMID: 30443336 DOI: 10.1098/rsfs.2018.0039] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/05/2018] [Indexed: 12/22/2022] Open
Abstract
All living things have evolved to sense changes in their environment in order to respond in adaptive ways. At the cellular level, these sensing systems generally involve receptor molecules at the cell surface, which detect changes outside the cell and relay those changes to the appropriate response elements downstream. With the advent of experimental technologies that can track signalling at the single-cell level, it has become clear that many signalling systems exhibit significant levels of 'noise,' manifesting as differential responses of otherwise identical cells to the same environment. This noise has a large impact on the capacity of cell signalling networks to transmit information from the environment. Application of information theory to experimental data has found that all systems studied to date encode less than 2.5 bits of information, with the majority transmitting significantly less than 1 bit. Given the growing interest in applying information theory to biological data, it is crucial to understand whether the low values observed to date represent some sort of intrinsic limit on information flow given the inherently stochastic nature of biochemical signalling events. In this work, we used a series of computational models to explore how much information a variety of common 'signalling motifs' can encode. We found that the majority of these motifs, which serve as the basic building blocks of cell signalling networks, can encode far more information (4-6 bits) than has ever been observed experimentally. In addition to providing a consistent framework for estimating information-theoretic quantities from experimental data, our findings suggest that the low levels of information flow observed so far in living system are not necessarily due to intrinsic limitations. Further experimental work will be needed to understand whether certain cell signalling systems actually can approach the intrinsic limits described here, and to understand the sources and purpose of the variation that reduces information flow in living cells.
Collapse
Affiliation(s)
- Ryan Suderman
- Center for Computational Biology, University of Kansas, Lawrence, KS 66047, USA
| | - Eric J Deeds
- Center for Computational Biology, University of Kansas, Lawrence, KS 66047, USA.,Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66047, USA
| |
Collapse
|
18
|
Using both qualitative and quantitative data in parameter identification for systems biology models. Nat Commun 2018; 9:3901. [PMID: 30254246 PMCID: PMC6156341 DOI: 10.1038/s41467-018-06439-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 09/04/2018] [Indexed: 11/28/2022] Open
Abstract
In systems biology, qualitative data are often generated, but rarely used to parameterize models. We demonstrate an approach in which qualitative and quantitative data can be combined for parameter identification. In this approach, qualitative data are converted into inequality constraints imposed on the outputs of the model. These inequalities are used along with quantitative data points to construct a single scalar objective function that accounts for both datasets. To illustrate the approach, we estimate parameters for a simple model describing Raf activation. We then apply the technique to a more elaborate model characterizing cell cycle regulation in yeast. We incorporate both quantitative time courses (561 data points) and qualitative phenotypes of 119 mutant yeast strains (1647 inequalities) to perform automated identification of 153 model parameters. We quantify parameter uncertainty using a profile likelihood approach. Our results indicate the value of combining qualitative and quantitative data to parameterize systems biology models. Much of the data generated in biology is qualitative, but exploiting such data to inform models of biological systems remains a challenge. Here, the authors demonstrate an approach that allows use of both quantitative and qualitative data for parameterising dynamical models.
Collapse
|
19
|
Gupta S, Hainsworth L, Hogg JS, Lee REC, Faeder JR. Evaluation of Parallel Tempering to Accelerate Bayesian Parameter Estimation in Systems Biology. PROCEEDINGS. EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING 2018; 2018:690-697. [PMID: 30175326 DOI: 10.1109/pdp2018.2018.00114] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Models of biological systems often have many unknown parameters that must be determined in order for model behavior to match experimental observations. Commonly-used methods for parameter estimation that return point estimates of the best-fit parameters are insufficient when models are high dimensional and under-constrained. As a result, Bayesian methods, which treat model parameters as random variables and attempt to estimate their probability distributions given data, have become popular in systems biology. Bayesian parameter estimation often relies on Markov Chain Monte Carlo (MCMC) methods to sample model parameter distributions, but the slow convergence of MCMC sampling can be a major bottleneck. One approach to improving performance is parallel tempering (PT), a physics-based method that uses swapping between multiple Markov chains run in parallel at different temperatures to accelerate sampling. The temperature of a Markov chain determines the probability of accepting an unfavorable move, so swapping with higher temperatures chains enables the sampling chain to escape from local minima. In this work we compared the MCMC performance of PT and the commonly-used Metropolis-Hastings (MH) algorithm on six biological models of varying complexity. We found that for simpler models PT accelerated convergence and sampling, and that for more complex models, PT often converged in cases MH became trapped in non-optimal local minima. We also developed a freely-available MATLAB package for Bayesian parameter estimation called PTEMPEST (http://github.com/RuleWorld/ptempest), which is closely integrated with the popular BioNetGen software for rule-based modeling of biological systems.
Collapse
Affiliation(s)
- Sanjana Gupta
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| | - Liam Hainsworth
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| | - Justin S Hogg
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| | - Robin E C Lee
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| | - James R Faeder
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| |
Collapse
|
20
|
Gyori BM, Bachman JA, Subramanian K, Muhlich JL, Galescu L, Sorger PK. From word models to executable models of signaling networks using automated assembly. Mol Syst Biol 2017; 13:954. [PMID: 29175850 PMCID: PMC5731347 DOI: 10.15252/msb.20177651] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Word models (natural language descriptions of molecular mechanisms) are a common currency in spoken and written communication in biomedicine but are of limited use in predicting the behavior of complex biological networks. We present an approach to building computational models directly from natural language using automated assembly. Molecular mechanisms described in simple English are read by natural language processing algorithms, converted into an intermediate representation, and assembled into executable or network models. We have implemented this approach in the Integrated Network and Dynamical Reasoning Assembler (INDRA), which draws on existing natural language processing systems as well as pathway information in Pathway Commons and other online resources. We demonstrate the use of INDRA and natural language to model three biological processes of increasing scope: (i) p53 dynamics in response to DNA damage, (ii) adaptive drug resistance in BRAF‐V600E‐mutant melanomas, and (iii) the RAS signaling pathway. The use of natural language makes the task of developing a model more efficient and it increases model transparency, thereby promoting collaboration with the broader biology community.
Collapse
Affiliation(s)
- Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - John A Bachman
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Kartik Subramanian
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Jeremy L Muhlich
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Lucian Galescu
- Institute for Human and Machine Cognition, Pensacola, FL, USA
| | - Peter K Sorger
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
21
|
Timescale Separation of Positive and Negative Signaling Creates History-Dependent Responses to IgE Receptor Stimulation. Sci Rep 2017; 7:15586. [PMID: 29138425 PMCID: PMC5686181 DOI: 10.1038/s41598-017-15568-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 10/26/2017] [Indexed: 02/02/2023] Open
Abstract
The high-affinity receptor for IgE expressed on the surface of mast cells and basophils interacts with antigens, via bound IgE antibody, and triggers secretion of inflammatory mediators that contribute to allergic reactions. To understand how past inputs (memory) influence future inflammatory responses in mast cells, a microfluidic device was used to precisely control exposure of cells to alternating stimulatory and non-stimulatory inputs. We determined that the response to subsequent stimulation depends on the interval of signaling quiescence. For shorter intervals of signaling quiescence, the second response is blunted relative to the first response, whereas longer intervals of quiescence induce an enhanced second response. Through an iterative process of computational modeling and experimental tests, we found that these memory-like phenomena arise from a confluence of rapid, short-lived positive signals driven by the protein tyrosine kinase Syk; slow, long-lived negative signals driven by the lipid phosphatase Ship1; and slower degradation of Ship1 co-factors. This work advances our understanding of mast cell signaling and represents a generalizable approach for investigating the dynamics of signaling systems.
Collapse
|
22
|
Abstract
Molecular self-assembly is the dominant form of chemical reaction in living systems, yet efforts at systems biology modeling are only beginning to appreciate the need for and challenges to accurate quantitative modeling of self-assembly. Self-assembly reactions are essential to nearly every important process in cell and molecular biology and handling them is thus a necessary step in building comprehensive models of complex cellular systems. They present exceptional challenges, however, to standard methods for simulating complex systems. While the general systems biology world is just beginning to deal with these challenges, there is an extensive literature dealing with them for more specialized self-assembly modeling. This review will examine the challenges of self-assembly modeling, nascent efforts to deal with these challenges in the systems modeling community, and some of the solutions offered in prior work on self-assembly specifically. The review concludes with some consideration of the likely role of self-assembly in the future of complex biological system models more generally.
Collapse
Affiliation(s)
- Marcus Thomas
- Computational Biology Department, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213, United States of America. Joint Carnegie Mellon University/University of Pittsburgh Ph.D. Program in Computational Biology, 4400 Fifth Avenue, Pittsburgh, PA 15213, United States of America
| | | |
Collapse
|
23
|
Modeling of Receptor Tyrosine Kinase Signaling: Computational and Experimental Protocols. Methods Mol Biol 2017; 1636:417-453. [PMID: 28730495 DOI: 10.1007/978-1-4939-7154-1_27] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The advent of systems biology has convincingly demonstrated that the integration of experiments and dynamic modelling is a powerful approach to understand the cellular network biology. Here we present experimental and computational protocols that are necessary for applying this integrative approach to the quantitative studies of receptor tyrosine kinase (RTK) signaling networks. Signaling by RTKs controls multiple cellular processes, including the regulation of cell survival, motility, proliferation, differentiation, glucose metabolism, and apoptosis. We describe methods of model building and training on experimentally obtained quantitative datasets, as well as experimental methods of obtaining quantitative dose-response and temporal dependencies of protein phosphorylation and activities. The presented methods make possible (1) both the fine-grained modeling of complex signaling dynamics and identification of salient, course-grained network structures (such as feedback loops) that bring about intricate dynamics, and (2) experimental validation of dynamic models.
Collapse
|
24
|
Mahajan A, Youssef LA, Cleyrat C, Grattan R, Lucero SR, Mattison CP, Erasmus MF, Jacobson B, Tapia L, Hlavacek WS, Schuyler M, Wilson BS. Allergen Valency, Dose, and FcεRI Occupancy Set Thresholds for Secretory Responses to Pen a 1 and Motivate Design of Hypoallergens. THE JOURNAL OF IMMUNOLOGY 2016; 198:1034-1046. [PMID: 28039304 DOI: 10.4049/jimmunol.1601334] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 11/30/2016] [Indexed: 11/19/2022]
Abstract
Ag-mediated crosslinking of IgE-FcεRI complexes activates mast cells and basophils, initiating the allergic response. Of 34 donors recruited having self-reported shrimp allergy, only 35% had significant levels of shrimp-specific IgE in serum and measurable basophil secretory responses to rPen a 1 (shrimp tropomyosin). We report that degranulation is linked to the number of FcεRI occupied with allergen-specific IgE, as well as the dose and valency of Pen a 1. Using clustered regularly interspaced palindromic repeat-based gene editing, human RBLrαKO cells were created that exclusively express the human FcεRIα subunit. Pen a 1-specific IgE was affinity purified from shrimp-positive plasma. Cells primed with a range of Pen a 1-specific IgE and challenged with Pen a 1 showed a bell-shaped dose response for secretion, with optimal Pen a 1 doses of 0.1-10 ng/ml. Mathematical modeling provided estimates of receptor aggregation kinetics based on FcεRI occupancy with IgE and allergen dose. Maximal degranulation was elicited when ∼2700 IgE-FcεRI complexes were occupied with specific IgE and challenged with Pen a 1 (IgE epitope valency of ≥8), although measurable responses were achieved when only a few hundred FcεRI were occupied. Prolonged periods of pepsin-mediated Pen a 1 proteolysis, which simulates gastric digestion, were required to diminish secretory responses. Recombinant fragments (60-79 aa), which together span the entire length of tropomyosin, were weak secretagogues. These fragments have reduced dimerization capacity, compete with intact Pen a 1 for binding to IgE-FcεRI complexes, and represent a starting point for the design of promising hypoallergens for immunotherapy.
Collapse
Affiliation(s)
- Avanika Mahajan
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131
| | - Lama A Youssef
- Department of Pharmaceutics and Pharmaceutical Technology, School of Pharmacy, Damascus University, Damascus, Syria.,National Commission for Biotechnology, Damascus, Syria
| | - Cédric Cleyrat
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131
| | - Rachel Grattan
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131
| | - Shayna R Lucero
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131
| | - Christopher P Mattison
- Southern Regional Research Center, Agricultural Research Service, U.S. Department of Agriculture, New Orleans, LA 70124
| | - M Frank Erasmus
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131
| | - Bruna Jacobson
- Department of Computer Sciences, University of New Mexico, Albuquerque, NM 87131
| | - Lydia Tapia
- Department of Computer Sciences, University of New Mexico, Albuquerque, NM 87131
| | - William S Hlavacek
- Theoretical Biology and Biophysics Group, Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545.,Theoretical Biology and Biophysics Group, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545; and
| | - Mark Schuyler
- Department of Medicine, University of New Mexico, Albuquerque, NM 87131
| | - Bridget S Wilson
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131;
| |
Collapse
|
25
|
Harris LA, Hogg JS, Tapia JJ, Sekar JAP, Gupta S, Korsunsky I, Arora A, Barua D, Sheehan RP, Faeder JR. BioNetGen 2.2: advances in rule-based modeling. Bioinformatics 2016; 32:3366-3368. [PMID: 27402907 DOI: 10.1093/bioinformatics/btw469] [Citation(s) in RCA: 121] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Accepted: 06/27/2016] [Indexed: 12/18/2022] Open
Abstract
: BioNetGen is an open-source software package for rule-based modeling of complex biochemical systems. Version 2.2 of the software introduces numerous new features for both model specification and simulation. Here, we report on these additions, discussing how they facilitate the construction, simulation and analysis of larger and more complex models than previously possible. AVAILABILITY AND IMPLEMENTATION Stable BioNetGen releases (Linux, Mac OS/X and Windows), with documentation, are available at http://bionetgen.org Source code is available at http://github.com/RuleWorld/bionetgen CONTACT: bionetgen.help@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leonard A Harris
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Justin S Hogg
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - José-Juan Tapia
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - John A P Sekar
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Sanjana Gupta
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Ilya Korsunsky
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Arshi Arora
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Dipak Barua
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Robert P Sheehan
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - James R Faeder
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| |
Collapse
|