1
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. Cell Syst 2023; 14:822-843.e22. [PMID: 37751736 PMCID: PMC10725240 DOI: 10.1016/j.cels.2023.08.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/16/2023] [Accepted: 08/25/2023] [Indexed: 09/28/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - John J Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
2
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541250. [PMID: 37292934 PMCID: PMC10245677 DOI: 10.1101/2023.05.17.541250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125
| | - John J. Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| |
Collapse
|
3
|
Modeling of Bioprocesses via MINLP-based Symbolic Regression of S-system Formalisms. Comput Chem Eng 2022. [DOI: 10.1016/j.compchemeng.2022.108108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
4
|
Gabel M, Hohl T, Imle A, Fackler OT, Graw F. FAMoS: A Flexible and dynamic Algorithm for Model Selection to analyse complex systems dynamics. PLoS Comput Biol 2019; 15:e1007230. [PMID: 31419221 PMCID: PMC6697322 DOI: 10.1371/journal.pcbi.1007230] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Accepted: 06/30/2019] [Indexed: 01/12/2023] Open
Abstract
Most biological systems are difficult to analyse due to a multitude of interacting components and the concomitant lack of information about the essential dynamics. Finding appropriate models that provide a systematic description of such biological systems and that help to identify their relevant factors and processes can be challenging given the sheer number of possibilities. Model selection algorithms that evaluate the performance of a multitude of different models against experimental data provide a useful tool to identify appropriate model structures. However, many algorithms addressing the analysis of complex dynamical systems, as they are often used in biology, compare a preselected number of models or rely on exhaustive searches of the total model space which might be unfeasible dependent on the number of possibilities. Therefore, we developed an algorithm that is able to perform model selection on complex systems and searches large model spaces in a dynamical way. Our algorithm includes local and newly developed non-local search methods that can prevent the algorithm from ending up in local minima of the model space by accounting for structurally similar processes. We tested and validated the algorithm based on simulated data and showed its flexibility for handling different model structures. We also used the algorithm to analyse experimental data on the cell proliferation dynamics of CD4+ and CD8+ T cells that were cultured under different conditions. Our analyses indicated dynamical changes within the proliferation potential of cells that was reduced within tissue-like 3D ex vivo cultures compared to suspension. Due to the flexibility in handling various model structures, the algorithm is applicable to a large variety of different biological problems and represents a useful tool for the data-oriented evaluation of complex model spaces. Identifying the systematic interactions of multiple components within a complex biological system can be challenging due to the number of potential processes and the concomitant lack of information about the essential dynamics. Selection algorithms that allow an automated evaluation of a large number of different models provide a useful tool in identifying the systematic relationships between experimental data. However, many of the existing model selection algorithms are not able to address complex model structures, such as systems of differential equations, and partly rely on local or exhaustive search methods which are inappropriate for the analysis of various biological systems. Therefore, we developed a flexible model selection algorithm that performs a robust and dynamical search of large model spaces to identify complex systems dynamics and applied it to the analysis of T cell proliferation dynamics within different culture conditions. The algorithm, which is available as an R-package, provides an advanced tool for the analysis of complex systems behaviour and, due to its flexible structure, can be applied to a large variety of biological problems.
Collapse
Affiliation(s)
- Michael Gabel
- Center for Modelling and Simulation in the Biosciences, BioQuant-Center, Heidelberg University, Heidelberg, Germany
- * E-mail: (MG); (FG)
| | - Tobias Hohl
- Center for Modelling and Simulation in the Biosciences, BioQuant-Center, Heidelberg University, Heidelberg, Germany
| | - Andrea Imle
- Department of Infectious Diseases, Centre for Integrative Infectious Disease Research (CIID), Integrative Virology, University Hospital Heidelberg, Heidelberg, Germany
| | - Oliver T. Fackler
- Department of Infectious Diseases, Centre for Integrative Infectious Disease Research (CIID), Integrative Virology, University Hospital Heidelberg, Heidelberg, Germany
| | - Frederik Graw
- Center for Modelling and Simulation in the Biosciences, BioQuant-Center, Heidelberg University, Heidelberg, Germany
- * E-mail: (MG); (FG)
| |
Collapse
|
5
|
Yuan M, Hong W, Li P. Identification of regulatory variables for state transition of biological networks. Biosystems 2019; 181:71-81. [PMID: 31071365 DOI: 10.1016/j.biosystems.2019.05.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 03/04/2019] [Accepted: 05/05/2019] [Indexed: 01/02/2023]
Abstract
Attractors represent steady states of biological networks. Recent studies have shown that regulatory variables can be used to steer a network state transition from an undesired attractor, such as a cancerous state, to a desired healthy one. Therefore, it is important to identify the regulatory variables and determine their time-dependent profile for state transition of a given network. However, this is a challenging task since regulatory variables have to be identified among numerous candidates in a large-scale biological network. In this study, we developed a new method for identifying regulatory variables in large-scale biological networks for the purpose of state transition. As a result, a set of optimal regulatory variables can be determined based on formulating and solving a mixed-integer nonlinear dynamic optimization problem. A relaxation scheme is used to overcome the difficulties in solving this complex problem containing a large number of binary variables. The solution to this problem simultaneously identifies the optimal regulatory variables, provides strength of regulatory interactions, and obtains the minimal control time to realize the required state transition. In addition, by adjusting the objective function, various combinations of the strength of regulatory interactions and the transition time can be achieved according to the requirement for disease therapy. Results of three case studies (a myeloid differentiation regulatory network, a cancer gene regulatory network, and a T-LGL signaling network) demonstrate the efficacy of the proposed approach. Therefore, this study establishes an appropriate framework for identifying the regulatory variables for state transition of complex biological networks.
Collapse
Affiliation(s)
- Meichen Yuan
- College of Energy Engineering, Zhejiang University, Hangzhou, 310027, China; Process Optimization Group, Institute of Automation and Systems Engineering, Technische Universität Ilmenau, Ilmenau, 98684, Germany
| | - Weirong Hong
- College of Energy Engineering, Zhejiang University, Hangzhou, 310027, China.
| | - Pu Li
- Process Optimization Group, Institute of Automation and Systems Engineering, Technische Universität Ilmenau, Ilmenau, 98684, Germany.
| |
Collapse
|
6
|
Construction of Boolean logic gates based on dual-vector circuits of multiple gene regulatory elements. Mol Genet Genomics 2019; 294:277-286. [PMID: 30374564 DOI: 10.1007/s00438-018-1502-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 10/11/2018] [Indexed: 12/16/2022]
Abstract
Gene circuits are constructed to run complex logical operations for the precise regulation of biological metabolic processes. At present, the implementation of most genetic circuits is based on the regulatory mechanism of various circuit components, but we hope to realize complex logic gates through biological metabolic pathways of organisms. In this study, we matched the regulatory elements of different functional mechanisms to build a Boolean logic gate model by means of a dual-vector circuit. In Escherichia coli, we made 12 circuit logic gate modules and validated the functions of four of the logic gates, including "AND", "NAND", "OR" and "NOR" by the expression and analysis of a reporter gene. The inputs were converted into outputs by an intermediate product of the host metabolism. The results indicated that these logic gate circuits had the expected efficacy and regulatory characteristics. Our study provides new ideas for designing genetic circuits and precisely controlling metabolic pathways.
Collapse
|
7
|
Fröhlich F, Loos C, Hasenauer J. Scalable Inference of Ordinary Differential Equation Models of Biochemical Processes. Methods Mol Biol 2019; 1883:385-422. [PMID: 30547409 DOI: 10.1007/978-1-4939-8882-2_16] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Ordinary differential equation models have become a standard tool for the mechanistic description of biochemical processes. If parameters are inferred from experimental data, such mechanistic models can provide accurate predictions about the behavior of latent variables or the process under new experimental conditions. Complementarily, inference of model structure can be used to identify the most plausible model structure from a set of candidates, and, thus, gain novel biological insight. Several toolboxes can infer model parameters and structure for small- to medium-scale mechanistic models out of the box. However, models for highly multiplexed datasets can require hundreds to thousands of state variables and parameters. For the analysis of such large-scale models, most algorithms require intractably high computation times. This chapter provides an overview of the state-of-the-art methods for parameter and model inference, with an emphasis on scalability.
Collapse
Affiliation(s)
- Fabian Fröhlich
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Carolin Loos
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.
- Center for Mathematics, Technische Universität München, Garching, Germany.
| |
Collapse
|
8
|
Xing L, Guo M, Liu X, Wang C, Zhang L. Gene Regulatory Networks Reconstruction Using the Flooding-Pruning Hill-Climbing Algorithm. Genes (Basel) 2018; 9:E342. [PMID: 29986472 PMCID: PMC6071145 DOI: 10.3390/genes9070342] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 06/28/2018] [Accepted: 07/02/2018] [Indexed: 11/17/2022] Open
Abstract
The explosion of genomic data provides new opportunities to improve the task of gene regulatory network reconstruction. Because of its inherent probability character, the Bayesian network is one of the most promising methods. However, excessive computation time and the requirements of a large number of biological samples reduce its effectiveness and application to gene regulatory network reconstruction. In this paper, Flooding-Pruning Hill-Climbing algorithm (FPHC) is proposed as a novel hybrid method based on Bayesian networks for gene regulatory networks reconstruction. On the basis of our previous work, we propose the concept of DPI Level based on data processing inequality (DPI) to better identify neighbors of each gene on the lack of enough biological samples. Then, we use the search-and-score approach to learn the final network structure in the restricted search space. We first analyze and validate the effectiveness of FPHC in theory. Then, extensive comparison experiments are carried out on known Bayesian networks and biological networks from the DREAM (Dialogue on Reverse Engineering Assessment and Methods) challenge. The results show that the FPHC algorithm, under recommended parameters, outperforms, on average, the original hill climbing and Max-Min Hill-Climbing (MMHC) methods with respect to the network structure and running time. In addition, our results show that FPHC is more suitable for gene regulatory network reconstruction with limited data.
Collapse
Affiliation(s)
- Linlin Xing
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China.
- Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing 100044, China.
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
| | - Lei Zhang
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China.
| |
Collapse
|
9
|
Xing L, Guo M, Liu X, Wang C, Wang L, Zhang Y. An improved Bayesian network method for reconstructing gene regulatory network based on candidate auto selection. BMC Genomics 2017; 18:844. [PMID: 29219084 PMCID: PMC5773867 DOI: 10.1186/s12864-017-4228-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The reconstruction of gene regulatory network (GRN) from gene expression data can discover regulatory relationships among genes and gain deep insights into the complicated regulation mechanism of life. However, it is still a great challenge in systems biology and bioinformatics. During the past years, numerous computational approaches have been developed for this goal, and Bayesian network (BN) methods draw most of attention among these methods because of its inherent probability characteristics. However, Bayesian network methods are time consuming and cannot handle large-scale networks due to their high computational complexity, while the mutual information-based methods are highly effective but directionless and have a high false-positive rate. RESULTS To solve these problems, we propose a Candidate Auto Selection algorithm (CAS) based on mutual information and breakpoint detection to restrict the search space in order to accelerate the learning process of Bayesian network. First, the proposed CAS algorithm automatically selects the neighbor candidates of each node before searching the best structure of GRN. Then based on CAS algorithm, we propose a globally optimal greedy search method (CAS + G), which focuses on finding the highest rated network structure, and a local learning method (CAS + L), which focuses on faster learning the structure with little loss of quality. CONCLUSION Results show that the proposed CAS algorithm can effectively reduce the search space of Bayesian networks through identifying the neighbor candidates of each node. In our experiments, the CAS + G method outperforms the state-of-the-art method on simulation data for inferring GRNs, and the CAS + L method is significantly faster than the state-of-the-art method with little loss of accuracy. Hence, the CAS based methods effectively decrease the computational complexity of Bayesian network and are more suitable for GRN inference.
Collapse
Affiliation(s)
- Linlin Xing
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Lei Wang
- Institute of Health Service and Medical Information, Academy of Military Medical Sciences, Beijing, China
| | - Yin Zhang
- Institute of Health Service and Medical Information, Academy of Military Medical Sciences, Beijing, China
| |
Collapse
|
10
|
A parallel metaheuristic for large mixed-integer dynamic optimization problems, with applications in computational biology. PLoS One 2017; 12:e0182186. [PMID: 28813442 PMCID: PMC5557587 DOI: 10.1371/journal.pone.0182186] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 07/13/2017] [Indexed: 11/24/2022] Open
Abstract
Background We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). Methods We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. Results We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). Conclusions These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling.
Collapse
|
11
|
Eduati F, Doldàn-Martelli V, Klinger B, Cokelaer T, Sieber A, Kogera F, Dorel M, Garnett MJ, Blüthgen N, Saez-Rodriguez J. Drug Resistance Mechanisms in Colorectal Cancer Dissected with Cell Type-Specific Dynamic Logic Models. Cancer Res 2017; 77:3364-3375. [PMID: 28381545 DOI: 10.1158/0008-5472.can-17-0078] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Revised: 03/17/2017] [Accepted: 03/31/2017] [Indexed: 12/20/2022]
Abstract
Genomic features are used as biomarkers of sensitivity to kinase inhibitors used widely to treat human cancer, but effective patient stratification based on these principles remains limited in impact. Insofar as kinase inhibitors interfere with signaling dynamics, and, in turn, signaling dynamics affects inhibitor responses, we investigated associations in this study between cell-specific dynamic signaling pathways and drug sensitivity. Specifically, we measured 14 phosphoproteins under 43 different perturbed conditions (combinations of 5 stimuli and 7 inhibitors) in 14 colorectal cancer cell lines, building cell line-specific dynamic logic models of underlying signaling networks. Model parameters representing pathway dynamics were used as features to predict sensitivity to a panel of 27 drugs. Specific parameters of signaling dynamics correlated strongly with drug sensitivity for 14 of the drugs, 9 of which had no genomic biomarker. Following one of these associations, we validated a drug combination predicted to overcome resistance to MEK inhibitors by coblockade of GSK3, which was not found based on associations with genomic data. These results suggest that to better understand the cancer resistance and move toward personalized medicine, it is essential to consider signaling network dynamics that cannot be inferred from static genotypes. Cancer Res; 77(12); 3364-75. ©2017 AACR.
Collapse
Affiliation(s)
- Federica Eduati
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | - Victoria Doldàn-Martelli
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom.,Departamento de Física de la Materia Condensada, Condensed Matter Physics Center (IFIMAC) and Instituto Nicolás Cabrera, Facultad de Ciencias, Universidad Autónoma de Madrid, Madrid, Spain
| | - Bertram Klinger
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Berlin, Germany.,Integrative Research Institute (IRI) Life Sciences and Institute for Theoretical Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Thomas Cokelaer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | - Anja Sieber
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Berlin, Germany.,Integrative Research Institute (IRI) Life Sciences and Institute for Theoretical Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Fiona Kogera
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Mathurin Dorel
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Berlin, Germany.,Integrative Research Institute (IRI) Life Sciences and Institute for Theoretical Biology, Humboldt-Universität zu Berlin, Berlin, Germany.,Berlin Institute of Health (BIH), Berlin, Germany
| | - Mathew J Garnett
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Nils Blüthgen
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Berlin, Germany. .,Integrative Research Institute (IRI) Life Sciences and Institute for Theoretical Biology, Humboldt-Universität zu Berlin, Berlin, Germany.,Berlin Institute of Health (BIH), Berlin, Germany
| | - Julio Saez-Rodriguez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom. .,Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Faculty of Medicine, Aachen, Germany
| |
Collapse
|
12
|
Newman RH, Zhang J. Integrated Strategies to Gain a Systems-Level View of Dynamic Signaling Networks. Methods Enzymol 2017; 589:133-170. [PMID: 28336062 DOI: 10.1016/bs.mie.2017.01.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
In order to survive and function properly in the face of an ever changing environment, cells must be able to sense changes in their surroundings and respond accordingly. Cells process information about their environment through complex signaling networks composed of many discrete signaling molecules. Individual pathways within these networks are often tightly integrated and highly dynamic, allowing cells to respond to a given stimulus (or, as is typically the case under physiological conditions, a combination of stimuli) in a specific and appropriate manner. However, due to the size and complexity of many cellular signaling networks, it is often difficult to predict how cellular signaling networks will respond under a particular set of conditions. Indeed, crosstalk between individual signaling pathways may lead to responses that are nonintuitive (or even counterintuitive) based on examination of the individual pathways in isolation. Therefore, to gain a more comprehensive view of cell signaling processes, it is important to understand how signaling networks behave at the systems level. This requires integrated strategies that combine quantitative experimental data with computational models. In this chapter, we first examine some of the progress that has recently been made toward understanding the systems-level regulation of cellular signaling networks, with a particular emphasis on phosphorylation-dependent signaling networks. We then discuss how genetically targetable fluorescent biosensors are being used together with computational models to gain unique insights into the spatiotemporal regulation of signaling networks within single, living cells.
Collapse
Affiliation(s)
- Robert H Newman
- North Carolina Agricultural and Technical State University, Greensboro, NC, United States.
| | - Jin Zhang
- University of California, San Diego, San Diego, CA, United States.
| |
Collapse
|
13
|
Henriques D, Villaverde AF, Rocha M, Saez-Rodriguez J, Banga JR. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models. PLoS Comput Biol 2017; 13:e1005379. [PMID: 28166222 PMCID: PMC5319798 DOI: 10.1371/journal.pcbi.1005379] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 02/21/2017] [Accepted: 01/24/2017] [Indexed: 11/19/2022] Open
Abstract
Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM's ensemble prediction is not only consistently better than predictions from individual models, but also often outperforms the state of the art represented by the methods used in the HPN-DREAM challenge.
Collapse
Affiliation(s)
- David Henriques
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
| | - Alejandro F. Villaverde
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Julio Saez-Rodriguez
- Joint Research Center for Computational Biomedicine, RWTH-Aachen University, Aachen, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | - Julio R. Banga
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
| |
Collapse
|
14
|
Penas DR, González P, Egea JA, Doallo R, Banga JR. Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy. BMC Bioinformatics 2017; 18:52. [PMID: 28109249 PMCID: PMC5251293 DOI: 10.1186/s12859-016-1452-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 12/24/2016] [Indexed: 12/02/2022] Open
Abstract
Background The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. Results The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. Conclusions The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1452-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David R Penas
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain
| | - Patricia González
- Computer Architecture Group, Universidade da Coruña, Campus de Elviña s/n, Coruña, 15071 A, Spain
| | - Jose A Egea
- Department of Applied Mathematics and Statistics, Universidad Politécnica de Cartagena, c/ Dr. Fleming s/n, Cartagena, 30202, Spain
| | - Ramón Doallo
- Computer Architecture Group, Universidade da Coruña, Campus de Elviña s/n, Coruña, 15071 A, Spain
| | - Julio R Banga
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain.
| |
Collapse
|
15
|
Tanevski J, Todorovski L, Džeroski S. Learning stochastic process-based models of dynamical systems from knowledge and data. BMC SYSTEMS BIOLOGY 2016; 10:30. [PMID: 27005698 PMCID: PMC4802653 DOI: 10.1186/s12918-016-0273-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 03/06/2016] [Indexed: 01/02/2023]
Abstract
Background Identifying a proper model structure, using methods that address both structural and parameter uncertainty, is a crucial problem within the systems approach to biology. And yet, it has a marginal presence in the recent literature. While many existing approaches integrate methods for simulation and parameter estimation of a single model to address parameter uncertainty, only few of them address structural uncertainty at the same time. The methods for handling structure uncertainty often oversimplify the problem by allowing the human modeler to explicitly enumerate a relatively small number of alternative model structures. On the other hand, process-based modeling methods provide flexible modular formalisms for specifying large classes of plausible model structures, but their scope is limited to deterministic models. Here, we aim at extending the scope of process-based modeling methods to inductively learn stochastic models from knowledge and data. Results We combine the flexibility of process-based modeling in terms of addressing structural uncertainty with the benefits of stochastic modeling. The proposed method combines search trough the space of plausible model structures, the parsimony principle and parameter estimation to identify a model with optimal structure and parameters. We illustrate the utility of the proposed method on four stochastic modeling tasks in two domains: gene regulatory networks and epidemiology. Within the first domain, using synthetically generated data, the method successfully recovers the structure and parameters of known regulatory networks from simulations. In the epidemiology domain, the method successfully reconstructs previously established models of epidemic outbreaks from real, sparse and noisy measurement data. Conclusions The method represents a unified approach to modeling dynamical systems that allows for flexible formalization of the space of candidate model structures, deterministic and stochastic interpretation of model dynamics, and automated induction of model structure and parameters from data. The method is able to reconstruct models of dynamical systems from synthetic and real data.
Collapse
Affiliation(s)
- Jovan Tanevski
- Jožef Stefan Institute, Jamova cesta 39, Ljubljana, 1000, Slovenia. .,Jožef Stefan International Postgraduate School, Jamova cesta 39, Ljubljana, 1000, Slovenia.
| | - Ljupčo Todorovski
- University of Ljubljana, Gosarjeva ulica 5, Ljubljana, 1000, Slovenia
| | - Sašo Džeroski
- Jožef Stefan Institute, Jamova cesta 39, Ljubljana, 1000, Slovenia.,Jožef Stefan International Postgraduate School, Jamova cesta 39, Ljubljana, 1000, Slovenia
| |
Collapse
|