1
|
Tryptophan Production Maximization in a Fed-Batch Bioreactor with Modified E. coli Cells, by Optimizing Its Operating Policy Based on an Extended Structured Cell Kinetic Model. Bioengineering (Basel) 2021; 8:bioengineering8120210. [PMID: 34940363 PMCID: PMC8698263 DOI: 10.3390/bioengineering8120210] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 12/04/2021] [Accepted: 12/06/2021] [Indexed: 11/17/2022] Open
Abstract
Hybrid kinetic models, linking structured cell metabolic processes to the dynamics of macroscopic variables of the bioreactor, are more and more used in engineering evaluations to derive more precise predictions of the process dynamics under variable operating conditions. Depending on the cell model complexity, such a math tool can be used to evaluate the metabolic fluxes in relation to the bioreactor operating conditions, thus suggesting ways to genetically modify the microorganism for certain purposes. Even if development of such an extended dynamic model requires more experimental and computational efforts, its use is advantageous. The approached probative example refers to a model simulating the dynamics of nanoscale variables from several pathways of the central carbon metabolism (CCM) of Escherichia coli cells, linked to the macroscopic state variables of a fed-batch bioreactor (FBR) used for the tryptophan (TRP) production. The used E. coli strain was modified to replace the PTS system for glucose (GLC) uptake with a more efficient one. The study presents multiple elements of novelty: (i) the experimentally validated modular model itself, and (ii) its efficiency in computationally deriving an optimal operation policy of the FBR.
Collapse
|
2
|
MARIA G. A CCM-based modular and hybrid kinetic model to simulate the tryptophan synthesis in a fed-batch bioreactor using modified E. coli cells. Comput Chem Eng 2021. [DOI: 10.1016/j.compchemeng.2021.107450] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
3
|
Maria G. In silico Determination of Some Conditions Leading to Glycolytic Oscillations and Their Interference With Some Other Processes in E. coli Cells. Front Chem 2020; 8:526679. [PMID: 33195042 PMCID: PMC7655968 DOI: 10.3389/fchem.2020.526679] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 09/23/2020] [Indexed: 01/05/2023] Open
Abstract
Autonomous oscillations of species levels in the glycolysis express the self-control of this essential cellular pathway belonging to the central carbon metabolism (CCM), and this phenomenon takes place in a large number of bacteria. Oscillations of glycolytic intermediates in living cells occur according to the environmental conditions and to the cell characteristics, especially the adenosine triphosphate (ATP) recovery system. Determining the conditions that lead to the occurrence and maintenance of the glycolytic oscillations can present immediate practical applications. Such a model-based analysis allows in silico (model-based) design of genetically modified microorganisms (GMO) with certain characteristics of interest for the biosynthesis industry, medicine, etc. Based on our kinetic model validated in previous works, this paper aims to in silico identify operating parameters and cell factors leading to the occurrence of stable glycolytic oscillations in the Escherichia coli cells. As long as most of the glycolytic intermediates are involved in various cellular metabolic pathways belonging to the CCM, evaluation of the dynamics and average level of its intermediates is of high importance for further applicative analyses. As an example, by using a lumped kinetic model for tryptophan (TRP) synthesis from literature, and its own kinetic model for the oscillatory glycolysis, this paper highlights the influence of glycolytic oscillations on the oscillatory TRP synthesis through the PEP (phosphoenolpyruvate) glycolytic node shared by the two oscillatory processes. The numerical analysis allows further TRP production maximization in a fed-batch bioreactor (FBR).
Collapse
Affiliation(s)
- Gheorghe Maria
- Department of Chemical and Biochemical Engineering, University POLITEHNICA of Bucharest, Bucharest, Romania.,Chemical Sciences Section, Romanian Academy, Bucharest, Romania
| |
Collapse
|
4
|
Zhao Q, Zhang Y. Ensemble Method of Feature Selection and Reverse Construction of Gene Logical Network Based on Information Entropy. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001420590041] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a novel ensemble gene selection method to obtain a gene subset. Then we provide a reverse construction method of gene network derived from expression profile data of the gene subset. The uncertainty coefficient based on information entropy are used to define the existence of logical relations among these genes. If the uncertainty coefficient between some genes exceeds predefined thresholds, the gene nodes will be connected by directed edges. Thus, a gene network is generated, which we define as gene logical network. This method is applied to the breast cancer data including control group and experimental group, with comparisons of the 2nd-order logic type distribution, average degree as well as average path length of the networks. It is found that these structures with different networks are quite distinct. By the comparison of the degree difference between control group and experimental group, the key genes are picked up. By defining the dynamics evolution rules of state transition based on the logical regulation among the key genes in the network, the dynamic behaviors for normal breast cells and cells with cancer of different stages are simulated numerically. Some of them are highly related to the development of breast cancer through literature inquiry. The study may provide a useful revelation to the biological mechanism in the formation and development of cancer.
Collapse
Affiliation(s)
- Qingfeng Zhao
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266590, P. R. China
- Shandong Province Key Laboratory of Wisdom Mine Information Technology, Shandong University of Science and Technology, Qingdao 266590, P. R. China
| | - Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, Shandong 266590, P. R. China
| |
Collapse
|
5
|
In silico optimization of a bioreactor with an E. coli culture for tryptophan production by using a structured model coupling the oscillating glycolysis and tryptophan synthesis. Chem Eng Res Des 2018. [DOI: 10.1016/j.cherd.2018.05.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
6
|
Maria G, Gijiu CL, Maria C, Tociu C. Interference of the oscillating glycolysis with the oscillating tryptophan synthesis in the E. coli cells. Comput Chem Eng 2018. [DOI: 10.1016/j.compchemeng.2017.10.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
7
|
De Souza Jacomini R, Martins DC, Da Silva FL, Costa AHR. GeNICE: A Novel Framework for Gene Network Inference by Clustering, Exhaustive Search, and Multivariate Analysis. J Comput Biol 2017. [PMID: 28636461 DOI: 10.1089/cmb.2017.0022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Gene network (GN) inference from temporal gene expression data is a crucial and challenging problem in systems biology. Expression data sets usually consist of dozens of temporal samples, while networks consist of thousands of genes, thus rendering many inference methods unfeasible in practice. To improve the scalability of GN inference methods, we propose a novel framework called GeNICE, based on probabilistic GNs; the main novelty is the introduction of a clustering procedure to group genes with related expression profiles and to provide an approximate solution with reduced computational complexity. We use the defined clusters to perform an exhaustive search to retrieve the best predictor gene subsets for each target gene, according to multivariate criterion functions. GeNICE greatly reduces the search space because predictor candidates are restricted to one gene per cluster. Finally, a multivariate analysis is performed for each defined predictor subset to retrieve minimal subsets and to simplify the network. In our experiments with in silico generated data sets, GeNICE achieved substantial computational time reduction when compared to solutions without the clustering step, while preserving the gene expression prediction accuracy even when the number of clusters is small (about 50) relative to the number of genes (order of thousands). For a Plasmodium falciparum microarray data set, the prediction accuracy achieved by GeNICE was roughly 97%, while the respective topologies involving glycolytic and apicoplast seed genes had a very large intramodularity, very small interconnection between modules, and some module hub genes, reflecting small-world and scale-free topological properties, as expected.
Collapse
|
8
|
Rubiolo M, Milone DH, Stegmayer G. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1365-1373. [PMID: 26671808 DOI: 10.1109/tcbb.2015.2420551] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.
Collapse
|
9
|
Yin W, Garimalla S, Moreno A, Galinski MR, Styczynski MP. A tree-like Bayesian structure learning algorithm for small-sample datasets from complex biological model systems. BMC SYSTEMS BIOLOGY 2015; 9:49. [PMID: 26310492 PMCID: PMC4551520 DOI: 10.1186/s12918-015-0194-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 08/06/2015] [Indexed: 11/10/2022]
Abstract
Background There are increasing efforts to bring high-throughput systems biology techniques to bear on complex animal model systems, often with a goal of learning about underlying regulatory network structures (e.g., gene regulatory networks). However, complex animal model systems typically have significant limitations on cohort sizes, number of samples, and the ability to perform follow-up and validation experiments. These constraints are particularly problematic for many current network learning approaches, which require large numbers of samples and may predict many more regulatory relationships than actually exist. Results Here, we test the idea that by leveraging the accuracy and efficiency of classifiers, we can construct high-quality networks that capture important interactions between variables in datasets with few samples. We start from a previously-developed tree-like Bayesian classifier and generalize its network learning approach to allow for arbitrary depth and complexity of tree-like networks. Using four diverse sample networks, we demonstrate that this approach performs consistently better at low sample sizes than the Sparse Candidate Algorithm, a representative approach for comparison because it is known to generate Bayesian networks with high positive predictive value. We develop and demonstrate a resampling-based approach to enable the identification of a viable root for the learned tree-like network, important for cases where the root of a network is not known a priori. We also develop and demonstrate an integrated resampling-based approach to the reduction of variable space for the learning of the network. Finally, we demonstrate the utility of this approach via the analysis of a transcriptional dataset of a malaria challenge in a non-human primate model system, Macaca mulatta, suggesting the potential to capture indicators of the earliest stages of cellular differentiation during leukopoiesis. Conclusions We demonstrate that by starting from effective and efficient approaches for creating classifiers, we can identify interesting tree-like network structures with significant ability to capture the relationships in the training data. This approach represents a promising strategy for inferring networks with high positive predictive value under the constraint of small numbers of samples, meeting a need that will only continue to grow as more high-throughput studies are applied to complex model systems. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0194-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Weiwei Yin
- Key Laboratory for Biomedical Engineering of Education Ministry, Department of Biomedical Engineering, Zhejiang University, Hangzhou, P. R. China. .,School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst Drive NW, Atlanta, GA, 30332-0100, USA.
| | - Swetha Garimalla
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA.
| | - Alberto Moreno
- Division of Infectious Diseases, Emory Vaccine Center, Yerkes National Primate Research Center, Emory University School of Medicine, Emory University, Atlanta, GA, USA.
| | - Mary R Galinski
- Division of Infectious Diseases, Emory Vaccine Center, Yerkes National Primate Research Center, Emory University School of Medicine, Emory University, Atlanta, GA, USA.
| | - Mark P Styczynski
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst Drive NW, Atlanta, GA, 30332-0100, USA.
| |
Collapse
|
10
|
Borelli FF, de Camargo RY, Martins DC, Rozante LCS. Gene regulatory networks inference using a multi-GPU exhaustive search algorithm. BMC Bioinformatics 2013; 14 Suppl 18:S5. [PMID: 24564268 PMCID: PMC3817808 DOI: 10.1186/1471-2105-14-s18-s5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Gene regulatory networks (GRN) inference is an important bioinformatics problem in which the gene interactions need to be deduced from gene expression data, such as microarray data. Feature selection methods can be applied to this problem. A feature selection technique is composed by two parts: a search algorithm and a criterion function. Among the search algorithms already proposed, there is the exhaustive search where the best feature subset is returned, although its computational complexity is unfeasible in almost all situations. The objective of this work is the development of a low cost parallel solution based on GPU architectures for exhaustive search with a viable cost-benefit. We use CUDA™, a general purpose parallel programming platform that allows the usage of NVIDIA® GPUs to solve complex problems in an efficient way. RESULTS We developed a parallel algorithm for GRN inference based on multiple GPU cards and obtained encouraging speedups (order of hundreds), when assuming that each target gene has two multivariate predictors. Also, experiments using single and multiple GPUs were performed, indicating that the speedup grows almost linearly with the number of GPUs. CONCLUSION In this work, we present a proof of principle, showing that it is possible to parallelize the exhaustive search algorithm in GPUs with encouraging results. Although our focus in this paper is on the GRN inference problem, the exhaustive search technique based on GPU developed here can be applied (with minor adaptations) to other combinatorial problems.
Collapse
Affiliation(s)
- Fabrizio F Borelli
- Center for Mathematics, Computing and Cognition, Federal University of ABC, Av. do Estados, 5001, Santo André -SP, Brazil
| | - Raphael Y de Camargo
- Center for Mathematics, Computing and Cognition, Federal University of ABC, Av. do Estados, 5001, Santo André -SP, Brazil
| | - David C Martins
- Center for Mathematics, Computing and Cognition, Federal University of ABC, Av. do Estados, 5001, Santo André -SP, Brazil
| | - Luiz CS Rozante
- Center for Mathematics, Computing and Cognition, Federal University of ABC, Av. do Estados, 5001, Santo André -SP, Brazil
| |
Collapse
|
11
|
Maria G, Luta I. Structured cell simulator coupled with a fluidized bed bioreactor model to predict the adaptive mercury uptake by E. coli cells. Comput Chem Eng 2013. [DOI: 10.1016/j.compchemeng.2013.06.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
12
|
Trairatphisan P, Mizera A, Pang J, Tantar AA, Schneider J, Sauter T. Recent development and biomedical applications of probabilistic Boolean networks. Cell Commun Signal 2013; 11:46. [PMID: 23815817 PMCID: PMC3726340 DOI: 10.1186/1478-811x-11-46] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Accepted: 06/22/2013] [Indexed: 12/13/2022] Open
Abstract
Probabilistic Boolean network (PBN) modelling is a semi-quantitative approach widely used for the study of the topology and dynamic aspects of biological systems. The combined use of rule-based representation and probability makes PBN appealing for large-scale modelling of biological networks where degrees of uncertainty need to be considered.A considerable expansion of our knowledge in the field of theoretical research on PBN can be observed over the past few years, with a focus on network inference, network intervention and control. With respect to areas of applications, PBN is mainly used for the study of gene regulatory networks though with an increasing emergence in signal transduction, metabolic, and also physiological networks. At the same time, a number of computational tools, facilitating the modelling and analysis of PBNs, are continuously developed.A concise yet comprehensive review of the state-of-the-art on PBN modelling is offered in this article, including a comparative discussion on PBN versus similar models with respect to concepts and biomedical applications. Due to their many advantages, we consider PBN to stand as a suitable modelling framework for the description and analysis of complex biological systems, ranging from molecular to physiological levels.
Collapse
Affiliation(s)
| | - Andrzej Mizera
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg
| | - Jun Pang
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg
| | - Alexandru Adrian Tantar
- Computer Science and Communications Research Unit, University of Luxembourg, Luxembourg
- Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg
| | - Jochen Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg
- Saarland University Medical Center, Department of Internal Medicine II, Homburg, Saarland, Germany
| | - Thomas Sauter
- Life Sciences Research Unit, University of Luxembourg, Luxembourg
| |
Collapse
|
13
|
|
14
|
Schmidt MD, Vallabhajosyula RR, Jenkins JW, Hood JE, Soni AS, Wikswo JP, Lipson H. Automated refinement and inference of analytical models for metabolic networks. Phys Biol 2011; 8:055011. [PMID: 21832805 DOI: 10.1088/1478-3975/8/5/055011] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The reverse engineering of metabolic networks from experimental data is traditionally a labor-intensive task requiring a priori systems knowledge. Using a proven model as a test system, we demonstrate an automated method to simplify this process by modifying an existing or related model--suggesting nonlinear terms and structural modifications--or even constructing a new model that agrees with the system's time series observations. In certain cases, this method can identify the full dynamical model from scratch without prior knowledge or structural assumptions. The algorithm selects between multiple candidate models by designing experiments to make their predictions disagree. We performed computational experiments to analyze a nonlinear seven-dimensional model of yeast glycolytic oscillations. This approach corrected mistakes reliably in both approximated and overspecified models. The method performed well to high levels of noise for most states, could identify the correct model de novo, and make better predictions than ordinary parametric regression and neural network models. We identified an invariant quantity in the model, which accurately derived kinetics and the numerical sensitivity coefficients of the system. Finally, we compared the system to dynamic flux estimation and discussed the scaling and application of this methodology to automated experiment design and control in biological systems in real time.
Collapse
Affiliation(s)
- Michael D Schmidt
- Cornell Computational Systems Laboratory, Cornell University, Ithaca, NY, USA
| | | | | | | | | | | | | |
Collapse
|
15
|
Lopes FM, Cesar RM, Costa LDF. Gene expression complex networks: synthesis, identification, and analysis. J Comput Biol 2011; 18:1353-67. [PMID: 21548810 DOI: 10.1089/cmb.2010.0118] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree <k> variation, decreasing its network recovery rate with the increase of <k>. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Collapse
Affiliation(s)
- Fabrício M Lopes
- Federal University of Technology-Paraná and Institute of Mathematics and Statistics, University of São Paulo, Brazil.
| | | | | |
Collapse
|
16
|
Gallo CA, Carballido JA, Ponzoni I. Discovering time-lagged rules from microarray data using gene profile classifiers. BMC Bioinformatics 2011; 12:123. [PMID: 21524308 PMCID: PMC3111372 DOI: 10.1186/1471-2105-12-123] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Accepted: 04/27/2011] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. RESULTS This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. CONCLUSIONS A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation.
Collapse
Affiliation(s)
- Cristian A Gallo
- Laboratorio de Investigación y Desarrollo en Computación Científica (LIDeCC), Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur, Av, Alem 1253, 8000, Bahía Blanca, Argentina
| | | | | |
Collapse
|
17
|
Street NR, Jansson S, Hvidsten TR. A systems biology model of the regulatory network in Populus leaves reveals interacting regulators and conserved regulation. BMC PLANT BIOLOGY 2011; 11:13. [PMID: 21232107 PMCID: PMC3030533 DOI: 10.1186/1471-2229-11-13] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Accepted: 01/13/2011] [Indexed: 05/23/2023]
Abstract
BACKGROUND Green plant leaves have always fascinated biologists as hosts for photosynthesis and providers of basic energy to many food webs. Today, comprehensive databases of gene expression data enable us to apply increasingly more advanced computational methods for reverse-engineering the regulatory network of leaves, and to begin to understand the gene interactions underlying complex emergent properties related to stress-response and development. These new systems biology methods are now also being applied to organisms such as Populus, a woody perennial tree, in order to understand the specific characteristics of these species. RESULTS We present a systems biology model of the regulatory network of Populus leaves. The network is reverse-engineered from promoter information and expression profiles of leaf-specific genes measured over a large set of conditions related to stress and developmental. The network model incorporates interactions between regulators, such as synergistic and competitive relationships, by evaluating increasingly more complex regulatory mechanisms, and is therefore able to identify new regulators of leaf development not found by traditional genomics methods based on pair-wise expression similarity. The approach is shown to explain available gene function information and to provide robust prediction of expression levels in new data. We also use the predictive capability of the model to identify condition-specific regulation as well as conserved regulation between Populus and Arabidopsis. CONCLUSIONS We outline a computationally inferred model of the regulatory network of Populus leaves, and show how treating genes as interacting, rather than individual, entities identifies new regulators compared to traditional genomics analysis. Although systems biology models should be used with care considering the complexity of regulatory programs and the limitations of current genomics data, methods describing interactions can provide hypotheses about the underlying cause of emergent properties and are needed if we are to identify target genes other than those constituting the "low hanging fruit" of genomic analysis.
Collapse
Affiliation(s)
- Nathaniel Robert Street
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, 901 87 Umeå, Sweden
| | - Stefan Jansson
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, 901 87 Umeå, Sweden
| | - Torgeir R Hvidsten
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, 901 87 Umeå, Sweden
- Computational Life Science Cluster (CLiC), Umeå University, 901 87 Umeå, Sweden
| |
Collapse
|
18
|
Zhang L, Xiao M, Wang Y, Zhang W. Reverse engineering large-scale genetic networks: synthetic versus real data. J Genet 2010; 89:73-80. [PMID: 20505249 DOI: 10.1007/s12041-010-0013-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Development of microarray technology has resulted in an exponential rise in gene expression data. Linear computational methods are of great assistance in identifying molecular interactions, and elucidating the functional properties of gene networks. It overcomes the weaknesses of in vivo experiments including high cost, large noise, and unrepeatable process. In this paper, we propose an easily applied system, Stepwise Network Inference (SWNI), which integrates deterministic linear model with statistical analysis, and has been tested effectively on both simulated experiments and real gene expression data sets. The study illustrates that connections of gene networks can be significantly detected via SWNI with high confidence, when single gene perturbation experiments are performed complying with the algorithm requirements. In particular, our algorithm shows efficiency and outperforms the existing ones presented in this paper when dealing with large-scale sparse networks without any prior knowledge.
Collapse
Affiliation(s)
- Luwen Zhang
- School of Computer Engineering and Science, Shanghai University, 149 Yanchang Road, Zhabei District, Shanghai 200072, People's Republic of China
| | | | | | | |
Collapse
|
19
|
Plato's cave algorithm: inferring functional signaling networks from early gene expression shadows. PLoS Comput Biol 2010; 6:e1000828. [PMID: 20585619 PMCID: PMC2891706 DOI: 10.1371/journal.pcbi.1000828] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2009] [Accepted: 05/21/2010] [Indexed: 11/19/2022] Open
Abstract
Improving the ability to reverse engineer biochemical networks is a major goal of systems biology. Lesions in signaling networks lead to alterations in gene expression, which in principle should allow network reconstruction. However, the information about the activity levels of signaling proteins conveyed in overall gene expression is limited by the complexity of gene expression dynamics and of regulatory network topology. Two observations provide the basis for overcoming this limitation: a. genes induced without de-novo protein synthesis (early genes) show a linear accumulation of product in the first hour after the change in the cell's state; b. The signaling components in the network largely function in the linear range of their stimulus-response curves. Therefore, unlike most genes or most time points, expression profiles of early genes at an early time point provide direct biochemical assays that represent the activity levels of upstream signaling components. Such expression data provide the basis for an efficient algorithm (Plato's Cave algorithm; PLACA) to reverse engineer functional signaling networks. Unlike conventional reverse engineering algorithms that use steady state values, PLACA uses stimulated early gene expression measurements associated with systematic perturbations of signaling components, without measuring the signaling components themselves. Besides the reverse engineered network, PLACA also identifies the genes detecting the functional interaction, thereby facilitating validation of the predicted functional network. Using simulated datasets, the algorithm is shown to be robust to experimental noise. Using experimental data obtained from gonadotropes, PLACA reverse engineered the interaction network of six perturbed signaling components. The network recapitulated many known interactions and identified novel functional interactions that were validated by further experiment. PLACA uses the results of experiments that are feasible for any signaling network to predict the functional topology of the network and to identify novel relationships.
Collapse
|
20
|
Maria G. Lumped dynamic model for a bistable genetic regulatory circuit within a variable-volume whole-cell modelling framework. ASIA-PAC J CHEM ENG 2009. [DOI: 10.1002/apj.297] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
21
|
Lee WP, Yang KC. Applying intelligent computing techniques to modeling biological networks from expression data. GENOMICS PROTEOMICS & BIOINFORMATICS 2008; 6:111-20. [PMID: 18973867 PMCID: PMC5054112 DOI: 10.1016/s1672-0229(08)60026-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Constructing biological networks is one of the most important issues in systems biology. However, constructing a network from data manually takes a considerable large amount of time, therefore an automated procedure is advocated. To automate the procedure of network construction, in this work we use two intelligent computing techniques, genetic programming and neural computation, to infer two kinds of network models that use continuous variables. To verify the presented approaches, experiments have been conducted and the preliminary results show that both approaches can be used to infer networks successfully.
Collapse
Affiliation(s)
- Wei-Po Lee
- Department of Information Management, National Sun Yat-sen University, Kaohsiung, Chinese Taipei.
| | | |
Collapse
|
22
|
Schilstra MJ, Nehaniv CL. Bio-logic: gene expression and the laws of combinatorial logic. ARTIFICIAL LIFE 2008; 14:121-133. [PMID: 18171135 DOI: 10.1162/artl.2008.14.1.121] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
At the heart of the development of fertilized eggs into fully formed organisms and the adaptation of cells to changed conditions are genetic regulatory networks (GRNs). In higher multicellular organisms, signal selection and multiplexing are performed at the cis-regulatory domains of genes, where combinations of transcription factors (TFs) regulate the rates at which the genes are transcribed into mRNA. To be able to act as activators or repressors of gene transcription, TFs must first bind to target sequences on the regulatory domains. Two TFs that act in concert may bind entirely independently of each other, but more often binding of the first one will alter the affinity of the other for its binding site. This article presents a systematic investigation into the effect of TF binding dependences on the predicted regulatory function of this bio-logic. Four extreme scenarios, commonly used to classify enzyme activation and inhibition patterns, for the binding of two TFs were explored: independent (the TFs bind without affecting each other's affinities), competitive (the TFs compete for the same binding site), ordered (the TFs bind in a compulsory order), and joint binding (the TFs either bind as a preformed complex, or binding of one is virtually impossible in the absence of the other). The conclusions are: (1) the laws of combinatorial logic hold only for systems with independently binding TFs; (2) systems formed according to the other scenarios can mimic the functions of their Boolean logical counterparts, but cannot be combined or decomposed in the same way; and (3) the continuously scaled output of systems consisting of competitively binding activators and repressors can be controlled more robustly than that of single TF or (quasi-)logical multi-TF systems.
Collapse
Affiliation(s)
- Maria J Schilstra
- Biological and Neural Computation Group, Science and Technology Research Institute, University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, United Kingdom.
| | | |
Collapse
|
23
|
A network analysis of the human T-cell activation gene network identifies JAGGED1 as a therapeutic target for autoimmune diseases. PLoS One 2007; 2:e1222. [PMID: 18030350 PMCID: PMC2077806 DOI: 10.1371/journal.pone.0001222] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2007] [Accepted: 10/30/2007] [Indexed: 12/16/2022] Open
Abstract
Understanding complex diseases will benefit the recognition of the properties of the gene networks that control biological functions. Here, we set out to model the gene network that controls T-cell activation in humans, which is critical for the development of autoimmune diseases such as Multiple Sclerosis (MS). The network was established on the basis of the quantitative expression from 104 individuals of 20 genes of the immune system, as well as on biological information from the Ingenuity database and Bayesian inference. Of the 31 links (gene interactions) identified in the network, 18 were identified in the Ingenuity database and 13 were new and we validated 7 of 8 interactions experimentally. In the MS patients network, we found an increase in the weight of gene interactions related to Th1 function and a decrease in those related to Treg and Th2 function. Indeed, we found that IFN-ß therapy induces changes in gene interactions related to T cell proliferation and adhesion, although these gene interactions were not restored to levels similar to controls. Finally, we identify JAG1 as a new therapeutic target whose differential behaviour in the MS network was not modified by immunomodulatory therapy. In vitro treatment with a Jagged1 agonist peptide modulated the T-cell activation network in PBMCs from patients with MS. Moreover, treatment of mice with experimental autoimmune encephalomyelitis with the Jagged1 agonist ameliorated the disease course, and modulated Th2, Th1 and Treg function. This study illustrates how network analysis can predict therapeutic targets for immune intervention and identified the immunomodulatory properties of Jagged1 making it a new therapeutic target for MS and other autoimmune diseases.
Collapse
|
24
|
Ponzoni I, Azuaje F, Augusto J, Glass D. Inferring adaptive regulation thresholds and association rules from gene expression data through combinatorial optimization learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2007; 4:624-634. [PMID: 17975273 DOI: 10.1109/tcbb.2007.1049] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
There is a need to design computational methods to support the prediction of gene regulatory networks. Such models should offer both biologically-meaningful and computationally-accurate predictions, which in combination with other techniques may improve large-scale, integrative studies. This paper presents a new machine learning method for the prediction of putative regulatory associations from expression data, which exhibit properties never or only partially addressed by other techniques recently published. The method was tested on a Saccharomyces cerevisiae gene expression dataset. The results were statistically validated and compared with the relationships inferred by two machine learning approaches to gene regulatory network prediction. Furthermore, the resulting predictions were assessed using domain knowledge. The proposed algorithm may be able to accurately predict relevant biological associations between genes. One of the most relevant features of this new method is the prediction of adaptive regulation thresholds for the discretization of gene expression values, which is required prior to the rule association learning process. Moreover, an important advantage consists of its low computational cost to infer association rules. The proposed system may significantly support exploratory, large-scale studies of automated identification of potentially-relevant gene expression associations.
Collapse
|
25
|
Peercy BE, Cox SJ, Shalel-Levanon S, San KY, Bennett G. A kinetic model of oxygen regulation of cytochrome production in Escherichia coli. J Theor Biol 2006; 242:547-63. [PMID: 16750836 DOI: 10.1016/j.jtbi.2006.04.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2005] [Revised: 03/20/2006] [Accepted: 04/05/2006] [Indexed: 11/16/2022]
Abstract
Recent experimental work has identified the principal components arrayed by Escherichia coli in its sensing of, and response to, varying levels of oxygen. This apparatus may be leveraged/modified by the metabolic engineer to identify nonuniform oxygen and glucose regimens that deliver better yields than their uniform counterparts. Toward this end we build and analyse a mathematical model that captures the role played by oxygen in the regulation of cytochrome production in E. coli.
Collapse
Affiliation(s)
- Bradford E Peercy
- Computational and Applied Mathematics, Rice University, 6100 Main Str., MS 134, Houstin, TX 77005, USA.
| | | | | | | | | |
Collapse
|
26
|
Kell DB. Theodor Bücher Lecture. Metabolomics, modelling and machine learning in systems biology - towards an understanding of the languages of cells. Delivered on 3 July 2005 at the 30th FEBS Congress and the 9th IUBMB conference in Budapest. FEBS J 2006; 273:873-94. [PMID: 16478464 DOI: 10.1111/j.1742-4658.2006.05136.x] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The newly emerging field of systems biology involves a judicious interplay between high-throughput 'wet' experimentation, computational modelling and technology development, coupled to the world of ideas and theory. This interplay involves iterative cycles, such that systems biology is not at all confined to hypothesis-dependent studies, with intelligent, principled, hypothesis-generating studies being of high importance and consequently very far from aimless fishing expeditions. I seek to illustrate each of these facets. Novel technology development in metabolomics can increase substantially the dynamic range and number of metabolites that one can detect, and these can be exploited as disease markers and in the consequent and principled generation of hypotheses that are consistent with the data and achieve this in a value-free manner. Much of classical biochemistry and signalling pathway analysis has concentrated on the analyses of changes in the concentrations of intermediates, with 'local' equations - such as that of Michaelis and Menten v=(Vmax x S)/(S+K m) - that describe individual steps being based solely on the instantaneous values of these concentrations. Recent work using single cells (that are not subject to the intellectually unsupportable averaging of the variable displayed by heterogeneous cells possessing nonlinear kinetics) has led to the recognition that some protein signalling pathways may encode their signals not (just) as concentrations (AM or amplitude-modulated in a radio analogy) but via changes in the dynamics of those concentrations (the signals are FM or frequency-modulated). This contributes in principle to a straightforward solution of the crosstalk problem, leads to a profound reassessment of how to understand the downstream effects of dynamic changes in the concentrations of elements in these pathways, and stresses the role of signal processing (and not merely the intermediates) in biological signalling. It is this signal processing that lies at the heart of understanding the languages of cells. The resolution of many of the modern and postgenomic problems of biochemistry requires the development of a myriad of new technologies (and maybe a new culture), and thus regular input from the physical sciences, engineering, mathematics and computer science. One solution, that we are adopting in the Manchester Interdisciplinary Biocentre (http://www.mib.ac.uk/) and the Manchester Centre for Integrative Systems Biology (http://www.mcisb.org/), is thus to colocate individuals with the necessary combinations of skills. Novel disciplines that require such an integrative approach continue to emerge. These include fields such as chemical genomics, synthetic biology, distributed computational environments for biological data and modelling, single cell diagnostics/bionanotechnology, and computational linguistics/text mining.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry, Faraday Building, The University of Manchester, UK.
| |
Collapse
|
27
|
|
28
|
Klamt S, Saez-Rodriguez J, Lindquist JA, Simeoni L, Gilles ED. A methodology for the structural and functional analysis of signaling and regulatory networks. BMC Bioinformatics 2006; 7:56. [PMID: 16464248 PMCID: PMC1458363 DOI: 10.1186/1471-2105-7-56] [Citation(s) in RCA: 201] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2005] [Accepted: 02/07/2006] [Indexed: 12/15/2022] Open
Abstract
Background Structural analysis of cellular interaction networks contributes to a deeper understanding of network-wide interdependencies, causal relationships, and basic functional capabilities. While the structural analysis of metabolic networks is a well-established field, similar methodologies have been scarcely developed and applied to signaling and regulatory networks. Results We propose formalisms and methods, relying on adapted and partially newly introduced approaches, which facilitate a structural analysis of signaling and regulatory networks with focus on functional aspects. We use two different formalisms to represent and analyze interaction networks: interaction graphs and (logical) interaction hypergraphs. We show that, in interaction graphs, the determination of feedback cycles and of all the signaling paths between any pair of species is equivalent to the computation of elementary modes known from metabolic networks. Knowledge on the set of signaling paths and feedback loops facilitates the computation of intervention strategies and the classification of compounds into activators, inhibitors, ambivalent factors, and non-affecting factors with respect to a certain species. In some cases, qualitative effects induced by perturbations can be unambiguously predicted from the network scheme. Interaction graphs however, are not able to capture AND relationships which do frequently occur in interaction networks. The consequent logical concatenation of all the arcs pointing into a species leads to Boolean networks. For a Boolean representation of cellular interaction networks we propose a formalism based on logical (or signed) interaction hypergraphs, which facilitates in particular a logical steady state analysis (LSSA). LSSA enables studies on the logical processing of signals and the identification of optimal intervention points (targets) in cellular networks. LSSA also reveals network regions whose parametrization and initial states are crucial for the dynamic behavior. We have implemented these methods in our software tool CellNetAnalyzer (successor of FluxAnalyzer) and illustrate their applicability using a logical model of T-Cell receptor signaling providing non-intuitive results regarding feedback loops, essential elements, and (logical) signal processing upon different stimuli. Conclusion The methods and formalisms we propose herein are another step towards the comprehensive functional analysis of cellular interaction networks. Their potential, shown on a realistic T-cell signaling model, makes them a promising tool.
Collapse
Affiliation(s)
- Steffen Klamt
- Max-Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, D-39106 Magdeburg, Germany
| | - Julio Saez-Rodriguez
- Max-Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, D-39106 Magdeburg, Germany
| | - Jonathan A Lindquist
- Institute for Immunology, University of Magdeburg, Leipziger Strasse 44, D-39120 Magdeburg, Germany
| | - Luca Simeoni
- Institute for Immunology, University of Magdeburg, Leipziger Strasse 44, D-39120 Magdeburg, Germany
| | - Ernst D Gilles
- Max-Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, D-39106 Magdeburg, Germany
| |
Collapse
|