1
|
Tan Y, Neto FBL, Neto UB. PALLAS: Penalized mAximum LikeLihood and pArticle Swarms for Inference of Gene Regulatory Networks From Time Series Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1807-1816. [PMID: 33170782 DOI: 10.1109/tcbb.2020.3037090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We present PALLAS, a practical method for gene regulatory network (GRN) inference from time series data, which employs penalized maximum likelihood and particle swarms for optimization. PALLAS is based on the Partially-Observable Boolean Dynamical System (POBDS) model and thus does not require ad-hoc binarization of the data. The penalty in the likelihood is a LASSO regularization term, which encourages the resulting network to be sparse. PALLAS is able to scale to networks of realistic size under no prior knowledge, by virtue of a novel continuous-discrete Fish School Search particle swarm algorithm for efficient simultaneous maximization of the penalized likelihood over the discrete space of networks and the continuous space of observational parameters. The performance of PALLAS is demonstrated by a comprehensive set of experiments using synthetic data generated from real and artificial networks, as well as real time series microarray and RNA-seq data, where it is compared to several other well-known methods for gene regulatory network inference. The results show that PALLAS can infer GRNs more accurately than other methods, while being capable of working directly on gene expression data, without need of ad-hoc binarization. PALLAS is a fully-fledged program, written in python, and available on GitHub (https://github.com/yukuntan92/PALLAS).
Collapse
|
2
|
The identifiability of gene regulatory networks: the role of observation data. J Biol Phys 2022; 48:93-110. [PMID: 34988715 PMCID: PMC8866611 DOI: 10.1007/s10867-021-09595-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Accepted: 11/07/2021] [Indexed: 10/19/2022] Open
Abstract
Identifying gene regulatory networks (GRN) from observation data is significant to understand biological systems. Conventional studies focus on improving the performance of identification algorithms. However, besides algorithm performance, the GRN identification is strongly depended on the observation data. In this work, for three GRN S-system models, three observation data collection schemes are used to perform the identifiability test procedure. A modified genetic algorithm-particle swarm optimization algorithm is proposed to implement this task, including the multi-level mutation operation and velocity limitation strategy. The results show that, in scheme 1 (starting from a special initial condition), the GRN systems are of identifiability using the sufficient transient observation data. In scheme 2, the observation data are short of sufficient system dynamic. The GRN systems are not of identifiability even though the state trajectories can be reproduced. As a special case of scheme 2, i.e., the steady-state observation data, the equilibrium point analysis is given to explain why it is infeasible for GRN identification. In schemes 1 and 2, the observation data are obtained from zero-input GRN systems, which will evolve to the steady state at last. The sufficient transient observation data in scheme 1 can be obtained by changing the experimental conditions. Additionally, the valid observation data can be also obtained by means of adding impulse excitation signal into GRN systems (scheme 3). Consequently, the GRN systems are identifiable using scheme 3. Owing to its universality and simplicity, these results provide a guide for biologists to collect valid observation data for identifying GRNs and to further understand GRN dynamics.
Collapse
|
3
|
Tsai MJ, Wang JR, Ho SJ, Shu LS, Huang WL, Ho SY. GREMA: modelling of emulated gene regulatory networks with confidence levels based on evolutionary intelligence to cope with the underdetermined problem. Bioinformatics 2020; 36:3833-3840. [DOI: 10.1093/bioinformatics/btaa267] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Revised: 04/14/2020] [Accepted: 05/09/2020] [Indexed: 11/12/2022] Open
Abstract
AbstractMotivationNon-linear ordinary differential equation (ODE) models that contain numerous parameters are suitable for inferring an emulated gene regulatory network (eGRN). However, the number of experimental measurements is usually far smaller than the number of parameters of the eGRN model that leads to an underdetermined problem. There is no unique solution to the inference problem for an eGRN using insufficient measurements.ResultsThis work proposes an evolutionary modelling algorithm (EMA) that is based on evolutionary intelligence to cope with the underdetermined problem. EMA uses an intelligent genetic algorithm to solve the large-scale parameter optimization problem. An EMA-based method, GREMA, infers a novel type of gene regulatory network with confidence levels for every inferred regulation. The higher the confidence level is, the more accurate the inferred regulation is. GREMA gradually determines the regulations of an eGRN with confidence levels in descending order using either an S-system or a Hill function-based ODE model. The experimental results showed that the regulations with high-confidence levels are more accurate and robust than regulations with low-confidence levels. Evolutionary intelligence enhanced the mean accuracy of GREMA by 19.2% when using the S-system model with benchmark datasets. An increase in the number of experimental measurements may increase the mean confidence level of the inferred regulations. GREMA performed well compared with existing methods that have been previously applied to the same S-system, DREAM4 challenge and SOS DNA repair benchmark datasets.Availability and implementationAll of the datasets that were used and the GREMA-based tool are freely available at https://nctuiclab.github.io/GREMA.Supplementary informationSupplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ming-Ju Tsai
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Jyun-Rong Wang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Shinn-Jang Ho
- Department of Automation Engineering, National Formosa University, Yunlin 632, Taiwan
| | - Li-Sun Shu
- Department of Information Management, Overseas Chinese University, Taichung 407, Taiwan
| | - Wen-Lin Huang
- Department of Industrial Engineering and Management, Minghsin University of Science and Technology, Xinfeng 304, Taiwan
| | - Shinn-Ying Ho
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology
- Center For Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Chiao Tung University, Hsinchu 300, Taiwan
| |
Collapse
|
4
|
Kizhakkethil Youseph AS, Chetty M, Karmakar G. Reverse engineering genetic networks using nonlinear saturation kinetics. Biosystems 2019; 182:30-41. [PMID: 31185246 DOI: 10.1016/j.biosystems.2019.103977] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 04/25/2019] [Accepted: 05/27/2019] [Indexed: 01/01/2023]
Abstract
A gene regulatory network (GRN) represents a set of genes along with their regulatory interactions. Cellular behavior is driven by genetic level interactions. Dynamics of such systems show nonlinear saturation kinetics which can be best modeled by Michaelis-Menten (MM) and Hill equations. Although MM equation is being widely used for modeling biochemical processes, it has been applied rarely for reverse engineering GRNs. In this paper, we develop a complete framework for a novel model for GRN inference using MM kinetics. A set of coupled equations is first proposed for modeling GRNs. In the coupled model, Michaelis-Menten constant associated with regulation by a gene is made invariant irrespective of the gene being regulated. The parameter estimation of the proposed model is carried out using an evolutionary optimization method, namely, trigonometric differential evolution (TDE). Subsequently, the model is further improved and the regulations of different genes by a given gene are made distinct by allowing varying values of Michaelis-Menten constants for each regulation. Apart from making the model more relevant biologically, the improvement results in a decoupled GRN model with fast estimation of model parameters. Further, to enhance exploitation of the search, we propose a local search algorithm based on hill climbing heuristics. A novel mutation operation is also proposed to avoid population stagnation and premature convergence. Real life benchmark data sets generated in vivo are used for validating the proposed model. Further, we also analyze realistic in silico datasets generated using GeneNetweaver. The comparison of the performance of proposed model with other existing methods shows the potential of the proposed model.
Collapse
Affiliation(s)
| | - Madhu Chetty
- School of Science, Engineering and Information Technology, Federation University Australia, Gippsland 3842, Australia
| | - Gour Karmakar
- School of Science, Engineering and Information Technology, Federation University Australia, Gippsland 3842, Australia
| |
Collapse
|
5
|
Simulink-Based Analysis for Coupled Metabolic Systems. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING 2018. [DOI: 10.1155/2018/8075051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Stability analysis and dynamic simulation are important for researchers to capture the performance and the properties of underling systems. S-systems have good potential for characterizing dynamic interactive behaviour of large scale metabolic and genetic systems. It is important to develop a platform to achieve timely dynamic behaviour of S-systems to various situations. In this study, we first set up the respective block diagrams of S-systems for module-based simulation. We then derive reasonable theorems to examine the stability of S-systems and find out what kinds of environmental situations will make systems stable. Three canonical systems are used to examine the results which are carried out in the Matlab/Simulink environments.
Collapse
|
6
|
Kimura S, Kitazawa K, Tokuhisa M, Okada-Hatakeyama M. Using A Priori Knowledge after Genetic Network Inference: Integrating Multiple Kinds of Knowledge. CHEM-BIO INFORMATICS JOURNAL 2017. [DOI: 10.1273/cbij.17.53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
7
|
Jereesh AS, Govindan VK. Immuno-hybrid algorithm: a novel hybrid approach for GRN reconstruction. 3 Biotech 2016; 6:222. [PMID: 28330294 PMCID: PMC5065543 DOI: 10.1007/s13205-016-0536-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 10/03/2016] [Indexed: 11/28/2022] Open
Abstract
Bio-inspired algorithms are widely used to optimize the model parameters of GRN. In this paper, focus is given to develop improvised versions of bio-inspired algorithm for the specific problem of reconstruction of gene regulatory network. The approach is applied to the data set that was developed by the DNA microarray technology through biological experiments in the lab. This paper introduced a novel hybrid method, which combines the clonal selection algorithm and BFGS Quasi-Newton algorithm. The proposed approach implemented for real world E. coli data set and identified most of the relations. The results are also compared with the existing methods and proven to be efficient.
Collapse
Affiliation(s)
- A. S. Jereesh
- Department of Computer Science, Cochin University of Science and Technology, Cochin, Kerala India
| | - V. K. Govindan
- Department of Computer Science and Engineering, Indian Institute of Information Technology Pala, Kottayam, Kerala India
| |
Collapse
|
8
|
Kimura S, Tokuhisa M, Okada-Hatakeyama M. Genetic Network Inference Using Hierarchical Structure. Front Physiol 2016; 7:57. [PMID: 26941653 PMCID: PMC4763037 DOI: 10.3389/fphys.2016.00057] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 02/05/2016] [Indexed: 12/28/2022] Open
Abstract
Many methods for inferring genetic networks have been proposed, but the regulations they infer often include false-positives. Several researchers have attempted to reduce these erroneous regulations by proposing the use of a priori knowledge about the properties of genetic networks such as their sparseness, scale-free structure, and so on. This study focuses on another piece of a priori knowledge, namely, that biochemical networks exhibit hierarchical structures. Based on this idea, we propose an inference approach that uses the hierarchical structure in a target genetic network. To obtain a reasonable hierarchical structure, the first step of the proposed approach is to infer multiple genetic networks from the observed gene expression data. We take this step using an existing method that combines a genetic network inference method with a bootstrap method. The next step is to extract a hierarchical structure from the inferred networks that is consistent with most of the networks. Third, we use the hierarchical structure obtained to assign confidence values to all candidate regulations. Numerical experiments are also performed to demonstrate the effectiveness of using the hierarchical structure in the genetic network inference. The improvement accomplished by the use of the hierarchical structure is small. However, the hierarchical structure could be used to improve the performances of many existing inference methods.
Collapse
Affiliation(s)
- Shuhei Kimura
- Department of Information and Electronics, Graduate School of Engineering, Tottori University Tottori, Japan
| | - Masato Tokuhisa
- Department of Information and Electronics, Graduate School of Engineering, Tottori University Tottori, Japan
| | - Mariko Okada-Hatakeyama
- Laboratory for Integrated Cellular Systems, RIKEN Center for Integrative Medical Sciences Yokohama, Japan
| |
Collapse
|
9
|
Liu LZ, Wu FX, Zhang WJ. Properties of sparse penalties on inferring gene regulatory networks from time-course gene expression data. IET Syst Biol 2015; 9:16-24. [PMID: 25569860 DOI: 10.1049/iet-syb.2013.0060] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Genes regulate each other and form a gene regulatory network (GRN) to realise biological functions. Elucidating GRN from experimental data remains a challenging problem in systems biology. Numerous techniques have been developed and sparse linear regression methods become a promising approach to infer accurate GRNs. However, most linear methods are either based on steady-state gene expression data or their statistical properties are not analysed. Here, two sparse penalties, adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviation, are proposed to infer GRNs from time-course gene expression data based on an auto-regressive model and their Oracle properties are proved under mild conditions. The effectiveness of those methods is demonstrated by applications to in silico and real biological data.
Collapse
Affiliation(s)
- Li-Zhi Liu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Wen-Jun Zhang
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
10
|
Liu LZ, Wu FX, Zhang WJ. Properties of sparse penalties on inferring gene regulatory networks from time-course gene expression data. IET Syst Biol 2015. [PMID: 25569860 DOI: 10.1049/iet‐syb.2013.0060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Genes regulate each other and form a gene regulatory network (GRN) to realise biological functions. Elucidating GRN from experimental data remains a challenging problem in systems biology. Numerous techniques have been developed and sparse linear regression methods become a promising approach to infer accurate GRNs. However, most linear methods are either based on steady-state gene expression data or their statistical properties are not analysed. Here, two sparse penalties, adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviation, are proposed to infer GRNs from time-course gene expression data based on an auto-regressive model and their Oracle properties are proved under mild conditions. The effectiveness of those methods is demonstrated by applications to in silico and real biological data.
Collapse
Affiliation(s)
- Li-Zhi Liu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Wen-Jun Zhang
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
11
|
Liu LZ, Wu FX, Zhang WJ. A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 3:S1. [PMID: 25350697 PMCID: PMC4243122 DOI: 10.1186/1752-0509-8-s3-s1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. RESULTS A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. CONCLUSIONS The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.
Collapse
|
12
|
Inference of Vohradský's models of genetic networks by solving two-dimensional function optimization problems. PLoS One 2014; 8:e83308. [PMID: 24386175 PMCID: PMC3875442 DOI: 10.1371/journal.pone.0083308] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 11/01/2013] [Indexed: 11/21/2022] Open
Abstract
The inference of a genetic network is a problem in which mutual interactions among genes are inferred from time-series of gene expression levels. While a number of models have been proposed to describe genetic networks, this study focuses on a mathematical model proposed by Vohradský. Because of its advantageous features, several researchers have proposed the inference methods based on Vohradský's model. When trying to analyze large-scale networks consisting of dozens of genes, however, these methods must solve high-dimensional non-linear function optimization problems. In order to resolve the difficulty of estimating the parameters of the Vohradský's model, this study proposes a new method that defines the problem as several two-dimensional function optimization problems. Through numerical experiments on artificial genetic network inference problems, we showed that, although the computation time of the proposed method is not the shortest, the method has the ability to estimate parameters of Vohradský's models more effectively with sufficiently short computation times. This study then applied the proposed method to an actual inference problem of the bacterial SOS DNA repair system, and succeeded in finding several reasonable regulations.
Collapse
|
13
|
Chowdhury AR, Chetty M, Vinh NX. Incorporating time-delays in S-System model for reverse engineering genetic networks. BMC Bioinformatics 2013; 14:196. [PMID: 23777625 PMCID: PMC3839642 DOI: 10.1186/1471-2105-14-196] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2013] [Accepted: 06/07/2013] [Indexed: 11/10/2022] Open
Abstract
Background In any gene regulatory network (GRN), the complex interactions occurring amongst transcription factors and target genes can be either instantaneous or time-delayed. However, many existing modeling approaches currently applied for inferring GRNs are unable to represent both these interactions simultaneously. As a result, all these approaches cannot detect important interactions of the other type. S-System model, a differential equation based approach which has been increasingly applied for modeling GRNs, also suffers from this limitation. In fact, all S-System based existing modeling approaches have been designed to capture only instantaneous interactions, and are unable to infer time-delayed interactions. Results In this paper, we propose a novel Time-Delayed S-System (TDSS) model which uses a set of delay differential equations to represent the system dynamics. The ability to incorporate time-delay parameters in the proposed S-System model enables simultaneous modeling of both instantaneous and time-delayed interactions. Furthermore, the delay parameters are not limited to just positive integer values (corresponding to time stamps in the data), but can also take fractional values. Moreover, we also propose a new criterion for model evaluation exploiting the sparse and scale-free nature of GRNs to effectively narrow down the search space, which not only reduces the computation time significantly but also improves model accuracy. The evaluation criterion systematically adapts the max-min in-degrees and also systematically balances the effect of network accuracy and complexity during optimization. Conclusion The four well-known performance measures applied to the experimental studies on synthetic networks with various time-delayed regulations clearly demonstrate that the proposed method can capture both instantaneous and delayed interactions correctly with high precision. The experiments carried out on two well-known real-life networks, namely IRMA and SOS DNA repair network in Escherichia coli show a significant improvement compared with other state-of-the-art approaches for GRN modeling.
Collapse
Affiliation(s)
- Ahsan Raja Chowdhury
- Gippsland School of Information Technology, Monash University, Churchill, Victoria-3842, Australia.
| | | | | |
Collapse
|
14
|
Abstract
Biochemical systems theory (BST) is the foundation for a set of analytical andmodeling tools that facilitate the analysis of dynamic biological systems. This paper depicts major developments in BST up to the current state of the art in 2012. It discusses its rationale, describes the typical strategies and methods of designing, diagnosing, analyzing, and utilizing BST models, and reviews areas of application. The paper is intended as a guide for investigators entering the fascinating field of biological systems analysis and as a resource for practitioners and experts.
Collapse
|
15
|
Nakajima N, Tamura T, Yamanishi Y, Horimoto K, Akutsu T. Network completion using dynamic programming and least-squares fitting. ScientificWorldJournal 2012; 2012:957620. [PMID: 23213307 PMCID: PMC3504398 DOI: 10.1100/2012/957620] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 09/26/2012] [Indexed: 11/24/2022] Open
Abstract
We consider the problem of network completion, which is to make
the minimum amount of modifications to a given network so that the resulting network
is most consistent with the observed data. We employ here a certain type of differential
equations as gene regulation rules in a genetic network, gene expression time series data
as observed data, and deletions and additions of edges as basic modification operations.
In addition, we assume that the numbers of deleted and added edges are specified. For
this problem, we present a novel method using dynamic programming and least-squares
fitting and show that it outputs a network with the minimum sum squared error in
polynomial time if the maximum indegree of the network is bounded by a constant. We
also perform computational experiments using both artificially generated and real gene
expression time series data.
Collapse
Affiliation(s)
- Natsu Nakajima
- Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan
| | | | | | | | | |
Collapse
|
16
|
Chemmangattuvalappil N, Task K, Banerjee I. An integer optimization algorithm for robust identification of non-linear gene regulatory networks. BMC SYSTEMS BIOLOGY 2012; 6:119. [PMID: 22937832 PMCID: PMC3444924 DOI: 10.1186/1752-0509-6-119] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2012] [Accepted: 08/27/2012] [Indexed: 11/16/2022]
Abstract
Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters. Furthermore, in both the in silico and experimental case studies, the predicted gene expression profiles are in very close agreement with the dynamics of the input data. Conclusions Our integer programming algorithm effectively utilizes bootstrapping to identify robust gene regulatory networks from noisy, non-linear time-series gene expression data. With significant noise and non-linearities being inherent to biological systems, the present formulism, with the incorporation of network sparsity, is extremely relevant to gene regulatory networks, and while the formulation has been validated against in silico and E. Coli data, it can be applied to any biological system.
Collapse
Affiliation(s)
- Nishanth Chemmangattuvalappil
- Department of Chemical Engineering, University of Pittsburgh, 1249 Benedum Hall, 3700 O'Hara Street, Pittsburgh, PA 15261, USA
| | | | | |
Collapse
|
17
|
Sambo F, de Oca MAM, Di Camillo B, Toffolo G, Stützle T. MORE: mixed optimization for reverse engineering--an application to modeling biological networks response via sparse systems of nonlinear differential equations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1459-1471. [PMID: 22837424 DOI: 10.1109/tcbb.2012.56] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Reverse engineering is the problem of inferring the structure of a network of interactions between biological variables from a set of observations. In this paper, we propose an optimization algorithm, called MORE, for the reverse engineering of biological networks from time series data. The model inferred by MORE is a sparse system of nonlinear differential equations, complex enough to realistically describe the dynamics of a biological system. MORE tackles separately the discrete component of the problem, the determination of the biological network topology, and the continuous component of the problem, the strength of the interactions. This approach allows us both to enforce system sparsity, by globally constraining the number of edges, and to integrate a priori information about the structure of the underlying interaction network. Experimental results on simulated and real-world networks show that the mixed discrete/continuous optimization approach of MORE significantly outperforms standard continuous optimization and that MORE is competitive with the state of the art in terms of accuracy of the inferred networks.
Collapse
Affiliation(s)
- Francesco Sambo
- Department of Information Engineering, University of Padova, Padova, Italy.
| | | | | | | | | |
Collapse
|
18
|
Yang X, Dent JE, Nardini C. An S-System Parameter Estimation Method (SPEM) for biological networks. J Comput Biol 2012; 19:175-87. [PMID: 22300319 DOI: 10.1089/cmb.2011.0269] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Advances in experimental biology, coupled with advances in computational power, bring new challenges to the interdisciplinary field of computational biology. One such broad challenge lies in the reverse engineering of gene networks, and goes from determining the structure of static networks, to reconstructing the dynamics of interactions from time series data. Here, we focus our attention on the latter area, and in particular, on parameterizing a dynamic network of oriented interactions between genes. By basing the parameterizing approach on a known power-law relationship model between connected genes (S-system), we are able to account for non-linearity in the network, without compromising the ability to analyze network characteristics. In this article, we introduce the S-System Parameter Estimation Method (SPEM). SPEM, a freely available R software package (http://www.picb.ac.cn/ClinicalGenomicNTW/temp3.html), takes gene expression data in time series and returns the network of interactions as a set of differential equations. The methods, which are presented and tested here, are shown to provide accurate results not only on synthetic data, but more importantly on real and therefore noisy by nature, biological data. In summary, SPEM shows high sensitivity and positive predicted values, as well as free availability and expansibility (because based on open source software). We expect these characteristics to make it a useful and broadly applicable software in the challenging reconstruction of dynamic gene networks.
Collapse
Affiliation(s)
- Xinyi Yang
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China
| | | | | |
Collapse
|
19
|
Hsiao YT, Lee WP. Inferring robust gene networks from expression data by a sensitivity-based incremental evolution method. BMC Bioinformatics 2012; 13 Suppl 7:S8. [PMID: 22595005 PMCID: PMC3348052 DOI: 10.1186/1471-2105-13-s7-s8] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Background Reconstructing gene regulatory networks (GRNs) from expression data is one of the most important challenges in systems biology research. Many computational models and methods have been proposed to automate the process of network reconstruction. Inferring robust networks with desired behaviours remains challenging, however. This problem is related to network dynamics but has yet to be investigated using network modeling. Results We propose an incremental evolution approach for inferring GRNs that takes network robustness into consideration and can deal with a large number of network parameters. Our approach includes a sensitivity analysis procedure to iteratively select the most influential network parameters, and it uses a swarm intelligence procedure to perform parameter optimization. We have conducted a series of experiments to evaluate the external behaviors and internal robustness of the networks inferred by the proposed approach. The results and analyses have verified the effectiveness of our approach. Conclusions Sensitivity analysis is crucial to identifying the most sensitive parameters that govern the network dynamics. It can further be used to derive constraints for network parameters in the network reconstruction process. The experimental results show that the proposed approach can successfully infer robust GRNs with desired system behaviors.
Collapse
Affiliation(s)
- Yu-Ting Hsiao
- Department of Information Management, National Sun Yat-sen University, 70, Lienhai Road, Kaohsiung, Taiwan
| | | |
Collapse
|
20
|
Saeki Y, Nagashima T, Kimura S, Okada-Hatakeyama M. An ErbB receptor-mediated AP-1 regulatory network is modulated by STAT3 and c-MYC during calcium-dependent keratinocyte differentiation. Exp Dermatol 2012; 21:293-8. [DOI: 10.1111/j.1600-0625.2012.01453.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
Kimura S, Araki D, Matsumura K, Okada-Hatakeyama M. Inference of S-system models of genetic networks by solving one-dimensional function optimization problems. Math Biosci 2012; 235:161-70. [PMID: 22155075 DOI: 10.1016/j.mbs.2011.11.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2011] [Revised: 10/21/2011] [Accepted: 11/22/2011] [Indexed: 11/17/2022]
Affiliation(s)
- S Kimura
- Graduate School of Engineering, Tottori University, 4-101, Koyama-minami, Tottori 680-8552, Japan.
| | | | | | | |
Collapse
|
22
|
KIMURA SHUHEI, SHIRAISHI YUICHI, OKADA MARIKO. INFERENCE OF GENETIC NETWORKS USING LPMs: ASSESSMENT OF CONFIDENCE VALUES OF REGULATIONS. J Bioinform Comput Biol 2011. [DOI: 10.1142/s0219720010004859] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
When we apply inference methods based on a set of differential equations into actual genetic network inference problems, we often end up with a large number of false-positive regulations. However, as we must check the inferred regulations through biochemical experiments, fewer false-positive regulations are preferable. In order to reduce the number of regulations checked, this study proposes a new method that assigns confidence values to all of the regulations contained in the target network. For this purpose, we combine a residual bootstrap method with the existing method, i.e. the inference method using linear programming machines (LPMs). Through numerical experiments on an artificial genetic network inference problem, we confirmed that most of the regulations with high confidence values are actually present in the target networks. We then used the proposed method to analyze the bacterial SOS DNA repair system, and succeeded in assigning reasonable confidence values to its regulations. Although this study combined the bootstrap method with the inference method using the LPMs, the proposed bootstrap approach could be combined with any method that has an ability to infer a genetic network from time-series of gene expression levels.
Collapse
Affiliation(s)
- SHUHEI KIMURA
- Graduate School of Engineering, Tottori University, 4-101, Koyama-minami, Tottori 680-8552, Japan
| | - YUICHI SHIRAISHI
- Research Center for Allergy and Immunology, RIKEN, 1-7-22 Suehiro, Tsurumi, Yokohama 230-0045, Japan
| | - MARIKO OKADA
- Research Center for Allergy and Immunology, RIKEN, 1-7-22 Suehiro, Tsurumi, Yokohama 230-0045, Japan
| |
Collapse
|
23
|
Kentzoglanakis K, Poole M. A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 9:358-371. [PMID: 21576756 DOI: 10.1109/tcbb.2011.87] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.
Collapse
|
24
|
|
25
|
Shiraishi Y, Kimura S, Okada M. Inferring cluster-based networks from differently stimulated multiple time-course gene expression data. ACTA ACUST UNITED AC 2010; 26:1073-81. [PMID: 20223837 PMCID: PMC2853688 DOI: 10.1093/bioinformatics/btq094] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Motivation: Clustering and gene network inference often help to predict the biological functions of gene subsets. Recently, researchers have accumulated a large amount of time-course transcriptome data collected under different treatment conditions to understand the physiological states of cells in response to extracellular stimuli and to identify drug-responsive genes. Although a variety of statistical methods for clustering and inferring gene networks from expression profiles have been proposed, most of these are not tailored to simultaneously treat expression data collected under multiple stimulation conditions. Results: We propose a new statistical method for analyzing temporal profiles under multiple experimental conditions. Our method simultaneously performs clustering of temporal expression profiles and inference of regulatory relationships among gene clusters. We applied this method to MCF7 human breast cancer cells treated with epidermal growth factor and heregulin which induce cellular proliferation and differentiation, respectively. The results showed that the method is useful for extracting biologically relevant information. Availability: A MATLAB implementation of the method is available from http://csb.gsc.riken.jp/yshira/software/clusterNetwork.zip Contact:yshira@riken.jp Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuichi Shiraishi
- RIKEN Research Center for Allergy and Immunology, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan.
| | | | | |
Collapse
|
26
|
Kabir M, Noman N, Iba H. Reverse engineering gene regulatory network from microarray data using linear time-variant model. BMC Bioinformatics 2010; 11 Suppl 1:S56. [PMID: 20122231 PMCID: PMC3009529 DOI: 10.1186/1471-2105-11-s1-s56] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Background Gene regulatory network is an abstract mapping of gene regulations in living cells that can help to predict the system behavior of living organisms. Such prediction capability can potentially lead to the development of improved diagnostic tests and therapeutics. DNA microarrays, which measure the expression level of thousands of genes in parallel, constitute the numeric seed for the inference of gene regulatory networks. In this paper, we have proposed a new approach for inferring gene regulatory networks from time-series gene expression data using linear time-variant model. Here, Self-Adaptive Differential Evolution, a versatile and robust Evolutionary Algorithm, is used as the learning paradigm. Results To assess the potency of the proposed work, a well known nonlinear synthetic network has been used. The reconstruction method has inferred this synthetic network topology and the associated regulatory parameters with high accuracy from both the noise-free and noisy time-series data. For validation purposes, the proposed approach is also applied to the simulated expression dataset of cAMP oscillations in Dictyostelium discoideum and has proved it's strength in finding the correct regulations. The strength of this work has also been verified by analyzing the real expression dataset of SOS DNA repair system in Escherichia coli and it has succeeded in finding more correct and reasonable regulations as compared to various existing works. Conclusion By the proposed approach, the gene interaction networks have been inferred in an efficient manner from both the synthetic, simulated cAMP oscillation expression data and real expression data. The computational time of this approach is also considerably smaller, which makes it to be more suitable for larger network reconstruction. Thus the proposed approach can serve as an initiate for the future researches regarding the associated area.
Collapse
Affiliation(s)
- Mitra Kabir
- Department of Computer Science and Engineering, University of Dhaka, Dhaka, Bangladesh.
| | | | | |
Collapse
|