1
|
Mandal S, Dutta P. A Review of Computational Approach for S-system-based Modeling of Gene Regulatory Network. Methods Mol Biol 2024; 2719:133-152. [PMID: 37803116 DOI: 10.1007/978-1-0716-3461-5_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
Inference of gene regulatory network (GRN) from time series microarray data remains as a fascinating task for computer science researchers to understand the complex biological process that occurred inside a cell. Among the different popular models to infer GRN, S-system is considered as one of the promising non-linear mathematical tools to model the dynamics of gene expressions, as well as to infer the GRN. S-system is based on biochemical system theory and power law formalism. By observing the value of kinetic parameters of S-system model, it is possible to extract the regulatory relationships among genes. In this review, several existing intelligent methods that were already proposed for inference of S-system-based GRN are explained. It is observed that finding out the most suitable and efficient optimization technique for the accurate inference of all kinds of networks, i.e., in-silico, in-vivo, etc., with less computational complexity is still an open research problem to all. This paper may help the beginners or researchers who want to continue their research in the field of computational biology and bioinformatics.
Collapse
Affiliation(s)
- Sudip Mandal
- Department of Electronics and Communication Engineering, Jalpaiguri Government Engineering College, Jalpaiguri, West Bengal, India
| | - Pijush Dutta
- Department of Electronics and Communication Engineering, Greater Kolkata College of Engineering and Management, Baruipur, India
| |
Collapse
|
2
|
Yang B, Bao W, Chen B. PGRNIG: novel parallel gene regulatory network identification algorithm based on GPU. Brief Funct Genomics 2022; 21:441-454. [PMID: 36064791 DOI: 10.1093/bfgp/elac028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 12/14/2022] Open
Abstract
Molecular biology has revealed that complex life phenomena can be treated as the result of many gene interactions. Investigating these interactions and understanding the intrinsic mechanisms of biological systems using gene expression data have attracted a lot of attention. As a typical gene regulatory network (GRN) inference method, the S-system has been utilized to deal with small-scale network identification. However, it is extremely difficult to optimize it to infer medium-to-large networks. This paper proposes a novel parallel swarm intelligent algorithm, PGRNIG, to optimize the parameters of the S-system. We employed the clone selection strategy to improve the whale optimization algorithm (CWOA). To enhance the time efficiency of CWOA optimization, we utilized a parallel CWOA (PCWOA) based on the compute unified device architecture (CUDA) platform. Decomposition strategy and L1 regularization were utilized to reduce the search space and complexity of GRN inference. We applied the PGRNIG algorithm on three synthetic datasets and two real time-series expression datasets of the species of Escherichia coli and Saccharomyces cerevisiae. Experimental results show that PGRNIG could infer the gene regulatory network more accurately than other state-of-the-art methods with a convincing computational speed-up. Our findings show that CWOA and PCWOA have faster convergence performances than WOA.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China
| | - Wenzheng Bao
- School of Information and Electrical Engineering, Xuzhou University of Technology, Xuzhou 221018, China
| | - Baitong Chen
- Xuzhou First People's Hospital, Xuzhou 221000, China
| |
Collapse
|
3
|
|
4
|
Yang B, Chen Y, Zhang W, Lv J, Bao W, Huang DS. HSCVFNT: Inference of Time-Delayed Gene Regulatory Network Based on Complex-Valued Flexible Neural Tree Model. Int J Mol Sci 2018; 19:E3178. [PMID: 30326663 PMCID: PMC6214043 DOI: 10.3390/ijms19103178] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 10/08/2018] [Accepted: 10/10/2018] [Indexed: 11/17/2022] Open
Abstract
Gene regulatory network (GRN) inference can understand the growth and development of animals and plants, and reveal the mystery of biology. Many computational approaches have been proposed to infer GRN. However, these inference approaches have hardly met the need of modeling, and the reducing redundancy methods based on individual information theory method have bad universality and stability. To overcome the limitations and shortcomings, this thesis proposes a novel algorithm, named HSCVFNT, to infer gene regulatory network with time-delayed regulations by utilizing a hybrid scoring method and complex-valued flexible neural network (CVFNT). The regulations of each target gene can be obtained by iteratively performing HSCVFNT. For each target gene, the HSCVFNT algorithm utilizes a novel scoring method based on time-delayed mutual information (TDMI), time-delayed maximum information coefficient (TDMIC) and time-delayed correlation coefficient (TDCC), to reduce the redundancy of regulatory relationships and obtain the candidate regulatory factor set. Then, the TDCC method is utilized to create time-delayed gene expression time-series matrix. Finally, a complex-valued flexible neural tree model is proposed to infer the time-delayed regulations of each target gene with the time-delayed time-series matrix. Three real time-series expression datasets from (Save Our Soul) SOS DNA repair system in E. coli and Saccharomyces cerevisiae are utilized to evaluate the performance of the HSCVFNT algorithm. As a result, HSCVFNT obtains outstanding F-scores of 0.923, 0.8 and 0.625 for SOS network and (In vivo Reverse-Engineering and Modeling Assessment) IRMA network inference, respectively, which are 5.5%, 14.3% and 72.2% higher than the best performance of other state-of-the-art GRN inference methods and time-delayed methods.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan 250002, China.
| | - Wei Zhang
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Jiaguo Lv
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Wenzheng Bao
- School of Computer Science, China University of Mining and Technology, Xuzhou 221000, China.
| | - De-Shuang Huang
- Institute of Machine Learning and Systems Biology, Tongji University, Shanghai 200092, China.
| |
Collapse
|
5
|
Barbosa S, Niebel B, Wolf S, Mauch K, Takors R. A guide to gene regulatory network inference for obtaining predictive solutions: Underlying assumptions and fundamental biological and data constraints. Biosystems 2018; 174:37-48. [PMID: 30312740 DOI: 10.1016/j.biosystems.2018.10.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 02/07/2023]
Abstract
The study of biological systems at a system level has become a reality due to the increasing powerful computational approaches able to handle increasingly larger datasets. Uncovering the dynamic nature of gene regulatory networks in order to attain a system level understanding and improve the predictive power of biological models is an important research field in systems biology. The task itself presents several challenges, since the problem is of combinatorial nature and highly depends on several biological constraints and also the intended application. Given the intrinsic interdisciplinary nature of gene regulatory network inference, we present a review on the currently available approaches, their challenges and limitations. We propose guidelines to select the most appropriate method considering the underlying assumptions and fundamental biological and data constraints.
Collapse
Affiliation(s)
- Sara Barbosa
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany.
| | - Bastian Niebel
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany
| | - Sebastian Wolf
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany
| | - Klaus Mauch
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany
| | - Ralf Takors
- Institute of Biochemical Engineering, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| |
Collapse
|
6
|
Cell signaling heterogeneity is modulated by both cell-intrinsic and -extrinsic mechanisms: An integrated approach to understanding targeted therapy. PLoS Biol 2018. [PMID: 29522507 PMCID: PMC5844524 DOI: 10.1371/journal.pbio.2002930] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
During the last decade, our understanding of cancer cell signaling networks has significantly improved, leading to the development of various targeted therapies that have elicited profound but, unfortunately, short-lived responses. This is, in part, due to the fact that these targeted therapies ignore context and average out heterogeneity. Here, we present a mathematical framework that addresses the impact of signaling heterogeneity on targeted therapy outcomes. We employ a simplified oncogenic rat sarcoma (RAS)-driven mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase-protein kinase B (PI3K-AKT) signaling pathway in lung cancer as an experimental model system and develop a network model of the pathway. We measure how inhibition of the pathway modulates protein phosphorylation as well as cell viability under different microenvironmental conditions. Training the model on this data using Monte Carlo simulation results in a suite of in silico cells whose relative protein activities and cell viability match experimental observation. The calibrated model predicts distributional responses to kinase inhibitors and suggests drug resistance mechanisms that can be exploited in drug combination strategies. The suggested combination strategies are validated using in vitro experimental data. The validated in silico cells are further interrogated through an unsupervised clustering analysis and then integrated into a mathematical model of tumor growth in a homogeneous and resource-limited microenvironment. We assess posttreatment heterogeneity and predict vast differences across treatments with similar efficacy, further emphasizing that heterogeneity should modulate treatment strategies. The signaling model is also integrated into a hybrid cellular automata (HCA) model of tumor growth in a spatially heterogeneous microenvironment. As a proof of concept, we simulate tumor responses to targeted therapies in a spatially segregated tissue structure containing tumor and stroma (derived from patient tissue) and predict complex cell signaling responses that suggest a novel combination treatment strategy. A signaling pathway is a network of molecules in a cell that is typically initiated by stimuli (e.g., microenvironmental cues) acting on receptors and internal signaling molecules to determine cell fate. Signaling pathways in cancer cells are different from those in normal cells, and this difference helps cancer cells to grow and thrive indefinitely. Drugs that target the aberrant signaling pathways in cancer cells (often referred to as targeted therapy) are promising for improving treatment outcomes of many different cancers in patients. However, most patients eventually develop resistance to these drugs. Resistance may already be present in the tumor or may emerge via mutation or via microenvironmental mediation. Tumor heterogeneity, which is characterized by subtle or dramatic differences among tumor cells, plays a key role in the development of drug resistance. Some tumor cells respond well to therapy, while others may adapt to the stress induced by the drug within the microenvironment. Moreover, removal of drug-sensitive cells may result in the competitive release of drug-resistant cells. Here, we present mathematical models to assess the impact of heterogeneity in signaling pathways within tumor cells on the outcomes of targeted therapy. We consider a simplified version of two well-known signaling pathways that modulate the growth of lung cancer cells. By using different targeted therapies, we quantify the effect of pathway inhibition on protein activity and cell viability and developed a mathematical model of the network, which is trained to reproduce these data and to develop a panel of heterogeneous in silico cells. The model predicts potential mechanisms of drug resistance and proposes combination therapies that are effective across the panel. We validate these combination therapies experimentally using the lung cancer cells and integrated the in silico cells into a computational lung tissue model that explicitly captures the microenvironment of lung cancer. Our results suggest that heterogeneity in both the tumor and microenvironment impacts treatment response in different ways and suggest a novel combination therapy for a better response.
Collapse
|
7
|
Inference of Biochemical S-Systems via Mixed-Variable Multiobjective Evolutionary Optimization. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017; 2017:3020326. [PMID: 28607576 PMCID: PMC5457779 DOI: 10.1155/2017/3020326] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 04/27/2017] [Indexed: 12/18/2022]
Abstract
Inference of the biochemical systems (BSs) via experimental data is important for understanding how biochemical components in vivo interact with each other. However, it is not a trivial task because BSs usually function with complex and nonlinear dynamics. As a popular ordinary equation (ODE) model, the S-System describes the dynamical properties of BSs by incorporating the power rule of biochemical reactions but behaves as a challenge because it has a lot of parameters to be confirmed. This work is dedicated to proposing a general method for inference of S-Systems by experimental data, using a biobjective optimization (BOO) model and a specially mixed-variable multiobjective evolutionary algorithm (mv-MOEA). Regarding that BSs are sparse in common sense, we introduce binary variables indicating network connections to eliminate the difficulty of threshold presetting and take data fitting error and the L0-norm as two objectives to be minimized in the BOO model. Then, a selection procedure that automatically runs tradeoff between two objectives is employed to choose final inference results from the obtained nondominated solutions of the mv-MOEA. Inference results of the investigated networks demonstrate that our method can identify their dynamical properties well, although the automatic selection procedure sometimes ignores some weak connections in BSs.
Collapse
|
8
|
Jereesh AS, Govindan VK. Immuno-hybrid algorithm: a novel hybrid approach for GRN reconstruction. 3 Biotech 2016; 6:222. [PMID: 28330294 PMCID: PMC5065543 DOI: 10.1007/s13205-016-0536-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 10/03/2016] [Indexed: 11/28/2022] Open
Abstract
Bio-inspired algorithms are widely used to optimize the model parameters of GRN. In this paper, focus is given to develop improvised versions of bio-inspired algorithm for the specific problem of reconstruction of gene regulatory network. The approach is applied to the data set that was developed by the DNA microarray technology through biological experiments in the lab. This paper introduced a novel hybrid method, which combines the clonal selection algorithm and BFGS Quasi-Newton algorithm. The proposed approach implemented for real world E. coli data set and identified most of the relations. The results are also compared with the existing methods and proven to be efficient.
Collapse
Affiliation(s)
- A. S. Jereesh
- Department of Computer Science, Cochin University of Science and Technology, Cochin, Kerala India
| | - V. K. Govindan
- Department of Computer Science and Engineering, Indian Institute of Information Technology Pala, Kottayam, Kerala India
| |
Collapse
|
9
|
Yang B, Liu S, Zhang W. Reverse engineering of gene regulatory network using restricted gene expression programming. J Bioinform Comput Biol 2016; 14:1650021. [PMID: 27338130 DOI: 10.1142/s0219720016500219] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Inference of gene regulatory networks has been becoming a major area of interest in the field of systems biology over the past decade. In this paper, we present a novel representation of S-system model, named restricted gene expression programming (RGEP), to infer gene regulatory network. A new hybrid evolutionary algorithm based on structure-based evolutionary algorithm and cuckoo search (CS) is proposed to optimize the architecture and corresponding parameters of model, respectively. Two synthetic benchmark datasets and one real biological dataset from SOS DNA repair network in E. coli are used to test the validity of our method. Experimental results demonstrate that our proposed method performs better than previously proposed popular methods.
Collapse
Affiliation(s)
- Bin Yang
- 1 School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China
| | - Sanrong Liu
- 1 School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China
| | - Wei Zhang
- 1 School of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China
| |
Collapse
|
10
|
Mandal S, Khan A, Saha G, Pal RK. Reverse engineering of gene regulatory networks based on S-systems and Bat algorithm. J Bioinform Comput Biol 2016; 14:1650010. [DOI: 10.1142/s0219720016500104] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The correct inference of gene regulatory networks for the understanding of the intricacies of the complex biological regulations remains an intriguing task for researchers. With the availability of large dimensional microarray data, relationships among thousands of genes can be simultaneously extracted. Among the prevalent models of reverse engineering genetic networks, S-system is considered to be an efficient mathematical tool. In this paper, Bat algorithm, based on the echolocation of bats, has been used to optimize the S-system model parameters. A decoupled S-system has been implemented to reduce the complexity of the algorithm. Initially, the proposed method has been successfully tested on an artificial network with and without the presence of noise. Based on the fact that a real-life genetic network is sparsely connected, a novel Accumulative Cardinality based decoupled S-system has been proposed. The cardinality has been varied from zero up to a maximum value, and this model has been implemented for the reconstruction of the DNA SOS repair network of Escherichia coli. The obtained results have shown significant improvements in the detection of a greater number of true regulations, and in the minimization of false detections compared to other existing methods.
Collapse
Affiliation(s)
- Sudip Mandal
- Electronics and Communication Engineering Department, Global Institute of Management and Technology Krishnanagar, West Bengal 741102, India
| | - Abhinandan Khan
- Computer Science and Engineering Department, University of Calcutta, Acharya Prafulla Chandra Siksha Prangan, JD-2, Sector – III, Salt Lake Kolkata 700098, India
| | - Goutam Saha
- Information Technology Department, North Eastern Hill University, Umshing, Mawkynroh, Shillong, Meghalaya 793022, India
| | - Rajat Kumar Pal
- Computer Science and Engineering Department, University of Calcutta, Acharya Prafulla Chandra Siksha Prangan, JD-2, Sector – III, Salt Lake Kolkata 700098, India
| |
Collapse
|
11
|
Berrones A, Jiménez E, Alcorta-García MA, Almaguer FJ, Peña B. Parameter inference of general nonlinear dynamical models of gene regulatory networks from small and noisy time series. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.10.095] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
12
|
Liu LZ, Wu FX, Zhang WJ. Properties of sparse penalties on inferring gene regulatory networks from time-course gene expression data. IET Syst Biol 2015; 9:16-24. [PMID: 25569860 DOI: 10.1049/iet-syb.2013.0060] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Genes regulate each other and form a gene regulatory network (GRN) to realise biological functions. Elucidating GRN from experimental data remains a challenging problem in systems biology. Numerous techniques have been developed and sparse linear regression methods become a promising approach to infer accurate GRNs. However, most linear methods are either based on steady-state gene expression data or their statistical properties are not analysed. Here, two sparse penalties, adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviation, are proposed to infer GRNs from time-course gene expression data based on an auto-regressive model and their Oracle properties are proved under mild conditions. The effectiveness of those methods is demonstrated by applications to in silico and real biological data.
Collapse
Affiliation(s)
- Li-Zhi Liu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Wen-Jun Zhang
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
13
|
Clustering and Differential Alignment Algorithm: Identification of Early Stage Regulators in the Arabidopsis thaliana Iron Deficiency Response. PLoS One 2015; 10:e0136591. [PMID: 26317202 PMCID: PMC4552565 DOI: 10.1371/journal.pone.0136591] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Accepted: 08/05/2015] [Indexed: 11/25/2022] Open
Abstract
Time course transcriptome datasets are commonly used to predict key gene regulators associated with stress responses and to explore gene functionality. Techniques developed to extract causal relationships between genes from high throughput time course expression data are limited by low signal levels coupled with noise and sparseness in time points. We deal with these limitations by proposing the Cluster and Differential Alignment Algorithm (CDAA). This algorithm was designed to process transcriptome data by first grouping genes based on stages of activity and then using similarities in gene expression to predict influential connections between individual genes. Regulatory relationships are assigned based on pairwise alignment scores generated using the expression patterns of two genes and some inferred delay between the regulator and the observed activity of the target. We applied the CDAA to an iron deficiency time course microarray dataset to identify regulators that influence 7 target transcription factors known to participate in the Arabidopsis thaliana iron deficiency response. The algorithm predicted that 7 regulators previously unlinked to iron homeostasis influence the expression of these known transcription factors. We validated over half of predicted influential relationships using qRT-PCR expression analysis in mutant backgrounds. One predicted regulator-target relationship was shown to be a direct binding interaction according to yeast one-hybrid (Y1H) analysis. These results serve as a proof of concept emphasizing the utility of the CDAA for identifying unknown or missing nodes in regulatory cascades, providing the fundamental knowledge needed for constructing predictive gene regulatory networks. We propose that this tool can be used successfully for similar time course datasets to extract additional information and infer reliable regulatory connections for individual genes.
Collapse
|
14
|
Liu LZ, Wu FX, Zhang WJ. Properties of sparse penalties on inferring gene regulatory networks from time-course gene expression data. IET Syst Biol 2015. [PMID: 25569860 DOI: 10.1049/iet‐syb.2013.0060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Genes regulate each other and form a gene regulatory network (GRN) to realise biological functions. Elucidating GRN from experimental data remains a challenging problem in systems biology. Numerous techniques have been developed and sparse linear regression methods become a promising approach to infer accurate GRNs. However, most linear methods are either based on steady-state gene expression data or their statistical properties are not analysed. Here, two sparse penalties, adaptive least absolute shrinkage and selection operator and smoothly clipped absolute deviation, are proposed to infer GRNs from time-course gene expression data based on an auto-regressive model and their Oracle properties are proved under mild conditions. The effectiveness of those methods is demonstrated by applications to in silico and real biological data.
Collapse
Affiliation(s)
- Li-Zhi Liu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Wen-Jun Zhang
- Department of Mechanical Engineering, Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
15
|
Youseph ASK, Chetty M, Karmakar G. Decoupled Modeling of Gene Regulatory Networks Using Michaelis-Menten Kinetics. NEURAL INFORMATION PROCESSING 2015:497-505. [DOI: 10.1007/978-3-319-26555-1_56] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
16
|
Hsiao YT, Lee WP. Reverse engineering gene regulatory networks: coupling an optimization algorithm with a parameter identification technique. BMC Bioinformatics 2014; 15 Suppl 15:S8. [PMID: 25474560 PMCID: PMC4271569 DOI: 10.1186/1471-2105-15-s15-s8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background To infer gene regulatory networks from time series gene profiles, two important tasks that are related to biological systems must be undertaken. One task is to determine a valid network structure that has topological properties that can influence the network dynamics profoundly. The other task is to optimize the network parameters to minimize the accumulated discrepancy between the gene expression data and the values produced by the inferred network model. Though the above two tasks must be conducted simultaneously, most existing work addresses only one of the tasks. Results We propose an iterative approach that couples parameter identification and parameter optimization techniques, to address the two tasks simultaneously during network inference. This approach first identifies the most influential parameters against internal perturbations; this identification is based on sensitivity measurements. Then, a hybrid GA-PSO optimization method infers parameters in accordance with their criticalities. The proposed approach has been applied to several datasets, including subsets of the SOS DNA repair system in E. coli, the Rat central nervous system (CNS), and the protein glycosylation system of yeast S. cerevisiae. The result and analysis show that our approach can infer solutions to satisfy both the requirements of network structure and network behavior. Conclusions Network structure is an important though challenging issue to address in inferring sophisticated networks with biological details. In need of prior structural knowledge, we turn to measure parameter sensitivity instead to account for the network structure in an indirect way. By developing an integrated approach for considering both the network structure and behavior in the inference process, we can successfully infer critical gene interactions as well as valid time expression profiles.
Collapse
|
17
|
Mahdevar G, Nowzari-Dalini A, Sadeghi M. Inferring gene correlation networks from transcription factor binding sites. Genes Genet Syst 2014; 88:301-9. [PMID: 24694393 DOI: 10.1266/ggs.88.301] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Gene expression is a highly regulated biological process that is fundamental to the existence of phenotypes of any living organism. The regulatory relations are usually modeled as a network; simply, every gene is modeled as a node and relations are shown as edges between two related genes. This paper presents a novel method for inferring correlation networks, networks constructed by connecting co-expressed genes, through predicting co-expression level from genes promoter's sequences. According to the results, this method works well on biological data and its outcome is comparable to the methods that use microarray as input. The method is written in C++ language and is available upon request from the corresponding author.
Collapse
Affiliation(s)
- Ghasem Mahdevar
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran
| | | | | |
Collapse
|
18
|
Liu LZ, Wu FX, Zhang WJ. A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 3:S1. [PMID: 25350697 PMCID: PMC4243122 DOI: 10.1186/1752-0509-8-s3-s1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. RESULTS A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. CONCLUSIONS The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.
Collapse
|
19
|
Lee WP, Hsiao YT, Hwang WC. Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment. BMC SYSTEMS BIOLOGY 2014; 8:5. [PMID: 24428926 PMCID: PMC3900469 DOI: 10.1186/1752-0509-8-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Accepted: 01/06/2014] [Indexed: 11/10/2022]
Abstract
Background To improve the tedious task of reconstructing gene networks through testing
experimentally the possible interactions between genes, it becomes a trend
to adopt the automated reverse engineering procedure instead. Some
evolutionary algorithms have been suggested for deriving network parameters.
However, to infer large networks by the evolutionary algorithm, it is
necessary to address two important issues: premature convergence and high
computational cost. To tackle the former problem and to enhance the
performance of traditional evolutionary algorithms, it is advisable to use
parallel model evolutionary algorithms. To overcome the latter and to speed
up the computation, it is advocated to adopt the mechanism of cloud
computing as a promising solution: most popular is the method of MapReduce
programming model, a fault-tolerant framework to implement parallel
algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by
developing and parallelizing a hybrid GA-PSO optimization method. Our
parallel method is extended to work with the Hadoop MapReduce programming
model and is executed in different cloud computing environments. To evaluate
the proposed approach, we use a well-known open-source software
GeneNetWeaver to create several yeast S. cerevisiae sub-networks
and use them to produce gene profiles. Experiments have been conducted and
the results have been analyzed. They show that our parallel approach can be
successfully used to infer networks with desired behaviors and the
computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network
parameters and they perform better than the widely-used sequential
algorithms in gene network inference. These parallel algorithms can be
distributed to the cloud computing environment to speed up the computation.
By coupling the parallel model population-based optimization method and the
parallel computational framework, high quality solutions can be obtained
within relatively short time. This integrated approach is a promising way
for inferring large networks.
Collapse
Affiliation(s)
- Wei-Po Lee
- Department of Information Management, National Sun Yat-sen University, Kaohsiung, Taiwan.
| | | | | |
Collapse
|
20
|
Inference of Vohradský's models of genetic networks by solving two-dimensional function optimization problems. PLoS One 2014; 8:e83308. [PMID: 24386175 PMCID: PMC3875442 DOI: 10.1371/journal.pone.0083308] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 11/01/2013] [Indexed: 11/21/2022] Open
Abstract
The inference of a genetic network is a problem in which mutual interactions among genes are inferred from time-series of gene expression levels. While a number of models have been proposed to describe genetic networks, this study focuses on a mathematical model proposed by Vohradský. Because of its advantageous features, several researchers have proposed the inference methods based on Vohradský's model. When trying to analyze large-scale networks consisting of dozens of genes, however, these methods must solve high-dimensional non-linear function optimization problems. In order to resolve the difficulty of estimating the parameters of the Vohradský's model, this study proposes a new method that defines the problem as several two-dimensional function optimization problems. Through numerical experiments on artificial genetic network inference problems, we showed that, although the computation time of the proposed method is not the shortest, the method has the ability to estimate parameters of Vohradský's models more effectively with sufficiently short computation times. This study then applied the proposed method to an actual inference problem of the bacterial SOS DNA repair system, and succeeded in finding several reasonable regulations.
Collapse
|
21
|
Reconstructing biological gene regulatory networks: where optimization meets big data. EVOLUTIONARY INTELLIGENCE 2013. [DOI: 10.1007/s12065-013-0098-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
22
|
Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers. BIOMED RESEARCH INTERNATIONAL 2013; 2013:239628. [PMID: 24078908 PMCID: PMC3770038 DOI: 10.1155/2013/239628] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2013] [Revised: 07/08/2013] [Accepted: 07/17/2013] [Indexed: 11/24/2022]
Abstract
DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance.
Collapse
|
23
|
Chowdhury AR, Chetty M, Vinh NX. Incorporating time-delays in S-System model for reverse engineering genetic networks. BMC Bioinformatics 2013; 14:196. [PMID: 23777625 PMCID: PMC3839642 DOI: 10.1186/1471-2105-14-196] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2013] [Accepted: 06/07/2013] [Indexed: 11/10/2022] Open
Abstract
Background In any gene regulatory network (GRN), the complex interactions occurring amongst transcription factors and target genes can be either instantaneous or time-delayed. However, many existing modeling approaches currently applied for inferring GRNs are unable to represent both these interactions simultaneously. As a result, all these approaches cannot detect important interactions of the other type. S-System model, a differential equation based approach which has been increasingly applied for modeling GRNs, also suffers from this limitation. In fact, all S-System based existing modeling approaches have been designed to capture only instantaneous interactions, and are unable to infer time-delayed interactions. Results In this paper, we propose a novel Time-Delayed S-System (TDSS) model which uses a set of delay differential equations to represent the system dynamics. The ability to incorporate time-delay parameters in the proposed S-System model enables simultaneous modeling of both instantaneous and time-delayed interactions. Furthermore, the delay parameters are not limited to just positive integer values (corresponding to time stamps in the data), but can also take fractional values. Moreover, we also propose a new criterion for model evaluation exploiting the sparse and scale-free nature of GRNs to effectively narrow down the search space, which not only reduces the computation time significantly but also improves model accuracy. The evaluation criterion systematically adapts the max-min in-degrees and also systematically balances the effect of network accuracy and complexity during optimization. Conclusion The four well-known performance measures applied to the experimental studies on synthetic networks with various time-delayed regulations clearly demonstrate that the proposed method can capture both instantaneous and delayed interactions correctly with high precision. The experiments carried out on two well-known real-life networks, namely IRMA and SOS DNA repair network in Escherichia coli show a significant improvement compared with other state-of-the-art approaches for GRN modeling.
Collapse
Affiliation(s)
- Ahsan Raja Chowdhury
- Gippsland School of Information Technology, Monash University, Churchill, Victoria-3842, Australia.
| | | | | |
Collapse
|
24
|
Chemmangattuvalappil N, Task K, Banerjee I. An integer optimization algorithm for robust identification of non-linear gene regulatory networks. BMC SYSTEMS BIOLOGY 2012; 6:119. [PMID: 22937832 PMCID: PMC3444924 DOI: 10.1186/1752-0509-6-119] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2012] [Accepted: 08/27/2012] [Indexed: 11/16/2022]
Abstract
Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters. Furthermore, in both the in silico and experimental case studies, the predicted gene expression profiles are in very close agreement with the dynamics of the input data. Conclusions Our integer programming algorithm effectively utilizes bootstrapping to identify robust gene regulatory networks from noisy, non-linear time-series gene expression data. With significant noise and non-linearities being inherent to biological systems, the present formulism, with the incorporation of network sparsity, is extremely relevant to gene regulatory networks, and while the formulation has been validated against in silico and E. Coli data, it can be applied to any biological system.
Collapse
Affiliation(s)
- Nishanth Chemmangattuvalappil
- Department of Chemical Engineering, University of Pittsburgh, 1249 Benedum Hall, 3700 O'Hara Street, Pittsburgh, PA 15261, USA
| | | | | |
Collapse
|
25
|
Morshed N, Chetty M, Nguyen XV. Simultaneous learning of instantaneous and time-delayed genetic interactions using novel information theoretic scoring technique. BMC SYSTEMS BIOLOGY 2012; 6:62. [PMID: 22691450 PMCID: PMC3529704 DOI: 10.1186/1752-0509-6-62] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2011] [Accepted: 06/06/2012] [Indexed: 11/10/2022]
Abstract
Background Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. Results In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. Conclusion By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach.
Collapse
Affiliation(s)
- Nizamul Morshed
- Gippsland School of Information Technology, Faculty of Information Technology, Monash University, VIC 3842, Northways Road, Australia.
| | | | | |
Collapse
|
26
|
Issues impacting genetic network reverse engineering algorithm validation using small networks. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2012; 1824:1434-41. [PMID: 22683439 DOI: 10.1016/j.bbapap.2012.05.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Revised: 05/15/2012] [Accepted: 05/31/2012] [Indexed: 11/22/2022]
Abstract
Genetic network reverse engineering has been an area of intensive research within the systems biology community during the last decade. With many techniques currently available, the task of validating them and choosing the best one for a certain problem is a complex issue. Current practice has been to validate an approach on in-silico synthetic data sets, and, wherever possible, on real data sets with known ground-truth. In this study, we highlight a major issue that the validation of reverse engineering algorithms on small benchmark networks very often results in networks which are not statistically better than a randomly picked network. Another important issue highlighted is that with short time series, a small variation in the pre-processing procedure might yield large differences in the inferred networks. To demonstrate these issues, we have selected as our case study the IRMA in-vivo synthetic yeast network recently published in Cell. Using Fisher's exact test, we show that many results reported in the literature on reverse-engineering this network are not significantly better than random. The discussion is further extended to some other networks commonly used for validation purposes in the literature. The results presented in this study emphasize that studies carried out using small genetic networks are likely to be trivial, making it imperative that larger real networks be used for validating and benchmarking purposes. If smaller networks are considered, then the results should be interpreted carefully to avoid over confidence. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.
Collapse
|
27
|
Yang X, Dent JE, Nardini C. An S-System Parameter Estimation Method (SPEM) for biological networks. J Comput Biol 2012; 19:175-87. [PMID: 22300319 DOI: 10.1089/cmb.2011.0269] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Advances in experimental biology, coupled with advances in computational power, bring new challenges to the interdisciplinary field of computational biology. One such broad challenge lies in the reverse engineering of gene networks, and goes from determining the structure of static networks, to reconstructing the dynamics of interactions from time series data. Here, we focus our attention on the latter area, and in particular, on parameterizing a dynamic network of oriented interactions between genes. By basing the parameterizing approach on a known power-law relationship model between connected genes (S-system), we are able to account for non-linearity in the network, without compromising the ability to analyze network characteristics. In this article, we introduce the S-System Parameter Estimation Method (SPEM). SPEM, a freely available R software package (http://www.picb.ac.cn/ClinicalGenomicNTW/temp3.html), takes gene expression data in time series and returns the network of interactions as a set of differential equations. The methods, which are presented and tested here, are shown to provide accurate results not only on synthetic data, but more importantly on real and therefore noisy by nature, biological data. In summary, SPEM shows high sensitivity and positive predicted values, as well as free availability and expansibility (because based on open source software). We expect these characteristics to make it a useful and broadly applicable software in the challenging reconstruction of dynamic gene networks.
Collapse
Affiliation(s)
- Xinyi Yang
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China
| | | | | |
Collapse
|
28
|
Hsiao YT, Lee WP. Inferring robust gene networks from expression data by a sensitivity-based incremental evolution method. BMC Bioinformatics 2012; 13 Suppl 7:S8. [PMID: 22595005 PMCID: PMC3348052 DOI: 10.1186/1471-2105-13-s7-s8] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Background Reconstructing gene regulatory networks (GRNs) from expression data is one of the most important challenges in systems biology research. Many computational models and methods have been proposed to automate the process of network reconstruction. Inferring robust networks with desired behaviours remains challenging, however. This problem is related to network dynamics but has yet to be investigated using network modeling. Results We propose an incremental evolution approach for inferring GRNs that takes network robustness into consideration and can deal with a large number of network parameters. Our approach includes a sensitivity analysis procedure to iteratively select the most influential network parameters, and it uses a swarm intelligence procedure to perform parameter optimization. We have conducted a series of experiments to evaluate the external behaviors and internal robustness of the networks inferred by the proposed approach. The results and analyses have verified the effectiveness of our approach. Conclusions Sensitivity analysis is crucial to identifying the most sensitive parameters that govern the network dynamics. It can further be used to derive constraints for network parameters in the network reconstruction process. The experimental results show that the proposed approach can successfully infer robust GRNs with desired system behaviors.
Collapse
Affiliation(s)
- Yu-Ting Hsiao
- Department of Information Management, National Sun Yat-sen University, 70, Lienhai Road, Kaohsiung, Taiwan
| | | |
Collapse
|
29
|
|
30
|
Kimura S, Araki D, Matsumura K, Okada-Hatakeyama M. Inference of S-system models of genetic networks by solving one-dimensional function optimization problems. Math Biosci 2012; 235:161-70. [PMID: 22155075 DOI: 10.1016/j.mbs.2011.11.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2011] [Revised: 10/21/2011] [Accepted: 11/22/2011] [Indexed: 11/17/2022]
Affiliation(s)
- S Kimura
- Graduate School of Engineering, Tottori University, 4-101, Koyama-minami, Tottori 680-8552, Japan.
| | | | | | | |
Collapse
|
31
|
Nazri A, Lio P. Investigating meta-approaches for reconstructing gene networks in a mammalian cellular context. PLoS One 2012; 7:e28713. [PMID: 22253694 PMCID: PMC3253778 DOI: 10.1371/journal.pone.0028713] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2011] [Accepted: 11/14/2011] [Indexed: 11/18/2022] Open
Abstract
The output of state-of-the-art reverse-engineering methods for biological networks is often based on the fitting of a mathematical model to the data. Typically, different datasets do not give single consistent network predictions but rather an ensemble of inconsistent networks inferred under the same reverse-engineering method that are only consistent with the specific experimentally measured data. Here, we focus on an alternative approach for combining the information contained within such an ensemble of inconsistent gene networks called meta-analysis, to make more accurate predictions and to estimate the reliability of these predictions. We review two existing meta-analysis approaches; the Fisher transformation combined coefficient test (FTCCT) and Fisher's inverse combined probability test (FICPT); and compare their performance with five well-known methods, ARACNe, Context Likelihood or Relatedness network (CLR), Maximum Relevance Minimum Redundancy (MRNET), Relevance Network (RN) and Bayesian Network (BN). We conducted in-depth numerical ensemble simulations and demonstrated for biological expression data that the meta-analysis approaches consistently outperformed the best gene regulatory network inference (GRNI) methods in the literature. Furthermore, the meta-analysis approaches have a low computational complexity. We conclude that the meta-analysis approaches are a powerful tool for integrating different datasets to give more accurate and reliable predictions for biological networks.
Collapse
Affiliation(s)
- Azree Nazri
- Department of Computer Science, Faculty of Computer Science & Information Technology, University Putra Malaysia, Malaysia, Selangor, Malaysia.
| | | |
Collapse
|
32
|
Kentzoglanakis K, Poole M. A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 9:358-371. [PMID: 21576756 DOI: 10.1109/tcbb.2011.87] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.
Collapse
|