1
|
Zhang J, Hu C, Zhang Q. Gene regulatory network inference based on a nonhomogeneous dynamic Bayesian network model with an improved Markov Monte Carlo sampling. BMC Bioinformatics 2023; 24:264. [PMID: 37355560 DOI: 10.1186/s12859-023-05381-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Accepted: 06/07/2023] [Indexed: 06/26/2023] Open
Abstract
A nonhomogeneous dynamic Bayesian network model, which combines the dynamic Bayesian network and the multi-change point process, solves the limitations of the dynamic Bayesian network in modeling non-stationary gene expression data to a certain extent. However, certain problems persist, such as the low network reconstruction accuracy and poor model convergence. Therefore, we propose an MD-birth move based on the Manhattan distance of the data points to increase the rationality of the multi-change point process. The underlying concept of the MD-birth move is that the direction of movement of the change point is assumed to have a larger Manhattan distance between the variance and the mean of its left and right data points. Considering the data instability characteristics, we propose a Markov chain Monte Carlo sampling method based on node-dependent particle filtering in addition to the multi-change point process. The candidate parent nodes to be sampled, which are close to the real state, are pushed to the high probability area through the particle filter, and the candidate parent node set to be sampled that is far from the real state is pushed to the low probability area and then sampled. In terms of reconstructing the gene regulatory network, the model proposed in this paper (FC-DBN) has better network reconstruction accuracy and model convergence speed than other corresponding models on the Saccharomyces cerevisiae data and RAF data.
Collapse
Affiliation(s)
- Jiayao Zhang
- College of Artificial Intelligence and Big Data, Hefei University, Hefei, 230031, China
| | - Chunling Hu
- College of Artificial Intelligence and Big Data, Hefei University, Hefei, 230031, China.
| | - Qianqian Zhang
- College of Artificial Intelligence and Big Data, Hefei University, Hefei, 230031, China
| |
Collapse
|
2
|
Wang Q, Guo M, Chen J, Duan R. A gene regulatory network inference model based on pseudo-siamese network. BMC Bioinformatics 2023; 24:163. [PMID: 37085776 PMCID: PMC10122305 DOI: 10.1186/s12859-023-05253-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 03/24/2023] [Indexed: 04/23/2023] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) arise from the intricate interactions between transcription factors (TFs) and their target genes during the growth and development of organisms. The inference of GRNs can unveil the underlying gene interactions in living systems and facilitate the investigation of the relationship between gene expression patterns and phenotypic traits. Although several machine-learning models have been proposed for inferring GRNs from single-cell RNA sequencing (scRNA-seq) data, some of these models, such as Boolean and tree-based networks, suffer from sensitivity to noise and may encounter difficulties in handling the high noise and dimensionality of actual scRNA-seq data, as well as the sparse nature of gene regulation relationships. Thus, inferring large-scale information from GRNs remains a formidable challenge. RESULTS This study proposes a multilevel, multi-structure framework called a pseudo-Siamese GRN (PSGRN) for inferring large-scale GRNs from time-series expression datasets. Based on the pseudo-Siamese network, we applied a gated recurrent unit to capture the time features of each TF and target matrix and learn the spatial features of the matrices after merging by applying the DenseNet framework. Finally, we applied a sigmoid function to evaluate interactions. We constructed two maize sub-datasets, including gene expression levels and GRNs, using existing open-source maize multi-omics data and compared them to other GRN inference methods, including GENIE3, GRNBoost2, nonlinear ordinary differential equations, CNNC, and DGRNS. Our results show that PSGRN outperforms state-of-the-art methods. This study proposed a new framework: a PSGRN that allows GRNs to be inferred from scRNA-seq data, elucidating the temporal and spatial features of TFs and their target genes. The results show the model's robustness and generalization, laying a theoretical foundation for maize genotype-phenotype associations with implications for breeding work.
Collapse
Affiliation(s)
- Qian Wang
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| | - Jian Chen
- College of Agronomy and Biotechnology, China Agricultural University, Beijing, China
| | - Ran Duan
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| |
Collapse
|
3
|
Nabuco Leva Ferreira de Freitas JA, Bischof O. Dynamic modeling of the cellular senescence gene regulatory network. Heliyon 2023; 9:e14007. [PMID: 36938415 PMCID: PMC10015196 DOI: 10.1016/j.heliyon.2023.e14007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 02/13/2023] [Accepted: 02/17/2023] [Indexed: 02/27/2023] Open
Abstract
Cellular senescence is a cell fate that prominently impacts physiological and pathophysiological processes. Diverse cellular stresses induce it, and dramatic gene expression changes accompany it. However, determining the interactions comprising the gene regulatory network (GRN) governing senescence remains challenging. Recent advances in signal processing techniques provide opportunities to reconstruct GRNs. Here, we describe a GRN for senescence integrating time-series transcriptome and transcription factor depletion datasets. Specifically, we infer a set of differential equations using the "Sparse Identification of Nonlinear Dynamics" (SINDy) algorithm, discriminate genes with potential hidden regulators, validate the inferred GRN for time-points not included in the training data, and comprehensively benchmark our approach. Our work is a proof of concept for a data-driven GRN reconstruction method, consolidating an iterative, powerful mathematical platform for senescence modeling that can be used to test hypotheses in silico and has the potential for future discoveries of clinical impact.
Collapse
Affiliation(s)
- José Américo Nabuco Leva Ferreira de Freitas
- IMRB, Mondor Institute for Biomedical Research, INSERM U955 – Université Paris Est Créteil, UPEC, Faculté de Médecine de Créteil 8, rue du Général Sarrail, 94010 Créteil
- Sorbonne Université, UMR 8256, Biological Adaptation and Ageing B2A–IBPS, F-75005, Paris, France
- INSERM U1164, F-75005, Paris, France
| | - Oliver Bischof
- IMRB, Mondor Institute for Biomedical Research, INSERM U955 – Université Paris Est Créteil, UPEC, Faculté de Médecine de Créteil 8, rue du Général Sarrail, 94010 Créteil
- Corresponding author.
| |
Collapse
|
4
|
Ma B, Fang M, Jiao X. Inference of gene regulatory networks based on nonlinear ordinary differential equations. Bioinformatics 2020; 36:4885-4893. [DOI: 10.1093/bioinformatics/btaa032] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 12/30/2019] [Accepted: 01/15/2020] [Indexed: 01/05/2023] Open
Abstract
Abstract
Motivation
Gene regulatory networks (GRNs) capture the regulatory interactions between genes, resulting from the fundamental biological process of transcription and translation. In some cases, the topology of GRNs is not known, and has to be inferred from gene expression data. Most of the existing GRNs reconstruction algorithms are either applied to time-series data or steady-state data. Although time-series data include more information about the system dynamics, steady-state data imply stability of the underlying regulatory networks.
Results
In this article, we propose a method for inferring GRNs from time-series and steady-state data jointly. We make use of a non-linear ordinary differential equations framework to model dynamic gene regulation and an importance measurement strategy to infer all putative regulatory links efficiently. The proposed method is evaluated extensively on the artificial DREAM4 dataset and two real gene expression datasets of yeast and Escherichia coli. Based on public benchmark datasets, the proposed method outperforms other popular inference algorithms in terms of overall score. By comparing the performance on the datasets with different scales, the results show that our method still keeps good robustness and accuracy at a low computational complexity.
Availability and implementation
The proposed method is written in the Python language, and is available at: https://github.com/lab319/GRNs_nonlinear_ODEs
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Baoshan Ma
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Mingkun Fang
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Xiangtian Jiao
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| |
Collapse
|
5
|
Che D, Guo S, Jiang Q, Chen L. PFBNet: a priori-fused boosting method for gene regulatory network inference. BMC Bioinformatics 2020; 21:308. [PMID: 32664870 PMCID: PMC7362553 DOI: 10.1186/s12859-020-03639-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2019] [Accepted: 07/02/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. RESULTS We present a novel method, namely priori-fused boosting network inference method (PFBNet), to infer GRNs from time-series expression data by using the non-linear model of Boosting and the prior information (e.g., the knockout data) fusion scheme. Specifically, PFBNet first calculates the confidences of the regulation relationships using the boosting-based model, where the information about the accumulation impact of the gene expressions at previous time points is taken into account. Then, a newly defined strategy is applied to fuse the information from the prior data by elevating the confidences of the regulation relationships from the corresponding regulators. CONCLUSIONS The experiments on the benchmark datasets from DREAM challenge as well as the E.coli datasets show that PFBNet achieves significantly better performance than other state-of-the-art methods (Jump3, GEINE3-lag, HiDi, iRafNet and BiXGBoost).
Collapse
Affiliation(s)
- Dandan Che
- Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000 China
| | - Shun Guo
- Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000 China
| | - Qingshan Jiang
- Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000 China
| | - Lifei Chen
- School of Mathematics and Computer Science, Fujian Normal University, Fujian, 350117 China
| |
Collapse
|
6
|
Law J, Ng K, Windram OPF. The Phenotype Paradox: Lessons From Natural Transcriptome Evolution on How to Engineer Plants. FRONTIERS IN PLANT SCIENCE 2020; 11:75. [PMID: 32133018 PMCID: PMC7040092 DOI: 10.3389/fpls.2020.00075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 01/20/2020] [Indexed: 06/10/2023]
Abstract
Plants have evolved genome complexity through iterative rounds of single gene and whole genome duplication. This has led to substantial expansion in transcription factor numbers following preferential retention and subsequent functional divergence of these regulatory genes. Here we review how this simple evolutionary network rewiring process, regulatory gene duplication followed by functional divergence, can be used to inspire synthetic biology approaches that seek to develop novel phenotypic variation for future trait based breeding programs in plants.
Collapse
Affiliation(s)
- Justin Law
- Grand Challenges in Ecosystems and the Environment, Imperial College London, Ascot, United Kingdom
| | - Kangbo Ng
- The Francis Crick Institute, London, United Kingdom
- Institute for the Physics of Living Systems, University College London, London, United Kingdom
| | - Oliver P. F. Windram
- Grand Challenges in Ecosystems and the Environment, Imperial College London, Ascot, United Kingdom
| |
Collapse
|
7
|
Wang H, Lian Y, Li C, Ma Y, Yan Z, Dong C. SIN-KNO: A method of gene regulatory network inference using single-cell transcription and gene knockout data. J Bioinform Comput Biol 2020; 17:1950035. [PMID: 32019417 DOI: 10.1142/s0219720019500355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
As a tool of interpreting and analyzing genetic data, gene regulatory network (GRN) could reveal regulatory relationships between genes, proteins, and small molecules, as well as understand physiological activities and functions within biological cells, interact in pathways, and how to make changes in the organism. Traditional GRN research focuses on the analysis of the regulatory relationships through the average of cellular gene expressions. These methods are difficult to identify the cell heterogeneity of gene expression. Existing methods for inferring GRN using single-cell transcriptional data lack expression information when genes reach steady state, and the high dimensionality of single-cell data leads to high temporal and spatial complexity of the algorithm. In order to solve the problem in traditional GRN inference methods, including the lack of cellular heterogeneity information, single-cell data complexity and lack of steady-state information, we propose a method for GRN inference using single-cell transcription and gene knockout data, called SINgle-cell transcription data-KNOckout data (SIN-KNO), which focuses on combining dynamic and steady-state information of regulatory relationship contained in gene expression. Capturing cell heterogeneity information could help understand the gene expression difference in different cells. So, we could observe gene expression changes more accurately. Gene knockout data could observe the gene expression levels at steady-state of all other genes when one gene is knockout. Classifying the genes before analyzing the single-cell data could determine a large number of non-existent regulation, greatly reducing the number of regulation required for inference. In order to show the efficiency, the proposed method has been compared with several typical methods in this area including GENIE3, JUMP3, and SINCERITIES. The results of the evaluation indicate that the proposed method can analyze the diversified information contained in the two types of data, establish a more accurate gene regulation network, and improve the computational efficiency. The method provides a new thinking for dealing with large datasets and high computational complexity of single-cell data in the GRN inference.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Yuanyuan Lian
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Chun Li
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Yue Ma
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Zhiliang Yan
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China
| | - Chunlin Dong
- Dryland Agriculture Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, Shanxi, China
| |
Collapse
|
8
|
Zenil H, Kiani NA, Marabita F, Deng Y, Elias S, Schmidt A, Ball G, Tegnér J. An Algorithmic Information Calculus for Causal Discovery and Reprogramming Systems. iScience 2019; 19:1160-1172. [PMID: 31541920 PMCID: PMC6831824 DOI: 10.1016/j.isci.2019.07.043] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2018] [Revised: 04/27/2019] [Accepted: 07/26/2019] [Indexed: 12/26/2022] Open
Abstract
We introduce and develop a method that demonstrates that the algorithmic information content of a system can be used as a steering handle in the dynamical phase space, thus affording an avenue for controlling and reprogramming systems. The method consists of applying a series of controlled interventions to a networked system while estimating how the algorithmic information content is affected. We demonstrate the method by reconstructing the phase space and their generative rules of some discrete dynamical systems (cellular automata) serving as controlled case studies. Next, the model-based interventional or causal calculus is evaluated and validated using (1) a huge large set of small graphs, (2) a number of larger networks with different topologies, and finally (3) biological networks derived from a widely studied and validated genetic network (E. coli) as well as on a significant number of differentiating (Th17) and differentiated human cells from a curated biological network data.
Collapse
Affiliation(s)
- Hector Zenil
- Algorithmic Dynamics Lab, Center for Molecular Medicine, Karolinska Institutet, Stockholm 171 76, Sweden; Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Oxford Immune Algorithmics, Reading RG1 3EU, UK; Science for Life Laboratory, Solna 171 65, Sweden; Algorithmic Nature Group, LABORES for the Natural and Digital Sciences, Paris 75006, France.
| | - Narsis A Kiani
- Algorithmic Dynamics Lab, Center for Molecular Medicine, Karolinska Institutet, Stockholm 171 76, Sweden; Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Science for Life Laboratory, Solna 171 65, Sweden; Algorithmic Nature Group, LABORES for the Natural and Digital Sciences, Paris 75006, France
| | - Francesco Marabita
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Science for Life Laboratory, Solna 171 65, Sweden
| | - Yue Deng
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden
| | - Szabolcs Elias
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Science for Life Laboratory, Solna 171 65, Sweden
| | - Angelika Schmidt
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Science for Life Laboratory, Solna 171 65, Sweden
| | - Gordon Ball
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Science for Life Laboratory, Solna 171 65, Sweden
| | - Jesper Tegnér
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institutet, Solna, Stockholm 171 76, Sweden; Science for Life Laboratory, Solna 171 65, Sweden; Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
9
|
Computational methods for Gene Regulatory Networks reconstruction and analysis: A review. Artif Intell Med 2019; 95:133-145. [DOI: 10.1016/j.artmed.2018.10.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 10/23/2018] [Accepted: 10/23/2018] [Indexed: 01/14/2023]
|