1
|
Fang WQ, Wu YL, Hwang MJ. A Noise-Tolerating Gene Association Network Uncovering an Oncogenic Regulatory Motif in Lymphoma Transcriptomics. Life (Basel) 2023; 13:1331. [PMID: 37374114 DOI: 10.3390/life13061331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 05/24/2023] [Accepted: 05/26/2023] [Indexed: 06/29/2023] Open
Abstract
In cancer genomics research, gene expressions provide clues to gene regulations implicating patients' risk of survival. Gene expressions, however, fluctuate due to noises arising internally and externally, making their use to infer gene associations, hence regulation mechanisms, problematic. Here, we develop a new regression approach to model gene association networks while considering uncertain biological noises. In a series of simulation experiments accounting for varying levels of biological noises, the new method was shown to be robust and perform better than conventional regression methods, as judged by a number of statistical measures on unbiasedness, consistency and accuracy. Application to infer gene associations in germinal-center B cells led to the discovery of a three-by-two regulatory motif gene expression and a three-gene prognostic signature for diffuse large B-cell lymphoma.
Collapse
Affiliation(s)
- Wei-Quan Fang
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
- Division of New Drug, Center for Drug Evaluation, Taipei 115, Taiwan
| | - Yu-Le Wu
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Ming-Jing Hwang
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|
2
|
Alali M, Imani M. Reinforcement Learning Data-Acquiring for Causal Inference of Regulatory Networks. PROCEEDINGS OF THE ... AMERICAN CONTROL CONFERENCE. AMERICAN CONTROL CONFERENCE 2023; 2023:3957-3964. [PMID: 37521901 PMCID: PMC10382224 DOI: 10.23919/acc55779.2023.10155867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Gene regulatory networks (GRNs) consist of multiple interacting genes whose activities govern various cellular processes. The limitations in genomics data and the complexity of the interactions between components often pose huge uncertainties in the models of these biological systems. Meanwhile, inferring/estimating the interactions between components of the GRNs using data acquired from the normal condition of these biological systems is a challenging or, in some cases, an impossible task. Perturbation is a well-known genomics approach that aims to excite targeted components to gather useful data from these systems. This paper models GRNs using the Boolean network with perturbation, where the network uncertainty appears in terms of unknown interactions between genes. Unlike the existing heuristics and greedy data-acquiring methods, this paper provides an optimal Bayesian formulation of the data-acquiring process in the reinforcement learning context, where the actions are perturbations, and the reward measures step-wise improvement in the inference accuracy. We develop a semi-gradient reinforcement learning method with function approximation for learning near-optimal data-acquiring policy. The obtained policy yields near-exact Bayesian optimality with respect to the entire uncertainty in the regulatory network model, and allows learning the policy offline through planning. We demonstrate the performance of the proposed framework using the well-known p53-Mdm2 negative feedback loop gene regulatory network.
Collapse
Affiliation(s)
- Mohammad Alali
- Department of Electrical and Computer Engineering at Northeastern University
| | - Mahdi Imani
- Department of Electrical and Computer Engineering at Northeastern University
| |
Collapse
|
3
|
Sriraja LO, Werhli A, Petsalaki E. Phosphoproteomics data-driven signalling network inference: Does it work? Comput Struct Biotechnol J 2022; 21:432-443. [PMID: 36618990 PMCID: PMC9798138 DOI: 10.1016/j.csbj.2022.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/16/2022] [Accepted: 12/06/2022] [Indexed: 12/23/2022] Open
Abstract
The advent of global phosphoproteome profiling has led to wide phosphosite coverage and therefore the opportunity to predict kinase-substrate associations from these datasets. However, the regulatory kinase is unknown for most substrates, due to biased and incomplete database annotations. In this study we compare the performance of six pairwise measures to predict kinase-substrate associations using a data driven approach on publicly available time resolved and perturbation mass spectrometry-based phosphoproteome data. First, we validated the performance of these measures using as a reference both a literature-based phosphosite-specific protein interaction network and a predicted kinase-substrate (KS) interactions set. The overall performance in predicting kinase-substrate associations using pairwise measures across both these reference sets was poor. To expand into the wider interactome space, we applied the approach on a network comprising pairs of substrates regulated by the same kinase (substrate-substrate associations) but found the performance to be equally poor. However, the addition of a sequence similarity filter for substrate-substrate associations led to a significant boost in performance. Our findings imply that the use of a filter to reduce the search space, such as a sequence similarity filter, can be used prior to the application of network inference methods to reduce noise and boost the signal. We also find that the current gold standard for reference sets is not adequate for evaluation as it is limited and context-agnostic. Therefore, there is a need for additional evaluation methods that have increased coverage and take into consideration the context-specific nature of kinase-substrate associations.
Collapse
Affiliation(s)
- Lourdes O. Sriraja
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Adriano Werhli
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Centro de Ciências Computacionais - Universidade Federal do Rio Grande - FURG, Avenida Itália, km 8, s/n, Campus Carreiros, 96203-900 Rio Grande, Rio Grande do Sul, Brazil2
| | - Evangelia Petsalaki
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|
4
|
Tyloo M, Delabays R, Jacquod P. Reconstructing network structures from partial measurements. CHAOS (WOODBURY, N.Y.) 2021; 31:103117. [PMID: 34717331 DOI: 10.1063/5.0058739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 09/28/2021] [Indexed: 06/13/2023]
Abstract
The dynamics of systems of interacting agents is determined by the structure of their coupling network. The knowledge of the latter is, therefore, highly desirable, for instance, to develop efficient control schemes, to accurately predict the dynamics, or to better understand inter-agent processes. In many important and interesting situations, the network structure is not known, however, and previous investigations have shown how it may be inferred from complete measurement time series on each and every agent. These methods implicitly presuppose that, even though the network is not known, all its nodes are. Here, we investigate the different problem of inferring network structures within the observed/measured agents. For symmetrically coupled dynamical systems close to a stable equilibrium, we establish analytically and illustrate numerically that velocity signal correlators encode not only direct couplings, but also geodesic distances in the coupling network within the subset of measurable agents. When dynamical data are accessible for all agents, our method is furthermore algorithmically more efficient than the traditional ones because it does not rely on matrix inversion.
Collapse
Affiliation(s)
- Melvyn Tyloo
- Department of Quantum Matter Physics, University of Geneva, CH-1211 Geneva, Switzerland
| | - Robin Delabays
- Automatic Control Laboratory, ETH Zürich, CH-8092 Zürich, Switzerland
| | - Philippe Jacquod
- Department of Quantum Matter Physics, University of Geneva, CH-1211 Geneva, Switzerland
| |
Collapse
|
5
|
Li Y, Liu D, Li T, Zhu Y. Bayesian differential analysis of gene regulatory networks exploiting genetic perturbations. BMC Bioinformatics 2020; 21:12. [PMID: 31918656 PMCID: PMC6953167 DOI: 10.1186/s12859-019-3314-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 12/12/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene regulatory networks (GRNs) can be inferred from both gene expression data and genetic perturbations. Under different conditions, the gene data of the same gene set may be different from each other, which results in different GRNs. Detecting structural difference between GRNs under different conditions is of great significance for understanding gene functions and biological mechanisms. RESULTS In this paper, we propose a Bayesian Fused algorithm to jointly infer differential structures of GRNs under two different conditions. The algorithm is developed for GRNs modeled with structural equation models (SEMs), which makes it possible to incorporate genetic perturbations into models to improve the inference accuracy, so we name it BFDSEM. Different from the naive approaches that separately infer pair-wise GRNs and identify the difference from the inferred GRNs, we first re-parameterize the two SEMs to form an integrated model that takes full advantage of the two groups of gene data, and then solve the re-parameterized model by developing a novel Bayesian fused prior following the criterion that separate GRNs and differential GRN are both sparse. CONCLUSIONS Computer simulations are run on synthetic data to compare BFDSEM to two state-of-the-art joint inference algorithms: FSSEM and ReDNet. The results demonstrate that the performance of BFDSEM is comparable to FSSEM, and is generally better than ReDNet. The BFDSEM algorithm is also applied to a real data set of lung cancer and adjacent normal tissues, the yielded normal GRN and differential GRN are consistent with the reported results in previous literatures. An open-source program implementing BFDSEM is freely available in Additional file 1.
Collapse
Affiliation(s)
- Yan Li
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| | - Dayou Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| | - Tengfei Li
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
| | - Yungang Zhu
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| |
Collapse
|
6
|
Wang Y, Luo Y, Wang M, Miao H. Time-invariant biological networks with feedback loops: structural equation models and structural identifiability. IET Syst Biol 2018; 12:264-272. [PMID: 30472690 DOI: 10.1049/iet-syb.2018.5004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Quantitative analyses of biological networks such as key biological parameter estimation necessarily call for the use of graphical models. While biological networks with feedback loops are common in reality, the development of graphical model methods and tools that are capable of dealing with feedback loops is still in its infancy. Particularly, inadequate attention has been paid to the parameter identifiability problem for biological networks with feedback loops such that unreliable or even misleading parameter estimates may be obtained. In this study, the structural identifiability analysis problem of time-invariant linear structural equation models (SEMs) with feedback loops is addressed, resulting in a general and efficient solution. The key idea is to combine Mason's gain with Wright's path coefficient method to generate identifiability equations, from which identifiability matrices are then derived to examine the structural identifiability of every single unknown parameter. The proposed method does not involve symbolic or expensive numerical computations, and is applicable to a broad range of time-invariant linear SEMs with or without explicit latent variables, presenting a remarkable breakthrough in terms of generality. Finally, a subnetwork structure of the C. elegans neural network is used to illustrate the application of the authors' method in practice.
Collapse
Affiliation(s)
- Yulin Wang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, Sichuan, People's Republic of China
| | - Yu Luo
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, Sichuan, People's Republic of China
| | - Mingwen Wang
- School of Mathematics, Southwest Jiaotong University, Chengdu 611756, Sichuan, People's Republic of China
| | - Hongyu Miao
- Department of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center, Houston 77030, TX, USA.
| |
Collapse
|
7
|
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems. Adv Bioinformatics 2017; 2017:4827171. [PMID: 28250767 PMCID: PMC5303608 DOI: 10.1155/2017/4827171] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Revised: 10/10/2016] [Accepted: 10/19/2016] [Indexed: 11/17/2022] Open
Abstract
Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.
Collapse
|
8
|
Wang Y, Lu N, Miao H. Structural identifiability of cyclic graphical models of biological networks with latent variables. BMC SYSTEMS BIOLOGY 2016; 10:41. [PMID: 27296452 PMCID: PMC4906697 DOI: 10.1186/s12918-016-0287-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Accepted: 06/06/2016] [Indexed: 12/16/2022]
Abstract
Background Graphical models have long been used to describe biological networks for a variety of important tasks such as the determination of key biological parameters, and the structure of graphical model ultimately determines whether such unknown parameters can be unambiguously obtained from experimental observations (i.e., the identifiability problem). Limited by resources or technical capacities, complex biological networks are usually partially observed in experiment, which thus introduces latent variables into the corresponding graphical models. A number of previous studies have tackled the parameter identifiability problem for graphical models such as linear structural equation models (SEMs) with or without latent variables. However, the limited resolution and efficiency of existing approaches necessarily calls for further development of novel structural identifiability analysis algorithms. Results An efficient structural identifiability analysis algorithm is developed in this study for a broad range of network structures. The proposed method adopts the Wright’s path coefficient method to generate identifiability equations in forms of symbolic polynomials, and then converts these symbolic equations to binary matrices (called identifiability matrix). Several matrix operations are introduced for identifiability matrix reduction with system equivalency maintained. Based on the reduced identifiability matrices, the structural identifiability of each parameter is determined. A number of benchmark models are used to verify the validity of the proposed approach. Finally, the network module for influenza A virus replication is employed as a real example to illustrate the application of the proposed approach in practice. Conclusions The proposed approach can deal with cyclic networks with latent variables. The key advantage is that it intentionally avoids symbolic computation and is thus highly efficient. Also, this method is capable of determining the identifiability of each single parameter and is thus of higher resolution in comparison with many existing approaches. Overall, this study provides a basis for systematic examination and refinement of graphical models of biological networks from the identifiability point of view, and it has a significant potential to be extended to more complex network structures or high-dimensional systems. Electronic supplementary material The online version of this article (doi:10.1186/s12918-016-0287-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yulin Wang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Na Lu
- State Key Laboratory for Manufacturing Systems Engineering, Systems Engineering Institute, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Hongyu Miao
- Department of Biostatistics, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
9
|
Zarkogianni K, Litsa E, Mitsis K, Wu PY, Kaddi CD, Cheng CW, Wang MD, Nikita KS. A Review of Emerging Technologies for the Management of Diabetes Mellitus. IEEE Trans Biomed Eng 2015; 62:2735-49. [PMID: 26292334 PMCID: PMC5859570 DOI: 10.1109/tbme.2015.2470521] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
OBJECTIVE High prevalence of diabetes mellitus (DM) along with the poor health outcomes and the escalated costs of treatment and care poses the need to focus on prevention, early detection and improved management of the disease. The aim of this paper is to present and discuss the latest accomplishments in sensors for glucose and lifestyle monitoring along with clinical decision support systems (CDSSs) facilitating self-disease management and supporting healthcare professionals in decision making. METHODS A critical literature review analysis is conducted focusing on advances in: 1) sensors for physiological and lifestyle monitoring, 2) models and molecular biomarkers for predicting the onset and assessing the progress of DM, and 3) modeling and control methods for regulating glucose levels. RESULTS Glucose and lifestyle sensing technologies are continuously evolving with current research focusing on the development of noninvasive sensors for accurate glucose monitoring. A wide range of modeling, classification, clustering, and control approaches have been deployed for the development of the CDSS for diabetes management. Sophisticated multiscale, multilevel modeling frameworks taking into account information from behavioral down to molecular level are necessary to reveal correlations and patterns indicating the onset and evolution of DM. CONCLUSION Integration of data originating from sensor-based systems and electronic health records combined with smart data analytics methods and powerful user centered approaches enable the shift toward preventive, predictive, personalized, and participatory diabetes care. SIGNIFICANCE The potential of sensing and predictive modeling approaches toward improving diabetes management is highlighted and related challenges are identified.
Collapse
Affiliation(s)
| | | | | | | | | | | | - May D. Wang
- Contact information for the corresponding author: , Phone: 404-385-2954, Fax: 404-894-4243, Address: Suite 4106, UA Whitaker Building, 313 Ferst Drive, Atlanta, GA 30332, USA
| | | |
Collapse
|
10
|
Mohamed Salleh FH, Arif SM, Zainudin S, Firdaus-Raih M. Reconstructing gene regulatory networks from knock-out data using Gaussian Noise Model and Pearson Correlation Coefficient. Comput Biol Chem 2015; 59 Pt B:3-14. [DOI: 10.1016/j.compbiolchem.2015.04.012] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Revised: 04/16/2015] [Accepted: 04/27/2015] [Indexed: 11/26/2022]
|