1
|
Zhang Y, Zhu L, Wang X. NEM-Tar: A Probabilistic Graphical Model for Cancer Regulatory Network Inference and Prioritization of Potential Therapeutic Targets From Multi-Omics Data. Front Genet 2021; 12:608042. [PMID: 33968127 PMCID: PMC8100334 DOI: 10.3389/fgene.2021.608042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 03/22/2021] [Indexed: 11/13/2022] Open
Abstract
Targeted therapy has been widely adopted as an effective treatment strategy to battle against cancer. However, cancers are not single disease entities, but comprising multiple molecularly distinct subtypes, and the heterogeneity nature prevents precise selection of patients for optimized therapy. Dissecting cancer subtype-specific signaling pathways is crucial to pinpointing dysregulated genes for the prioritization of novel therapeutic targets. Nested effects models (NEMs) are a group of graphical models that encode subset relations between observed downstream effects under perturbations to upstream signaling genes, providing a prototype for mapping the inner workings of the cell. In this study, we developed NEM-Tar, which extends the original NEMs to predict drug targets by incorporating causal information of (epi)genetic aberrations for signaling pathway inference. An information theory-based score, weighted information gain (WIG), was proposed to assess the impact of signaling genes on a specific downstream biological process of interest. Subsequently, we conducted simulation studies to compare three inference methods and found that the greedy hill-climbing algorithm demonstrated the highest accuracy and robustness to noise. Furthermore, two case studies were conducted using multi-omics data for colorectal cancer (CRC) and gastric cancer (GC) in the TCGA database. Using NEM-Tar, we inferred signaling networks driving the poor-prognosis subtypes of CRC and GC, respectively. Our model prioritized not only potential individual drug targets such as HER2, for which FDA-approved inhibitors are available but also the combinations of multiple targets potentially useful for the design of combination therapies.
Collapse
Affiliation(s)
- Yuchen Zhang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Lina Zhu
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China.,Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
| |
Collapse
|
2
|
Pirkl M, Beerenwinkel N. Inferring perturbation profiles of cancer samples. Bioinformatics 2021; 37:2441-2449. [PMID: 33617647 PMCID: PMC8388028 DOI: 10.1093/bioinformatics/btab113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 11/20/2020] [Accepted: 02/18/2021] [Indexed: 11/25/2022] Open
Abstract
Motivation Cancer is one of the most prevalent diseases in the world. Tumors arise due to important genes changing their activity, e.g. when inhibited or over-expressed. But these gene perturbations are difficult to observe directly. Molecular profiles of tumors can provide indirect evidence of gene perturbations. However, inferring perturbation profiles from molecular alterations is challenging due to error-prone molecular measurements and incomplete coverage of all possible molecular causes of gene perturbations. Results We have developed a novel mathematical method to analyze cancer driver genes and their patient-specific perturbation profiles. We combine genetic aberrations with gene expression data in a causal network derived across patients to infer unobserved perturbations. We show that our method can predict perturbations in simulations, CRISPR perturbation screens and breast cancer samples from The Cancer Genome Atlas. Availability and implementation The method is available as the R-package nempi at https://github.com/cbg-ethz/nempi and http://bioconductor.org/packages/nempi. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Pirkl
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland.,Swiss Institute of Bioinformatics, Basel, 4058, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland.,Swiss Institute of Bioinformatics, Basel, 4058, Switzerland
| |
Collapse
|
3
|
Tiuryn J, Szczurek E. Learning signaling networks from combinatorial perturbations by exploiting siRNA off-target effects. Bioinformatics 2020; 35:i605-i614. [PMID: 31510678 PMCID: PMC6612802 DOI: 10.1093/bioinformatics/btz334] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Motivation Perturbation experiments constitute the central means to study cellular networks. Several confounding factors complicate computational modeling of signaling networks from this data. First, the technique of RNA interference (RNAi), designed and commonly used to knock-down specific genes, suffers from off-target effects. As a result, each experiment is a combinatorial perturbation of multiple genes. Second, the perturbations propagate along unknown connections in the signaling network. Once the signal is blocked by perturbation, proteins downstream of the targeted proteins also become inactivated. Finally, all perturbed network members, either directly targeted by the experiment, or by propagation in the network, contribute to the observed effect, either in a positive or negative manner. One of the key questions of computational inference of signaling networks from such data are, how many and what combinations of perturbations are required to uniquely and accurately infer the model? Results Here, we introduce an enhanced version of linear effects models (LEMs), which extends the original by accounting for both negative and positive contributions of the perturbed network proteins to the observed phenotype. We prove that the enhanced LEMs are identified from data measured under perturbations of all single, pairs and triplets of network proteins. For small networks of up to five nodes, only perturbations of single and pairs of proteins are required for identifiability. Extensive simulations demonstrate that enhanced LEMs achieve excellent accuracy of parameter estimation and network structure learning, outperforming the previous version on realistic data. LEMs applied to Bartonella henselae infection RNAi screening data identified known interactions between eight nodes of the infection network, confirming high specificity of our model and suggested one new interaction. Availability and implementation https://github.com/EwaSzczurek/LEM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jerzy Tiuryn
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
4
|
Cardner M, Meyer-Schaller N, Christofori G, Beerenwinkel N. Inferring signalling dynamics by integrating interventional with observational data. Bioinformatics 2020; 35:i577-i585. [PMID: 31510686 PMCID: PMC6612850 DOI: 10.1093/bioinformatics/btz325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Motivation In order to infer a cell signalling network, we generally need interventional data from perturbation experiments. If the perturbation experiments are time-resolved, then signal progression through the network can be inferred. However, such designs are infeasible for large signalling networks, where it is more common to have steady-state perturbation data on the one hand, and a non-interventional time series on the other. Such was the design in a recent experiment investigating the coordination of epithelial–mesenchymal transition (EMT) in murine mammary gland cells. We aimed to infer the underlying signalling network of transcription factors and microRNAs coordinating EMT, as well as the signal progression during EMT. Results In the context of nested effects models, we developed a method for integrating perturbation data with a non-interventional time series. We applied the model to RNA sequencing data obtained from an EMT experiment. Part of the network inferred from RNA interference was validated experimentally using luciferase reporter assays. Our model extension is formulated as an integer linear programme, which can be solved efficiently using heuristic algorithms. This extension allowed us to infer the signal progression through the network during an EMT time course, and thereby assess when each regulator is necessary for EMT to advance. Availability and implementation R package at https://github.com/cbg-ethz/timeseriesNEM. The RNA sequencing data and microscopy images can be explored through a Shiny app at https://emt.bsse.ethz.ch. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mathias Cardner
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
5
|
Matthews ML, Wang JP, Sederoff R, Chiang VL, Williams CM. Modeling cross-regulatory influences on monolignol transcripts and proteins under single and combinatorial gene knockdowns in Populus trichocarpa. PLoS Comput Biol 2020; 16:e1007197. [PMID: 32275650 PMCID: PMC7147730 DOI: 10.1371/journal.pcbi.1007197] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 02/27/2020] [Indexed: 11/18/2022] Open
Abstract
Accurate manipulation of metabolites in monolignol biosynthesis is a key step for controlling lignin content, structure, and other wood properties important to the bioenergy and biomaterial industries. A crucial component of this strategy is predicting how single and combinatorial knockdowns of monolignol specific gene transcripts influence the abundance of monolignol proteins, which are the driving mechanisms of monolignol biosynthesis. Computational models have been developed to estimate protein abundances from transcript perturbations of monolignol specific genes. The accuracy of these models, however, is hindered by their inability to capture indirect regulatory influences on other pathway genes. Here, we examine the manifestation of these indirect influences on transgenic transcript and protein abundances, identifying putative indirect regulatory influences that occur when one or more specific monolignol pathway genes are perturbed. We created a computational model using sparse maximum likelihood to estimate the resulting monolignol transcript and protein abundances in transgenic Populus trichocarpa based on targeted knockdowns of specific monolignol genes. Using in-silico simulations of this model and root mean square error, we showed that our model more accurately estimated transcript and protein abundances, in comparison to previous models, when individual and families of monolignol genes were perturbed. We leveraged insight from the inferred network structure obtained from our model to identify potential genes, including PtrHCT, PtrCAD, and Ptr4CL, involved in post-transcriptional and/or post-translational regulation. Our model provides a useful computational tool for exploring the cascaded impact of single and combinatorial modifications of monolignol specific genes on lignin and other wood properties.
Collapse
Affiliation(s)
- Megan L. Matthews
- Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Jack P. Wang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, China
- Department of Forestry and Environmental Resources, Forest Biotechnology Group, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Ronald Sederoff
- Department of Forestry and Environmental Resources, Forest Biotechnology Group, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Vincent L. Chiang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, China
- Department of Forestry and Environmental Resources, Forest Biotechnology Group, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Forest Biomaterials, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Cranos M. Williams
- Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, United States of America
| |
Collapse
|
6
|
Abstract
Motivation New technologies allow for the elaborate measurement of different traits of single cells under genetic perturbations. These interventional data promise to elucidate intra-cellular networks in unprecedented detail and further help to improve treatment of diseases like cancer. However, cell populations can be very heterogeneous. Results We developed a mixture of Nested Effects Models (M&NEM) for single-cell data to simultaneously identify different cellular subpopulations and their corresponding causal networks to explain the heterogeneity in a cell population. For inference, we assign each cell to a network with a certain probability and iteratively update the optimal networks and cell probabilities in an Expectation Maximization scheme. We validate our method in the controlled setting of a simulation study and apply it to three data sets of pooled CRISPR screens generated previously by two novel experimental techniques, namely Crop-Seq and Perturb-Seq. Availability and implementation The mixture Nested Effects Model (M&NEM) is available as the R-package mnem at https://github.com/cbg-ethz/mnem/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Pirkl
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
7
|
Srivatsa S, Kuipers J, Schmich F, Eicher S, Emmenlauer M, Dehio C, Beerenwinkel N. Improved pathway reconstruction from RNA interference screens by exploiting off-target effects. Bioinformatics 2018; 34:i519-i527. [PMID: 29950000 PMCID: PMC6022657 DOI: 10.1093/bioinformatics/bty240] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Motivation Pathway reconstruction has proven to be an indispensable tool for analyzing the molecular mechanisms of signal transduction underlying cell function. Nested effects models (NEMs) are a class of probabilistic graphical models designed to reconstruct signalling pathways from high-dimensional observations resulting from perturbation experiments, such as RNA interference (RNAi). NEMs assume that the short interfering RNAs (siRNAs) designed to knockdown specific genes are always on-target. However, it has been shown that most siRNAs exhibit strong off-target effects, which further confound the data, resulting in unreliable reconstruction of networks by NEMs. Results Here, we present an extension of NEMs called probabilistic combinatorial nested effects models (pc-NEMs), which capitalize on the ancillary siRNA off-target effects for network reconstruction from combinatorial gene knockdown data. Our model employs an adaptive simulated annealing search algorithm for simultaneous inference of network structure and error rates inherent to the data. Evaluation of pc-NEMs on simulated data with varying number of phenotypic effects and noise levels as well as real data demonstrates improved reconstruction compared to classical NEMs. Application to Bartonella henselae infection RNAi screening data yielded an eight node network largely in agreement with previous works, and revealed novel binary interactions of direct impact between established components. Availability and implementation The software used for the analysis is freely available as an R package at https://github.com/cbg-ethz/pcNEM.git. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sumana Srivatsa
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Fabian Schmich
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
8
|
Szczurek E, Beerenwinkel N. Linear effects models of signaling pathways from combinatorial perturbation data. Bioinformatics 2017; 32:i297-i305. [PMID: 27307630 PMCID: PMC4908352 DOI: 10.1093/bioinformatics/btw268] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Motivation: Perturbations constitute the central means to study signaling pathways. Interrupting components of the pathway and analyzing observed effects of those interruptions can give insight into unknown connections within the signaling pathway itself, as well as the link from the pathway to the effects. Different pathway components may have different individual contributions to the measured perturbation effects, such as gene expression changes. Those effects will be observed in combination when the pathway components are perturbed. Extant approaches focus either on the reconstruction of pathway structure or on resolving how the pathway components control the downstream effects. Results: Here, we propose a linear effects model, which can be applied to solve both these problems from combinatorial perturbation data. We use simulated data to demonstrate the accuracy of learning the pathway structure as well as estimation of the individual contributions of pathway components to the perturbation effects. The practical utility of our approach is illustrated by an application to perturbations of the mitogen-activated protein kinase pathway in Saccharomyces cerevisiae. Availability and Implementation: lem is available as a R package at http://www.mimuw.edu.pl/∼szczurek/lem. Contact:szczurek@mimuw.edu.pl; niko.beerenwinkel@bsse.ethz.ch Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics
| |
Collapse
|
9
|
Pirkl M, Diekmann M, van der Wees M, Beerenwinkel N, Fröhlich H, Markowetz F. Inferring modulators of genetic interactions with epistatic nested effects models. PLoS Comput Biol 2017; 13:e1005496. [PMID: 28406896 PMCID: PMC5407847 DOI: 10.1371/journal.pcbi.1005496] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 04/27/2017] [Accepted: 04/03/2017] [Indexed: 12/27/2022] Open
Abstract
Maps of genetic interactions can dissect functional redundancies in cellular networks. Gene expression profiles as high-dimensional molecular readouts of combinatorial perturbations provide a detailed view of genetic interactions, but can be hard to interpret if different gene sets respond in different ways (called mixed epistasis). Here we test the hypothesis that mixed epistasis between a gene pair can be explained by the action of a third gene that modulates the interaction. We have extended the framework of Nested Effects Models (NEMs), a type of graphical model specifically tailored to analyze high-dimensional gene perturbation data, to incorporate logical functions that describe interactions between regulators on downstream genes and proteins. We benchmark our approach in the controlled setting of a simulation study and show high accuracy in inferring the correct model. In an application to data from deletion mutants of kinases and phosphatases in S. cerevisiae we show that epistatic NEMs can point to modulators of genetic interactions. Our approach is implemented in the R-package 'epiNEM' available from https://github.com/cbg-ethz/epiNEM and https://bioconductor.org/packages/epiNEM/.
Collapse
Affiliation(s)
- Martin Pirkl
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Madeline Diekmann
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Niko Beerenwinkel
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT (B-IT), University of Bonn, Bonn, Germany
- UCB Biosciences GmbH, Monheim, Germany
| | - Florian Markowetz
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
| |
Collapse
|
10
|
Pirkl M, Hand E, Kube D, Spang R. Analyzing synergistic and non-synergistic interactions in signalling pathways using Boolean Nested Effect Models. Bioinformatics 2016; 32:893-900. [PMID: 26581413 PMCID: PMC5939970 DOI: 10.1093/bioinformatics/btv680] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Revised: 10/19/2015] [Accepted: 11/11/2015] [Indexed: 11/21/2022] Open
Abstract
MOTIVATION Understanding the structure and interplay of cellular signalling pathways is one of the great challenges in molecular biology. Boolean Networks can infer signalling networks from observations of protein activation. In situations where it is difficult to assess protein activation directly, Nested Effect Models are an alternative. They derive the network structure indirectly from downstream effects of pathway perturbations. To date, Nested Effect Models cannot resolve signalling details like the formation of signalling complexes or the activation of proteins by multiple alternative input signals. Here we introduce Boolean Nested Effect Models (B-NEM). B-NEMs combine the use of downstream effects with the higher resolution of signalling pathway structures in Boolean Networks. RESULTS We show that B-NEMs accurately reconstruct signal flows in simulated data. Using B-NEM we then resolve BCR signalling via PI3K and TAK1 kinases in BL2 lymphoma cell lines. AVAILABILITY AND IMPLEMENTATION R code is available at https://github.com/MartinFXP/B-NEM (github). The BCR signalling dataset is available at the GEO database (http://www.ncbi.nlm.nih.gov/geo/) through accession number GSE68761. CONTACT martin-franz-xaver.pirkl@ukr.de, Rainer.Spang@ukr.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Pirkl
- Statistical Bioinformatics Department, Institute of Functional Genomics, University of Regensburg, 93053 Regensburg and
| | - Elisabeth Hand
- Department of Haematology and Oncology, University Medical Centre of the Georg-August University of Göttingen, 37073 Göttingen
| | - Dieter Kube
- Department of Haematology and Oncology, University Medical Centre of the Georg-August University of Göttingen, 37073 Göttingen
| | - Rainer Spang
- Statistical Bioinformatics Department, Institute of Functional Genomics, University of Regensburg, 93053 Regensburg and
| |
Collapse
|
11
|
Fröhlich H. biRte: Bayesian inference of context-specific regulator activities and transcriptional networks. Bioinformatics 2015; 31:3290-8. [PMID: 26112290 DOI: 10.1093/bioinformatics/btv379] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/15/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED In the last years there has been an increasing effort to computationally model and predict the influence of regulators (transcription factors, miRNAs) on gene expression. Here we introduce biRte as a computationally attractive approach combining Bayesian inference of regulator activities with network reverse engineering. biRte integrates target gene predictions with different omics data entities (e.g. miRNA and mRNA data) into a joint probabilistic framework. The utility of our method is tested in extensive simulation studies and demonstrated with applications from prostate cancer and Escherichia coli growth control. The resulting regulatory networks generally show a good agreement with the biological literature. AVAILABILITY AND IMPLEMENTATION biRte is available on Bioconductor (http://bioconductor.org). CONTACT frohlich@bit.uni-bonn.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Holger Fröhlich
- University of Bonn, Institute for Computer Science, Römerstr. 164, 53117 Bonn, Germany
| |
Collapse
|
12
|
Siebourg-Polster J, Mudrak D, Emmenlauer M, Rämö P, Dehio C, Greber U, Fröhlich H, Beerenwinkel N. NEMix: single-cell nested effects models for probabilistic pathway stimulation. PLoS Comput Biol 2015; 11:e1004078. [PMID: 25879530 PMCID: PMC4400057 DOI: 10.1371/journal.pcbi.1004078] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 12/08/2014] [Indexed: 11/18/2022] Open
Abstract
Nested effects models have been used successfully for learning subcellular networks from high-dimensional perturbation effects that result from RNA interference (RNAi) experiments. Here, we further develop the basic nested effects model using high-content single-cell imaging data from RNAi screens of cultured cells infected with human rhinovirus. RNAi screens with single-cell readouts are becoming increasingly common, and they often reveal high cell-to-cell variation. As a consequence of this cellular heterogeneity, knock-downs result in variable effects among cells and lead to weak average phenotypes on the cell population level. To address this confounding factor in network inference, we explicitly model the stimulation status of a signaling pathway in individual cells. We extend the framework of nested effects models to probabilistic combinatorial knock-downs and propose NEMix, a nested effects mixture model that accounts for unobserved pathway activation. We analyzed the identifiability of NEMix and developed a parameter inference scheme based on the Expectation Maximization algorithm. In an extensive simulation study, we show that NEMix improves learning of pathway structures over classical NEMs significantly in the presence of hidden pathway stimulation. We applied our model to single-cell imaging data from RNAi screens monitoring human rhinovirus infection, where limited infection efficiency of the assay results in uncertain pathway stimulation. Using a subset of genes with known interactions, we show that the inferred NEMix network has high accuracy and outperforms the classical nested effects model without hidden pathway activity. NEMix is implemented as part of the R/Bioconductor package ‘nem’ and available at www.cbg.ethz.ch/software/NEMix. Experiments monitoring individual cells show that cells can behave differently even under same experimental conditions. Summarizing measurements over a population of cells can lead to weak and widely deviating signals, and subsequently applied modeling approaches, like network inference, will suffer from this information loss. Nested effects models, a method tailored to reconstruct signaling networks from high-dimensional read-outs of gene silencing experiments, have so far been only applied on the cell population level. These models assume the pathway under consideration to be activated in all cells. The signal flow is only disrupted, when genes are silenced. However, if this assumption is not met, inference results can be incorrect, because observed effects are interpreted wrongly. We extended nested effects models, to use the power of single-cell resolution data sets. We introduce a new unobserved factor, which describes the pathway activity of single cells. The pathway activity is learned for each cell during network inference. We apply our model to gene silencing screens, investigating human rhino virus infection of single cells from microscopy imaging features. Comparing the learned network to the known KEGG pathway of the genes shows that our method recovers networks significantly better than classical nested effects models without capturing of hidden signaling.
Collapse
Affiliation(s)
- Juliane Siebourg-Polster
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Daria Mudrak
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | | | - Pauli Rämö
- Biozentrum, University of Basel, Basel, Switzerland
| | | | - Urs Greber
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Holger Fröhlich
- Algorithmic Bioinformatics, Bonn-Aachen International Center for IT, University of Bonn, Bonn, Germany
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail:
| |
Collapse
|
13
|
Sadeh MJ, Moffa G, Spang R. Considering unknown unknowns: reconstruction of nonconfoundable causal relations in biological networks. J Comput Biol 2014; 20:920-32. [PMID: 24195708 DOI: 10.1089/cmb.2013.0119] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Our current understanding of cellular networks is rather incomplete. We over look important but so far unknown genes and mechanisms in the pathways. Moreover, we often only have a partial account of the molecular interactions and modifications of the known players. When analyzing the cell, we look through narrow windows leaving potentially important events in blind spots. Network reconstruction is naturally confined to what we have observed. Little is known on how the incompleteness of our observations confounds our interpretation of the available data. Here we ask which features of a network can be confounded by incomplete observations and which cannot. In the context of nested effects models, we show that in the presence of missing observations or hidden factors a reliable reconstruction of the full network is not feasible. Nevertheless, we can show that certain characteristics of signaling networks like the existence of cross-talk between certain branches of the network can be inferred in a nonconfoundable way. We derive a test for inferring such nonconfoundable characteristics of signaling networks. Next, we introduce a new data structure to represent partially reconstructed signaling networks. Finally, we evaluate our method both on simulated data and in the context of a study on early stem cell differentiation in mice.
Collapse
Affiliation(s)
- Mohammad J Sadeh
- Institute of Functional Genomics, Computational Diagnostics Group, University of Regensburg , Regensburg, Germany
| | | | | |
Collapse
|
14
|
Wang X, Yuan K, Hellmayr C, Liu W, Markowetz F. Reconstructing evolving signalling networks by hidden Markov nested effects models. Ann Appl Stat 2014. [DOI: 10.1214/13-aoas696] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
15
|
Inferring regulatory networks by combining perturbation screens and steady state gene expression profiles. PLoS One 2014; 9:e82393. [PMID: 24586224 PMCID: PMC3938831 DOI: 10.1371/journal.pone.0082393] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2012] [Accepted: 11/01/2013] [Indexed: 11/19/2022] Open
Abstract
Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g., wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network.
Collapse
|
16
|
Dümcke S, Bräuer J, Anchang B, Spang R, Beerenwinkel N, Tresch A. Exact likelihood computation in Boolean networks with probabilistic time delays, and its application in signal network reconstruction. ACTA ACUST UNITED AC 2013; 30:414-9. [PMID: 24292937 DOI: 10.1093/bioinformatics/btt696] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
MOTIVATION For biological pathways, it is common to measure a gene expression time series after various knockdowns of genes that are putatively involved in the process of interest. These interventional time-resolved data are most suitable for the elucidation of dynamic causal relationships in signaling networks. Even with this kind of data it is still a major and largely unsolved challenge to infer the topology and interaction logic of the underlying regulatory network. RESULTS In this work, we present a novel model-based approach involving Boolean networks to reconstruct small to medium-sized regulatory networks. In particular, we solve the problem of exact likelihood computation in Boolean networks with probabilistic exponential time delays. Simulations demonstrate the high accuracy of our approach. We apply our method to data of Ivanova et al. (2006), where RNA interference knockdown experiments were used to build a network of the key regulatory genes governing mouse stem cell maintenance and differentiation. In contrast to previous analyses of that data set, our method can identify feedback loops and provides new insights into the interplay of some master regulators in embryonic stem cell development. AVAILABILITY AND IMPLEMENTATION The algorithm is implemented in the statistical language R. Code and documentation are available at Bioinformatics online. CONTACT duemcke@mpipz.mpg.de or tresch@mpipz.mpg.de SUPPLEMENTARY INFORMATION Supplementary Materials are available at Bioinfomatics online.
Collapse
Affiliation(s)
- Sebastian Dümcke
- Institute for Genetics, University of Cologne, 50674 Cologne, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Gene Center, Department of Biochemistry, Ludwig-Maximilians University, 81379 Munich, Germany, Insitute for Functional Genomics, 93053 Regensburg, Germany, Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, CA 94305-5488, USA and ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26 4058 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
17
|
Knapp B, Kaderali L. Reconstruction of cellular signal transduction networks using perturbation assays and linear programming. PLoS One 2013; 8:e69220. [PMID: 23935958 PMCID: PMC3728289 DOI: 10.1371/journal.pone.0069220] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 06/06/2013] [Indexed: 12/23/2022] Open
Abstract
Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4+ T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.
Collapse
Affiliation(s)
- Bettina Knapp
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- ViroQuant Research Group Modeling, BioQuant, Heidelberg University, Heidelberg, Germany
| | - Lars Kaderali
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- ViroQuant Research Group Modeling, BioQuant, Heidelberg University, Heidelberg, Germany
- * E-mail:
| |
Collapse
|
18
|
Failmezger H, Praveen P, Tresch A, Fröhlich H. Learning gene network structure from time laps cell imaging in RNAi Knock downs. Bioinformatics 2013; 29:1534-40. [PMID: 23595660 DOI: 10.1093/bioinformatics/btt179] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION As RNA interference is becoming a standard method for targeted gene perturbation, computational approaches to reverse engineer parts of biological networks based on measurable effects of RNAi become increasingly relevant. The vast majority of these methods use gene expression data, but little attention has been paid so far to other data types. RESULTS Here we present a method, which can infer gene networks from high-dimensional phenotypic perturbation effects on single cells recorded by time-lapse microscopy. We use data from the Mitocheck project to extract multiple shape, intensity and texture features at each frame. Features from different cells and movies are then aligned along the cell cycle time. Subsequently we use Dynamic Nested Effects Models (dynoNEMs) to estimate parts of the network structure between perturbed genes via a Markov Chain Monte Carlo approach. Our simulation results indicate a high reconstruction quality of this method. A reconstruction based on 22 gene knock downs yielded a network, where all edges could be explained via the biological literature. AVAILABILITY The implementation of dynoNEMs is part of the Bioconductor R-package nem.
Collapse
Affiliation(s)
- Henrik Failmezger
- Computational Biology and Regulatory Networks, Max-Planck Institute for Plant Breeding Research, Carl-von-Linne-Weg 10, 50829 Cologne, Germany
| | | | | | | |
Collapse
|
19
|
|