1
|
Michoel T, Zhang JD. Causal inference in drug discovery and development. Drug Discov Today 2023; 28:103737. [PMID: 37591410 DOI: 10.1016/j.drudis.2023.103737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 07/31/2023] [Accepted: 08/10/2023] [Indexed: 08/19/2023]
Abstract
To discover new drugs is to seek and to prove causality. As an emerging approach leveraging human knowledge and creativity, data, and machine intelligence, causal inference holds the promise of reducing cognitive bias and improving decision-making in drug discovery. Although it has been applied across the value chain, the concepts and practice of causal inference remain obscure to many practitioners. This article offers a nontechnical introduction to causal inference, reviews its recent applications, and discusses opportunities and challenges of adopting the causal language in drug discovery and development.
Collapse
Affiliation(s)
- Tom Michoel
- Computational Biology Unit, Department of Informatics, University of Bergen, Postboks 7803, 5020 Bergen, Norway
| | - Jitao David Zhang
- Pharma Early Research and Development, Roche Innovation Centre Basel, F. Hoffmann-La Roche, Grenzacherstrasse 124, 4070 Basel, Switzerland; Department of Mathematics and Computer Science, University of Basel, Spiegelgasse 1, 4051 Basel, Switzerland.
| |
Collapse
|
2
|
Lee S, Vu HM, Lee JH, Lim H, Kim MS. Advances in Mass Spectrometry-Based Single Cell Analysis. BIOLOGY 2023; 12:395. [PMID: 36979087 PMCID: PMC10045136 DOI: 10.3390/biology12030395] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/27/2023] [Accepted: 03/01/2023] [Indexed: 03/06/2023]
Abstract
Technological developments and improvements in single-cell isolation and analytical platforms allow for advanced molecular profiling at the single-cell level, which reveals cell-to-cell variation within the admixture cells in complex biological or clinical systems. This helps to understand the cellular heterogeneity of normal or diseased tissues and organs. However, most studies focused on the analysis of nucleic acids (e.g., DNA and RNA) and mass spectrometry (MS)-based analysis for proteins and metabolites of a single cell lagged until recently. Undoubtedly, MS-based single-cell analysis will provide a deeper insight into cellular mechanisms related to health and disease. This review summarizes recent advances in MS-based single-cell analysis methods and their applications in biology and medicine.
Collapse
Affiliation(s)
- Siheun Lee
- School of Undergraduate Studies, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
| | - Hung M. Vu
- Department of New Biology, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
| | - Jung-Hyun Lee
- Department of New Biology, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
| | - Heejin Lim
- Center for Scientific Instrumentation, Korea Basic Science Institute (KBSI), Cheongju 28119, Republic of Korea
| | - Min-Sik Kim
- Department of New Biology, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
- New Biology Research Center, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
- Center for Cell Fate Reprogramming and Control, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Republic of Korea
| |
Collapse
|
3
|
Horowitz BB, Nanda S, Walhout AJ. A Transcriptional Cofactor Regulatory Network for the C. elegans Intestine. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522920. [PMID: 36711629 PMCID: PMC9881946 DOI: 10.1101/2023.01.05.522920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Chromatin modifiers and transcriptional cofactors (collectively referred to as CFs) work with DNA-binding transcription factors (TFs) to regulate gene expression. In multicellular eukaryotes, distinct tissues each execute their own gene expression program for accurate differentiation and subsequent functionality. While the function of TFs in differential gene expression has been studied in detail in many systems, the contribution of CFs has remained less explored. Here we uncovered the contributions of CFs to gene regulation in the Caenorhabditis elegans intestine. We first annotated 366 CFs encoded by the C. elegans genome and assembled a library of 335 RNAi clones. Using this library, we analyzed the effects of individually depleting these CFs on the expression of 19 fluorescent transcriptional reporters in the intestine and identified 216 regulatory interactions. We found that different CFs interact specifically with different promoters, and that both essential and intestinally expressed CFs exhibit the highest proportion of interactions. We did not find all members of CF complexes acting on the same set of reporters but instead found diversity in the promoter targets of each complex component. Finally, we found that previously identified activation mechanisms for the acdh-1 promoter use different CFs and TFs. Overall, we demonstrate that CFs function specifically rather than ubiquitously at intestinal promoters and provide an RNAi resource for reverse genetic screens.
Collapse
|
4
|
Sahoo A, Pechmann S. Functional network motifs defined through integration of protein-protein and genetic interactions. PeerJ 2022; 10:e13016. [PMID: 35223214 PMCID: PMC8877332 DOI: 10.7717/peerj.13016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 02/06/2022] [Indexed: 01/11/2023] Open
Abstract
Cells are enticingly complex systems. The identification of feedback regulation is critically important for understanding this complexity. Network motifs defined as small graphlets that occur more frequently than expected by chance have revolutionized our understanding of feedback circuits in cellular networks. However, with their definition solely based on statistical over-representation, network motifs often lack biological context, which limits their usefulness. Here, we define functional network motifs (FNMs) through the systematic integration of genetic interaction data that directly inform on functional relationships between genes and encoded proteins. Occurring two orders of magnitude less frequently than conventional network motifs, we found FNMs significantly enriched in genes known to be functionally related. Moreover, our comprehensive analyses of FNMs in yeast showed that they are powerful at capturing both known and putative novel regulatory interactions, thus suggesting a promising strategy towards the systematic identification of feedback regulation in biological networks. Many FNMs appeared as excellent candidates for the prioritization of follow-up biochemical characterization, which is a recurring bottleneck in the targeting of complex diseases. More generally, our work highlights a fruitful avenue for integrating and harnessing genomic network data.
Collapse
Affiliation(s)
- Amruta Sahoo
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
| | | |
Collapse
|
5
|
Lüönd F, Pirkl M, Hisano M, Prestigiacomo V, Kalathur RK, Beerenwinkel N, Christofori G. Hierarchy of TGFβ/SMAD, Hippo/YAP/TAZ, and Wnt/β-catenin signaling in melanoma phenotype switching. Life Sci Alliance 2021; 5:5/2/e202101010. [PMID: 34819356 PMCID: PMC8616544 DOI: 10.26508/lsa.202101010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 11/11/2021] [Accepted: 11/12/2021] [Indexed: 12/13/2022] Open
Abstract
TGFβ, YAP/TAZ, and canonical Wnt/β-catenin signaling functionally interact in a hierarchical manner to induce the switching of melanoma cells from proliferative-to-invasive cell phenotype. In melanoma, a switch from a proliferative melanocytic to an invasive mesenchymal phenotype is based on dramatic transcriptional reprogramming which involves complex interactions between a variety of signaling pathways and their downstream transcriptional regulators. TGFβ/SMAD, Hippo/YAP/TAZ, and Wnt/β-catenin signaling pathways are major inducers of transcriptional reprogramming and converge at several levels. Here, we report that TGFβ/SMAD, YAP/TAZ, and β-catenin are all required for a proliferative-to-invasive phenotype switch. Loss and gain of function experimentation, global gene expression analysis, and computational nested effects models revealed the hierarchy between these signaling pathways and identified shared target genes. SMAD-mediated transcription at the top of the hierarchy leads to the activation of YAP/TAZ and of β-catenin, with YAP/TAZ governing an essential subprogram of TGFβ-induced phenotype switching. Wnt/β-catenin signaling is situated further downstream and exerts a dual role: it promotes the proliferative, differentiated melanoma cell phenotype and it is essential but not sufficient for SMAD or YAP/TAZ–induced phenotype switching. The results identify epistatic interactions among the signaling pathways underlying melanoma phenotype switching and highlight the priorities in targets for melanoma therapy.
Collapse
Affiliation(s)
- Fabiana Lüönd
- Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Martin Pirkl
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Mizue Hisano
- Department of Biomedicine, University of Basel, Basel, Switzerland
| | | | - Ravi Kr Kalathur
- Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | |
Collapse
|
6
|
Zhang Y, Zhu L, Wang X. NEM-Tar: A Probabilistic Graphical Model for Cancer Regulatory Network Inference and Prioritization of Potential Therapeutic Targets From Multi-Omics Data. Front Genet 2021; 12:608042. [PMID: 33968127 PMCID: PMC8100334 DOI: 10.3389/fgene.2021.608042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 03/22/2021] [Indexed: 11/13/2022] Open
Abstract
Targeted therapy has been widely adopted as an effective treatment strategy to battle against cancer. However, cancers are not single disease entities, but comprising multiple molecularly distinct subtypes, and the heterogeneity nature prevents precise selection of patients for optimized therapy. Dissecting cancer subtype-specific signaling pathways is crucial to pinpointing dysregulated genes for the prioritization of novel therapeutic targets. Nested effects models (NEMs) are a group of graphical models that encode subset relations between observed downstream effects under perturbations to upstream signaling genes, providing a prototype for mapping the inner workings of the cell. In this study, we developed NEM-Tar, which extends the original NEMs to predict drug targets by incorporating causal information of (epi)genetic aberrations for signaling pathway inference. An information theory-based score, weighted information gain (WIG), was proposed to assess the impact of signaling genes on a specific downstream biological process of interest. Subsequently, we conducted simulation studies to compare three inference methods and found that the greedy hill-climbing algorithm demonstrated the highest accuracy and robustness to noise. Furthermore, two case studies were conducted using multi-omics data for colorectal cancer (CRC) and gastric cancer (GC) in the TCGA database. Using NEM-Tar, we inferred signaling networks driving the poor-prognosis subtypes of CRC and GC, respectively. Our model prioritized not only potential individual drug targets such as HER2, for which FDA-approved inhibitors are available but also the combinations of multiple targets potentially useful for the design of combination therapies.
Collapse
Affiliation(s)
- Yuchen Zhang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Lina Zhu
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China.,Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
| |
Collapse
|
7
|
Hackett SR, Baltz EA, Coram M, Wranik BJ, Kim G, Baker A, Fan M, Hendrickson DG, Berndl M, McIsaac RS. Learning causal networks using inducible transcription factors and transcriptome-wide time series. Mol Syst Biol 2021; 16:e9174. [PMID: 32181581 PMCID: PMC7076914 DOI: 10.15252/msb.20199174] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 02/13/2020] [Accepted: 02/19/2020] [Indexed: 11/27/2022] Open
Abstract
We present IDEA (the Induction Dynamics gene Expression Atlas), a dataset constructed by independently inducing hundreds of transcription factors (TFs) and measuring timecourses of the resulting gene expression responses in budding yeast. Each experiment captures a regulatory cascade connecting a single induced regulator to the genes it causally regulates. We discuss the regulatory cascade of a single TF, Aft1, in detail; however, IDEA contains > 200 TF induction experiments with 20 million individual observations and 100,000 signal‐containing dynamic responses. As an application of IDEA, we integrate all timecourses into a whole‐cell transcriptional model, which is used to predict and validate multiple new and underappreciated transcriptional regulators. We also find that the magnitudes of coefficients in this model are predictive of genetic interaction profile similarities. In addition to being a resource for exploring regulatory connectivity between TFs and their target genes, our modeling approach shows that combining rapid perturbations of individual genes with genome‐scale time‐series measurements is an effective strategy for elucidating gene regulatory networks.
Collapse
Affiliation(s)
| | | | | | | | - Griffin Kim
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | - Adam Baker
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | | | | | | | | |
Collapse
|
8
|
Tiuryn J, Szczurek E. Learning signaling networks from combinatorial perturbations by exploiting siRNA off-target effects. Bioinformatics 2020; 35:i605-i614. [PMID: 31510678 PMCID: PMC6612802 DOI: 10.1093/bioinformatics/btz334] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Motivation Perturbation experiments constitute the central means to study cellular networks. Several confounding factors complicate computational modeling of signaling networks from this data. First, the technique of RNA interference (RNAi), designed and commonly used to knock-down specific genes, suffers from off-target effects. As a result, each experiment is a combinatorial perturbation of multiple genes. Second, the perturbations propagate along unknown connections in the signaling network. Once the signal is blocked by perturbation, proteins downstream of the targeted proteins also become inactivated. Finally, all perturbed network members, either directly targeted by the experiment, or by propagation in the network, contribute to the observed effect, either in a positive or negative manner. One of the key questions of computational inference of signaling networks from such data are, how many and what combinations of perturbations are required to uniquely and accurately infer the model? Results Here, we introduce an enhanced version of linear effects models (LEMs), which extends the original by accounting for both negative and positive contributions of the perturbed network proteins to the observed phenotype. We prove that the enhanced LEMs are identified from data measured under perturbations of all single, pairs and triplets of network proteins. For small networks of up to five nodes, only perturbations of single and pairs of proteins are required for identifiability. Extensive simulations demonstrate that enhanced LEMs achieve excellent accuracy of parameter estimation and network structure learning, outperforming the previous version on realistic data. LEMs applied to Bartonella henselae infection RNAi screening data identified known interactions between eight nodes of the infection network, confirming high specificity of our model and suggested one new interaction. Availability and implementation https://github.com/EwaSzczurek/LEM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jerzy Tiuryn
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| |
Collapse
|
9
|
Cardner M, Meyer-Schaller N, Christofori G, Beerenwinkel N. Inferring signalling dynamics by integrating interventional with observational data. Bioinformatics 2020; 35:i577-i585. [PMID: 31510686 PMCID: PMC6612850 DOI: 10.1093/bioinformatics/btz325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Motivation In order to infer a cell signalling network, we generally need interventional data from perturbation experiments. If the perturbation experiments are time-resolved, then signal progression through the network can be inferred. However, such designs are infeasible for large signalling networks, where it is more common to have steady-state perturbation data on the one hand, and a non-interventional time series on the other. Such was the design in a recent experiment investigating the coordination of epithelial–mesenchymal transition (EMT) in murine mammary gland cells. We aimed to infer the underlying signalling network of transcription factors and microRNAs coordinating EMT, as well as the signal progression during EMT. Results In the context of nested effects models, we developed a method for integrating perturbation data with a non-interventional time series. We applied the model to RNA sequencing data obtained from an EMT experiment. Part of the network inferred from RNA interference was validated experimentally using luciferase reporter assays. Our model extension is formulated as an integer linear programme, which can be solved efficiently using heuristic algorithms. This extension allowed us to infer the signal progression through the network during an EMT time course, and thereby assess when each regulator is necessary for EMT to advance. Availability and implementation R package at https://github.com/cbg-ethz/timeseriesNEM. The RNA sequencing data and microscopy images can be explored through a Shiny app at https://emt.bsse.ethz.ch. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mathias Cardner
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
10
|
Holding AN, Cook HV, Markowetz F. Data generation and network reconstruction strategies for single cell transcriptomic profiles of CRISPR-mediated gene perturbations. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2020; 1863:194441. [PMID: 31756390 DOI: 10.1016/j.bbagrm.2019.194441] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 10/01/2019] [Accepted: 10/01/2019] [Indexed: 02/05/2023]
Abstract
Recent advances in single-cell RNA-sequencing (scRNA-seq) in combination with CRISPR/Cas9 technologies have enabled the development of methods for large-scale perturbation studies with transcriptional readouts. These methods are highly scalable and have the potential to provide a wealth of information on the biological networks that underlie cellular response. Here we discuss how to overcome several key challenges to generate and analyse data for the confident reconstruction of models of the underlying cellular network. Some challenges are generic, and apply to analysing any single-cell transcriptomic data, while others are specific to combined single-cell CRISPR/Cas9 data, in particular barcode swapping, knockdown efficiency, multiplicity of infection and potential confounding factors. We also provide a curated collection of published data sets to aid the development of analysis strategies. Finally, we discuss several network reconstruction approaches, including co-expression networks and Bayesian networks, as well as their limitations, and highlight the potential of Nested Effects Models for network reconstruction from scRNA-seq data. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Andrew N Holding
- Department of Biology, University of York, York, UK; York Biomedical Research Institute, University of York, York, UK; CRUK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK; The Alan Turing Institute, 96 Euston Road, Kings Cross, London, UK
| | - Helen V Cook
- Department of Biology, University of York, York, UK
| | | |
Collapse
|
11
|
Sverchkov Y, Ho YH, Gasch A, Craven M. Context-Specific Nested Effects Models. J Comput Biol 2020; 27:403-417. [PMID: 32053004 DOI: 10.1089/cmb.2019.0459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Advances in systems biology have made clear the importance of network models for capturing knowledge about complex relationships in gene regulation, metabolism, and cellular signaling. A common approach to uncovering biological networks involves performing perturbations on elements of the network, such as gene knockdown experiments, and measuring how the perturbation affects some reporter of the process under study. In this article, we develop context-specific nested effects models (CSNEMs), an approach to inferring such networks that generalizes nested effects models (NEMs). The main contribution of this work is that CSNEMs explicitly model the participation of a gene in multiple contexts, meaning that a gene can appear in multiple places in the network. Biologically, the representation of regulators in multiple contexts may indicate that these regulators have distinct roles in different cellular compartments or cell cycle phases. We present an evaluation of the method on simulated data as well as on data from a study of the sodium chloride stress response in Saccharomyces cerevisiae.
Collapse
Affiliation(s)
- Yuriy Sverchkov
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Yi-Hsuan Ho
- Department of Genetics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Audrey Gasch
- Department of Genetics, University of Wisconsin-Madison, Madison, Wisconsin
| | - Mark Craven
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin
| |
Collapse
|
12
|
|
13
|
Ritter CD, Faurby S, Bennett DJ, Naka LN, Ter Steege H, Zizka A, Haenel Q, Nilsson RH, Antonelli A. The pitfalls of biodiversity proxies: Differences in richness patterns of birds, trees and understudied diversity across Amazonia. Sci Rep 2019; 9:19205. [PMID: 31844092 PMCID: PMC6915760 DOI: 10.1038/s41598-019-55490-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 11/25/2019] [Indexed: 01/09/2023] Open
Abstract
Most knowledge on biodiversity derives from the study of charismatic macro-organisms, such as birds and trees. However, the diversity of micro-organisms constitutes the majority of all life forms on Earth. Here, we ask if the patterns of richness inferred for macro-organisms are similar for micro-organisms. For this, we barcoded samples of soil, litter and insects from four localities on a west-to-east transect across Amazonia. We quantified richness as Operational Taxonomic Units (OTUs) in those samples using three molecular markers. We then compared OTU richness with species richness of two relatively well-studied organism groups in Amazonia: trees and birds. We find that OTU richness shows a declining west-to-east diversity gradient that is in agreement with the species richness patterns documented here and previously for birds and trees. These results suggest that most taxonomic groups respond to the same overall diversity gradients at large spatial scales. However, our results show a different pattern of richness in relation to habitat types, suggesting that the idiosyncrasies of each taxonomic group and peculiarities of the local environment frequently override large-scale diversity gradients. Our findings caution against using the diversity distribution of one taxonomic group as an indication of patterns of richness across all groups.
Collapse
Affiliation(s)
- Camila D Ritter
- Department of Eukaryotic Microbiology, University of Duisburg-Essen, Universitätsstrasse 5 S05 R04 H83, D-45141, Essen, Germany. .,Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden. .,Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden.
| | - Søren Faurby
- Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden.,Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden
| | - Dominic J Bennett
- Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden.,Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden
| | - Luciano N Naka
- Laboratório de Ornitologia, Departamento de Zoologia, Universidade Federal de Pernambuco, Recife, PE, Brazil
| | - Hans Ter Steege
- Naturalis Biodiversity Center, Leiden, Netherlands.,Systems Ecology, Free University, Amsterdam, Netherlands
| | - Alexander Zizka
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103, Leipzig, Germany
| | - Quiterie Haenel
- Zoological Institute, University of Basel, Vesalgasse 1, CH-4051, Basel, Switzerland
| | - R Henrik Nilsson
- Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden.,Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden
| | - Alexandre Antonelli
- Gothenburg Global Biodiversity Centre, Box 461, SE-405 30, Göteborg, Sweden.,Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden.,Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
| |
Collapse
|
14
|
Liu A, Trairatphisan P, Gjerga E, Didangelos A, Barratt J, Saez-Rodriguez J. From expression footprints to causal pathways: contextualizing large signaling networks with CARNIVAL. NPJ Syst Biol Appl 2019; 5:40. [PMID: 31728204 PMCID: PMC6848167 DOI: 10.1038/s41540-019-0118-z] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 10/09/2019] [Indexed: 12/19/2022] Open
Abstract
While gene expression profiling is commonly used to gain an overview of cellular processes, the identification of upstream processes that drive expression changes remains a challenge. To address this issue, we introduce CARNIVAL, a causal network contextualization tool which derives network architectures from gene expression footprints. CARNIVAL (CAusal Reasoning pipeline for Network identification using Integer VALue programming) integrates different sources of prior knowledge including signed and directed protein-protein interactions, transcription factor targets, and pathway signatures. The use of prior knowledge in CARNIVAL enables capturing a broad set of upstream cellular processes and regulators, leading to a higher accuracy when benchmarked against related tools. Implementation as an integer linear programming (ILP) problem guarantees efficient computation. As a case study, we applied CARNIVAL to contextualize signaling networks from gene expression data in IgA nephropathy (IgAN), a condition that can lead to chronic kidney disease. CARNIVAL identified specific signaling pathways and associated mediators dysregulated in IgAN including Wnt and TGF-β, which we subsequently validated experimentally. These results demonstrated how CARNIVAL generates hypotheses on potential upstream alterations that propagate through signaling networks, providing insights into diseases.
Collapse
Affiliation(s)
- Anika Liu
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute of Computational Biomedicine, Bioquant, 69120 Heidelberg, Germany
- 2RWTH Aachen University, Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), 52074 Aachen, Germany
| | - Panuwat Trairatphisan
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute of Computational Biomedicine, Bioquant, 69120 Heidelberg, Germany
| | - Enio Gjerga
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute of Computational Biomedicine, Bioquant, 69120 Heidelberg, Germany
- 2RWTH Aachen University, Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), 52074 Aachen, Germany
| | - Athanasios Didangelos
- 3Department of Infection, Immunity and Inflammation, University of Leicester, Leicester, UK
| | - Jonathan Barratt
- 3Department of Infection, Immunity and Inflammation, University of Leicester, Leicester, UK
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute of Computational Biomedicine, Bioquant, 69120 Heidelberg, Germany
- 2RWTH Aachen University, Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), 52074 Aachen, Germany
| |
Collapse
|
15
|
Meyer AS, Heiser LM. Systems biology approaches to measure and model phenotypic heterogeneity in cancer. CURRENT OPINION IN SYSTEMS BIOLOGY 2019; 17:35-40. [PMID: 32864511 PMCID: PMC7449235 DOI: 10.1016/j.coisb.2019.09.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The recent wide-spread adoption of single cell profiling technologies has revealed that individual cancers are not homogenous collections of deregulated cells, but instead are comprised of multiple genetically and phenotypically distinct cell subpopulations that exhibit a wide range of responses to extracellular signals and therapeutic insult. Such observations point to the urgent need to understand cancer as a complex, adaptive system. Cancer systems biology studies seek to develop the experimental and theoretical methods required to understand how biological components work together to determine how cancer cells function. Ultimately, such approaches will lead to improvements in how cancer is managed and treated. In this review, we discuss recent advances in cancer systems biology approaches to quantify, model, and elucidate mechanisms of heterogeneity.
Collapse
Affiliation(s)
- Aaron S. Meyer
- Department of Bioengineering, University of California Los Angeles, Los Angeles, CA, USA
| | - Laura M. Heiser
- Department of Biomedical Engineering and OHSU Center for Spatial Systems Biomedicine, OHSU, Portland, OR, USA
| |
Collapse
|
16
|
Pacini C, Koziol MJ. Bioinformatics challenges and perspectives when studying the effect of epigenetic modifications on alternative splicing. Philos Trans R Soc Lond B Biol Sci 2019; 373:rstb.2017.0073. [PMID: 29685977 PMCID: PMC5915717 DOI: 10.1098/rstb.2017.0073] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/14/2017] [Indexed: 02/07/2023] Open
Abstract
It is widely known that epigenetic modifications are important in regulating transcription, but several have also been reported in alternative splicing. The regulation of pre-mRNA splicing is important to explain proteomic diversity and the misregulation of splicing has been implicated in many diseases. Here, we give a brief overview of the role of epigenetics in alternative splicing and disease. We then discuss the bioinformatics methods that can be used to model interactions between epigenetic marks and regulators of splicing. These models can be used to identify alternative splicing and epigenetic changes across different phenotypes. This article is part of a discussion meeting issue ‘Frontiers in epigenetic chemical biology’.
Collapse
Affiliation(s)
- Clare Pacini
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.,Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Magdalena J Koziol
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK .,Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| |
Collapse
|
17
|
Meyer-Schaller N, Cardner M, Diepenbruck M, Saxena M, Tiede S, Lüönd F, Ivanek R, Beerenwinkel N, Christofori G. A Hierarchical Regulatory Landscape during the Multiple Stages of EMT. Dev Cell 2019; 48:539-553.e6. [DOI: 10.1016/j.devcel.2018.12.023] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 11/28/2018] [Accepted: 12/28/2018] [Indexed: 01/02/2023]
|
18
|
Franks AM, Markowetz F, Airoldi EM. REFINING CELLULAR PATHWAY MODELS USING AN ENSEMBLE OF HETEROGENEOUS DATA SOURCES. Ann Appl Stat 2018; 12:1361-1384. [PMID: 36506698 PMCID: PMC9733905 DOI: 10.1214/16-aoas915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Improving current models and hypotheses of cellular pathways is one of the major challenges of systems biology and functional genomics. There is a need for methods to build on established expert knowledge and reconcile it with results of new high-throughput studies. Moreover, the available sources of data are heterogeneous, and the data need to be integrated in different ways depending on which part of the pathway they are most informative for. In this paper, we introduce a compartment specific strategy to integrate edge, node and path data for refining a given network hypothesis. To carry out inference, we use a local-move Gibbs sampler for updating the pathway hypothesis from a compendium of heterogeneous data sources, and a new network regression idea for integrating protein attributes. We demonstrate the utility of this approach in a case study of the pheromone response MAPK pathway in the yeast S. cerevisiae.
Collapse
Affiliation(s)
- Alexander M Franks
- Department of Statistics and, Applied Probability, University of California, Santa Barbara, South Hall, Santa Barbara, California 93106, USA
| | - Florian Markowetz
- Cancer Research UK, Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, United Kingdom
| | - Edoardo M Airoldi
- Fox School of Business, Department of Statistical Science, Temple University, Center for Data Science, 1810 Liacouras Walk, Philadelphia, Pennsylvania 19122, USA
| |
Collapse
|
19
|
Abstract
Motivation New technologies allow for the elaborate measurement of different traits of single cells under genetic perturbations. These interventional data promise to elucidate intra-cellular networks in unprecedented detail and further help to improve treatment of diseases like cancer. However, cell populations can be very heterogeneous. Results We developed a mixture of Nested Effects Models (M&NEM) for single-cell data to simultaneously identify different cellular subpopulations and their corresponding causal networks to explain the heterogeneity in a cell population. For inference, we assign each cell to a network with a certain probability and iteratively update the optimal networks and cell probabilities in an Expectation Maximization scheme. We validate our method in the controlled setting of a simulation study and apply it to three data sets of pooled CRISPR screens generated previously by two novel experimental techniques, namely Crop-Seq and Perturb-Seq. Availability and implementation The mixture Nested Effects Model (M&NEM) is available as the R-package mnem at https://github.com/cbg-ethz/mnem/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Pirkl
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
20
|
Variable Selection and Joint Estimation of Mean and Covariance Models with an Application to eQTL Data. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2018; 2018:4626307. [PMID: 30046352 PMCID: PMC6036858 DOI: 10.1155/2018/4626307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 12/25/2017] [Accepted: 04/18/2018] [Indexed: 11/24/2022]
Abstract
In genomic data analysis, it is commonplace that underlying regulatory relationship over multiple genes is hardly ascertained due to unknown genetic complexity and epigenetic regulations. In this paper, we consider a joint mean and constant covariance model (JMCCM) that elucidates conditional dependent structures of genes with controlling for potential genotype perturbations. To this end, the modified Cholesky decomposition is utilized to parametrize entries of a precision matrix. The JMCCM maximizes the likelihood function to estimate parameters involved in the model. We also develop a variable selection algorithm that selects explanatory variables and Cholesky factors by exploiting the combination of the GCV and BIC as benchmarks, together with Rao and Wald statistics. Importantly, we notice that sparse estimation of a precision matrix (or equivalently gene network) is effectively achieved via the proposed variable selection scheme and contributes to exploring significant hub genes shown to be concordant to a priori biological evidence. In simulation studies, we confirm that our model selection efficiently identifies the true underlying networks. With an application to miRNA and SNPs data from yeast (a.k.a. eQTL data), we demonstrate that constructed gene networks reproduce validated biological and clinical knowledge with regard to various pathways including the cell cycle pathway.
Collapse
|
21
|
DRUG-NEM: Optimizing drug combinations using single-cell perturbation response to account for intratumoral heterogeneity. Proc Natl Acad Sci U S A 2018; 115:E4294-E4303. [PMID: 29654148 PMCID: PMC5939057 DOI: 10.1073/pnas.1711365115] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Single-cell high-throughput technologies enable the ability to identify combination cancer therapies that account for intratumoral heterogeneity, a phenomenon that has been shown to influence the effectiveness of cancer treatment. We developed and applied an approach that identifies top-ranking drug combinations based on the single-cell perturbation response when an individual tumor sample is screened against a panel of single drugs. This approach optimizes drug combinations by choosing the minimum number of drugs that produce the maximal intracellular desired effects for an individual sample. An individual malignant tumor is composed of a heterogeneous collection of single cells with distinct molecular and phenotypic features, a phenomenon termed intratumoral heterogeneity. Intratumoral heterogeneity poses challenges for cancer treatment, motivating the need for combination therapies. Single-cell technologies are now available to guide effective drug combinations by accounting for intratumoral heterogeneity through the analysis of the signaling perturbations of an individual tumor sample screened by a drug panel. In particular, Mass Cytometry Time-of-Flight (CyTOF) is a high-throughput single-cell technology that enables the simultaneous measurements of multiple (>40) intracellular and surface markers at the level of single cells for hundreds of thousands of cells in a sample. We developed a computational framework, entitled Drug Nested Effects Models (DRUG-NEM), to analyze CyTOF single-drug perturbation data for the purpose of individualizing drug combinations. DRUG-NEM optimizes drug combinations by choosing the minimum number of drugs that produce the maximal desired intracellular effects based on nested effects modeling. We demonstrate the performance of DRUG-NEM using single-cell drug perturbation data from tumor cell lines and primary leukemia samples.
Collapse
|
22
|
Deng Y, Zenil H, Tegnér J, Kiani NA. HiDi: an efficient reverse engineering schema for large-scale dynamic regulatory network reconstruction using adaptive differentiation. Bioinformatics 2017; 33:3964-3972. [DOI: 10.1093/bioinformatics/btx501] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 08/05/2017] [Indexed: 11/14/2022] Open
Affiliation(s)
- Yue Deng
- Algorithmic Dynamics Lab, Karolinska Institute, Stockholm, Sweden
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Solna and Science for Life Laboratory (SciLifeLab), Karolinska Institute, Stockholm, Sweden
| | - Hector Zenil
- Algorithmic Dynamics Lab, Karolinska Institute, Stockholm, Sweden
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Solna and Science for Life Laboratory (SciLifeLab), Karolinska Institute, Stockholm, Sweden
| | - Jesper Tegnér
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Solna and Science for Life Laboratory (SciLifeLab), Karolinska Institute, Stockholm, Sweden
- Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Narsis A Kiani
- Algorithmic Dynamics Lab, Karolinska Institute, Stockholm, Sweden
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, Solna and Science for Life Laboratory (SciLifeLab), Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
23
|
Szczurek E, Beerenwinkel N. Linear effects models of signaling pathways from combinatorial perturbation data. Bioinformatics 2017; 32:i297-i305. [PMID: 27307630 PMCID: PMC4908352 DOI: 10.1093/bioinformatics/btw268] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Motivation: Perturbations constitute the central means to study signaling pathways. Interrupting components of the pathway and analyzing observed effects of those interruptions can give insight into unknown connections within the signaling pathway itself, as well as the link from the pathway to the effects. Different pathway components may have different individual contributions to the measured perturbation effects, such as gene expression changes. Those effects will be observed in combination when the pathway components are perturbed. Extant approaches focus either on the reconstruction of pathway structure or on resolving how the pathway components control the downstream effects. Results: Here, we propose a linear effects model, which can be applied to solve both these problems from combinatorial perturbation data. We use simulated data to demonstrate the accuracy of learning the pathway structure as well as estimation of the individual contributions of pathway components to the perturbation effects. The practical utility of our approach is illustrated by an application to perturbations of the mitogen-activated protein kinase pathway in Saccharomyces cerevisiae. Availability and Implementation: lem is available as a R package at http://www.mimuw.edu.pl/∼szczurek/lem. Contact:szczurek@mimuw.edu.pl; niko.beerenwinkel@bsse.ethz.ch Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics
| |
Collapse
|
24
|
Pirkl M, Diekmann M, van der Wees M, Beerenwinkel N, Fröhlich H, Markowetz F. Inferring modulators of genetic interactions with epistatic nested effects models. PLoS Comput Biol 2017; 13:e1005496. [PMID: 28406896 PMCID: PMC5407847 DOI: 10.1371/journal.pcbi.1005496] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 04/27/2017] [Accepted: 04/03/2017] [Indexed: 12/27/2022] Open
Abstract
Maps of genetic interactions can dissect functional redundancies in cellular networks. Gene expression profiles as high-dimensional molecular readouts of combinatorial perturbations provide a detailed view of genetic interactions, but can be hard to interpret if different gene sets respond in different ways (called mixed epistasis). Here we test the hypothesis that mixed epistasis between a gene pair can be explained by the action of a third gene that modulates the interaction. We have extended the framework of Nested Effects Models (NEMs), a type of graphical model specifically tailored to analyze high-dimensional gene perturbation data, to incorporate logical functions that describe interactions between regulators on downstream genes and proteins. We benchmark our approach in the controlled setting of a simulation study and show high accuracy in inferring the correct model. In an application to data from deletion mutants of kinases and phosphatases in S. cerevisiae we show that epistatic NEMs can point to modulators of genetic interactions. Our approach is implemented in the R-package 'epiNEM' available from https://github.com/cbg-ethz/epiNEM and https://bioconductor.org/packages/epiNEM/.
Collapse
Affiliation(s)
- Martin Pirkl
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Madeline Diekmann
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Niko Beerenwinkel
- ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT (B-IT), University of Bonn, Bonn, Germany
- UCB Biosciences GmbH, Monheim, Germany
| | - Florian Markowetz
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
| |
Collapse
|
25
|
Trescher S, Münchmeyer J, Leser U. Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization. BMC SYSTEMS BIOLOGY 2017; 11:41. [PMID: 28347313 PMCID: PMC5369021 DOI: 10.1186/s12918-017-0419-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Accepted: 03/08/2017] [Indexed: 12/28/2022]
Abstract
Background Gene regulation is one of the most important cellular processes, indispensable for the adaptability of organisms and closely interlinked with several classes of pathogenesis and their progression. Elucidation of regulatory mechanisms can be approached by a multitude of experimental methods, yet integration of the resulting heterogeneous, large, and noisy data sets into comprehensive and tissue or disease-specific cellular models requires rigorous computational methods. Recently, several algorithms have been proposed which model genome-wide gene regulation as sets of (linear) equations over the activity and relationships of transcription factors, genes and other factors. Subsequent optimization finds those parameters that minimize the divergence of predicted and measured expression intensities. In various settings, these methods produced promising results in terms of estimating transcription factor activity and identifying key biomarkers for specific phenotypes. However, despite their common root in mathematical optimization, they vastly differ in the types of experimental data being integrated, the background knowledge necessary for their application, the granularity of their regulatory model, the concrete paradigm used for solving the optimization problem and the data sets used for evaluation. Results Here, we review five recent methods of this class in detail and compare them with respect to several key properties. Furthermore, we quantitatively compare the results of four of the presented methods based on publicly available data sets. Conclusions The results show that all methods seem to find biologically relevant information. However, we also observe that the mutual result overlaps are very low, which contradicts biological intuition. Our aim is to raise further awareness of the power of these methods, yet also to identify common shortcomings and necessary extensions enabling focused research on the critical points. Electronic supplementary material The online version of this article (doi:10.1186/s12918-017-0419-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Saskia Trescher
- Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany.
| | - Jannes Münchmeyer
- Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany
| | - Ulf Leser
- Knowledge Management in Bioinformatics, Computer Science Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany
| |
Collapse
|
26
|
Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data. PLoS One 2017; 12:e0171240. [PMID: 28166542 PMCID: PMC5293552 DOI: 10.1371/journal.pone.0171240] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 01/17/2017] [Indexed: 11/19/2022] Open
Abstract
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Collapse
|
27
|
Lu S, Cai C, Yan G, Zhou Z, Wan Y, Chen V, Chen L, Cooper GF, Obeid LM, Hannun YA, Lee AV, Lu X. Signal-Oriented Pathway Analyses Reveal a Signaling Complex as a Synthetic Lethal Target for p53 Mutations. Cancer Res 2016; 76:6785-6794. [PMID: 27758891 DOI: 10.1158/0008-5472.can-16-1740] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Revised: 08/31/2016] [Accepted: 09/18/2016] [Indexed: 11/16/2022]
Abstract
Defining processes that are synthetic lethal with p53 mutations in cancer cells may reveal possible therapeutic strategies. In this study, we report the development of a signal-oriented computational framework for cancer pathway discovery in this context. We applied our bipartite graph-based functional module discovery algorithm to identify transcriptomic modules abnormally expressed in multiple tumors, such that the genes in a module were likely regulated by a common, perturbed signal. For each transcriptomic module, we applied our weighted k-path merge algorithm to search for a set of somatic genome alterations (SGA) that likely perturbed the signal, that is, the candidate members of the pathway that regulate the transcriptomic module. Computational evaluations indicated that our methods-identified pathways were perturbed by SGA. In particular, our analyses revealed that SGA affecting TP53, PTK2, YWHAZ, and MED1 perturbed a set of signals that promote cell proliferation, anchor-free colony formation, and epithelial-mesenchymal transition (EMT). These proteins formed a signaling complex that mediates these oncogenic processes in a coordinated fashion. Disruption of this signaling complex by knocking down PTK2, YWHAZ, or MED1 attenuated and reversed oncogenic phenotypes caused by mutant p53 in a synthetic lethal manner. This signal-oriented framework for searching pathways and therapeutic targets is applicable to all cancer types, thus potentially impacting precision medicine in cancer. Cancer Res; 76(23); 6785-94. ©2016 AACR.
Collapse
Affiliation(s)
- Songjian Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania.,Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Chunhui Cai
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania.,Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Gonghong Yan
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania.,Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, Pennsylvania.,Magee-Womens Research Institute, Pittsburgh, Pennsylvania
| | - Zhuan Zhou
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania.,Department of Cell Biology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Yong Wan
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania.,Department of Cell Biology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Vicky Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania.,Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Lujia Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania.,Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Gregory F Cooper
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania.,Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Lina M Obeid
- Department of Medicine, the State University of New York at Stony Brook, Stony Brook, New York
| | - Yusuf A Hannun
- Department of Medicine, the State University of New York at Stony Brook, Stony Brook, New York
| | - Adrian V Lee
- Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania. .,University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania.,Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, Pennsylvania.,Magee-Womens Research Institute, Pittsburgh, Pennsylvania
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania. .,Center for Causal Discovery, University of Pittsburgh, Pittsburgh, Pennsylvania
| |
Collapse
|
28
|
Deng Y, Altschuler SJ, Wu LF. PHOCOS: inferring multi-feature phenotypic crosstalk networks. Bioinformatics 2016; 32:i44-i51. [PMID: 27307643 PMCID: PMC4908335 DOI: 10.1093/bioinformatics/btw251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Motivation: Quantification of cellular changes to perturbations can provide a powerful approach to infer crosstalk among molecular components in biological networks. Existing crosstalk inference methods conduct network-structure learning based on a single phenotypic feature (e.g. abundance) of a biomarker. These approaches are insufficient for analyzing perturbation data that can contain information about multiple features (e.g. abundance, activity or localization) of each biomarker. Results: We propose a computational framework for inferring phenotypic crosstalk (PHOCOS) that is suitable for high-content microscopy or other modalities that capture multiple phenotypes per biomarker. PHOCOS uses a robust graph-learning paradigm to predict direct effects from potential indirect effects and identify errors owing to noise or missing links. The result is a multi-feature, sparse network that parsimoniously captures direct and strong interactions across phenotypic attributes of multiple biomarkers. We use simulated and biological data to demonstrate the ability of PHOCOS to recover multi-attribute crosstalk networks from cellular perturbation assays. Availability and implementation: PHOCOS is available in open source at https://github.com/AltschulerWu-Lab/PHOCOS Contact:steven.altschuler@ucsf.edu or lani.wu@ucsf.edu
Collapse
Affiliation(s)
- Yue Deng
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158, USA
| | - Steven J Altschuler
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158, USA
| | - Lani F Wu
- Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
29
|
Moffa G, Erdmann G, Voloshanenko O, Hundsrucker C, Sadeh MJ, Boutros M, Spang R. Refining Pathways: A Model Comparison Approach. PLoS One 2016; 11:e0155999. [PMID: 27248690 PMCID: PMC4889067 DOI: 10.1371/journal.pone.0155999] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 05/06/2016] [Indexed: 11/30/2022] Open
Abstract
Cellular signalling pathways consolidate multiple molecular interactions into working models of signal propagation, amplification, and modulation. They are described and visualized as networks. Adjusting network topologies to experimental data is a key goal of systems biology. While network reconstruction algorithms like nested effects models are well established tools of computational biology, their data requirements can be prohibitive for their practical use. In this paper we suggest focussing on well defined aspects of a pathway and develop the computational tools to do so. We adapt the framework of nested effect models to focus on a specific aspect of activated Wnt signalling in HCT116 colon cancer cells: Does the activation of Wnt target genes depend on the secretion of Wnt ligands or do mutations in the signalling molecule β-catenin make this activation independent from them? We framed this question into two competing classes of models: Models that depend on Wnt ligands secretion versus those that do not. The model classes translate into restrictions of the pathways in the network topology. Wnt dependent models are more flexible than Wnt independent models. Bayes factors are the standard Bayesian tool to compare different models fairly on the data evidence. In our analysis, the Bayes factors depend on the number of potential Wnt signalling target genes included in the models. Stability analysis with respect to this number showed that the data strongly favours Wnt ligands dependent models for all realistic numbers of target genes.
Collapse
Affiliation(s)
- Giusi Moffa
- Department of Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Regensburg, Germany
| | - Gerrit Erdmann
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Faculty of Medicine Mannheim, Heidelberg University, Heidelberg, Germany
| | - Oksana Voloshanenko
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Faculty of Medicine Mannheim, Heidelberg University, Heidelberg, Germany
| | - Christian Hundsrucker
- Department of Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Regensburg, Germany
| | - Mohammad J. Sadeh
- Department of Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Regensburg, Germany
| | - Michael Boutros
- Division of Signaling and Functional Genomics, German Cancer Research Center (DKFZ) and Department of Cell and Molecular Biology, Faculty of Medicine Mannheim, Heidelberg University, Heidelberg, Germany
| | - Rainer Spang
- Department of Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Regensburg, Germany
| |
Collapse
|
30
|
Jahn K, Kuipers J, Beerenwinkel N. Tree inference for single-cell data. Genome Biol 2016; 17:86. [PMID: 27149953 PMCID: PMC4858868 DOI: 10.1186/s13059-016-0936-x] [Citation(s) in RCA: 181] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 04/08/2016] [Indexed: 02/02/2023] Open
Abstract
Understanding the mutational heterogeneity within tumors is a keystone for the development of efficient cancer therapies. Here, we present SCITE, a stochastic search algorithm to identify the evolutionary history of a tumor from noisy and incomplete mutation profiles of single cells. SCITE comprises a flexible Markov chain Monte Carlo sampling scheme that allows the user to compute the maximum-likelihood mutation history, to sample from the posterior probability distribution, and to estimate the error rates of the underlying sequencing experiments. Evaluation on real cancer data and on simulation studies shows the scalability of SCITE to present-day single-cell sequencing data and improved reconstruction accuracy compared to existing approaches.
Collapse
Affiliation(s)
- Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB, Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB, Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland. .,SIB, Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
31
|
Chasman D, Fotuhi Siahpirani A, Roy S. Network-based approaches for analysis of complex biological systems. Curr Opin Biotechnol 2016; 39:157-166. [PMID: 27115495 DOI: 10.1016/j.copbio.2016.04.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2015] [Revised: 04/04/2016] [Accepted: 04/05/2016] [Indexed: 12/22/2022]
Abstract
Cells function and respond to changes in their environment by the coordinated activity of their molecular components, including mRNAs, proteins and metabolites. At the heart of proper cellular function are molecular networks connecting these components to process extra-cellular environmental signals and drive dynamic, context-specific cellular responses. Network-based computational approaches aim to systematically integrate measurements from high-throughput experiments to gain a global understanding of cellular function under changing environmental conditions. We provide an overview of recent methodological developments toward solving two major computational problems within this field in the past two years (2013-2015): network reconstruction and network-based interpretation. Looking forward, we envision development of methods that can predict phenotypes with high accuracy as well as provide biologically plausible mechanistic hypotheses.
Collapse
Affiliation(s)
- Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Alireza Fotuhi Siahpirani
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States; Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States; Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States.
| |
Collapse
|
32
|
Ross EM, Markowetz F. OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome Biol 2016; 17:69. [PMID: 27083415 PMCID: PMC4832472 DOI: 10.1186/s13059-016-0929-9] [Citation(s) in RCA: 149] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 03/30/2016] [Indexed: 11/17/2022] Open
Abstract
Single-cell sequencing promises a high-resolution view of genetic heterogeneity and clonal evolution in cancer. However, methods to infer tumor evolution from single-cell sequencing data lag behind methods developed for bulk-sequencing data. Here, we present OncoNEM, a probabilistic method for inferring intra-tumor evolutionary lineage trees from somatic single nucleotide variants of single cells. OncoNEM identifies homogeneous cellular subpopulations and infers their genotypes as well as a tree describing their evolutionary relationships. In simulation studies, we assess OncoNEM's robustness and benchmark its performance against competing methods. Finally, we show its applicability in case studies of muscle-invasive bladder cancer and essential thrombocythemia.
Collapse
Affiliation(s)
- Edith M Ross
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK
| | - Florian Markowetz
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, UK.
| |
Collapse
|
33
|
Pirkl M, Hand E, Kube D, Spang R. Analyzing synergistic and non-synergistic interactions in signalling pathways using Boolean Nested Effect Models. Bioinformatics 2016; 32:893-900. [PMID: 26581413 PMCID: PMC5939970 DOI: 10.1093/bioinformatics/btv680] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Revised: 10/19/2015] [Accepted: 11/11/2015] [Indexed: 11/21/2022] Open
Abstract
MOTIVATION Understanding the structure and interplay of cellular signalling pathways is one of the great challenges in molecular biology. Boolean Networks can infer signalling networks from observations of protein activation. In situations where it is difficult to assess protein activation directly, Nested Effect Models are an alternative. They derive the network structure indirectly from downstream effects of pathway perturbations. To date, Nested Effect Models cannot resolve signalling details like the formation of signalling complexes or the activation of proteins by multiple alternative input signals. Here we introduce Boolean Nested Effect Models (B-NEM). B-NEMs combine the use of downstream effects with the higher resolution of signalling pathway structures in Boolean Networks. RESULTS We show that B-NEMs accurately reconstruct signal flows in simulated data. Using B-NEM we then resolve BCR signalling via PI3K and TAK1 kinases in BL2 lymphoma cell lines. AVAILABILITY AND IMPLEMENTATION R code is available at https://github.com/MartinFXP/B-NEM (github). The BCR signalling dataset is available at the GEO database (http://www.ncbi.nlm.nih.gov/geo/) through accession number GSE68761. CONTACT martin-franz-xaver.pirkl@ukr.de, Rainer.Spang@ukr.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Pirkl
- Statistical Bioinformatics Department, Institute of Functional Genomics, University of Regensburg, 93053 Regensburg and
| | - Elisabeth Hand
- Department of Haematology and Oncology, University Medical Centre of the Georg-August University of Göttingen, 37073 Göttingen
| | - Dieter Kube
- Department of Haematology and Oncology, University Medical Centre of the Georg-August University of Göttingen, 37073 Göttingen
| | - Rainer Spang
- Statistical Bioinformatics Department, Institute of Functional Genomics, University of Regensburg, 93053 Regensburg and
| |
Collapse
|
34
|
MacNeil LT, Pons C, Arda HE, Giese GE, Myers CL, Walhout AJM. Transcription Factor Activity Mapping of a Tissue-Specific in vivo Gene Regulatory Network. Cell Syst 2015; 1:152-162. [PMID: 26430702 DOI: 10.1016/j.cels.2015.08.003] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
A wealth of physical interaction data between transcription factors (TFs) and DNA has been generated, but these interactions often do not have apparent regulatory consequences. Thus, equating physical interaction data with gene regulatory networks (GRNs) is problematic. Here, we comprehensively assay TF activity, rather than binding, to construct a network of gene regulatory interactions in the C. elegans intestine. By manually observing the in vivo tissue-specific knockdown of 921 TFs on a panel of 19 fluorescent transcriptional reporters, we identified a GRN of 411 interactions between 19 promoters and 177 TFs. This GRN shows only modest overlap with physical interactions, indicating that many regulatory interactions are indirect. We applied nested effects modeling to uncover information flow between TFs in the intestine that converges on a small set of physical TF-promoter interactions. We found numerous cell nonautonomous regulatory interactions, illustrating tissue-to-tissue communication. Altogether, our study illuminates the complexity of gene regulation in the context of a living animal.
Collapse
Affiliation(s)
- Lesley T MacNeil
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA ; Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Carles Pons
- Department of Computer Science and Engineering, University of Minnesota-Twin Cities, Minneapolis, MN 55455, USA
| | - H Efsun Arda
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA ; Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Gabrielle E Giese
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Chad L Myers
- Department of Computer Science and Engineering, University of Minnesota-Twin Cities, Minneapolis, MN 55455, USA
| | - Albertha J M Walhout
- Program in Systems Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA ; Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| |
Collapse
|
35
|
Budak G, Eren Ozsoy O, Aydin Son Y, Can T, Tuncbag N. Reconstruction of the temporal signaling network in Salmonella-infected human cells. Front Microbiol 2015; 6:730. [PMID: 26257716 PMCID: PMC4507143 DOI: 10.3389/fmicb.2015.00730] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 07/03/2015] [Indexed: 12/02/2022] Open
Abstract
Salmonella enterica is a bacterial pathogen that usually infects its host through food sources. Translocation of the pathogen proteins into the host cells leads to changes in the signaling mechanism either by activating or inhibiting the host proteins. Given that the bacterial infection modifies the response network of the host, a more coherent view of the underlying biological processes and the signaling networks can be obtained by using a network modeling approach based on the reverse engineering principles. In this work, we have used a published temporal phosphoproteomic dataset of Salmonella-infected human cells and reconstructed the temporal signaling network of the human host by integrating the interactome and the phosphoproteomic dataset. We have combined two well-established network modeling frameworks, the Prize-collecting Steiner Forest (PCSF) approach and the Integer Linear Programming (ILP) based edge inference approach. The resulting network conserves the information on temporality, direction of interactions, while revealing hidden entities in the signaling, such as the SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis pathways. Targets of the Salmonella effectors in the host cells such as CDC42, RHOA, 14-3-3δ, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed signaling network although they were not present in the initial phosphoproteomic data. We believe that integrated approaches, such as the one presented here, have a high potential for the identification of clinical targets in infectious diseases, especially in the Salmonella infections.
Collapse
Affiliation(s)
- Gungor Budak
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| | - Oyku Eren Ozsoy
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| | - Yesim Aydin Son
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| | - Tolga Can
- Department of Computer Engineering, College of Engineering, Middle East Technical University Ankara, Turkey
| | - Nurcan Tuncbag
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University Ankara, Turkey
| |
Collapse
|
36
|
Fröhlich H. biRte: Bayesian inference of context-specific regulator activities and transcriptional networks. Bioinformatics 2015; 31:3290-8. [PMID: 26112290 DOI: 10.1093/bioinformatics/btv379] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/15/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED In the last years there has been an increasing effort to computationally model and predict the influence of regulators (transcription factors, miRNAs) on gene expression. Here we introduce biRte as a computationally attractive approach combining Bayesian inference of regulator activities with network reverse engineering. biRte integrates target gene predictions with different omics data entities (e.g. miRNA and mRNA data) into a joint probabilistic framework. The utility of our method is tested in extensive simulation studies and demonstrated with applications from prostate cancer and Escherichia coli growth control. The resulting regulatory networks generally show a good agreement with the biological literature. AVAILABILITY AND IMPLEMENTATION biRte is available on Bioconductor (http://bioconductor.org). CONTACT frohlich@bit.uni-bonn.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Holger Fröhlich
- University of Bonn, Institute for Computer Science, Römerstr. 164, 53117 Bonn, Germany
| |
Collapse
|
37
|
Rodriguez A, Crespo I, Fournier A, del Sol A. Discrete Logic Modelling Optimization to Contextualize Prior Knowledge Networks Using PRUNET. PLoS One 2015; 10:e0127216. [PMID: 26058016 PMCID: PMC4461287 DOI: 10.1371/journal.pone.0127216] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Accepted: 04/13/2015] [Indexed: 01/09/2023] Open
Abstract
High-throughput technologies have led to the generation of an increasing amount of data in different areas of biology. Datasets capturing the cell's response to its intra- and extra-cellular microenvironment allows such data to be incorporated as signed and directed graphs or influence networks. These prior knowledge networks (PKNs) represent our current knowledge of the causality of cellular signal transduction. New signalling data is often examined and interpreted in conjunction with PKNs. However, different biological contexts, such as cell type or disease states, may have distinct variants of signalling pathways, resulting in the misinterpretation of new data. The identification of inconsistencies between measured data and signalling topologies, as well as the training of PKNs using context specific datasets (PKN contextualization), are necessary conditions to construct reliable, predictive models, which are current challenges in the systems biology of cell signalling. Here we present PRUNET, a user-friendly software tool designed to address the contextualization of a PKNs to specific experimental conditions. As the input, the algorithm takes a PKN and the expression profile of two given stable steady states or cellular phenotypes. The PKN is iteratively pruned using an evolutionary algorithm to perform an optimization process. This optimization rests in a match between predicted attractors in a discrete logic model (Boolean) and a Booleanized representation of the phenotypes, within a population of alternative subnetworks that evolves iteratively. We validated the algorithm applying PRUNET to four biological examples and using the resulting contextualized networks to predict missing expression values and to simulate well-characterized perturbations. PRUNET constitutes a tool for the automatic curation of a PKN to make it suitable for describing biological processes under particular experimental conditions. The general applicability of the implemented algorithm makes PRUNET suitable for a variety of biological processes, for instance cellular reprogramming or transitions between healthy and disease states.
Collapse
Affiliation(s)
- Ana Rodriguez
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Isaac Crespo
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Anna Fournier
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Antonio del Sol
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| |
Collapse
|
38
|
Matos MRA, Knapp B, Kaderali L. lpNet: a linear programming approach to reconstruct signal transduction networks. Bioinformatics 2015; 31:3231-3. [PMID: 26026168 DOI: 10.1093/bioinformatics/btv327] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Accepted: 05/19/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED With the widespread availability of high-throughput experimental technologies it has become possible to study hundreds to thousands of cellular factors simultaneously, such as coding- or non-coding mRNA or protein concentrations. Still, extracting information about the underlying regulatory or signaling interactions from these data remains a difficult challenge. We present a flexible approach towards network inference based on linear programming. Our method reconstructs the interactions of factors from a combination of perturbation/non-perturbation and steady-state/time-series data. We show both on simulated and real data that our methods are able to reconstruct the underlying networks fast and efficiently, thus shedding new light on biological processes and, in particular, into disease's mechanisms of action. We have implemented the approach as an R package available through bioconductor. AVAILABILITY AND IMPLEMENTATION This R package is freely available under the Gnu Public License (GPL-3) from bioconductor.org (http://bioconductor.org/packages/release/bioc/html/lpNet.html) and is compatible with most operating systems (Windows, Linux, Mac OS) and hardware architectures. CONTACT bettina.knapp@helmholtz-muenchen.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marta R A Matos
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany and
| | - Bettina Knapp
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany and Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
| | - Lars Kaderali
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany and
| |
Collapse
|
39
|
Abstract
Large-scale genetic perturbation screens are a classical approach in biology and have been crucial for many discoveries. New technologies can now provide unbiased quantification of multiple molecular and phenotypic changes across tens of thousands of individual cells from large numbers of perturbed cell populations simultaneously. In this Review, we describe how these developments have enabled the discovery of new principles of intracellular and intercellular organization, novel interpretations of genetic perturbation effects and the inference of novel functional genetic interactions. These advances now allow more accurate and comprehensive analyses of gene function in cells using genetic perturbation screens.
Collapse
|
40
|
Kiani NA, Kaderali L. Dynamic probabilistic threshold networks to infer signaling pathways from time-course perturbation data. BMC Bioinformatics 2014; 15:250. [PMID: 25047753 PMCID: PMC4133630 DOI: 10.1186/1471-2105-15-250] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 07/15/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system's response after systematic perturbations are available. RESULTS We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway. CONCLUSIONS Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.
Collapse
Affiliation(s)
- Narsis A Kiani
- Technische Universität Dresden, Medical Faculty Carl Gustav Carus, Institute for Medical Informatics and Biometry, Fetscherstr, 74, 01307 Dresden, Germany.
| | | |
Collapse
|
41
|
Martin F, Sewer A, Talikka M, Xiang Y, Hoeng J, Peitsch MC. Quantification of biological network perturbations for mechanistic insight and diagnostics using two-layer causal models. BMC Bioinformatics 2014; 15:238. [PMID: 25015298 PMCID: PMC4227138 DOI: 10.1186/1471-2105-15-238] [Citation(s) in RCA: 92] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 06/26/2014] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND High-throughput measurement technologies such as microarrays provide complex datasets reflecting mechanisms perturbed in an experiment, typically a treatment vs. control design. Analysis of these information rich data can be guided based on a priori knowledge, such as networks or set of related proteins or genes. Among those, cause-and-effect network models are becoming increasingly popular and more than eighty such models, describing processes involved in cell proliferation, cell fate, cell stress, and inflammation have already been published. A meaningful systems toxicology approach to study the response of a cell system, or organism, exposed to bio-active substances requires a quantitative measure of dose-response at network level, to go beyond the differential expression of single genes. RESULTS We developed a method that quantifies network response in an interpretable manner. It fully exploits the (signed graph) structure of cause-and-effect networks models to integrate and mine transcriptomics measurements. The presented approach also enables the extraction of network-based signatures for predicting a phenotype of interest. The obtained signatures are coherent with the underlying network perturbation and can lead to more robust predictions across independent studies. The value of the various components of our mathematically coherent approach is substantiated using several in vivo and in vitro transcriptomics datasets. As a proof-of-principle, our methodology was applied to unravel mechanisms related to the efficacy of a specific anti-inflammatory drug in patients suffering from ulcerative colitis. A plausible mechanistic explanation of the unequal efficacy of the drug is provided. Moreover, by utilizing the underlying mechanisms, an accurate and robust network-based diagnosis was built to predict the response to the treatment. CONCLUSION The presented framework efficiently integrates transcriptomics data and "cause and effect" network models to enable a mathematically coherent framework from quantitative impact assessment and data interpretation to patient stratification for diagnosis purposes.
Collapse
Affiliation(s)
- Florian Martin
- Philip Morris International, R&D, Biological Systems Research, Quai Jeanrenaud 5, 2000 Neuchatel, Switzerland.
| | | | | | | | | | | |
Collapse
|
42
|
Sadeh MJ, Moffa G, Spang R. Considering unknown unknowns: reconstruction of nonconfoundable causal relations in biological networks. J Comput Biol 2014; 20:920-32. [PMID: 24195708 DOI: 10.1089/cmb.2013.0119] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Our current understanding of cellular networks is rather incomplete. We over look important but so far unknown genes and mechanisms in the pathways. Moreover, we often only have a partial account of the molecular interactions and modifications of the known players. When analyzing the cell, we look through narrow windows leaving potentially important events in blind spots. Network reconstruction is naturally confined to what we have observed. Little is known on how the incompleteness of our observations confounds our interpretation of the available data. Here we ask which features of a network can be confounded by incomplete observations and which cannot. In the context of nested effects models, we show that in the presence of missing observations or hidden factors a reliable reconstruction of the full network is not feasible. Nevertheless, we can show that certain characteristics of signaling networks like the existence of cross-talk between certain branches of the network can be inferred in a nonconfoundable way. We derive a test for inferring such nonconfoundable characteristics of signaling networks. Next, we introduce a new data structure to represent partially reconstructed signaling networks. Finally, we evaluate our method both on simulated data and in the context of a study on early stem cell differentiation in mice.
Collapse
Affiliation(s)
- Mohammad J Sadeh
- Institute of Functional Genomics, Computational Diagnostics Group, University of Regensburg , Regensburg, Germany
| | | | | |
Collapse
|
43
|
Wang YXR, Huang H. Review on statistical methods for gene network reconstruction using expression data. J Theor Biol 2014; 362:53-61. [PMID: 24726980 DOI: 10.1016/j.jtbi.2014.03.040] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2014] [Revised: 03/29/2014] [Accepted: 03/31/2014] [Indexed: 12/16/2022]
Abstract
Network modeling has proven to be a fundamental tool in analyzing the inner workings of a cell. It has revolutionized our understanding of biological processes and made significant contributions to the discovery of disease biomarkers. Much effort has been devoted to reconstruct various types of biochemical networks using functional genomic datasets generated by high-throughput technologies. This paper discusses statistical methods used to reconstruct gene regulatory networks using gene expression data. In particular, we highlight progress made and challenges yet to be met in the problems involved in estimating gene interactions, inferring causality and modeling temporal changes of regulation behaviors. As rapid advances in technologies have made available diverse, large-scale genomic data, we also survey methods of incorporating all these additional data to achieve better, more accurate inference of gene networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- Department of Statistics, University of California, Berkeley, CA 94720, USA.
| | - Haiyan Huang
- Department of Statistics, University of California, Berkeley, CA 94720, USA.
| |
Collapse
|
44
|
Wang X, Yuan K, Hellmayr C, Liu W, Markowetz F. Reconstructing evolving signalling networks by hidden Markov nested effects models. Ann Appl Stat 2014. [DOI: 10.1214/13-aoas696] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
45
|
Inferring regulatory networks by combining perturbation screens and steady state gene expression profiles. PLoS One 2014; 9:e82393. [PMID: 24586224 PMCID: PMC3938831 DOI: 10.1371/journal.pone.0082393] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2012] [Accepted: 11/01/2013] [Indexed: 11/19/2022] Open
Abstract
Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g., wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network.
Collapse
|
46
|
Sailem H, Bousgouni V, Cooper S, Bakal C. Cross-talk between Rho and Rac GTPases drives deterministic exploration of cellular shape space and morphological heterogeneity. Open Biol 2014; 4:130132. [PMID: 24451547 PMCID: PMC3909273 DOI: 10.1098/rsob.130132] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
One goal of cell biology is to understand how cells adopt different shapes in response to varying environmental and cellular conditions. Achieving a comprehensive understanding of the relationship between cell shape and environment requires a systems-level understanding of the signalling networks that respond to external cues and regulate the cytoskeleton. Classical biochemical and genetic approaches have identified thousands of individual components that contribute to cell shape, but it remains difficult to predict how cell shape is generated by the activity of these components using bottom-up approaches because of the complex nature of their interactions in space and time. Here, we describe the regulation of cellular shape by signalling systems using a top-down approach. We first exploit the shape diversity generated by systematic RNAi screening and comprehensively define the shape space a migratory cell explores. We suggest a simple Boolean model involving the activation of Rac and Rho GTPases in two compartments to explain the basis for all cell shapes in the dataset. Critically, we also generate a probabilistic graphical model to show how cells explore this space in a deterministic, rather than a stochastic, fashion. We validate the predictions made by our model using live-cell imaging. Our work explains how cross-talk between Rho and Rac can generate different cell shapes, and thus morphological heterogeneity, in genetically identical populations.
Collapse
Affiliation(s)
- Heba Sailem
- Chester Beatty Laboratories, Division of Cancer Biology, Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
| | | | | | | |
Collapse
|
47
|
Snijder B, Liberali P, Frechin M, Stoeger T, Pelkmans L. Predicting functional gene interactions with the hierarchical interaction score. Nat Methods 2013; 10:1089-92. [PMID: 24097268 DOI: 10.1038/nmeth.2655] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2013] [Accepted: 08/06/2013] [Indexed: 12/18/2022]
Abstract
Systems biology aims to unravel the vast network of functional interactions that govern biological systems. To date, the inference of gene interactions from large-scale 'omics data is typically achieved using correlations. We present the hierarchical interaction score (HIS) and show that the HIS outperforms commonly used methods in the inference of functional interactions between genes measured in large-scale experiments, making it a valuable statistic for systems biology.
Collapse
Affiliation(s)
- Berend Snijder
- 1] Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland. [2]
| | | | | | | | | |
Collapse
|
48
|
Knapp B, Kaderali L. Reconstruction of cellular signal transduction networks using perturbation assays and linear programming. PLoS One 2013; 8:e69220. [PMID: 23935958 PMCID: PMC3728289 DOI: 10.1371/journal.pone.0069220] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 06/06/2013] [Indexed: 12/23/2022] Open
Abstract
Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4+ T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.
Collapse
Affiliation(s)
- Bettina Knapp
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- ViroQuant Research Group Modeling, BioQuant, Heidelberg University, Heidelberg, Germany
| | - Lars Kaderali
- Institute for Medical Informatics and Biometry, Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- ViroQuant Research Group Modeling, BioQuant, Heidelberg University, Heidelberg, Germany
- * E-mail:
| |
Collapse
|
49
|
Ozsoy OE, Can T. A divide and conquer approach for construction of large-scale signaling networks from PPI and RNAi data using linear programming. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:869-883. [PMID: 24334382 DOI: 10.1109/tcbb.2013.80] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Inference of topology of signaling networks from perturbation experiments is a challenging problem. Recently, the inference problem has been formulated as a reference network editing problem and it has been shown that finding the minimum number of edit operations on a reference network to comply with perturbation experiments is an NP-complete problem. In this paper, we propose an integer linear optimization (ILP) model for reconstruction of signaling networks from RNAi data and a reference network. The ILP model guarantees the optimal solution; however, is practical only for small signaling networks of size 10-15 genes due to computational complexity. To scale for large signaling networks, we propose a divide and conquer-based heuristic, in which a given reference network is divided into smaller subnetworks that are solved separately and the solutions are merged together to form the solution for the large network. We validate our proposed approach on real and synthetic data sets, and comparison with the state of the art shows that our proposed approach is able to scale better for large networks while attaining similar or better biological accuracy.
Collapse
Affiliation(s)
| | - Tolga Can
- Middle East Technical University, Ankara
| |
Collapse
|
50
|
Abstract
High-throughput experimental technologies are generating increasingly massive and complex genomic data sets. The sheer enormity and heterogeneity of these data threaten to make the arising problems computationally infeasible. Fortunately, powerful algorithmic techniques lead to software that can answer important biomedical questions in practice. In this Review, we sample the algorithmic landscape, focusing on state-of-the-art techniques, the understanding of which will aid the bench biologist in analysing omics data. We spotlight specific examples that have facilitated and enriched analyses of sequence, transcriptomic and network data sets.
Collapse
Affiliation(s)
- Bonnie Berger
- Department of Mathematics and Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | | | |
Collapse
|