1
|
Pirak D, Sharan R. D'or: deep orienter of protein-protein interaction networks. Bioinformatics 2024; 40:btae355. [PMID: 38862241 PMCID: PMC11254290 DOI: 10.1093/bioinformatics/btae355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 04/19/2024] [Accepted: 06/06/2024] [Indexed: 06/13/2024] Open
Abstract
MOTIVATION Protein-protein interactions (PPIs) provide the skeleton for signal transduction in the cell. Current PPI measurement techniques do not provide information on their directionality which is critical for elucidating signaling pathways. To date, there are hundreds of thousands of known PPIs in public databases, yet only a small fraction of them have an assigned direction. This information gap calls for computational approaches for inferring the directionality of PPIs, aka network orientation. RESULTS In this work, we propose a novel deep learning approach for PPI network orientation. Our method first generates a set of proximity scores between a protein interaction and sets of cause and effect proteins using a network propagation procedure. Each of these score sets is fed, one at a time, to a deep set encoder whose outputs are used as features for predicting the interaction's orientation. On a comprehensive dataset of oriented PPIs taken from five different sources, we achieve an area under the precision-recall curve of 0.89-0.92, outperforming previous methods. We further demonstrate the utility of the oriented network in prioritizing cancer driver genes and disease genes. AVAILABILITY AND IMPLEMENTATION D'or is implemented in Python and is publicly available at https://github.com/pirakd/DeepOrienter.
Collapse
Affiliation(s)
- Daniel Pirak
- Department of Electrical Engineering, Tel Aviv University, Tel Aviv 69978, Israel
| | - Roded Sharan
- Department of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
2
|
Kim Y, Han Y, Hopper C, Lee J, Joo JI, Gong JR, Lee CK, Jang SH, Kang J, Kim T, Cho KH. A gray box framework that optimizes a white box logical model using a black box optimizer for simulating cellular responses to perturbations. CELL REPORTS METHODS 2024; 4:100773. [PMID: 38744288 PMCID: PMC11133856 DOI: 10.1016/j.crmeth.2024.100773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 03/19/2024] [Accepted: 04/19/2024] [Indexed: 05/16/2024]
Abstract
Predicting cellular responses to perturbations requires interpretable insights into molecular regulatory dynamics to perform reliable cell fate control, despite the confounding non-linearity of the underlying interactions. There is a growing interest in developing machine learning-based perturbation response prediction models to handle the non-linearity of perturbation data, but their interpretation in terms of molecular regulatory dynamics remains a challenge. Alternatively, for meaningful biological interpretation, logical network models such as Boolean networks are widely used in systems biology to represent intracellular molecular regulation. However, determining the appropriate regulatory logic of large-scale networks remains an obstacle due to the high-dimensional and discontinuous search space. To tackle these challenges, we present a scalable derivative-free optimizer trained by meta-reinforcement learning for Boolean network models. The logical network model optimized by the trained optimizer successfully predicts anti-cancer drug responses of cancer cell lines, while simultaneously providing insight into their underlying molecular regulatory mechanisms.
Collapse
Affiliation(s)
- Yunseong Kim
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Younghyun Han
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Corbin Hopper
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Jonghoon Lee
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Jae Il Joo
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Jeong-Ryeol Gong
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Chun-Kyung Lee
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Seong-Hoon Jang
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Junsoo Kang
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Taeyoung Kim
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
| | - Kwang-Hyun Cho
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
| |
Collapse
|
3
|
Viswan NA, Bhalla US. Understanding molecular signaling cascades in neural disease using multi-resolution models. Curr Opin Neurobiol 2023; 83:102808. [PMID: 37972535 DOI: 10.1016/j.conb.2023.102808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 10/10/2023] [Accepted: 10/19/2023] [Indexed: 11/19/2023]
Abstract
If the genome defines the program for the operations of a cell, signaling networks execute it. These cascades of chemical, cell-biological, structural, and trafficking events span milliseconds (e.g., synaptic release) to potentially a lifetime (e.g., stabilization of dendritic spines). In principle almost every aspect of neuronal function, particularly at the synapse, depends on signaling. Thus dysfunction of these cascades, whether through mutations, local dysregulation, or infection, leads to disease. The sheer complexity of these pathways is matched by the range of diseases and the diversity of their phenotypes. In this review, we discuss how to build computational models, how these models are essential to tackle this complexity, and the benefits of using families of models at different levels of detail to understand signaling in health and disease.
Collapse
Affiliation(s)
- Nisha Ann Viswan
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bellary Road, Bengaluru, 560065, India; The University of Trans-Disciplinary Health Sciences and Technology, Bangalore, India. https://twitter.com/nishanna
| | - Upinder Singh Bhalla
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bellary Road, Bengaluru, 560065, India.
| |
Collapse
|
4
|
Ahmadian Elmi M, Motamed N, Picard D. Proteomic Analyses of the G Protein-Coupled Estrogen Receptor GPER1 Reveal Constitutive Links to Endoplasmic Reticulum, Glycosylation, Trafficking, and Calcium Signaling. Cells 2023; 12:2571. [PMID: 37947649 PMCID: PMC10650109 DOI: 10.3390/cells12212571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 10/14/2023] [Accepted: 11/01/2023] [Indexed: 11/12/2023] Open
Abstract
The G protein-coupled estrogen receptor 1 (GPER1) has been proposed to mediate rapid responses to the steroid hormone estrogen. However, despite a strong interest in its potential role in cancer, whether it is indeed activated by estrogen and how this works remain controversial. To provide new tools to address these questions, we set out to determine the interactome of exogenously expressed GPER1. The combination of two orthogonal methods, namely APEX2-mediated proximity labeling and immunoprecipitation followed by mass spectrometry, gave us high-confidence results for 73 novel potential GPER1 interactors. We found that this GPER1 interactome is not affected by estrogen, a result that mirrors the constitutive activity of GPER1 in a functional assay with a Rac1 sensor. We specifically validated several hits highlighted by a gene ontology analysis. We demonstrate that CLPTM1 interacts with GPER1 and that PRKCSH and GANAB, the regulatory and catalytic subunits of α-glucosidase II, respectively, associate with CLPTM1 and potentially indirectly with GPER1. An imbalance in CLPTM1 levels induces nuclear association of GPER1, as does the overexpression of PRKCSH. Moreover, we show that the Ca2+ sensor STIM1 interacts with GPER1 and that upon STIM1 overexpression and depletion of Ca2+ stores, GPER1 becomes more nuclear. Thus, these new GPER1 interactors establish interesting connections with membrane protein maturation, trafficking, and calcium signaling.
Collapse
Affiliation(s)
- Maryam Ahmadian Elmi
- Department of Cellular and Molecular Biology, School of Biology, College of Science, University of Tehran, Tehran 14155-6455, Iran
- Département de Biologie Moléculaire et Cellulaire, Université de Genève, Sciences III, Quai Ernest-Ansermet 30, CH-1211 Genève, Switzerland
| | - Nasrin Motamed
- Department of Cellular and Molecular Biology, School of Biology, College of Science, University of Tehran, Tehran 14155-6455, Iran
| | - Didier Picard
- Département de Biologie Moléculaire et Cellulaire, Université de Genève, Sciences III, Quai Ernest-Ansermet 30, CH-1211 Genève, Switzerland
| |
Collapse
|
5
|
Amgalan B, Wojtowicz D, Kim YA, Przytycka TM. Influence network model uncovers relations between biological processes and mutational signatures. Genome Med 2023; 15:15. [PMID: 36879282 PMCID: PMC9987115 DOI: 10.1186/s13073-023-01162-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 02/08/2023] [Indexed: 03/08/2023] Open
Abstract
BACKGROUND There has been a growing appreciation recently that mutagenic processes can be studied through the lenses of mutational signatures, which represent characteristic mutation patterns attributed to individual mutagens. However, the causal links between mutagens and observed mutation patterns as well as other types of interactions between mutagenic processes and molecular pathways are not fully understood, limiting the utility of mutational signatures. METHODS To gain insights into these relationships, we developed a network-based method, named GENESIGNET that constructs an influence network among genes and mutational signatures. The approach leverages sparse partial correlation among other statistical techniques to uncover dominant influence relations between the activities of network nodes. RESULTS Applying GENESIGNET to cancer data sets, we uncovered important relations between mutational signatures and several cellular processes that can shed light on cancer-related processes. Our results are consistent with previous findings, such as the impact of homologous recombination deficiency on clustered APOBEC mutations in breast cancer. The network identified by GENESIGNET also suggest an interaction between APOBEC hypermutation and activation of regulatory T Cells (Tregs), as well as a relation between APOBEC mutations and changes in DNA conformation. GENESIGNET also exposed a possible link between the SBS8 signature of unknown etiology and the Nucleotide Excision Repair (NER) pathway. CONCLUSIONS GENESIGNET provides a new and powerful method to reveal the relation between mutational signatures and gene expression. The GENESIGNET method was implemented in python, and installable package, source codes and the data sets used for and generated during this study are available at the Github site https://github.com/ncbi/GeneSigNet.
Collapse
Affiliation(s)
- Bayarbaatar Amgalan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, 20894, Bethesda, USA
| | - Damian Wojtowicz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, 20894, Bethesda, USA.,Current address: Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, ul. Banacha 2, 02-097, Warszawa, Poland
| | - Yoo-Ah Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, 20894, Bethesda, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, 20894, Bethesda, USA.
| |
Collapse
|
6
|
Redhu N, Thakur Z. Network biology and applications. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00024-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
7
|
de Schaetzen van Brienen L, Miclotte G, Larmuseau M, Van den Eynden J, Marchal K. Network-Based Analysis to Identify Drivers of Metastatic Prostate Cancer Using GoNetic. Cancers (Basel) 2021; 13:5291. [PMID: 34771455 PMCID: PMC8582433 DOI: 10.3390/cancers13215291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 10/19/2021] [Accepted: 10/19/2021] [Indexed: 11/16/2022] Open
Abstract
Most known driver genes of metastatic prostate cancer are frequently mutated. To dig into the long tail of rarely mutated drivers, we performed network-based driver identification on the Hartwig Medical Foundation metastatic prostate cancer data set (HMF cohort). Hereto, we developed GoNetic, a method based on probabilistic pathfinding, to identify recurrently mutated subnetworks. In contrast to most state-of-the-art network-based methods, GoNetic can leverage sample-specific mutational information and the weights of the underlying prior network. When applied to the HMF cohort, GoNetic successfully recovered known primary and metastatic drivers of prostate cancer that are frequently mutated in the HMF cohort (TP53, RB1, and CTNNB1). In addition, the identified subnetworks contain frequently mutated genes, reflect processes related to metastatic prostate cancer, and contain rarely mutated driver candidates. To further validate these rarely mutated genes, we assessed whether the identified genes were more mutated in metastatic than in primary samples using an independent cohort. Then we evaluated their association with tumor evolution and with the lymph node status of the patients. This resulted in forwarding several novel putative driver genes for metastatic prostate cancer, some of which might be prognostic for disease evolution.
Collapse
Affiliation(s)
- Louise de Schaetzen van Brienen
- Department of Plant Biotechnology and Bioinformatics, Faculty of Sciences, Ghent University, 9052 Ghent, Belgium; (L.d.S.v.B.); (G.M.); (M.L.)
- Department of Information Technology, Faculty of Engineering and Architecture, Ghent University-IMEC, 9052 Ghent, Belgium
| | - Giles Miclotte
- Department of Plant Biotechnology and Bioinformatics, Faculty of Sciences, Ghent University, 9052 Ghent, Belgium; (L.d.S.v.B.); (G.M.); (M.L.)
- Department of Information Technology, Faculty of Engineering and Architecture, Ghent University-IMEC, 9052 Ghent, Belgium
| | - Maarten Larmuseau
- Department of Plant Biotechnology and Bioinformatics, Faculty of Sciences, Ghent University, 9052 Ghent, Belgium; (L.d.S.v.B.); (G.M.); (M.L.)
- Department of Information Technology, Faculty of Engineering and Architecture, Ghent University-IMEC, 9052 Ghent, Belgium
| | - Jimmy Van den Eynden
- Department of Human Structure and Repair, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium;
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Faculty of Sciences, Ghent University, 9052 Ghent, Belgium; (L.d.S.v.B.); (G.M.); (M.L.)
- Department of Information Technology, Faculty of Engineering and Architecture, Ghent University-IMEC, 9052 Ghent, Belgium
| |
Collapse
|
8
|
Iqbal S, Halim Z. Orienting Conflicted Graph Edges Using Genetic Algorithms to Discover Pathways in Protein-Protein Interaction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1970-1985. [PMID: 31944985 DOI: 10.1109/tcbb.2020.2966703] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Advanced computational techniques of the current era help to identify proteins from the complex biological network that interact with each other and with the cell's environment. Biological pathways are a chain of molecular actions that leads to a new molecular product creation or alters the cellular state. These pathways are helpful in the predication of many real-world issues. Rebuilding these pathways is a challenging task due to the fact that protein interactions are undirected, whereas pathways are directed. To discover these pathways in protein-protein interaction data from specified source and target, it is essential to orient protein interactions. Unfortunately, the edge orientation problem is NP-hard, which makes it challenging to develop effective algorithms. This work rebuilds biologically important pathways in a weighted network of protein interactions of yeast species. The proposed algorithm, pseudo-guided multi-objective genetic algorithm (PGMOGA) rebuilds pathways by assigning orientation to the edges of the weighted network. Extending the past research, mathematical modeling of single-objective and multi-objective functions is performed. The PGMOGA is compared with four state-of-the-art approaches, namely, random orientation plus local search (ROLS), single-objective genetic algorithm (SOGA), multi-objective genetic algorithm (MOGA), and multi random search (MRS). The comparison is based on three general and four path specific metrics. Results show that the current proposal performs better.
Collapse
|
9
|
Ovens K, Eames BF, McQuillan I. Comparative Analyses of Gene Co-expression Networks: Implementations and Applications in the Study of Evolution. Front Genet 2021; 12:695399. [PMID: 34484293 PMCID: PMC8414652 DOI: 10.3389/fgene.2021.695399] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
Similarities and differences in the associations of biological entities among species can provide us with a better understanding of evolutionary relationships. Often the evolution of new phenotypes results from changes to interactions in pre-existing biological networks and comparing networks across species can identify evidence of conservation or adaptation. Gene co-expression networks (GCNs), constructed from high-throughput gene expression data, can be used to understand evolution and the rise of new phenotypes. The increasing abundance of gene expression data makes GCNs a valuable tool for the study of evolution in non-model organisms. In this paper, we cover motivations for why comparing these networks across species can be valuable for the study of evolution. We also review techniques for comparing GCNs in the context of evolution, including local and global methods of graph alignment. While some protein-protein interaction (PPI) bioinformatic methods can be used to compare co-expression networks, they often disregard highly relevant properties, including the existence of continuous and negative values for edge weights. Also, the lack of comparative datasets in non-model organisms has hindered the study of evolution using PPI networks. We also discuss limitations and challenges associated with cross-species comparison using GCNs, and provide suggestions for utilizing co-expression network alignments as an indispensable tool for evolutionary studies going forward.
Collapse
Affiliation(s)
- Katie Ovens
- Augmented Intelligence & Precision Health Laboratory (AIPHL), Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - B. Frank Eames
- Department of Anatomy, Physiology, & Pharmacology, University of Saskatchewan, Saskatoon, SK, Canada
| | - Ian McQuillan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
10
|
Chen X, Gu J, Neuwald AF, Hilakivi-Clarke L, Clarke R, Xuan J. Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence. Sci Rep 2021; 11:385. [PMID: 33432018 PMCID: PMC7801429 DOI: 10.1038/s41598-020-79603-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Accepted: 11/18/2020] [Indexed: 11/09/2022] Open
Abstract
Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/ .
Collapse
Affiliation(s)
- Xi Chen
- grid.438526.e0000 0001 0694 4940Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, 900 North Glebe Road, Arlington, VA 22203 USA ,grid.430264.7Center for Computational Biology, Flatiron Institute, Simons Foundation, 162 Fifth Avenue, New York, NY 10010 USA
| | - Jinghua Gu
- grid.438526.e0000 0001 0694 4940Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, 900 North Glebe Road, Arlington, VA 22203 USA
| | - Andrew F. Neuwald
- grid.411024.20000 0001 2175 4264Institute for Genome Sciences and Department Biochemistry and Molecular Biology, University of Maryland School of Medicine, 670 W. Baltimore Street, Baltimore, MD 21201 USA
| | - Leena Hilakivi-Clarke
- grid.17635.360000000419368657Hormel Institute, University of Minnesota, 801 16th Ave NE, Austin, MN 55912 USA
| | - Robert Clarke
- grid.17635.360000000419368657Hormel Institute, University of Minnesota, 801 16th Ave NE, Austin, MN 55912 USA
| | - Jianhua Xuan
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, 900 North Glebe Road, Arlington, VA, 22203, USA.
| |
Collapse
|
11
|
Cardner M, Meyer-Schaller N, Christofori G, Beerenwinkel N. Inferring signalling dynamics by integrating interventional with observational data. Bioinformatics 2020; 35:i577-i585. [PMID: 31510686 PMCID: PMC6612850 DOI: 10.1093/bioinformatics/btz325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Motivation In order to infer a cell signalling network, we generally need interventional data from perturbation experiments. If the perturbation experiments are time-resolved, then signal progression through the network can be inferred. However, such designs are infeasible for large signalling networks, where it is more common to have steady-state perturbation data on the one hand, and a non-interventional time series on the other. Such was the design in a recent experiment investigating the coordination of epithelial–mesenchymal transition (EMT) in murine mammary gland cells. We aimed to infer the underlying signalling network of transcription factors and microRNAs coordinating EMT, as well as the signal progression during EMT. Results In the context of nested effects models, we developed a method for integrating perturbation data with a non-interventional time series. We applied the model to RNA sequencing data obtained from an EMT experiment. Part of the network inferred from RNA interference was validated experimentally using luciferase reporter assays. Our model extension is formulated as an integer linear programme, which can be solved efficiently using heuristic algorithms. This extension allowed us to infer the signal progression through the network during an EMT time course, and thereby assess when each regulator is necessary for EMT to advance. Availability and implementation R package at https://github.com/cbg-ethz/timeseriesNEM. The RNA sequencing data and microscopy images can be explored through a Shiny app at https://emt.bsse.ethz.ch. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mathias Cardner
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
12
|
Wagner MJ, Pratapa A, Murali TM. Reconstructing signaling pathways using regular language constrained paths. Bioinformatics 2020; 35:i624-i633. [PMID: 31510694 PMCID: PMC6612893 DOI: 10.1093/bioinformatics/btz360] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION High-quality curation of the proteins and interactions in signaling pathways is slow and painstaking. As a result, many experimentally detected interactions are not annotated to any pathways. A natural question that arises is whether or not it is possible to automatically leverage existing pathway annotations to identify new interactions for inclusion in a given pathway. RESULTS We present RegLinker, an algorithm that achieves this purpose by computing multiple short paths from pathway receptors to transcription factors within a background interaction network. The key idea underlying RegLinker is the use of regular language constraints to control the number of non-pathway interactions that are present in the computed paths. We systematically evaluate RegLinker and five alternative approaches against a comprehensive set of 15 signaling pathways and demonstrate that RegLinker recovers withheld pathway proteins and interactions with the best precision and recall. We used RegLinker to propose new extensions to the pathways. We discuss the literature that supports the inclusion of these proteins in the pathways. These results show the broad potential of automated analysis to attenuate difficulties of traditional manual inquiry. AVAILABILITY AND IMPLEMENTATION https://github.com/Murali-group/RegLinker. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Aditya Pratapa
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| |
Collapse
|
13
|
Bag AK, Mandloi S, Jarmalavicius S, Mondal S, Kumar K, Mandal C, Walden P, Chakrabarti S, Mandal C. Connecting signaling and metabolic pathways in EGF receptor-mediated oncogenesis of glioblastoma. PLoS Comput Biol 2019; 15:e1007090. [PMID: 31386654 PMCID: PMC6684045 DOI: 10.1371/journal.pcbi.1007090] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Accepted: 05/13/2019] [Indexed: 12/21/2022] Open
Abstract
As malignant transformation requires synchronization of growth-driving signaling (S) and metabolic (M) pathways, defining cancer-specific S-M interconnected networks (SMINs) could lead to better understanding of oncogenic processes. In a systems-biology approach, we developed a mathematical model for SMINs in mutated EGF receptor (EGFRvIII) compared to wild-type EGF receptor (EGFRwt) expressing glioblastoma multiforme (GBM). Starting with experimentally validated human protein-protein interactome data for S-M pathways, and incorporating proteomic data for EGFRvIII and EGFRwt GBM cells and patient transcriptomic data, we designed a dynamic model for EGFR-driven GBM-specific information flow. Key nodes and paths identified by in silico perturbation were validated experimentally when inhibition of signaling pathway proteins altered expression of metabolic proteins as predicted by the model. This demonstrated capacity of the model to identify unknown connections between signaling and metabolic pathways, explain the robustness of oncogenic SMINs, predict drug escape, and assist identification of drug targets and the development of combination therapies. Complex and highly dynamic interconnected networks allow cancer to take different routes and circumvent chemotherapy. Therefore, understanding these context-specific networks and their dynamics of molecular interactions driven by different oncogenic signaling and metabolic pathways is very much needed to predict drug targets and the effect of therapeutics. We incorporated high-throughput transcriptome and proteome data into mathematical models to deduce properties of cancer cells through systems biology approach. Here we report the development, testing and validation of an integrated systems biology model of information flow between signaling and metabolic pathways to understand the regulation of the interconnection between them in cancer. Our model efficiently identified unique connections and key nodes important in signaling-metabolic information flow. We predicted some potential novel targets before performing actual drug tests. We have successfully applied this model to identify the interconnections altered in the constitutive signaling of the mutated EGFR by comparing EGF-dependent and wild-type EGFR signaling in glioblastoma multiforme.
Collapse
Affiliation(s)
- Arup K. Bag
- Cancer Biology and Inflammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Sapan Mandloi
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Saulius Jarmalavicius
- Department of Dermatology, Venerology and Allergology, Charité– Universitätsmedizin Berlin corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Susmita Mondal
- Cancer Biology and Inflammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Krishna Kumar
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Chhabinath Mandal
- National Institute of Pharmaceutical Education and Research, Kolkata, India
| | - Peter Walden
- Department of Dermatology, Venerology and Allergology, Charité– Universitätsmedizin Berlin corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- * E-mail: (PW); , (SC); , (CM)
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Kolkata, India
- * E-mail: (PW); , (SC); , (CM)
| | - Chitra Mandal
- Cancer Biology and Inflammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India
- * E-mail: (PW); , (SC); , (CM)
| |
Collapse
|
14
|
Silverbush D, Sharan R. A systematic approach to orient the human protein-protein interaction network. Nat Commun 2019; 10:3015. [PMID: 31289271 PMCID: PMC6617457 DOI: 10.1038/s41467-019-10887-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2018] [Accepted: 06/06/2019] [Indexed: 11/16/2022] Open
Abstract
The protein-protein interaction (PPI) network of an organism serves as a skeleton for its signaling circuitry, which mediates cellular response to environmental and genetic cues. Understanding this circuitry could improve the prediction of gene function and cellular behavior in response to diverse signals. To realize this potential, one has to comprehensively map PPIs and their directions of signal flow. While the quality and the volume of identified human PPIs improved dramatically over the last decade, the directions of these interactions are still mostly unknown, thus precluding subsequent prediction and modeling efforts. Here we present a systematic approach to orient the human PPI network using drug response and cancer genomic data. We provide a diffusion-based method for the orientation task that significantly outperforms existing methods. The oriented network leads to improved prioritization of cancer driver genes and drug targets compared to the state-of-the-art unoriented network.
Collapse
Affiliation(s)
- Dana Silverbush
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Roded Sharan
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
15
|
Inostroza D, Hernández C, Seco D, Navarro G, Olivera-Nappa A. Cell cycle and protein complex dynamics in discovering signaling pathways. J Bioinform Comput Biol 2019; 17:1950011. [PMID: 31230498 DOI: 10.1142/s0219720019500112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Signaling pathways are responsible for the regulation of cell processes, such as monitoring the external environment, transmitting information across membranes, and making cell fate decisions. Given the increasing amount of biological data available and the recent discoveries showing that many diseases are related to the disruption of cellular signal transduction cascades, in silico discovery of signaling pathways in cell biology has become an active research topic in past years. However, reconstruction of signaling pathways remains a challenge mainly because of the need for systematic approaches for predicting causal relationships, like edge direction and activation/inhibition among interacting proteins in the signal flow. We propose an approach for predicting signaling pathways that integrates protein interactions, gene expression, phenotypes, and protein complex information. Our method first finds candidate pathways using a directed-edge-based algorithm and then defines a graph model to include causal activation relationships among proteins, in candidate pathways using cell cycle gene expression and phenotypes to infer consistent pathways in yeast. Then, we incorporate protein complex coverage information for deciding on the final predicted signaling pathways. We show that our approach improves the predictive results of the state of the art using different ranking metrics.
Collapse
Affiliation(s)
- Daniel Inostroza
- 1 Computer Science Department, University of Concepción, Edmundo Larenas, Concepción 4030000, Chile
| | - Cecilia Hernández
- 1 Computer Science Department, University of Concepción, Edmundo Larenas, Concepción 4030000, Chile.,2 Center for Biotechnology and Bioengineering (CeBiB), Santiago, Chile
| | - Diego Seco
- 1 Computer Science Department, University of Concepción, Edmundo Larenas, Concepción 4030000, Chile.,3 IMFD - Millennium Institute for Foundational Research on Data, Chile
| | - Gonzalo Navarro
- 4 Center for Biotechnology and Bioengineering (CeBiB), Department of Computer Science, University of Chile, Santiago, Chile
| | - Alvaro Olivera-Nappa
- 5 Center for Biotechnology and Bioengineering (CeBiB), Department of Chemical Engineering and Biotechnology, University of Chile, Santiago, Chile
| |
Collapse
|
16
|
Li M, Zheng R, Li Y, Wu FX, Wang J. MGT-SM: A Method for Constructing Cellular Signal Transduction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:417-424. [PMID: 28541220 DOI: 10.1109/tcbb.2017.2705143] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
A cellular signal transduction network is an important means to describe biological responses to environmental stimuli and exchange of biological signals. Constructing the cellular signal transduction network provides an important basis for the study of the biological activities, the mechanism of the diseases, drug targets and so on. The statistical approaches to network inference are popular in literature. Granger test has been used as an effective method for causality inference. Compared with bivariate granger tests, multivariate granger tests reduce the indirect causality and were used widely for the construction of cellular signal transduction networks. A multivariate Granger test requires that the number of time points in the time-series data is more than the number of nodes involved in the network. However, there are many real datasets with a few time points which are much less than the number of nodes in the network. In this study, we propose a new multivariate Granger test-based framework to construct cellular signal transduction network, called MGT-SM. Our MGT-SM uses SVD to compute the coefficient matrix from gene expression data and adopts Monte Carlo simulation to estimate the significance of directed edges in the constructed networks. We apply the proposed MGT-SM to Yeast Synthetic Network and MDA-MB-468, and evaluate its performance in terms of the recall and the AUC. The results show that MGT-SM achieves better results, compared with other popular methods (CGC2SPR, PGC, and DBN).
Collapse
|
17
|
Kabir MH, O'Connor MD. Stems cells, big data and compendium-based analyses for identifying cell types, signalling pathways and gene regulatory networks. Biophys Rev 2019; 11:41-50. [PMID: 30684132 DOI: 10.1007/s12551-018-0486-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 11/15/2018] [Indexed: 01/31/2023] Open
Abstract
Identification of new drug and cell therapy targets for disease treatment will be facilitated by a detailed molecular understanding of normal and disease development. Human pluripotent stem cells can provide a large in vitro source of human cell types and, in a growing number of instances, also three-dimensional multicellular tissues called organoids. The application of stem cell technology to discovery and development of new therapies will be aided by detailed molecular characterisation of cell identity, cell signalling pathways and target gene networks. Big data or 'omics' techniques-particularly transcriptomics and proteomics-facilitate cell and tissue characterisation using thousands to tens-of-thousands of genes or proteins. These gene and protein profiles are analysed using existing and/or emergent bioinformatics methods, including a growing number of methods that compare sample profiles against compendia of reference samples. This review assesses how compendium-based analyses can aid the application of stem cell technology for new therapy development. This includes via robust definition of differentiated stem cell identity, as well as elucidation of complex signalling pathways and target gene networks involved in normal and diseased states.
Collapse
Affiliation(s)
- Md Humayun Kabir
- School of Medicine, Western Sydney University, Campbelltown, NSW, Australia.,Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Michael D O'Connor
- School of Medicine, Western Sydney University, Campbelltown, NSW, Australia. .,Medical Sciences Research Group, Western Sydney University, Campbelltown, NSW, Australia.
| |
Collapse
|
18
|
Siahpirani AF, Chasman D, Roy S. Integrative Approaches for Inference of Genome-Scale Gene Regulatory Networks. Methods Mol Biol 2019; 1883:161-194. [PMID: 30547400 DOI: 10.1007/978-1-4939-8882-2_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Transcriptional regulatory networks specify the regulatory proteins of target genes that control the context-specific expression levels of genes. With our ability to profile the different types of molecular components of cells under different conditions, we are now uniquely positioned to infer regulatory networks in diverse biological contexts such as different cell types, tissues, and time points. In this chapter, we cover two main classes of computational methods to integrate different types of information to infer genome-scale transcriptional regulatory networks. The first class of methods focuses on integrative methods for specifically inferring connections between transcription factors and target genes by combining gene expression data with regulatory edge-specific knowledge. The second class of methods integrates upstream signaling networks with transcriptional regulatory networks by combining gene expression data with protein-protein interaction networks and proteomic datasets. We conclude with a section on practical applications of a network inference algorithm to infer a genome-scale regulatory network.
Collapse
Affiliation(s)
- Alireza Fotuhi Siahpirani
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA.,Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA. .,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
19
|
Kabir MH, Patrick R, Ho JWK, O'Connor MD. Identification of active signaling pathways by integrating gene expression and protein interaction data. BMC SYSTEMS BIOLOGY 2018; 12:120. [PMID: 30598083 PMCID: PMC6311899 DOI: 10.1186/s12918-018-0655-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Background Signaling pathways are the key biological mechanisms that transduce extracellular signals to affect transcription factor mediated gene regulation within cells. A number of computational methods have been developed to identify the topological structure of a specific signaling pathway using protein-protein interaction data, but they are not designed for identifying active signaling pathways in an unbiased manner. On the other hand, there are statistical methods based on gene sets or pathway data that can prioritize likely active signaling pathways, but they do not make full use of active pathway structure that link receptor, kinases and downstream transcription factors. Results Here, we present a method to simultaneously predict the set of active signaling pathways, together with their pathway structure, by integrating protein-protein interaction network and gene expression data. We evaluated the capacity for our method to predict active signaling pathways for dental epithelial cells, ocular lens epithelial cells, human pluripotent stem cell-derived lens epithelial cells, and lens fiber cells. This analysis showed our approach could identify all the known active pathways that are associated with tooth formation and lens development. Conclusions The results suggest that SPAGI can be a useful approach to identify the potential active signaling pathways given a gene expression profile. Our method is implemented as an open source R package, available via https://github.com/VCCRI/SPAGI/. Electronic supplementary material The online version of this article (10.1186/s12918-018-0655-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Md Humayun Kabir
- School of Medicine, Western Sydney University, Campbelltown, NSW, Australia.,Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia.,Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Ralph Patrick
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia.,St. Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia.,Stem Cells Australia, Melbourne Brain Centre, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Joshua W K Ho
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia. .,St. Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia. .,School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong, SAR, China.
| | - Michael D O'Connor
- School of Medicine, Western Sydney University, Campbelltown, NSW, Australia. .,Molecular Medicine Research Group, Western Sydney University, Campbelltown, NSW, Australia.
| |
Collapse
|
20
|
Köksal AS, Beck K, Cronin DR, McKenna A, Camp ND, Srivastava S, MacGilvray ME, Bodík R, Wolf-Yadlin A, Fraenkel E, Fisher J, Gitter A. Synthesizing Signaling Pathways from Temporal Phosphoproteomic Data. Cell Rep 2018; 24:3607-3618. [PMID: 30257219 PMCID: PMC6295338 DOI: 10.1016/j.celrep.2018.08.085] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Revised: 04/16/2018] [Accepted: 08/29/2018] [Indexed: 12/25/2022] Open
Abstract
We present a method for automatically discovering signaling pathways from time-resolved phosphoproteomic data. The Temporal Pathway Synthesizer (TPS) algorithm uses constraint-solving techniques first developed in the context of formal verification to explore paths in an interaction network. It systematically eliminates all candidate structures for a signaling pathway where a protein is activated or inactivated before its upstream regulators. The algorithm can model more than one hundred thousand dynamic phosphosites and can discover pathway members that are not differentially phosphorylated. By analyzing temporal data, TPS defines signaling cascades without needing to experimentally perturb individual proteins. It recovers known pathways and proposes pathway connections when applied to the human epidermal growth factor and yeast osmotic stress responses. Independent kinase mutant studies validate predicted substrates in the TPS osmotic stress pathway.
Collapse
Affiliation(s)
- Ali Sinan Köksal
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Kirsten Beck
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Dylan R Cronin
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA; Department of Biological Sciences, Bowling Green State University, Bowling Green, OH, USA
| | - Aaron McKenna
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Nathan D Camp
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Saurabh Srivastava
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | | | - Rastislav Bodík
- Paul G. Allen Center for Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | | | - Ernest Fraenkel
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jasmin Fisher
- Microsoft Research, Cambridge, UK; Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA; Morgridge Institute for Research, Madison, WI, USA.
| |
Collapse
|
21
|
Alanis-Lobato G, Mier P, Andrade-Navarro M. The latent geometry of the human protein interaction network. Bioinformatics 2018; 34:2826-2834. [PMID: 29635317 PMCID: PMC6084611 DOI: 10.1093/bioinformatics/bty206] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 02/16/2018] [Accepted: 04/03/2018] [Indexed: 11/21/2022] Open
Abstract
Motivation A series of recently introduced algorithms and models advocates for the existence of a hyperbolic geometry underlying the network representation of complex systems. Since the human protein interaction network (hPIN) has a complex architecture, we hypothesized that uncovering its latent geometry could ease challenging problems in systems biology, translating them into measuring distances between proteins. Results We embedded the hPIN to hyperbolic space and found that the inferred coordinates of nodes capture biologically relevant features, like protein age, function and cellular localization. This means that the representation of the hPIN in the two-dimensional hyperbolic plane offers a novel and informative way to visualize proteins and their interactions. We then used these coordinates to compute hyperbolic distances between proteins, which served as likelihood scores for the prediction of plausible protein interactions. Finally, we observed that proteins can efficiently communicate with each other via a greedy routing process, guided by the latent geometry of the hPIN. We show that these efficient communication channels can be used to determine the core members of signal transduction pathways and to study how system perturbations impact their efficiency. Availability and implementation An R implementation of our network embedder is available at https://github.com/galanisl/NetHypGeom. Also, a web tool for the geometric analysis of the hPIN accompanies this text at http://cbdm-01.zdv.uni-mainz.de/~galanisl/gapi. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gregorio Alanis-Lobato
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg Universität, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg Universität, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| | - Miguel Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg Universität, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| |
Collapse
|
22
|
How JJ, Navlakha S. Evidence of Rentian Scaling of Functional Modules in Diverse Biological Networks. Neural Comput 2018; 30:2210-2244. [DOI: 10.1162/neco_a_01095] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Biological networks have long been known to be modular, containing sets of nodes that are highly connected internally. Less emphasis, however, has been placed on understanding how intermodule connections are distributed within a network. Here, we borrow ideas from engineered circuit design and study Rentian scaling, which states that the number of external connections between nodes in different modules is related to the number of nodes inside the modules by a power-law relationship. We tested this property in a broad class of molecular networks, including protein interaction networks for six species and gene regulatory networks for 41 human and 25 mouse cell types. Using evolutionarily defined modules corresponding to known biological processes in the cell, we found that all networks displayed Rentian scaling with a broad range of exponents. We also found evidence for Rentian scaling in functional modules in the Caenorhabditis elegans neural network, but, interestingly, not in three different social networks, suggesting that this property does not inevitably emerge. To understand how such scaling may have arisen evolutionarily, we derived a new graph model that can generate Rentian networks given a target Rent exponent and a module decomposition as inputs. Overall, our work uncovers a new principle shared by engineered circuits and biological networks.
Collapse
Affiliation(s)
- Javier J. How
- Salk Institute for Biological Studies, Integrative Biology Laboratory, La Jolla, CA 92037, U.S.A
| | - Saket Navlakha
- Salk Institute for Biological Studies, Integrative Biology Laboratory, La Jolla, CA 92037, U.S.A
| |
Collapse
|
23
|
Abstract
Motivation A chief goal of systems biology is the reconstruction of large-scale executable models of cellular processes of interest. While accurate continuous models are still beyond reach, a powerful alternative is to learn a logical model of the processes under study, which predicts the logical state of any node of the model as a Boolean function of its incoming nodes. Key to learning such models is the functional annotation of the underlying physical interactions with activation/repression (sign) effects. Such annotations are pretty common for a few well-studied biological pathways. Results Here we present a novel optimization framework for large-scale sign annotation that employs different plausible models of signaling and combines them in a rigorous manner. We apply our framework to two large-scale knockout datasets in yeast and evaluate its different components as well as the combined model to predict signs of different subsets of physical interactions. Overall, we obtain an accurate predictor that outperforms previous work by a considerable margin. Availability and implementation The code is publicly available at https://github.com/spatkar94/NetworkAnnotation.git.
Collapse
Affiliation(s)
- Sushant Patkar
- Computer Science, University of Maryland, College Park, MD, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
24
|
MacGilvray ME, Shishkova E, Chasman D, Place M, Gitter A, Coon JJ, Gasch AP. Network inference reveals novel connections in pathways regulating growth and defense in the yeast salt response. PLoS Comput Biol 2018; 13:e1006088. [PMID: 29738528 PMCID: PMC5940180 DOI: 10.1371/journal.pcbi.1006088] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 03/13/2018] [Indexed: 11/18/2022] Open
Abstract
Cells respond to stressful conditions by coordinating a complex, multi-faceted response that spans many levels of physiology. Much of the response is coordinated by changes in protein phosphorylation. Although the regulators of transcriptome changes during stress are well characterized in Saccharomyces cerevisiae, the upstream regulatory network controlling protein phosphorylation is less well dissected. Here, we developed a computational approach to infer the signaling network that regulates phosphorylation changes in response to salt stress. We developed an approach to link predicted regulators to groups of likely co-regulated phospho-peptides responding to stress, thereby creating new edges in a background protein interaction network. We then use integer linear programming (ILP) to integrate wild type and mutant phospho-proteomic data and predict the network controlling stress-activated phospho-proteomic changes. The network we inferred predicted new regulatory connections between stress-activated and growth-regulating pathways and suggested mechanisms coordinating metabolism, cell-cycle progression, and growth during stress. We confirmed several network predictions with co-immunoprecipitations coupled with mass-spectrometry protein identification and mutant phospho-proteomic analysis. Results show that the cAMP-phosphodiesterase Pde2 physically interacts with many stress-regulated transcription factors targeted by PKA, and that reduced phosphorylation of those factors during stress requires the Rck2 kinase that we show physically interacts with Pde2. Together, our work shows how a high-quality computational network model can facilitate discovery of new pathway interactions during osmotic stress. Cells sense and respond to stressful environments by utilizing complex signaling networks that integrate diverse signals to coordinate a multi-faceted physiological response. Much of this response is controlled by post-translational protein phosphorylation. Although many regulators that mediate changes in protein phosphorylation are known, how these regulators inter-connect in a single regulatory network that can transmit cellular signals is not known. It is also unclear how regulators that promote growth and regulators that activate the stress response interconnect to reorganize resource allocation during stress. Here, we developed an integrated experimental and computational workflow to infer the signaling network that regulates phosphorylation changes during osmotic stress in the budding yeast Saccharomyces cerevisiae. The workflow integrates data measuring protein phosphorylation changes in response to osmotic stress with known physical interactions between yeast proteins from large-scale datasets, along with other information about how regulators recognize their targets. The resulting network suggested new signaling connections between regulators and pathways, including those involved in regulating growth and defense, and predicted new regulators involved in stress defense. Our work highlights the power of using network inference to deliver new insight on how cells coordinate a diverse adaptive strategy to stress.
Collapse
Affiliation(s)
- Matthew E. MacGilvray
- Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, United States of America
| | - Evgenia Shishkova
- Department of Biomolecular Chemistry, University of Wisconsin—Madison, Madison, WI, United States of America
| | - Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin–Madison, Madison, WI, United States of America
| | - Michael Place
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin -Madison, Madison, WI, United States of America
- Morgridge Institute for Research, Madison, WI, United States of America
| | - Joshua J. Coon
- Department of Biomolecular Chemistry, University of Wisconsin—Madison, Madison, WI, United States of America
- Morgridge Institute for Research, Madison, WI, United States of America
- Department of Chemistry, University of Wisconsin -Madison, Madison, WI, United States of America
- Genome Center of Wisconsin, Madison, WI, United States of America
| | - Audrey P. Gasch
- Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, United States of America
- Department of Chemistry, University of Wisconsin -Madison, Madison, WI, United States of America
- * E-mail:
| |
Collapse
|
25
|
Swings T, Weytjens B, Schalck T, Bonte C, Verstraeten N, Michiels J, Marchal K. Network-Based Identification of Adaptive Pathways in Evolved Ethanol-Tolerant Bacterial Populations. Mol Biol Evol 2018; 34:2927-2943. [PMID: 28961727 PMCID: PMC5850225 DOI: 10.1093/molbev/msx228] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Efficient production of ethanol for use as a renewable fuel requires organisms with a high level of ethanol tolerance. However, this trait is complex and increased tolerance therefore requires mutations in multiple genes and pathways. Here, we use experimental evolution for a system-level analysis of adaptation of Escherichia coli to high ethanol stress. As adaptation to extreme stress often results in complex mutational data sets consisting of both causal and noncausal passenger mutations, identifying the true adaptive mutations in these settings is not trivial. Therefore, we developed a novel method named IAMBEE (Identification of Adaptive Mutations in Bacterial Evolution Experiments). IAMBEE exploits the temporal profile of the acquisition of mutations during evolution in combination with the functional implications of each mutation at the protein level. These data are mapped to a genome-wide interaction network to search for adaptive mutations at the level of pathways. The 16 evolved populations in our data set together harbored 2,286 mutated genes with 4,470 unique mutations. Analysis by IAMBEE significantly reduced this number and resulted in identification of 90 mutated genes and 345 unique mutations that are most likely to be adaptive. Moreover, IAMBEE not only enabled the identification of previously known pathways involved in ethanol tolerance, but also identified novel systems such as the AcrAB-TolC efflux pump and fatty acids biosynthesis and even allowed to gain insight into the temporal profile of adaptation to ethanol stress. Furthermore, this method offers a solid framework for identifying the molecular underpinnings of other complex traits as well.
Collapse
Affiliation(s)
- Toon Swings
- Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
| | - Bram Weytjens
- Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium.,Department of Information Technology, IDLab, IMEC, Ghent University, Gent, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium.,Bioinformatics Institute Ghent, Gent, Belgium
| | - Thomas Schalck
- Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
| | - Camille Bonte
- Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
| | | | - Jan Michiels
- Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
| | - Kathleen Marchal
- Department of Information Technology, IDLab, IMEC, Ghent University, Gent, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium.,Bioinformatics Institute Ghent, Gent, Belgium.,Department of Genetics, University of Pretoria, Pretoria, South Africa
| |
Collapse
|
26
|
Peng X, Wang J, Peng W, Wu FX, Pan Y. Protein-protein interactions: detection, reliability assessment and applications. Brief Bioinform 2017; 18:798-819. [PMID: 27444371 DOI: 10.1093/bib/bbw066] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Indexed: 01/06/2023] Open
Abstract
Protein-protein interactions (PPIs) participate in all important biological processes in living organisms, such as catalyzing metabolic reactions, DNA replication, DNA transcription, responding to stimuli and transporting molecules from one location to another. To reveal the function mechanisms in cells, it is important to identify PPIs that take place in the living organism. A large number of PPIs have been discovered by high-throughput experiments and computational methods. However, false-positive PPIs have been introduced too. Therefore, to obtain reliable PPIs, many computational methods have been proposed. Generally, these methods can be classified into two categories. One category includes the methods that are designed to determine new reliable PPIs. The other one is designed to assess the reliability of existing PPIs and filter out the unreliable ones. In this article, we review the two kinds of methods for detecting reliable PPIs, and then focus on evaluating the performance of some of these typical methods. Later on, we also enumerate several PPI network-based applications with taking a reliability assessment of the PPI data into consideration. Finally, we will discuss the challenges for obtaining reliable PPIs and future directions of the construction of reliable PPI networks. Our research will provide readers some guidance for choosing appropriate methods and features for obtaining reliable PPIs.
Collapse
|
27
|
Malik S, Sharma D, Khatri SK. Reconstructing phylogenetic tree using a protein-protein interaction technique. IET Nanobiotechnol 2017; 11:1005-1016. [PMID: 29155401 DOI: 10.1049/iet-nbt.2016.0177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
In this study, a novel substitution method for finding potential protein-protein interactions (PPIs) has been discussed. This newly designed method for analyzing PPI also aids in the comparison of evolutionary distances. The method deals with various data sets, and additionally performs measurable assessment to determine PPIs is introduced. PPIs are biologically relevant and aid in better conceptual framework of phylogenetic profiling. The newly designed framework gives vision to relate the topological properties of the system with evolutionary behavior of datasets. Firstly, this study found that the most conserved protein motifs exist at the roots of the system, whereas newer motifs with mutations have a tendency to dwell on the branches. In-depth functional analysis revealed that the most conserved motifs have high specificity for improved structural procedures and pathway engagements, which may help identify their formative parts in cells. In conclusion, this study demonstrates several important aspects for future studies focusing to enhance phylogenetic profiling systems. This study can also be used effectively to utilize such strategies to develop new biological insights which will further lead to understanding of disease mechanisms.
Collapse
Affiliation(s)
- Shamita Malik
- Amity School of Engineering and Technology, Amity University, Uttar Pradesh, India.
| | - Dolly Sharma
- Computer Science and Engineering Department, Shiv Nadar University, Uttar Pradesh, India
| | - Sunil Kumar Khatri
- Amity Institute of Information Technology, Amity University, Uttar Pradesh, India
| |
Collapse
|
28
|
Ruffalo M, Stojanov P, Pillutla VK, Varma R, Bar-Joseph Z. Reconstructing cancer drug response networks using multitask learning. BMC SYSTEMS BIOLOGY 2017; 11:96. [PMID: 29017547 PMCID: PMC5635550 DOI: 10.1186/s12918-017-0471-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 10/02/2017] [Indexed: 01/03/2023]
Abstract
BACKGROUND Translating in vitro results to clinical tests is a major challenge in systems biology. Here we present a new Multi-Task learning framework which integrates thousands of cell line expression experiments to reconstruct drug specific response networks in cancer. RESULTS The reconstructed networks correctly identify several shared key proteins and pathways while simultaneously highlighting many cell type specific proteins. We used top proteins from each drug network to predict survival for patients prescribed the drug. CONCLUSIONS Predictions based on proteins from the in-vitro derived networks significantly outperformed predictions based on known cancer genes indicating that Multi-Task learning can indeed identify accurate drug response networks.
Collapse
Affiliation(s)
- Matthew Ruffalo
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Petar Stojanov
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Venkata Krishna Pillutla
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Rohan Varma
- Electrical and Computer Engineering, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. .,Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
29
|
Fu C, Deng S, Jin G, Wang X, Yu ZG. Bayesian network model for identification of pathways by integrating protein interaction with genetic interaction data. BMC SYSTEMS BIOLOGY 2017; 11:81. [PMID: 28950903 PMCID: PMC5615243 DOI: 10.1186/s12918-017-0454-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Background Molecular interaction data at proteomic and genetic levels provide physical and functional insights into a molecular biosystem and are helpful for the construction of pathway structures complementarily. Despite advances in inferring biological pathways using genetic interaction data, there still exists weakness in developed models, such as, activity pathway networks (APN), when integrating the data from proteomic and genetic levels. It is necessary to develop new methods to infer pathway structure by both of interaction data. Results We utilized probabilistic graphical model to develop a new method that integrates genetic interaction and protein interaction data and infers exquisitely detailed pathway structure. We modeled the pathway network as Bayesian network and applied this model to infer pathways for the coherent subsets of the global genetic interaction profiles, and the available data set of endoplasmic reticulum genes. The protein interaction data were derived from the BioGRID database. Our method can accurately reconstruct known cellular pathway structures, including SWR complex, ER-Associated Degradation (ERAD) pathway, N-Glycan biosynthesis pathway, Elongator complex, Retromer complex, and Urmylation pathway. By comparing N-Glycan biosynthesis pathway and Urmylation pathway identified from our approach with that from APN, we found that our method is able to overcome its weakness (certain edges are inexplicable). According to underlying protein interaction network, we defined a simple scoring function that only adopts genetic interaction information to avoid the balance difficulty in the APN. Using the effective stochastic simulation algorithm, the performance of our proposed method is significantly high. Conclusion We developed a new method based on Bayesian network to infer detailed pathway structures from interaction data at proteomic and genetic levels. The results indicate that the developed method performs better in predicting signaling pathways than previously described models.
Collapse
Affiliation(s)
- Changhe Fu
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China. .,School of Mathematics and System Science, Shenyang Normal University, Shenyang, 110034, China.
| | - Su Deng
- School of Mathematics and System Science, Shenyang Normal University, Shenyang, 110034, China
| | - Guangxu Jin
- Center of Systems Biology and Bioinformatics, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Xinxin Wang
- School of Mathematics and System Science, Shenyang Normal University, Shenyang, 110034, China
| | - Zu-Guo Yu
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China.
| |
Collapse
|
30
|
Jain S, Arrais J, Venkatachari NJ, Ayyavoo V, Bar-Joseph Z. Reconstructing the temporal progression of HIV-1 immune response pathways. Bioinformatics 2017; 32:i253-i261. [PMID: 27307624 PMCID: PMC4908338 DOI: 10.1093/bioinformatics/btw254] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Motivation: Most methods for reconstructing response networks from high throughput data generate static models which cannot distinguish between early and late response stages. Results: We present TimePath, a new method that integrates time series and static datasets to reconstruct dynamic models of host response to stimulus. TimePath uses an Integer Programming formulation to select a subset of pathways that, together, explain the observed dynamic responses. Applying TimePath to study human response to HIV-1 led to accurate reconstruction of several known regulatory and signaling pathways and to novel mechanistic insights. We experimentally validated several of TimePaths’ predictions highlighting the usefulness of temporal models. Availability and Implementation: Data, Supplementary text and the TimePath software are available from http://sb.cs.cmu.edu/timepath Contact:zivbj@cs.cmu.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Siddhartha Jain
- Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Joel Arrais
- Department of Computer Science, University of Coimbra, Coimbra, Portugal
| | | | - Velpandi Ayyavoo
- Department of Infectious Diseases, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ziv Bar-Joseph
- Computational Biology and Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
31
|
A Systemic Analysis of Transcriptomic and Epigenomic Data To Reveal Regulation Patterns for Complex Disease. G3-GENES GENOMES GENETICS 2017; 7:2271-2279. [PMID: 28500050 PMCID: PMC5499134 DOI: 10.1534/g3.117.042408] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Integrating diverse genomics data can provide a global view of the complex biological processes related to the human complex diseases. Although substantial efforts have been made to integrate different omics data, there are at least three challenges for multi-omics integration methods: (i) How to simultaneously consider the effects of various genomic factors, since these factors jointly influence the phenotypes; (ii) How to effectively incorporate the information from publicly accessible databases and omics datasets to fully capture the interactions among (epi)genomic factors from diverse omics data; and (iii) Until present, the combination of more than two omics datasets has been poorly explored. Current integration approaches are not sufficient to address all of these challenges together. We proposed a novel integrative analysis framework by incorporating sparse model, multivariate analysis, Gaussian graphical model, and network analysis to address these three challenges simultaneously. Based on this strategy, we performed a systemic analysis for glioblastoma multiforme (GBM) integrating genome-wide gene expression, DNA methylation, and miRNA expression data. We identified three regulatory modules of genomic factors associated with GBM survival time and revealed a global regulatory pattern for GBM by combining the three modules, with respect to the common regulatory factors. Our method can not only identify disease-associated dysregulated genomic factors from different omics, but more importantly, it can incorporate the information from publicly accessible databases and omics datasets to infer a comprehensive interaction map of all these dysregulated genomic factors. Our work represents an innovative approach to enhance our understanding of molecular genomic mechanisms underlying human complex diseases.
Collapse
|
32
|
Abstract
PathLinker is a graph-theoretic algorithm for reconstructing the interactions in a signaling pathway of interest. It efficiently computes multiple short paths within a background protein interaction network from the receptors to transcription factors (TFs) in a pathway. We originally developed PathLinker to complement manual curation of signaling pathways, which is slow and painstaking. The method can be used in general to connect any set of sources to any set of targets in an interaction network. The app presented here makes the PathLinker functionality available to Cytoscape users. We present an example where we used PathLinker to compute and analyze the network of interactions connecting proteins that are perturbed by the drug lovastatin.
Collapse
Affiliation(s)
- Daniel P Gil
- Department of Computer Science, Virginia Tech, Blacksburg, USA
| | - Jeffrey N Law
- Genetics, Bioinformatics, and Computational Biology, Virginia Tech, Blacksburg, USA
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, USA.,ICTAS Center for Systems Biology of Engineered Tissues, Virginia Tech, Blacksburg, USA
| |
Collapse
|
33
|
Azimzadeh Jamalkandi S, Mozhgani SH, Gholami Pourbadie H, Mirzaie M, Noorbakhsh F, Vaziri B, Gholami A, Ansari-Pour N, Jafari M. Systems Biomedicine of Rabies Delineates the Affected Signaling Pathways. Front Microbiol 2016; 7:1688. [PMID: 27872612 PMCID: PMC5098112 DOI: 10.3389/fmicb.2016.01688] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Accepted: 10/07/2016] [Indexed: 12/16/2022] Open
Abstract
The prototypical neurotropic virus, rabies, is a member of the Rhabdoviridae family that causes lethal encephalomyelitis. Although there have been a plethora of studies investigating the etiological mechanism of the rabies virus and many precautionary methods have been implemented to avert the disease outbreak over the last century, the disease has surprisingly no definite remedy at its late stages. The psychological symptoms and the underlying etiology, as well as the rare survival rate from rabies encephalitis, has still remained a mystery. We, therefore, undertook a systems biomedicine approach to identify the network of gene products implicated in rabies. This was done by meta-analyzing whole-transcriptome microarray datasets of the CNS infected by strain CVS-11, and integrating them with interactome data using computational and statistical methods. We first determined the differentially expressed genes (DEGs) in each study and horizontally integrated the results at the mRNA and microRNA levels separately. A total of 61 seed genes involved in signal propagation system were obtained by means of unifying mRNA and microRNA detected integrated DEGs. We then reconstructed a refined protein–protein interaction network (PPIN) of infected cells to elucidate the rabies-implicated signal transduction network (RISN). To validate our findings, we confirmed differential expression of randomly selected genes in the network using Real-time PCR. In conclusion, the identification of seed genes and their network neighborhood within the refined PPIN can be useful for demonstrating signaling pathways including interferon circumvent, toward proliferation and survival, and neuropathological clue, explaining the intricate underlying molecular neuropathology of rabies infection and thus rendered a molecular framework for predicting potential drug targets.
Collapse
Affiliation(s)
| | - Sayed-Hamidreza Mozhgani
- Department of Virology, School of Public Health, Tehran University of Medical Sciences Tehran, Iran
| | | | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modares University Tehran, Iran
| | - Farshid Noorbakhsh
- Department of Immunology, School of Medicine, Tehran University of Medical Sciences Tehran, Iran
| | - Behrouz Vaziri
- Protein Chemistry and Proteomics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran Tehran, Iran
| | - Alireza Gholami
- WHO Collaborating Center for Reference and Research on Rabies, Pasteur Institute of Iran Tehran, Iran
| | - Naser Ansari-Pour
- Faculty of New Sciences and Technology, University of TehranTehran, Iran; Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College LondonLondon, UK
| | - Mohieddin Jafari
- Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran Tehran, Iran
| |
Collapse
|
34
|
Multi-label ℓ 2-regularized logistic regression for predicting activation/inhibition relationships in human protein-protein interaction networks. Sci Rep 2016; 6:36453. [PMID: 27819359 PMCID: PMC5098220 DOI: 10.1038/srep36453] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 10/17/2016] [Indexed: 11/30/2022] Open
Abstract
Protein-protein interaction (PPI) networks are naturally viewed as infrastructure to infer signalling pathways. The descriptors of signal events between two interacting proteins such as upstream/downstream signal flow, activation/inhibition relationship and protein modification are indispensable for inferring signalling pathways from PPI networks. However, such descriptors are not available in most cases as most PPI networks are seldom semantically annotated. In this work, we extend ℓ2-regularized logistic regression to the scenario of multi-label learning for predicting the activation/inhibition relationships in human PPI networks. The phenomenon that both activation and inhibition relationships exist between two interacting proteins is computationally modelled by multi-label learning framework. The problem of GO (gene ontology) sparsity is tackled by introducing the homolog knowledge as independent homolog instances. ℓ2-regularized logistic regression is accordingly adopted here to penalize the homolog noise and to reduce the computational complexity of the double-sized training data. Computational results show that the proposed method achieves satisfactory multi-label learning performance and outperforms the existing phenotype correlation method on the experimental data of Drosophila melanogaster. Several predictions have been validated against recent literature. The predicted activation/inhibition relationships in human PPI networks are provided in the supplementary file for further biomedical research.
Collapse
|
35
|
Huang XT, Zhu Y, Chan LLH, Zhao Z, Yan H. An integrative C. elegans protein-protein interaction network with reliability assessment based on a probabilistic graphical model. MOLECULAR BIOSYSTEMS 2016; 12:85-92. [PMID: 26555698 DOI: 10.1039/c5mb00417a] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
In Caenorhabditis elegans, a large number of protein-protein interactions (PPIs) are identified by different experiments. However, a comprehensive weighted PPI network, which is essential for signaling pathway inference, is not yet available in this model organism. Therefore, we firstly construct an integrative PPI network in C. elegans with 12,951 interactions involving 5039 proteins from seven molecular interaction databases. Then, a reliability score based on a probabilistic graphical model (RSPGM) is proposed to assess PPIs. It assumes that the random number of interactions between two proteins comes from the Bernoulli distribution to avoid multi-links. The main parameter of the RSPGM score contains a few latent variables which can be considered as several common properties between two proteins. Validations on high-confidence yeast datasets show that RSPGM provides more accurate evaluation than other approaches, and the PPIs in the reconstructed PPI network have higher biological relevance than that in the original network in terms of gene ontology, gene expression, essentiality and the prediction of known protein complexes. Furthermore, this weighted integrative PPI network in C. elegans is employed on inferring interaction path of the canonical Wnt/β-catenin pathway as well. Most genes on the inferred interaction path have been validated to be Wnt pathway components. Therefore, RSPGM is essential and effective for evaluating PPIs and inferring interaction path. Finally, the PPI network with RSPGM scores can be queried and visualized on a user interactive website, which is freely available at .
Collapse
Affiliation(s)
- Xiao-Tai Huang
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China
| | - Yuan Zhu
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China and School of Automation, China University of Geosciences, Wuhan, China.
| | - Leanne Lai Hang Chan
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China
| | - Zhongying Zhao
- Department of Biology, Faculty of Science, Hong Kong Baptist University, Hong Kong, China
| | - Hong Yan
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China
| |
Collapse
|
36
|
Uhart M, Flores G, Bustos DM. Controllability of protein-protein interaction phosphorylation-based networks: Participation of the hub 14-3-3 protein family. Sci Rep 2016; 6:26234. [PMID: 27195976 PMCID: PMC4872533 DOI: 10.1038/srep26234] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 04/28/2016] [Indexed: 12/26/2022] Open
Abstract
Posttranslational regulation of protein function is an ubiquitous mechanism in eukaryotic cells. Here, we analyzed biological properties of nodes and edges of a human protein-protein interaction phosphorylation-based network, especially of those nodes critical for the network controllability. We found that the minimal number of critical nodes needed to control the whole network is 29%, which is considerably lower compared to other real networks. These critical nodes are more regulated by posttranslational modifications and contain more binding domains to these modifications than other kinds of nodes in the network, suggesting an intra-group fast regulation. Also, when we analyzed the edges characteristics that connect critical and non-critical nodes, we found that the former are enriched in domain-to-eukaryotic linear motif interactions, whereas the later are enriched in domain-domain interactions. Our findings suggest a possible structure for protein-protein interaction networks with a densely interconnected and self-regulated central core, composed of critical nodes with a high participation in the controllability of the full network, and less regulated peripheral nodes. Our study offers a deeper understanding of complex network control and bridges the controllability theorems for complex networks and biological protein-protein interaction phosphorylation-based networked systems.
Collapse
Affiliation(s)
- Marina Uhart
- Cell Signal Integration Lab, Instituto de Histología y Embriología “Dr. Mario H. Burgos” CCT CONICET Mendoza Facultad de Ciencias Médicas U.N. Cuyo P.O. Box 56 - Mendoza - ZIP 5500 Argentina
| | - Gabriel Flores
- Eventioz/Eventbrite Company, Adolfo A Calle 1853, Dorrego, Guaymallén, Mendoza, Argentina
| | - Diego M. Bustos
- Cell Signal Integration Lab, Instituto de Histología y Embriología “Dr. Mario H. Burgos” CCT CONICET Mendoza Facultad de Ciencias Médicas U.N. Cuyo P.O. Box 56 - Mendoza - ZIP 5500 Argentina
| |
Collapse
|
37
|
Tuncbag N, Gosline SJC, Kedaigle A, Soltis AR, Gitter A, Fraenkel E. Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package. PLoS Comput Biol 2016; 12:e1004879. [PMID: 27096930 PMCID: PMC4838263 DOI: 10.1371/journal.pcbi.1004879] [Citation(s) in RCA: 91] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 03/23/2016] [Indexed: 02/07/2023] Open
Abstract
High-throughput, ‘omic’ methods provide sensitive measures of biological responses to perturbations. However, inherent biases in high-throughput assays make it difficult to interpret experiments in which more than one type of data is collected. In this work, we introduce Omics Integrator, a software package that takes a variety of ‘omic’ data as input and identifies putative underlying molecular pathways. The approach applies advanced network optimization algorithms to a network of thousands of molecular interactions to find high-confidence, interpretable subnetworks that best explain the data. These subnetworks connect changes observed in gene expression, protein abundance or other global assays to proteins that may not have been measured in the screens due to inherent bias or noise in measurement. This approach reveals unannotated molecular pathways that would not be detectable by searching pathway databases. Omics Integrator also provides an elegant framework to incorporate not only positive data, but also negative evidence. Incorporating negative evidence allows Omics Integrator to avoid unexpressed genes and avoid being biased toward highly-studied hub proteins, except when they are strongly implicated by the data. The software is comprised of two individual tools, Garnet and Forest, that can be run together or independently to allow a user to perform advanced integration of multiple types of high-throughput data as well as create condition-specific subnetworks of protein interactions that best connect the observed changes in various datasets. It is available at http://fraenkel.mit.edu/omicsintegrator and on GitHub at https://github.com/fraenkel-lab/OmicsIntegrator.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Sara J. C. Gosline
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Amanda Kedaigle
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Anthony R. Soltis
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Anthony Gitter
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Ernest Fraenkel
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
38
|
Crespo I, Doucey MA, Xenarios I. Social networks help to infer causality in the tumor microenvironment. BMC Res Notes 2016; 9:168. [PMID: 26979239 PMCID: PMC4793762 DOI: 10.1186/s13104-016-1976-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 03/03/2016] [Indexed: 11/10/2022] Open
Abstract
Background Networks have become a popular way to conceptualize a system of interacting elements, such as electronic circuits, social communication, metabolism or gene regulation. Network inference, analysis, and modeling techniques have been developed in different areas of science and technology, such as computer science, mathematics, physics, and biology, with an active interdisciplinary exchange of concepts and approaches. However, some concepts seem to belong to a specific field without a clear transferability to other domains. At the same time, it is increasingly recognized that within some biological systems—such as the tumor microenvironment—where different types of resident and infiltrating cells interact to carry out their functions, the complexity of the system demands a theoretical framework, such as statistical inference, graph analysis and dynamical models, in order to asses and study the information derived from high-throughput experimental technologies. Results In this article we propose to adopt and adapt the concepts of influence and investment from the world of social network analysis to biological problems, and in particular to apply this approach to infer causality in the tumor microenvironment. We showed that constructing a bidirectional network of influence between cell and cell communication molecules allowed us to determine the direction of inferred regulations at the expression level and correctly recapitulate cause-effect relationships described in literature. Conclusions This work constitutes an example of a transfer of knowledge and concepts from the world of social network analysis to biomedical research, in particular to infer network causality in biological networks. This causality elucidation is essential to model the homeostatic response of biological systems to internal and external factors, such as environmental conditions, pathogens or treatments. Electronic supplementary material The online version of this article (doi:10.1186/s13104-016-1976-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Isaac Crespo
- Vital-IT, SIB (Swiss Institute of Bioinformatics), University of Lausanne, Lausanne, Switzerland.
| | - Marie-Agnès Doucey
- Ludwig Center for Cancer Research, University of Lausanne, Epalinges, Switzerland
| | - Ioannis Xenarios
- Vital-IT, SIB (Swiss Institute of Bioinformatics), University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
39
|
Pathways on demand: automated reconstruction of human signaling networks. NPJ Syst Biol Appl 2016; 2:16002. [PMID: 28725467 PMCID: PMC5516854 DOI: 10.1038/npjsba.2016.2] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Revised: 11/26/2015] [Accepted: 11/27/2015] [Indexed: 12/13/2022] Open
Abstract
Signaling pathways are a cornerstone of systems biology. Several databases store high-quality representations of these pathways that are amenable for automated analyses. Despite painstaking and manual curation, these databases remain incomplete. We present PATHLINKER, a new computational method to reconstruct the interactions in a signaling pathway of interest. PATHLINKER efficiently computes multiple short paths from the receptors to transcriptional regulators (TRs) in a pathway within a background protein interaction network. We use PATHLINKER to accurately reconstruct a comprehensive set of signaling pathways from the NetPath and KEGG databases. We show that PATHLINKER has higher precision and recall than several state-of-the-art algorithms, while also ensuring that the resulting network connects receptor proteins to TRs. PATHLINKER’s reconstruction of the Wnt pathway identified CFTR, an ABC class chloride ion channel transporter, as a novel intermediary that facilitates the signaling of Ryk to Dab2, which are known components of Wnt/β-catenin signaling. In HEK293 cells, we show that the Ryk–CFTR–Dab2 path is a novel amplifier of β-catenin signaling specifically in response to Wnt 1, 2, 3, and 3a of the 11 Wnts tested. PATHLINKER captures the structure of signaling pathways as represented in pathway databases better than existing methods. PATHLINKER’s success in reconstructing pathways from NetPath and KEGG databases point to its applicability for complementing manual curation of these databases. PATHLINKER may serve as a promising approach for prioritizing proteins and interactions for experimental study, as illustrated by its discovery of a novel pathway in Wnt/β-catenin signaling. Our supplementary website at http://bioinformatics.cs.vt.edu/~murali/supplements/2016-sys-bio-applications-pathlinker/ provides links to the PATHLINKER software, input datasets, PATHLINKER reconstructions of NetPath pathways, and links to interactive visualizations of these reconstructions on GraphSpace.
Collapse
|
40
|
Hou J, Acharya L, Zhu D, Cheng J. An overview of bioinformatics methods for modeling biological pathways in yeast. Brief Funct Genomics 2016; 15:95-108. [PMID: 26476430 PMCID: PMC5065356 DOI: 10.1093/bfgp/elv040] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein-protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways inS. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed.
Collapse
|
41
|
De Maeyer D, Weytjens B, De Raedt L, Marchal K. Network-Based Analysis of eQTL Data to Prioritize Driver Mutations. Genome Biol Evol 2016; 8:481-94. [PMID: 26802430 PMCID: PMC4825419 DOI: 10.1093/gbe/evw010] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
In clonal systems, interpreting driver genes in terms of molecular networks helps understanding how these drivers elicit an adaptive phenotype. Obtaining such a network-based understanding depends on the correct identification of driver genes. In clonal systems, independent evolved lines can acquire a similar adaptive phenotype by affecting the same molecular pathways, a phenomenon referred to as parallelism at the molecular pathway level. This implies that successful driver identification depends on interpreting mutated genes in terms of molecular networks. Driver identification and obtaining a network-based understanding of the adaptive phenotype are thus confounded problems that ideally should be solved simultaneously. In this study, a network-based eQTL method is presented that solves both the driver identification and the network-based interpretation problem. As input the method uses coupled genotype-expression phenotype data (eQTL data) of independently evolved lines with similar adaptive phenotypes and an organism-specific genome-wide interaction network. The search for mutational consistency at pathway level is defined as a subnetwork inference problem, which consists of inferring a subnetwork from the genome-wide interaction network that best connects the genes containing mutations to differentially expressed genes. Based on their connectivity with the differentially expressed genes, mutated genes are prioritized as driver genes. Based on semisynthetic data and two publicly available data sets, we illustrate the potential of the network-based eQTL method to prioritize driver genes and to gain insights in the molecular mechanisms underlying an adaptive phenotype. The method is available at http://bioinformatics.intec.ugent.be/phenetic_eqtl/index.html
Collapse
Affiliation(s)
- Dries De Maeyer
- Deptartment of Information Technology (INTEC, iMINDS), UGent, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium Bioinformatics Institute Ghent, Technologiepark 927, 9052 Ghent, Belgium Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| | - Bram Weytjens
- Deptartment of Information Technology (INTEC, iMINDS), UGent, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium Bioinformatics Institute Ghent, Technologiepark 927, 9052 Ghent, Belgium Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| | - Luc De Raedt
- Department of Computer Science, KU Leuven, Celestijnenlaan 200A, B-3001 Leuven, Belgium
| | - Kathleen Marchal
- Deptartment of Information Technology (INTEC, iMINDS), UGent, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium Bioinformatics Institute Ghent, Technologiepark 927, 9052 Ghent, Belgium Department of Genetics, University of Pretoria, Hatfield Campus, Pretoria 0028, South Africa Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| |
Collapse
|
42
|
Gitter A, Bar-Joseph Z. The SDREM Method for Reconstructing Signaling and Regulatory Response Networks: Applications for Studying Disease Progression. Methods Mol Biol 2016; 1303:493-506. [PMID: 26235087 DOI: 10.1007/978-1-4939-2627-5_30] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The Signaling and Dynamic Regulatory Events Miner (SDREM) is a powerful computational approach for identifying which signaling pathways and transcription factors control the temporal cellular response to a stimulus. SDREM builds end-to-end response models by combining condition-independent protein-protein interactions and transcription factor binding data with two types of condition-specific data: source proteins that detect the stimulus and changes in gene expression over time. Here we describe how to apply SDREM to study human diseases, using epidermal growth factor (EGF) response impacting neurogenesis and Alzheimer's disease as an example.
Collapse
|
43
|
Mei S, Zhu H. Multi-label multi-instance transfer learning for simultaneous reconstruction and cross-talk modeling of multiple human signaling pathways. BMC Bioinformatics 2015; 16:417. [PMID: 26718335 PMCID: PMC4697333 DOI: 10.1186/s12859-015-0841-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 07/13/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Signaling pathways play important roles in the life processes of cell growth, cell apoptosis and organism development. At present the signal transduction networks are far from complete. As an effective complement to experimental methods, computational modeling is suited to rapidly reconstruct the signaling pathways at low cost. To our knowledge, the existing computational methods seldom simultaneously exploit more than three signaling pathways into one predictive model for the discovery of novel signaling components and the cross-talk modeling between signaling pathways. RESULTS In this work, we propose a multi-label multi-instance transfer learning method to simultaneously reconstruct 27 human signaling pathways and model their cross-talks. Computational results show that the proposed method demonstrates satisfactory multi-label learning performance and rational proteome-wide predictions. Some predicted signaling components or pathway targeted proteins have been validated by recent literature. The predicted signaling components are further linked to pathways using the experimentally derived PPIs (protein-protein interactions) to reconstruct the human signaling pathways. Thus the map of the cross-talks via common signaling components and common signaling PPIs is conveniently inferred to provide valuable insights into the regulatory and cooperative relationships between signaling pathways. Lastly, gene ontology enrichment analysis is conducted to gain statistical knowledge about the reconstructed human signaling pathways. CONCLUSIONS Multi-label learning framework has been demonstrated effective in this work to model the phenomena that a signaling protein belongs to more than one signaling pathway. As results, novel signaling components and pathways targeted proteins are predicted to simultaneously reconstruct multiple human signaling pathways and the static map of their cross-talks for further biomedical research.
Collapse
Affiliation(s)
- Suyu Mei
- Software College, Shenyang Normal University, Shenyang, China. .,Bioinformatics Section, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
| | - Hao Zhu
- Bioinformatics Section, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
| |
Collapse
|
44
|
Mei S, Zhu H. A simple feature construction method for predicting upstream/downstream signal flow in human protein-protein interaction networks. Sci Rep 2015; 5:17983. [PMID: 26648121 PMCID: PMC4673612 DOI: 10.1038/srep17983] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/10/2015] [Indexed: 12/24/2022] Open
Abstract
Signaling pathways play important roles in understanding the underlying mechanism of cell growth, cell apoptosis, organismal development and pathways-aberrant diseases. Protein-protein interaction (PPI) networks are commonly-used infrastructure to infer signaling pathways. However, PPI networks generally carry no information of upstream/downstream relationship between interacting proteins, which retards our inferring the signal flow of signaling pathways. In this work, we propose a simple feature construction method to train a SVM (support vector machine) classifier to predict PPI upstream/downstream relations. The domain based asymmetric feature representation naturally embodies domain-domain upstream/downstream relations, providing an unconventional avenue to predict the directionality between two objects. Moreover, we propose a semantically interpretable decision function and a macro bag-level performance metric to satisfy the need of two-instance depiction of an interacting protein pair. Experimental results show that the proposed method achieves satisfactory cross validation performance and independent test performance. Lastly, we use the trained model to predict the PPIs in HPRD, Reactome and IntAct. Some predictions have been validated against recent literature.
Collapse
Affiliation(s)
- Suyu Mei
- Software College, Shenyang Normal University, Shenyang, China.,Bioinformatics Section, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Hao Zhu
- Bioinformatics Section, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| |
Collapse
|
45
|
Drug target prioritization by perturbed gene expression and network information. Sci Rep 2015; 5:17417. [PMID: 26615774 PMCID: PMC4663505 DOI: 10.1038/srep17417] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 10/29/2015] [Indexed: 12/27/2022] Open
Abstract
Drugs bind to their target proteins, which interact with downstream effectors and ultimately perturb the transcriptome of a cancer cell. These perturbations reveal information about their source, i.e., drugs’ targets. Here, we investigate whether these perturbations and protein interaction networks can uncover drug targets and key pathways. We performed the first systematic analysis of over 500 drugs from the Connectivity Map. First, we show that the gene expression of drug targets is usually not significantly affected by the drug perturbation. Hence, expression changes after drug treatment on their own are not sufficient to identify drug targets. However, ranking of candidate drug targets by network topological measures prioritizes the targets. We introduce a novel measure, local radiality, which combines perturbed genes and functional interaction network information. The new measure outperforms other methods in target prioritization and proposes cancer-specific pathways from drugs to affected genes for the first time. Local radiality identifies more diverse targets with fewer neighbors and possibly less side effects.
Collapse
|
46
|
De Maeyer D, Weytjens B, Renkens J, De Raedt L, Marchal K. PheNetic: network-based interpretation of molecular profiling data. Nucleic Acids Res 2015; 43:W244-50. [PMID: 25878035 PMCID: PMC4489255 DOI: 10.1093/nar/gkv347] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Accepted: 04/03/2015] [Indexed: 12/17/2022] Open
Abstract
Molecular profiling experiments have become standard in current wet-lab practices. Classically, enrichment analysis has been used to identify biological functions related to these experimental results. Combining molecular profiling results with the wealth of currently available interactomics data, however, offers the opportunity to identify the molecular mechanism behind an observed molecular phenotype. In this paper, we therefore introduce ‘PheNetic’, a user-friendly web server for inferring a sub-network based on probabilistic logical querying. PheNetic extracts from an interactome, the sub-network that best explains genes prioritized through a molecular profiling experiment. Depending on its run mode, PheNetic searches either for a regulatory mechanism that gave explains to the observed molecular phenotype or for the pathways (in)activated in the molecular phenotype. The web server provides access to a large number of interactomes, making sub-network inference readily applicable to a wide variety of organisms. The inferred sub-networks can be interactively visualized in the browser. PheNetic's method and use are illustrated using an example analysis of differential expression results of ampicillin treated Escherichia coli cells. The PheNetic web service is available at http://bioinformatics.intec.ugent.be/phenetic/.
Collapse
Affiliation(s)
- Dries De Maeyer
- Dept. of Microbial and Molecular Systems, KULeuven, Leuven, 3000, Belgium Dept. of Information Technology (INTEC, iMINDS), U.Ghent, Ghent, 9052, Belgium
| | - Bram Weytjens
- Dept. of Microbial and Molecular Systems, KULeuven, Leuven, 3000, Belgium Dept. of Information Technology (INTEC, iMINDS), U.Ghent, Ghent, 9052, Belgium
| | - Joris Renkens
- Dept. of Computer Science, KULeuven, Leuven, 3000, Belgium
| | - Luc De Raedt
- Dept. of Computer Science, KULeuven, Leuven, 3000, Belgium
| | - Kathleen Marchal
- Dept. of Microbial and Molecular Systems, KULeuven, Leuven, 3000, Belgium Dept. of Information Technology (INTEC, iMINDS), U.Ghent, Ghent, 9052, Belgium Dept. of Plant Biotechnology and Bioinformatics, U.Ghent, Ghent, 9052, Belgium
| |
Collapse
|
47
|
Nguyen HA, Vu CL, Tu MP, Bui TL. Discovery of pathways in protein–protein interaction networks using a genetic algorithm. DATA KNOWL ENG 2015. [DOI: 10.1016/j.datak.2015.04.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
48
|
Abstract
Behaviours of complex biomolecular systems are often irreducible to the elementary properties of their individual components. Explanatory and predictive mathematical models are therefore useful for fully understanding and precisely engineering cellular functions. The development and analyses of these models require their adaptation to the problems that need to be solved and the type and amount of available genetic or molecular data. Quantitative and logic modelling are among the main methods currently used to model molecular and gene networks. Each approach comes with inherent advantages and weaknesses. Recent developments show that hybrid approaches will become essential for further progress in synthetic biology and in the development of virtual organisms.
Collapse
Affiliation(s)
- Nicolas Le Novère
- Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| |
Collapse
|
49
|
Speegle G. P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:309-321. [PMID: 26357219 DOI: 10.1109/tcbb.2014.2355216] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Because most complex genetic diseases are caused by defects of cell signaling, illuminating a signaling cascade is essential for understanding their mechanisms. We present three novel computational algorithms to reconstruct signaling networks between a starting protein and an ending protein using genome-wide protein-protein interaction (PPI) networks and gene ontology (GO) annotation data. A signaling network is represented as a directed acyclic graph in a merged form of multiple linear pathways. An advanced semantic similarity metric is applied for weighting PPIs as the preprocessing of all three methods. The first algorithm repeatedly extends the list of nodes based on path frequency towards an ending protein. The second algorithm repeatedly appends edges based on the occurrence of network motifs which indicate the link patterns more frequently appearing in a PPI network than in a random graph. The last algorithm uses the information propagation technique which iteratively updates edge orientations based on the path strength and merges the selected directed edges. Our experimental results demonstrate that the proposed algorithms achieve higher accuracy than previous methods when they are tested on well-studied pathways of S. cerevisiae. Furthermore, we introduce an interactive web application tool, called P-Finder, to visualize reconstructed signaling networks.
Collapse
|
50
|
Jain S, Gitter A, Bar-Joseph Z. Multitask learning of signaling and regulatory networks with application to studying human response to flu. PLoS Comput Biol 2014; 10:e1003943. [PMID: 25522349 PMCID: PMC4270428 DOI: 10.1371/journal.pcbi.1003943] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Accepted: 09/28/2014] [Indexed: 01/04/2023] Open
Abstract
Reconstructing regulatory and signaling response networks is one of the major goals of systems biology. While several successful methods have been suggested for this task, some integrating large and diverse datasets, these methods have so far been applied to reconstruct a single response network at a time, even when studying and modeling related conditions. To improve network reconstruction we developed MT-SDREM, a multi-task learning method which jointly models networks for several related conditions. In MT-SDREM, parameters are jointly constrained across the networks while still allowing for condition-specific pathways and regulation. We formulate the multi-task learning problem and discuss methods for optimizing the joint target function. We applied MT-SDREM to reconstruct dynamic human response networks for three flu strains: H1N1, H5N1 and H3N2. Our multi-task learning method was able to identify known and novel factors and genes, improving upon prior methods that model each condition independently. The MT-SDREM networks were also better at identifying proteins whose removal affects viral load indicating that joint learning can still lead to accurate, condition-specific, networks. Supporting website with MT-SDREM implementation: http://sb.cs.cmu.edu/mtsdrem To understand why some flu strains are more virulent than others, researchers attempt to profile and model the molecular human response to these strains and identify similarities and differences between the resulting models. So far, the modeling and analysis part has been done independently for each strain and the results contrasted in a post-processing step. Here we present a new method, termed MT-SDREM, that simultaneously models the response to all strains allowing us to identify both, the core response elements that are shared among the strains, and factors that are uniquely activated or repressed by individual strains. We applied this method to study the human response to three flu strains: H1N1, H3N2 and H5N1. As we show, the method was able to correctly identify several common and known factors regulating immune response to such strains and also identified unique factors for each of the strains. The models reconstructed by the simultaneous analysis method improved upon those generated by methods that model each strain response separately. Our joint models can be used to identify strain specific treatments as well as treatments that are likely to be effective against all three strains.
Collapse
Affiliation(s)
- Siddhartha Jain
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Anthony Gitter
- Microsoft Research, Cambridge, Massachusetts, United States of America
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Ziv Bar-Joseph
- Lane Center for Computational Biology and Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|