1
|
Carvalho Silva R, Martini P, Hohoff C, Mattevi S, Bortolomasi M, Menesello V, Gennarelli M, Baune BT, Minelli A. DNA methylation changes in association with trauma-focused psychotherapy efficacy in treatment-resistant depression patients: a prospective longitudinal study. Eur J Psychotraumatol 2024; 15:2314913. [PMID: 38362742 PMCID: PMC10878335 DOI: 10.1080/20008066.2024.2314913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/30/2024] [Indexed: 02/17/2024] Open
Abstract
Background: Stressful events increase the risk for treatment-resistant depression (TRD), and trauma-focused psychotherapy can be useful for TRD patients exposed to early life stress (ELS). Epigenetic processes are known to be related to depression and ELS, but there is no evidence of the effects of trauma-focused psychotherapy on methylation alterations.Objective: We performed the first epigenome-wide association study to investigate methylation changes related to trauma-focused psychotherapies effects in TRD patients.Method: Thirty TRD patients assessed for ELS underwent trauma-focused psychotherapy, of those, 12 received trauma-focused cognitive behavioural therapy, and 18 Eye Movement Desensitization and Reprocessing (EMDR). DNA methylation was profiled with Illumina Infinium EPIC array at T0 (baseline), after 8 weeks (T8, end of psychotherapy) and after 12 weeks (T12 - follow-up). We examined differentially methylated CpG sites and regions, as well as pathways analysis in association with the treatment.Results: Main results obtained have shown 110 differentially methylated regions (DMRs) with a significant adjusted p-value area associated with the effects of trauma-focused psychotherapies in the entire cohort. Several annotated genes are related to inflammatory processes and psychiatric disorders, such as LTA, GFI1, ARID5B, TNFSF13, and LST1. Gene enrichment analyses revealed statistically significant processes related to tumour necrosis factor (TNF) receptor and TNF signalling pathway. Stratified analyses by type of trauma-focused psychotherapy showed statistically significant adjusted p-value area in 141 DMRs only for the group of patients receiving EMDR, with annotated genes related to inflammation and psychiatric disorders, including LTA, GFI1, and S100A8. Gene set enrichment analyses in the EMDR group indicated biological processes related to inflammatory response, particularly the TNF signalling pathway.Conclusion: We provide preliminary valuable insights into global DNA methylation changes associated with trauma-focused psychotherapies effects, in particular with EMDR treatment.
Collapse
Affiliation(s)
- Rosana Carvalho Silva
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Paolo Martini
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Christa Hohoff
- Department of Psychiatry and Psychotherapy, University of Münster, Münster, Germany
| | - Stefania Mattevi
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | | | - Valentina Menesello
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | - Massimo Gennarelli
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | - Bernhard T. Baune
- Department of Psychiatry and Psychotherapy, University of Münster, Münster, Germany
- Department of Psychiatry, Melbourne Medical School, University of Melbourne, Melbourne, Australia
- The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Parkville, Australia
| | - Alessandra Minelli
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| |
Collapse
|
2
|
Castaneda EU, Baker EJ. KNeXT: a NetworkX-based topologically relevant KEGG parser. Front Genet 2024; 15:1292394. [PMID: 38415058 PMCID: PMC10896898 DOI: 10.3389/fgene.2024.1292394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/25/2024] [Indexed: 02/29/2024] Open
Abstract
Automating the recreation of gene and mixed gene-compound networks from Kyoto Encyclopedia of Genes and Genomes (KEGG) Markup Language (KGML) files is challenging because the data structure does not preserve the independent or loosely connected neighborhoods in which they were originally derived, referred to here as its topological environment. Identical accession numbers may overlap, causing neighborhoods to artificially collapse based on duplicated identifiers. This causes current parsers to create misleading or erroneous graphical representations when mixed gene networks are converted to gene-only networks. To overcome these challenges we created a python-based KEGG NetworkX Topological (KNeXT) parser that allows users to accurately recapitulate genetic networks and mixed networks from KGML map data. The software, archived as a python package index (PyPI) file to ensure broad application, is designed to ingest KGML files through built-in APIs and dynamically create high-fidelity topological representations. The utilization of NetworkX's framework to generate tab-separated files additionally ensures that KNeXT results may be imported into other graph frameworks and maintain programmatic access to the original x-y axis positions to each node in the KEGG pathway. KNeXT is a well-described Python 3 package that allows users to rapidly download and aggregate specific KGML files and recreate KEGG pathways based on a range of user-defined settings. KNeXT is platform-independent, distinctive, and it is not written on top of other Python parsers. Furthermore, KNeXT enables users to parse entire local folders or single files through command line scripts and convert the output into NCBI or UniProt IDs. KNeXT provides an ability for researchers to generate pathway visualizations while persevering the original context of a KEGG pathway. Source code is freely available at https://github.com/everest-castaneda/knext.
Collapse
Affiliation(s)
- Everest Uriel Castaneda
- Department of Biology, Baylor University, Waco, TX, United States
- School of Engineering and Computer Science, Baylor University, Waco, TX, United States
| | - Erich J Baker
- Department of Mathematics and Computer Science, Belmont University, Nashville, TN, United States
| |
Collapse
|
3
|
Groppa E, Tung LW, Mattevi S, Ritso M, Rossi FMV, Martini P. Protocol for generation of a time-resolved cellular interactome during tissue remodeling in adult mice. STAR Protoc 2023; 4:102638. [PMID: 37831606 PMCID: PMC10583169 DOI: 10.1016/j.xpro.2023.102638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/25/2023] [Accepted: 09/22/2023] [Indexed: 10/15/2023] Open
Abstract
Efficient skeletal muscle regeneration necessitates fine-tuned coordination among multiple cell types through an intricate network of intercellular communication. We present a protocol for generation of a time-resolved cellular interactome during tissue remodeling. We describe steps for isolating distinct cell populations from skeletal muscle of adult mice after acute damage and extracting RNA from purified cells prior to the generation of RNA sequencing data. We then detail procedures for generating and deciphering a time- and lineage-resolved model of intercellular crosstalk. For complete details on the use and execution of this protocol, please refer to Groppa et al. (2023).1.
Collapse
Affiliation(s)
- Elena Groppa
- Borea Therapeutics, Scuola Internazionale Superiore di Studi Avanzati, 34136 Trieste, Italy.
| | - Lin Wei Tung
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC V6T 2B9, Canada
| | - Stefania Mattevi
- Department of Molecular and Translational Medicine, University of Brescia, 25121 Brescia, Italy
| | - Morten Ritso
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC V6T 2B9, Canada
| | - Fabio M V Rossi
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC V6T 2B9, Canada
| | - Paolo Martini
- Department of Molecular and Translational Medicine, University of Brescia, 25121 Brescia, Italy.
| |
Collapse
|
4
|
Xu B, Hwangbo DS, Saurabh S, Rosensweig C, Allada R, Kath WL, Braun R. Temperature-driven coordination of circadian transcriptome regulation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.27.563979. [PMID: 37961403 PMCID: PMC10634908 DOI: 10.1101/2023.10.27.563979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The circadian rhythm is an evolutionarily-conserved molecular oscillator that enables species to anticipate rhythmic changes in their environment. At a molecular level, the core clock genes induce a circadian oscillation in thousands of genes in a tissue-specific manner, orchestrating myriad biological processes. While studies have investigated how the core clock circuit responds to environmental perturbations such as temperature, the downstream effects of such perturbations on circadian regulation remain poorly understood. By analyzing bulk-RNA sequencing of Drosophila fat bodies harvested from flies subjected to different environmental conditions, we demonstrate a highly condition-specific circadian transcriptome. Further employing a reference-based gene regulatory network (Reactome), we find evidence of increased gene-gene coordination at low temperatures and synchronization of rhythmic genes that are network neighbors. Our results point to the mechanisms by which the circadian clock mediates the fly's response to seasonal changes in temperature.
Collapse
Affiliation(s)
- Bingxian Xu
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
- NSF-Simons Center for Quantitative Biology, Northwestern University, Evanston, IL 60208, USA
| | - Dae-Sung Hwangbo
- Department of Biology, University of Louisville, Louisville, KY 40292, USA
- Department of Neurobiology, Northwestern University, Evanston, IL 60208, USA
| | - Sumit Saurabh
- Department of Biology, Loyola University, Chicago, IL 60660, USA
| | - Clark Rosensweig
- NSF-Simons Center for Quantitative Biology, Northwestern University, Evanston, IL 60208, USA
- Department of Neurobiology, Northwestern University, Evanston, IL 60208, USA
| | - Ravi Allada
- NSF-Simons Center for Quantitative Biology, Northwestern University, Evanston, IL 60208, USA
- Department of Neurobiology, Northwestern University, Evanston, IL 60208, USA
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Anesthesiology, University of Michigan, Ann Arbor, MI 48109, USA
| | - William L Kath
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
- NSF-Simons Center for Quantitative Biology, Northwestern University, Evanston, IL 60208, USA
- Department of Neurobiology, Northwestern University, Evanston, IL 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL 60208, USA
| | - Rosemary Braun
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
- NSF-Simons Center for Quantitative Biology, Northwestern University, Evanston, IL 60208, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL 60208, USA
- Department of Physics and Astronomy, Northwestern University, Evanston, IL 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL 60208, USA
- Santa Fe Institute, Santa Fe, NM 87501, USA
| |
Collapse
|
5
|
Martini P, Mingardi J, Carini G, Mattevi S, Ndoj E, La Via L, Magri C, Gennarelli M, Russo I, Popoli M, Musazzi L, Barbon A. Transcriptional Profiling of Rat Prefrontal Cortex after Acute Inescapable Footshock Stress. Genes (Basel) 2023; 14:genes14030740. [PMID: 36981011 PMCID: PMC10048409 DOI: 10.3390/genes14030740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/31/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Stress is a primary risk factor for psychiatric disorders such as Major Depressive Disorder (MDD) and Post Traumatic Stress Disorder (PTSD). The response to stress involves the regulation of transcriptional programs, which is supposed to play a role in coping with stress. To evaluate transcriptional processes implemented after exposure to unavoidable traumatic stress, we applied microarray expression analysis to the PFC of rats exposed to acute footshock (FS) stress that were sacrificed immediately after the 40 min session or 2 h or 24 h after. While no substantial changes were observed at the single gene level immediately after the stress session, gene set enrichment analysis showed alterations in neuronal pathways associated with glia development, glia-neuron networking, and synaptic function. Furthermore, we found alterations in the expression of gene sets regulated by specific transcription factors that could represent master regulators of the acute stress response. Of note, these pathways and transcriptional programs are activated during the early stress response (immediately after FS) and are already turned off after 2 h-while at 24 h, the transcriptional profile is largely unaffected. Overall, our analysis provided a transcriptional landscape of the early changes triggered by acute unavoidable FS stress in the PFC of rats, suggesting that the transcriptional wave is fast and mild, but probably enough to activate a cellular response to acute stress.
Collapse
Affiliation(s)
- Paolo Martini
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
| | - Jessica Mingardi
- Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy
| | - Giulia Carini
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, 25123 Brescia, Italy
| | - Stefania Mattevi
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
| | - Elona Ndoj
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
| | - Luca La Via
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
| | - Chiara Magri
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
| | - Massimo Gennarelli
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, 25123 Brescia, Italy
| | - Isabella Russo
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, 25123 Brescia, Italy
| | - Maurizio Popoli
- Department of Pharmaceutical Sciences, University of Milan, 20133 Milan, Italy
| | - Laura Musazzi
- Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy
| | - Alessandro Barbon
- Department of Molecular and Translational Medicine, University of Brescia, 25123 Brescia, Italy
| |
Collapse
|
6
|
Groppa E, Martini P, Derakhshan N, Theret M, Ritso M, Tung LW, Wang YX, Soliman H, Hamer MS, Stankiewicz L, Eisner C, Erwan LN, Chang C, Yi L, Yuan JH, Kong S, Weng C, Adams J, Chang L, Peng A, Blau HM, Romualdi C, Rossi FMV. Spatial compartmentalization of signaling imparts source-specific functions on secreted factors. Cell Rep 2023; 42:112051. [PMID: 36729831 DOI: 10.1016/j.celrep.2023.112051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 09/08/2022] [Accepted: 01/16/2023] [Indexed: 02/03/2023] Open
Abstract
Efficient regeneration requires multiple cell types acting in coordination. To better understand the intercellular networks involved and how they change when regeneration fails, we profile the transcriptome of hematopoietic, stromal, myogenic, and endothelial cells over 14 days following acute muscle damage. We generate a time-resolved computational model of interactions and identify VEGFA-driven endothelial engagement as a key differentiating feature in models of successful and failed regeneration. In addition, the analysis highlights that the majority of secreted signals, including VEGFA, are simultaneously produced by multiple cell types. To test whether the cellular source of a factor determines its function, we delete VEGFA from two cell types residing in close proximity: stromal and myogenic progenitors. By comparing responses to different types of damage, we find that myogenic and stromal VEGFA have distinct functions in regeneration. This suggests that spatial compartmentalization of signaling plays a key role in intercellular communication networks.
Collapse
Affiliation(s)
- Elena Groppa
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada; Borea Therapeutics, Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, Trieste, Italy
| | - Paolo Martini
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy; Department of Biology, University of Padova, via U. Bassi 58B, Padova, Italy
| | - Nima Derakhshan
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Marine Theret
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Morten Ritso
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Lin Wei Tung
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Yu Xin Wang
- Baxter Laboratory for Stem Cell Biology, Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA
| | - Hesham Soliman
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada; Faculty of Pharmaceutical Sciences, Minia University, Minia, Egypt; Aspect Biosystems, 1781 W 75th Avenue, Vancouver, BC, Canada
| | - Mark Stephen Hamer
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Laura Stankiewicz
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Christine Eisner
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Le Nevé Erwan
- Department of Pediatrics, Université Laval, Laval, QC, Canada
| | - Chihkai Chang
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Lin Yi
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Jack H Yuan
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Sunny Kong
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Curtis Weng
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Josephine Adams
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Lucas Chang
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Anne Peng
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada
| | - Helen M Blau
- Baxter Laboratory for Stem Cell Biology, Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA
| | - Chiara Romualdi
- Department of Biology, University of Padova, via U. Bassi 58B, Padova, Italy
| | - Fabio M V Rossi
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC, Canada.
| |
Collapse
|
7
|
Aslanis I, Krokidis MG, Dimitrakopoulos GN, Vrahatis AG. Identifying Network Biomarkers for Alzheimer's Disease Using Single-Cell RNA Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1423:207-214. [PMID: 37525046 DOI: 10.1007/978-3-031-31978-5_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]
Abstract
System-level network-based approaches are an emerging field in the biomedical domain since biological networks can be used to analyze complicated biological processes and complex human disorders more efficiently. Network biomarkers are groups of interconnected molecular components causing perturbations in the entire network topology that can be used as indicators of pathogenic biological processes when studying a given disease. Although in the last years computational systems-based approaches have gained ground on the path to discovering new network biomarkers, in complex diseases like Alzheimer's disease (AD), this approach has still much to offer. Especially the adoption of single-cell RNA sequencing (scRNA-seq) has now become the dominant technology for the study of stochastic gene expression. Toward this orientation, we propose an R workflow that extracts disease-perturbed subpathways within a pathway network. We construct a gene-gene interaction network integrated with scRNA-seq expression profiles, and after network processing and pruning, the most active subnetworks are isolated from the entire network topology. The proposed methodology was applied on a real AD-based scRNA-seq data, providing already existing and new potential AD biomarkers in gene network context.
Collapse
Affiliation(s)
- Ioannis Aslanis
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Corfu, Greece
| | - Marios G Krokidis
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Corfu, Greece
| | - Georgios N Dimitrakopoulos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Corfu, Greece
| | - Aristidis G Vrahatis
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Corfu, Greece
| |
Collapse
|
8
|
Wong Fok Lung T, Charytonowicz D, Beaumont KG, Shah SS, Sridhar SH, Gorrie CL, Mu A, Hofstaedter CE, Varisco D, McConville TH, Drikic M, Fowler B, Urso A, Shi W, Fucich D, Annavajhala MK, Khan IN, Oussenko I, Francoeur N, Smith ML, Stockwell BR, Lewis IA, Hachani A, Upadhyay Baskota S, Uhlemann AC, Ahn D, Ernst RK, Howden BP, Sebra R, Prince A. Klebsiella pneumoniae induces host metabolic stress that promotes tolerance to pulmonary infection. Cell Metab 2022; 34:761-774.e9. [PMID: 35413274 PMCID: PMC9081115 DOI: 10.1016/j.cmet.2022.03.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 01/18/2022] [Accepted: 03/22/2022] [Indexed: 12/21/2022]
Abstract
K. pneumoniae sequence type 258 (Kp ST258) is a major cause of healthcare-associated pneumonia. However, it remains unclear how it causes protracted courses of infection in spite of its expression of immunostimulatory lipopolysaccharide, which should activate a brisk inflammatory response and bacterial clearance. We predicted that the metabolic stress induced by the bacteria in the host cells shapes an immune response that tolerates infection. We combined in situ metabolic imaging and transcriptional analyses to demonstrate that Kp ST258 activates host glutaminolysis and fatty acid oxidation. This response creates an oxidant-rich microenvironment conducive to the accumulation of anti-inflammatory myeloid cells. In this setting, metabolically active Kp ST258 elicits a disease-tolerant immune response. The bacteria, in turn, adapt to airway oxidants by upregulating the type VI secretion system, which is highly conserved across ST258 strains worldwide. Thus, much of the global success of Kp ST258 in hospital settings can be explained by the metabolic activity provoked in the host that promotes disease tolerance.
Collapse
Affiliation(s)
| | - Daniel Charytonowicz
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA
| | - Kristin G Beaumont
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA
| | - Shivang S Shah
- Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Shwetha H Sridhar
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA
| | - Claire L Gorrie
- Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, VIC 3000, Australia
| | - Andre Mu
- Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, VIC 3000, Australia
| | - Casey E Hofstaedter
- Department of Microbial Pathogenesis, University of Maryland, Baltimore, MD 21201, USA
| | - David Varisco
- Department of Microbial Pathogenesis, University of Maryland, Baltimore, MD 21201, USA
| | | | - Marija Drikic
- Department of Biological Sciences, University of Calgary, Calgary, T2N 1N4, Canada
| | - Brandon Fowler
- Microbiome & Pathogen Genomics Collaborative Center, Columbia University, New York, NY 10032, USA
| | - Andreacarola Urso
- Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Wei Shi
- Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Dario Fucich
- Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Medini K Annavajhala
- Department of Medicine, Columbia University, New York, NY 10032, USA; Microbiome & Pathogen Genomics Collaborative Center, Columbia University, New York, NY 10032, USA
| | - Ibrahim N Khan
- Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Irina Oussenko
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA
| | - Nancy Francoeur
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA
| | - Melissa L Smith
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA
| | - Brent R Stockwell
- Department of Chemistry, Columbia University, New York, NY 10027, USA; Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Ian A Lewis
- Department of Biological Sciences, University of Calgary, Calgary, T2N 1N4, Canada
| | - Abderrahman Hachani
- Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, VIC 3000, Australia
| | | | - Anne-Catrin Uhlemann
- Department of Medicine, Columbia University, New York, NY 10032, USA; Microbiome & Pathogen Genomics Collaborative Center, Columbia University, New York, NY 10032, USA
| | - Danielle Ahn
- Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Robert K Ernst
- Department of Microbial Pathogenesis, University of Maryland, Baltimore, MD 21201, USA
| | - Benjamin P Howden
- Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, VIC 3000, Australia; Microbiological Diagnostic Unit Public Health Laboratory, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, VIC 3000, Australia
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Mt. Sinai Icahn School of Medicine, New York, NY 10029, USA; Sema4: A Mount Sinai Venture, Stamford, CT 06902, USA
| | - Alice Prince
- Department of Pediatrics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
9
|
Corso D, Chemello F, Alessio E, Urso I, Ferrarese G, Bazzega M, Romualdi C, Lanfranchi G, Sales G, Cagnin S. MyoData: An expression knowledgebase at single cell/nucleus level for the discovery of coding-noncoding RNA functional interactions in skeletal muscle. Comput Struct Biotechnol J 2021; 19:4142-4155. [PMID: 34527188 PMCID: PMC8342900 DOI: 10.1016/j.csbj.2021.07.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 12/22/2022] Open
Abstract
Regulation of gene expression through non-coding RNAs at single myofiber and nucleus resolution. Reinterpretation of KEGG pathways with microRNA and long non-coding RNA activities. miR-149, -214, and let-7e alter mitochondrial shape. The long non-coding RNA Pvt1 is a sponge for miR-27a. miR-208b regulates Sox6; miR-214 regulates both Sox6 and Slc16a3.
Non-coding RNAs represent the largest part of transcribed mammalian genomes and prevalently exert regulatory functions. Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) can modulate the activity of each other. Skeletal muscle is the most abundant tissue in mammals. It is composed of different cell types with myofibers that represent the smallest complete contractile system. Considering that lncRNAs and miRNAs are more cell type-specific than coding RNAs, to understand their function it is imperative to evaluate their expression and action within single myofibers. In this database, we collected gene expression data for coding and non-coding genes in single myofibers and used them to produce interaction networks based on expression correlations. Since biological pathways are more informative than networks based on gene expression correlation, to understand how altered genes participate in the studied phenotype, we integrated KEGG pathways with miRNAs and lncRNAs. The database also integrates single nucleus gene expression data on skeletal muscle in different patho-physiological conditions. We demonstrated that these networks can serve as a framework from which to dissect new miRNA and lncRNA functions to experimentally validate. Some interactions included in the database have been previously experimentally validated using high throughput methods. These can be the basis for further functional studies. Using database information, we demonstrate the involvement of miR-149, -214 and let-7e in mitochondria shaping; the ability of the lncRNA Pvt1 to mitigate the action of miR-27a via sponging; and the regulatory activity of miR-214 on Sox6 and Slc16a3. The MyoData is available at https://myodata.bio.unipd.it.
Collapse
Affiliation(s)
- Davide Corso
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Francesco Chemello
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Enrico Alessio
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Ilenia Urso
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Giulia Ferrarese
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Martina Bazzega
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Chiara Romualdi
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Gerolamo Lanfranchi
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy.,CRIBI Biotechnology Centre, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy.,CIR-Myo Myology Center, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Gabriele Sales
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| | - Stefano Cagnin
- Department of Biology, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy.,CRIBI Biotechnology Centre, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy.,CIR-Myo Myology Center, University of Padova, Via Ugo Bassi 58/b, 35131 Padova, Italy
| |
Collapse
|
10
|
Wang L, Xie W, Li K, Wang Z, Li X, Feng W, Li J. DysPIA: A Novel Dysregulated Pathway Identification Analysis Method. Front Genet 2021; 12:647653. [PMID: 34290733 PMCID: PMC8287415 DOI: 10.3389/fgene.2021.647653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/20/2021] [Indexed: 11/13/2022] Open
Abstract
Differential co-expression-based pathway analysis is still limited and not widely used. In most current methods, the pathways were considered as gene sets, but the gene regulation relationships were not considered, and the computational speed was slow. In this article, we proposed a novel Dysregulated Pathway Identification Analysis (DysPIA) method to overcome these shortcomings. We adopted the idea of Correlation by Individual Level Product into analysis and performed a fast enrichment analysis. We constructed a combined gene-pair background which was much more sufficient than the background used in Edge Set Enrichment Analysis. In simulation study, DysPIA was able to identify the causal pathways with high AUC (0.9584 to 0.9896). In p53 mutation data, DysPIA obtained better performance than other methods. It obtained more potential dysregulated pathways that could be literature verified, and it ran much faster (∼1,700-8,000 times faster than other methods when 10,000 permutations). DysPIA was also applied to breast cancer relapse dataset and breast cancer subtype dataset. The results show that DysPIA is effective and has a great biological significance. R packages "DysPIA" and "DysPIAData" are constructed and freely available on R CRAN (https://cran.r-project.org/web/packages/DysPIA/index.html and https://cran.r-project.org/web/packages/DysPIAData/index.html), and on GitHub (https://github.com/lemonwang2020).
Collapse
Affiliation(s)
- Limei Wang
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China.,Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weixin Xie
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| | - Kongning Li
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China
| | - Zhenzhen Wang
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China
| | - Xia Li
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weixing Feng
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| | - Jin Li
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
11
|
Huang X, Wang Z, Su B, He X, Liu B, Kang B. A computational strategy for metabolic network construction based on the overlapping ratio: Study of patients' metabolic responses to different dialysis patterns. Comput Biol Chem 2021; 93:107539. [PMID: 34246891 DOI: 10.1016/j.compbiolchem.2021.107539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 06/25/2021] [Accepted: 07/01/2021] [Indexed: 11/16/2022]
Abstract
BACKGROUND Uremia is a worldwide epidemic disease and poses a serious threat to human health. Both maintenance hemodialysis (HD) and maintenance high flux hemodialysis (HFD) are common treatments for uremia and are generally used in clinical applications. In-depth exploration of patients' metabolic responses to different dialysis patterns can facilitate the understanding of pathological alterations associated with uremia and the effects of different dialysis methods on uremia, which may be used for future personalized therapy. However, due to variations of multiple factors (i.e., genetic, epigenetic and environment) in the process of disease treatments, identification of the similarities and differences in plasma metabolite changes in uremic patients in response to HD and HFD remains challenging. METHODS In this study, a computational strategy for metabolic network construction based on the overlapping ratio (MNC-OR) was proposed for disease treatment effect research. In MNC-OR, the overlapping ratio was introduced to measure metabolic reactions and to construct metabolic networks for analysis of different treatment options. Then, MNC-OR was employed to analyze HD-pattern-dependent changes in plasma metabolites to explore the pathological alterations associated with uremia and the effectiveness of different dialysis patterns (i.e., HD and HFD) on uremia. Based on the networks constructed by MNC-OR, two network analysis techniques, namely, similarity analysis and difference analysis of network topology, were used to find the similarity and differences in metabolic signals in patients under treatment with either HD or HFD, which can facilitate the understanding of pathological alterations associated with uremia and provide the guidance for personalized dialysis therapy. RESULTS Similarity analysis of network topology suggested that abnormal energy metabolism, gut metabolism and pyrimidine metabolism might occur in uremic patients, and maintenance of both HFD and HD therapies have beneficial effects on uremia. Then, difference analysis of network topology was employed to extract the crucial information related to HD-pattern-dependent changes in plasma metabolites. Experimental results indicated that the amino acid metabolism was closer to the normal status in HFD-treated patients; however, in HD-treated patients, the ability of antioxidation showed greater reduction, and the protein O-GlcNAcylation level was higher. Our findings demonstrate the potential of MNC-OR for explaining the metabolic similarities and differences of patients in response to different dialysis methods, thereby contributing to the guidance of personalized dialysis therapy.
Collapse
Affiliation(s)
- Xin Huang
- School of Mathematics and Information Science, Anshan Normal University, Anshan, Liaoning, China.
| | - Zeyu Wang
- School of Mathematics and Information Science, Anshan Normal University, Anshan, Liaoning, China
| | - Benzhe Su
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, China
| | - Xinyu He
- School of Computer and Information Technology, Liaoning Normal University, Dalian, Liaoning, China
| | - Bing Liu
- School of Mathematics and Information Science, Anshan Normal University, Anshan, Liaoning, China
| | - Baolin Kang
- School of Mathematics and Information Science, Anshan Normal University, Anshan, Liaoning, China
| |
Collapse
|
12
|
Differential metabolic network construction for personalized medicine: Study of type 2 diabetes mellitus patients' response to gliclazide-modified-release-treated. J Biomed Inform 2021; 118:103796. [PMID: 33932596 DOI: 10.1016/j.jbi.2021.103796] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 02/26/2021] [Accepted: 04/26/2021] [Indexed: 11/21/2022]
Abstract
Individual variation in genetic and environmental factors can cause the differences in metabolic phenotypes, which may have an effect on drug responses of patients. Deep exploration of patients' responses to therapeutic agents is a crucial and urgent event in the personalized treatment study. Using machine learning methods for the discovery of suitability evaluation biomarkers can provide deep insight into the mechanism of disease therapy and facilitate the development of personalized medicine. To find important metabolic network signals for the prediction of patients' drug responses, a novel method referred to as differential metabolic network construction (DMNC) was proposed. In DMNC, concentration changes in metabolite ratios between different pathological states are measured to construct differential metabolic networks, which can be used to advance clinical decision-making. In this study, DMNC was applied to characterize type 2 diabetes mellitus (T2DM) patients' responses against gliclazide modified-release (MR) therapy. Two T2DM metabolomics datasets from different batches of subjects treated by gliclazide MR were analyzed in depth. A network biomarker was defined to assess the patients' suitability for gliclazide MR. It can be effective in the prediction of significant responders from nonsignificant responders, achieving area under the curve values of 0.893 and 1.000 for the discovery and validation sets, respectively. Compared with the metabolites selected by the other methods, the network biomarker selected by DMNC was more stable and precise to reflect the metabolic responses in patients to gliclazide MR therapy, thereby contributing for the personalized medicine of T2DM patients. The better performance of DMNC validated its potential for the identification of network biomarkers to characterize the responses against therapeutic treatments and provide valuable information for personalized medicine.
Collapse
|
13
|
Li F, Michelson AP, Foraker R, Zhan M, Payne PRO. Computational analysis to repurpose drugs for COVID-19 based on transcriptional response of host cells to SARS-CoV-2. BMC Med Inform Decis Mak 2021; 21:15. [PMID: 33413329 PMCID: PMC7789899 DOI: 10.1186/s12911-020-01373-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 12/16/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Coronavirus Disease 2019 (COVID-19) pandemic has infected over 10 million people globally with a relatively high mortality rate. There are many therapeutics undergoing clinical trials, but there is no effective vaccine or therapy for treatment thus far. After affected by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), molecular signaling pathways of host cells play critical roles during the life cycle of SARS-CoV-2. Thus, it is significant to identify the involved molecular signaling pathways within the host cells. Drugs targeting these molecular signaling pathways could be potentially effective for COVID-19 treatment. METHODS In this study, we developed a novel integrative analysis approach to identify the related molecular signaling pathways within host cells, and repurposed drugs as potentially effective treatments for COVID-19, based on the transcriptional response of host cells. RESULTS We identified activated signaling pathways associated with the infection caused SARS-CoV-2 in human lung epithelial cells through integrative analysis. Then, the activated gene ontologies (GOs) and super GOs were identified. Signaling pathways and GOs such as MAPK, JNK, STAT, ERK, JAK-STAT, IRF7-NFkB signaling, and MYD88/CXCR6 immune signaling were particularly activated. Based on the identified signaling pathways and GOs, a set of potentially effective drugs were repurposed by integrating the drug-target and reverse gene expression data resources. In addition to many drugs being evaluated in clinical trials, the dexamethasone was top-ranked in the prediction, which was the first reported drug to be able to significantly reduce the death rate of COVID-19 patients receiving respiratory support. CONCLUSIONS The integrative genomics data analysis and results can be helpful to understand the associated molecular signaling pathways within host cells, and facilitate the discovery of effective drugs for COVID-19 treatment.
Collapse
Affiliation(s)
- Fuhai Li
- Institute for Informatics (I2), Washington University in St. Louis School of Medicine, St. Louis, MO, USA.
- Department of Pediatrics, Washington University in St. Louis School of Medicine, St. Louis, MO, USA.
| | - Andrew P Michelson
- Institute for Informatics (I2), Washington University in St. Louis School of Medicine, St. Louis, MO, USA
- Pulmonary and Critical Care Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Randi Foraker
- Institute for Informatics (I2), Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Ming Zhan
- National Institute of Mental Health (NIMH), NIH, Bethesda, MD, USA
| | - Philip R O Payne
- Institute for Informatics (I2), Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| |
Collapse
|
14
|
A strategy to incorporate prior knowledge into correlation network cutoff selection. Nat Commun 2020; 11:5153. [PMID: 33056991 PMCID: PMC7560866 DOI: 10.1038/s41467-020-18675-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 08/27/2020] [Indexed: 12/16/2022] Open
Abstract
Correlation networks are frequently used to statistically extract biological interactions between omics markers. Network edge selection is typically based on the statistical significance of the correlation coefficients. This procedure, however, is not guaranteed to capture biological mechanisms. We here propose an alternative approach for network reconstruction: a cutoff selection algorithm that maximizes the overlap of the inferred network with available prior knowledge. We first evaluate the approach on IgG glycomics data, for which the biochemical pathway is known and well-characterized. Importantly, even in the case of incomplete or incorrect prior knowledge, the optimal network is close to the true optimum. We then demonstrate the generalizability of the approach with applications to untargeted metabolomics and transcriptomics data. For the transcriptomics case, we demonstrate that the optimized network is superior to statistical networks in systematically retrieving interactions that were not included in the biological reference used for optimization.
Collapse
|
15
|
Balomenos P, Dragomir A, Tsakalidis AK, Bezerianos A. Identification of differentially expressed subpathways via a bilevel consensus scoring of network topology and gene expression. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:5316-5319. [PMID: 33019184 DOI: 10.1109/embc44109.2020.9176556] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Identifying differentially expressed subpathways connected to the emergence of a disease that can be considered as candidates for pharmacological intervention, with minimal off-target effects, is a daunting task. In this direction, we present a bilevel subpathway analysis method to identify differentially expressed subpathways that are connected with an experimental condition, while taking into account potential crosstalks between subpathways which arise due to their connectivity in a combined multi-pathway network. The efficacy of the method is demonstrated on a hematopoietic stem cell aging dataset, with findings corroborated using recent literature.
Collapse
|
16
|
Yang M, Petralia F, Li Z, Li H, Ma W, Song X, Kim S, Lee H, Yu H, Lee B, Bae S, Heo E, Kaczmarczyk J, Stępniak P, Warchoł M, Yu T, Calinawan AP, Boutros PC, Payne SH, Reva B, Boja E, Rodriguez H, Stolovitzky G, Guan Y, Kang J, Wang P, Fenyö D, Saez-Rodriguez J, Aderinwale T, Afyounian E, Agrawal P, Ali M, Amadoz A, Azuaje F, Bachman J, Bae S, Bhalla S, Carbonell-Caballero J, Chakraborty P, Chaudhary K, Choi Y, Choi Y, Çubuk C, Dhanda SK, Dopazo J, Elo LL, Fóthi Á, Gevaert O, Granberg K, Greiner R, Heo E, Hidalgo MR, Jayaswal V, Jeon H, Jeon M, Kalmady SV, Kambara Y, Kang J, Kang K, Kaoma T, Kaur H, Kazan H, Kesar D, Kesseli J, Kim D, Kim K, Kim SY, Kim S, Kumar S, Lee B, Lee H, Liu Y, Luethy R, Mahajan S, Mahmoudian M, Muller A, Nazarov PV, Nguyen H, Nykter M, Okuda S, Park S, Pal Singh Raghava G, Rajapakse JC, Rantapero T, Ryu H, Salavert F, Saraei S, Sharma R, Siitonen A, Sokolov A, Subramanian K, Suni V, Suomi T, Tranchevent LC, Usmani SS, Välikangas T, Vega R, Zhong H. Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics. Cell Syst 2020; 11:186-195.e9. [DOI: 10.1016/j.cels.2020.06.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 03/12/2020] [Accepted: 06/29/2020] [Indexed: 10/23/2022]
|
17
|
Djordjilović V, Chiogna M, Romualdi C. Simulating gene silencing through intervention analysis. J R Stat Soc Ser C Appl Stat 2020. [DOI: 10.1111/rssc.12412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
18
|
Martini P, Chiogna M, Calura E, Romualdi C. MOSClip: multi-omic and survival pathway analysis for the identification of survival associated gene and modules. Nucleic Acids Res 2019; 47:e80. [PMID: 31049575 PMCID: PMC6698707 DOI: 10.1093/nar/gkz324] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 03/29/2019] [Accepted: 04/29/2019] [Indexed: 01/09/2023] Open
Abstract
Survival analyses of gene expression data has been a useful and widely used approach in clinical applications. But, in complex diseases, such as cancer, the identification of survival-associated cell processes - rather than single genes - provides more informative results because the efficacy of survival prediction increases when multiple prognostic features are combined to enlarge the possibility of having druggable targets. Moreover, genome-wide screening in molecular medicine has rapidly grown, providing not only gene expression but also multi-omic measurements such as DNA mutations, methylation, expression, and copy number data. In cancer, virtually all these aberrations can contribute in synergy to pathological processes, and their measurements can improve a patient’s outcome and help in diagnosis and treatment decisions. Here, we present MOSClip, an R package implementing a new topological pathway analysis tool able to integrate multi-omic data and look for survival-associated gene modules. MOSClip tests the survival association of dimensionality-reduced multi-omic data using multivariate models, providing graphical devices for management, browsing and interpretation of results. Using simulated data we evaluated MOSClip performance in terms of false positives and false negatives in different settings, while the TCGA ovarian cancer dataset is used as a case study to highlight MOSClip’s potential.
Collapse
Affiliation(s)
- Paolo Martini
- Department of Biology, University of Padova, Via U.Bassi 58B, 35121 Padova, Italy
| | - Monica Chiogna
- Department of Statistical Sciences 'Paolo Fortunati', University of Bologna, via delle Belle Arti 41, 40126 Bologna, Italy
| | - Enrica Calura
- Department of Biology, University of Padova, Via U.Bassi 58B, 35121 Padova, Italy
| | - Chiara Romualdi
- Department of Biology, University of Padova, Via U.Bassi 58B, 35121 Padova, Italy
| |
Collapse
|
19
|
Calura E, Ciciani M, Sambugaro A, Paracchini L, Benvenuto G, Milite S, Martini P, Beltrame L, Zane F, Fruscio R, Delle Marchette M, Borella F, Tognon G, Ravaggi A, Katsaros D, Bignotti E, Odicino F, D’Incalci M, Marchini S, Romualdi C. Transcriptional Characterization of Stage I Epithelial Ovarian Cancer: A Multicentric Study. Cells 2019; 8:cells8121554. [PMID: 31805750 PMCID: PMC6952972 DOI: 10.3390/cells8121554] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 11/19/2019] [Accepted: 11/26/2019] [Indexed: 02/07/2023] Open
Abstract
Stage I epithelial ovarian cancer (EOC) represents about 10% of all EOCs. It is characterized by a complex histopathological and molecular heterogeneity, and it is composed of five main histological subtypes (mucinous, endometrioid, clear cell and high, and low grade serous), which have peculiar genetic, molecular, and clinical characteristics. As it occurs less frequently than advanced-stage EOC, its molecular features have not been thoroughly investigated. In this study, using in silico approaches and gene expression data, on a multicentric cohort composed of 208 snap-frozen tumor biopsies, we explored the subtype-specific molecular alterations that regulate tumor aggressiveness in stage I EOC. We found that single genes rather than pathways are responsible for histotype specificities and that a cAMP-PKA-CREB1 signaling axis seems to play a central role in histotype differentiation. Moreover, our results indicate that immune response seems to be, at least in part, involved in histotype differences, as a higher immune-reactive behavior of serous and mucinous samples was observed with respect to other histotypes.
Collapse
Affiliation(s)
- Enrica Calura
- Department of Biology, University of Padova, 35121 Padua, Italy; (E.C.); (A.S.); (G.B.); (S.M.); (P.M.); (C.R.)
| | - Matteo Ciciani
- Department of Cellular, Computational and Integrative Biology—CIBIO, University of Trento, 38123 Povo Trento, Italy;
| | - Andrea Sambugaro
- Department of Biology, University of Padova, 35121 Padua, Italy; (E.C.); (A.S.); (G.B.); (S.M.); (P.M.); (C.R.)
| | - Lara Paracchini
- Department of Oncology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milano, Italy; (L.P.); (L.B.); (S.M.)
| | - Giuseppe Benvenuto
- Department of Biology, University of Padova, 35121 Padua, Italy; (E.C.); (A.S.); (G.B.); (S.M.); (P.M.); (C.R.)
| | - Salvatore Milite
- Department of Biology, University of Padova, 35121 Padua, Italy; (E.C.); (A.S.); (G.B.); (S.M.); (P.M.); (C.R.)
| | - Paolo Martini
- Department of Biology, University of Padova, 35121 Padua, Italy; (E.C.); (A.S.); (G.B.); (S.M.); (P.M.); (C.R.)
| | - Luca Beltrame
- Department of Oncology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milano, Italy; (L.P.); (L.B.); (S.M.)
| | - Flaminia Zane
- Unit of Biological Adaptation and Ageing UMR8256, Institute of Biology Paris-Seine, Sorbonne University, 75005 Paris, France;
| | - Robert Fruscio
- Clinic of Obstetrics and Gynaecology, University of Milano-Bicocca, San Gerardo Hospital, 20900 Monza, Italy; (R.F.); (M.D.M.)
| | - Martina Delle Marchette
- Clinic of Obstetrics and Gynaecology, University of Milano-Bicocca, San Gerardo Hospital, 20900 Monza, Italy; (R.F.); (M.D.M.)
| | - Fulvio Borella
- Department of Surgical Science and Gynecology, Azienda Ospedaliero Universitaria, Città della Salute, presidio S.Anna, University of Torino, 10126 Torino, Italy; (F.B.); (D.K.)
| | - Germana Tognon
- Division of Obstetrics and Gynecology, ASST Spedali Civili di Brescia, 25123 Brescia, Italy; (G.T.); (E.B.); (F.O.)
| | - Antonella Ravaggi
- Angelo Nocivelli Institute of Molecular Medicine, University of Brescia and ASST-Spedali Civili of Brescia, 25123 Brescia, Italy;
- Department of Clinical and Experimental Sciences, Division of Obstetrics and Gynecology, University of Brescia, 25123 Brescia, Italy
| | - Dionyssios Katsaros
- Department of Surgical Science and Gynecology, Azienda Ospedaliero Universitaria, Città della Salute, presidio S.Anna, University of Torino, 10126 Torino, Italy; (F.B.); (D.K.)
| | - Eliana Bignotti
- Division of Obstetrics and Gynecology, ASST Spedali Civili di Brescia, 25123 Brescia, Italy; (G.T.); (E.B.); (F.O.)
- Angelo Nocivelli Institute of Molecular Medicine, University of Brescia and ASST-Spedali Civili of Brescia, 25123 Brescia, Italy;
| | - Franco Odicino
- Division of Obstetrics and Gynecology, ASST Spedali Civili di Brescia, 25123 Brescia, Italy; (G.T.); (E.B.); (F.O.)
| | - Maurizio D’Incalci
- Department of Oncology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milano, Italy; (L.P.); (L.B.); (S.M.)
- Correspondence:
| | - Sergio Marchini
- Department of Oncology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 20156 Milano, Italy; (L.P.); (L.B.); (S.M.)
| | - Chiara Romualdi
- Department of Biology, University of Padova, 35121 Padua, Italy; (E.C.); (A.S.); (G.B.); (S.M.); (P.M.); (C.R.)
| |
Collapse
|
20
|
Mubeen S, Hoyt CT, Gemünd A, Hofmann-Apitius M, Fröhlich H, Domingo-Fernández D. The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling. Front Genet 2019; 10:1203. [PMID: 31824580 PMCID: PMC6883970 DOI: 10.3389/fgene.2019.01203] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 10/30/2019] [Indexed: 02/04/2023] Open
Abstract
Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - André Gemünd
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|
21
|
Salviato E, Djordjilović V, Chiogna M, Romualdi C. SourceSet: A graphical model approach to identify primary genes in perturbed biological pathways. PLoS Comput Biol 2019; 15:e1007357. [PMID: 31652275 PMCID: PMC6834292 DOI: 10.1371/journal.pcbi.1007357] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 11/06/2019] [Accepted: 08/23/2019] [Indexed: 11/24/2022] Open
Abstract
Topological gene-set analysis has emerged as a powerful means for omic data interpretation. Although numerous methods for identifying dysregulated genes have been proposed, few of them aim to distinguish genes that are the real source of perturbation from those that merely respond to the signal dysregulation. Here, we propose a new method, called SourceSet, able to distinguish between the primary and the secondary dysregulation within a Gaussian graphical model context. The proposed method compares gene expression profiles in the control and in the perturbed condition and detects the differences in both the mean and the covariance parameters with a series of likelihood ratio tests. The resulting evidence is used to infer the primary and the secondary set, i.e. the genes responsible for the primary dysregulation, and the genes affected by the perturbation through network propagation. The proposed method demonstrates high specificity and sensitivity in different simulated scenarios and on several real biological case studies. In order to fit into the more traditional pathway analysis framework, SourceSet R package also extends the analysis from a single to multiple pathways and provides several graphical outputs, including Cytoscape visualization to browse the results. The rapid increase in omic studies has created a need to understand the biological implications of their results. Gene-set analysis has emerged as a powerful means for gaining such understanding, evolving in the last decade from the classical enrichment analysis to the more powerful topological approaches. Although numerous methods for identifying dysregulated genes have been proposed, few of them aim to distinguish genes that are the real source of perturbation from those that merely respond to the signal dysregulation. This distinction is crucial for network medicine, where the prioritization of the effect of biological perturbations may help in the molecular understanding of drug treatments and diseases. Here we propose a new method, called SourceSet, able to distinguish between primary and secondary dysregulation within a graphical model context, demonstrating a high specificity and sensitivity in different simulated scenarios and on real biological case studies.
Collapse
Affiliation(s)
- Elisa Salviato
- IFOM - The FIRC Institute of Molecular Oncology, Milan, Italy
- * E-mail: (ES); (CR)
| | | | - Monica Chiogna
- Department of Statistical Sciences, University of Bologna, Bologna, Italy
| | - Chiara Romualdi
- Department of Biology, University of Padova, Padova, Italy
- * E-mail: (ES); (CR)
| |
Collapse
|
22
|
Stanstrup J, Broeckling CD, Helmus R, Hoffmann N, Mathé E, Naake T, Nicolotti L, Peters K, Rainer J, Salek RM, Schulze T, Schymanski EL, Stravs MA, Thévenot EA, Treutler H, Weber RJM, Willighagen E, Witting M, Neumann S. The metaRbolomics Toolbox in Bioconductor and beyond. Metabolites 2019; 9:E200. [PMID: 31548506 PMCID: PMC6835268 DOI: 10.3390/metabo9100200] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/17/2022] Open
Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
Collapse
Affiliation(s)
- Jan Stanstrup
- Preventive and Clinical Nutrition, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark.
| | - Corey D Broeckling
- Proteomics and Metabolomics Facility, Colorado State University, Fort Collins, CO 80523, USA.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH Amsterdam, The Netherlands.
| | - Nils Hoffmann
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Otto-Hahn-Straße 6b, 44227 Dortmund, Germany.
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | - Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
| | - Luca Nicolotti
- The Australian Wine Research Institute, Metabolomics Australia, PO Box 197, Adelaide SA 5064, Australia.
| | - Kristian Peters
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy.
| | - Reza M Salek
- The International Agency for Research on Cancer, 150 cours Albert Thomas, CEDEX 08, 69372 Lyon, France.
| | - Tobias Schulze
- Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research-UFZ, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dubendorf, Switzerland.
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Sciences and Decision, MetaboHUB, Gif-Sur-Yvette F-91191, France.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Ralf J M Weber
- Phenome Centre Birmingham and School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, The Netherlands.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
- Chair of Analytical Food Chemistry, Technische Universität München, 85354 Weihenstephan, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany.
| |
Collapse
|
23
|
Domingo-Fernández D, Mubeen S, Marín-Llaó J, Hoyt CT, Hofmann-Apitius M. PathMe: merging and exploring mechanistic pathway knowledge. BMC Bioinformatics 2019; 20:243. [PMID: 31092193 PMCID: PMC6521546 DOI: 10.1186/s12859-019-2863-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 04/29/2019] [Indexed: 12/12/2022] Open
Abstract
Background The complexity of representing biological systems is compounded by an ever-expanding body of knowledge emerging from multi-omics experiments. A number of pathway databases have facilitated pathway-centric approaches that assist in the interpretation of molecular signatures yielded by these experiments. However, the lack of interoperability between pathway databases has hindered the ability to harmonize these resources and to exploit their consolidated knowledge. Such a unification of pathway knowledge is imperative in enhancing the comprehension and modeling of biological abstractions. Results Here, we present PathMe, a Python package that transforms pathway knowledge from three major pathway databases into a unified abstraction using Biological Expression Language as the pivotal, integrative schema. PathMe is complemented by a novel web application (freely available at https://pathme.scai.fraunhofer.de/) which allows users to comprehensively explore pathway crosstalk and compare areas of consensus and discrepancies. Conclusions This work has harmonized three major pathway databases and transformed them into a unified schema in order to gain a holistic picture of pathway knowledge. We demonstrate the utility of the PathMe framework in: i) integrating pathway landscapes at the database level, ii) comparing the degree of consensus at the pathway level, and iii) exploring pathway crosstalk and investigating consensus at the molecular level. Electronic supplementary material The online version of this article (10.1186/s12859-019-2863-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754, Sankt Augustin, Germany. .,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany.
| | - Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Josep Marín-Llaó
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754, Sankt Augustin, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| |
Collapse
|
24
|
Mubeen S, Hoyt CT, Gemünd A, Hofmann-Apitius M, Fröhlich H, Domingo-Fernández D. The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling. Front Genet 2019. [PMID: 31824580 DOI: 10.3389/fgene.2019.01203/bibtex] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023] Open
Abstract
Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - André Gemünd
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|