1
|
Candia J, Ferrucci L. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks. PLoS One 2024; 19:e0302696. [PMID: 38753612 PMCID: PMC11098418 DOI: 10.1371/journal.pone.0302696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/09/2024] [Indexed: 05/18/2024] Open
Abstract
Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools.
Collapse
Affiliation(s)
- Julián Candia
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, United States of America
| | - Luigi Ferrucci
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, United States of America
| |
Collapse
|
2
|
Jeuken GS, Käll L. Pathway analysis through mutual information. Bioinformatics 2024; 40:btad776. [PMID: 38195928 PMCID: PMC10783954 DOI: 10.1093/bioinformatics/btad776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 12/09/2023] [Accepted: 01/08/2024] [Indexed: 01/11/2024] Open
Abstract
MOTIVATION In pathway analysis, we aim to establish a connection between the activity of a particular biological pathway and a difference in phenotype. There are many available methods to perform pathway analysis, many of them rely on an upstream differential expression analysis, and many model the relations between the abundances of the analytes in a pathway as linear relationships. RESULTS Here, we propose a new method for pathway analysis, MIPath, that relies on information theoretical principles and, therefore, does not model the association between pathway activity and phenotype, resulting in relatively few assumptions. For this, we construct a graph of the data points for each pathway using a nearest-neighbor approach and score the association between the structure of this graph and the phenotype of these same samples using Mutual Information while adjusting for the effects of random chance in each score. The initial nearest neighbor approach evades individual gene-level comparisons, hence making the method scalable and less vulnerable to missing values. These properties make our method particularly useful for single-cell data. We benchmarked our method on several single-cell datasets, comparing it to established and new methods, and found that it produces robust, reproducible, and meaningful scores. AVAILABILITY AND IMPLEMENTATION Source code is available at https://github.com/statisticalbiotechnology/mipath, or through Python Package Index as "mipathway."
Collapse
Affiliation(s)
- Gustavo S Jeuken
- Science for Life Laboratory, KTH – Royal Institute of Technology, Stockholm 171 65, Sweden
- Computer Science Department, Vrije Universiteit Amsterdam, Amsterdam 1081 HV, The Netherlands
| | - Lukas Käll
- Science for Life Laboratory, KTH – Royal Institute of Technology, Stockholm 171 65, Sweden
| |
Collapse
|
3
|
Nguyen TM, Craig DB, Tran D, Nguyen T, Draghici S. A novel approach for predicting upstream regulators (PURE) that affect gene expression. Sci Rep 2023; 13:18571. [PMID: 37903768 PMCID: PMC10616115 DOI: 10.1038/s41598-023-41374-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 08/25/2023] [Indexed: 11/01/2023] Open
Abstract
External factors such as exposure to a chemical, drug, or toxicant (CDT), or conversely, the lack of certain chemicals can cause many diseases. The ability to identify such causal CDTs based on changes in the gene expression profile is extremely important in many studies. Furthermore, the ability to correctly infer CDTs that can revert the gene expression changes induced by a given disease phenotype is a crucial step in drug repurposing. We present an approach for Predicting Upstream REgulators (PURE) designed to tackle this challenge. PURE can correctly infer a CDT from the measured expression changes in a given phenotype, as well as correctly identify drugs that could revert disease-induced gene expression changes. We compared the proposed approach with four classical approaches as well as with the causal analysis used in Ingenuity Pathway Analysis (IPA) on 16 data sets (1 rat, 5 mouse, and 10 human data sets), involving 8 chemicals or drugs. We assessed the results based on the ability to correctly identify the CDT as indicated by its rank. We also considered the number of false positives, i.e. CDTs other than the correct CDT that were reported to be significant by each method. The proposed approach performed best in 11 out of the 16 experiments, reporting the correct CDT at the very top 7 times. IPA was the second best, reporting the correct CDT at the top 5 times, but was unable to identify the correct CDT at all in 5 out of the 16 experiments. The validation results showed that our approach, PURE, outperformed some of the most popular methods in the field. PURE could effectively infer the true CDTs responsible for the observed gene expression changes and could also be useful in drug repurposing applications.
Collapse
Affiliation(s)
- Tuan-Minh Nguyen
- Department of Computer Science, Wayne State University, Detroit, 48202, USA
| | - Douglas B Craig
- Department of Computer Science, Wayne State University, Detroit, 48202, USA
- Department of Oncology, School of Medicine, Wayne State University, Detroit, MI, 48201, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, 36849, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, 48202, USA.
- Advaita Bioinformatics, Ann Arbor, MI, 48105, USA.
| |
Collapse
|
4
|
Zhao K, Rhee SY. Interpreting omics data with pathway enrichment analysis. Trends Genet 2023; 39:308-319. [PMID: 36750393 DOI: 10.1016/j.tig.2023.01.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 11/24/2022] [Accepted: 01/13/2023] [Indexed: 02/09/2023]
Abstract
Pathway enrichment analysis is indispensable for interpreting omics datasets and generating hypotheses. However, the foundations of enrichment analysis remain elusive to many biologists. Here, we discuss best practices in interpreting different types of omics data using pathway enrichment analysis and highlight the importance of considering intrinsic features of various types of omics data. We further explain major components that influence the outcomes of a pathway enrichment analysis, including defining background sets and choosing reference annotation databases. To improve reproducibility, we describe how to standardize reporting methodological details in publications. This article aims to serve as a primer for biologists to leverage the wealth of omics resources and motivate bioinformatics tool developers to enhance the power of pathway enrichment analysis.
Collapse
Affiliation(s)
- Kangmei Zhao
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94025, USA.
| | - Seung Yon Rhee
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94025, USA.
| |
Collapse
|
5
|
Whittaker CA, Kucukural A, Gates C, Wilkins OM, Bell GW, Hutchinson JN, Polson SW, Dragon J. Functional Annotation Routines Used by ABRF Bioinformatics Core Facilities - Observations, Comparisons, and Considerations. J Biomol Tech 2023; 34:3fc1f5fe.0b74b9db. [PMID: 37089874 PMCID: PMC10121236 DOI: 10.7171/3fc1f5fe.0b74b9db] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
The functional annotation of gene lists is a common analysis routine required for most genomics experiments, and bioinformatics core facilities must support these analyses. In contrast to methods such as the quantitation of RNA-Seq reads or differential expression analysis, our research group noted a lack of consensus in our preferred approaches to functional annotation. To investigate this observation, we selected 4 experiments that represent a range of experimental designs encountered by our cores and analyzed those data with 6 tools used by members of the Association of Biomolecular Resource Facilities (ABRF) Genomic Bioinformatics Research Group (GBIRG). To facilitate comparisons between tools, we focused on a single biological result for each experiment. These results were represented by a gene set, and we analyzed these gene sets with each tool considered in our study to map the result to the annotation categories presented by each tool. In most cases, each tool produces data that would facilitate identification of the selected biological result for each experiment. For the exceptions, Fisher's exact test parameters could be adjusted to detect the result. Because Fisher's exact test is used by many functional annotation tools, we investigated input parameters and demonstrate that, while background set size is unlikely to have a significant impact on the results, the numbers of differentially expressed genes in an annotation category and the total number of differentially expressed genes under consideration are both critical parameters that may need to be modified during analyses. In addition, we note that differences in the annotation categories tested by each tool, as well as the composition of those categories, can have a significant impact on results.
Collapse
Affiliation(s)
- Charles A. Whittaker
- Barbara K. Ostrom (1978) Bioinformatics and Computing Core FacilitySwanson Biotechnology CenterKoch Institute at the Massachusetts Institute of TechnologyCambridgeMassachusetts02139USA
| | - Alper Kucukural
- Bioinformatics CoreUniversity of Massachusetts Medical SchoolWorcesterMassachusetts01605USA
| | - Chris Gates
- BRCF Bioinformatics CoreUniversity of MichiganAnn ArborMichigan48109USA
| | - Owen Michael Wilkins
- Department of Biomedical Data ScienceGeisel School of Medicine at DartmouthHanoverNew Hampshire03755USA
- Dartmouth Cancer CenterDartmouth Hitchcock Medical CenterLebanonNew Hampshire03756USA
| | - George W. Bell
- Bioinformatics and Research ComputingWhitehead InstituteCambridgeMassachusetts02142USA
| | - John N. Hutchinson
- Harvard T.H. Chan School of Public HealthDepartment of BiostatisticsBostonMassachusetts02115USA
| | - Shawn W. Polson
- Bioinformatics CoreCenter for Bioinformatics and Computational BiologyUniversity of DelawareDelaware Biotechnology InstituteNewarkDelaware19713USA
| | - Julie Dragon
- Vermont Integrative Genomics Resource and Vermont Biomedical Research Network Bioinformatic CoreUniversity of VermontBurlingtonVermont05405USA
| |
Collapse
|
6
|
Palshikar MG, Min X, Crystal A, Meng J, Hilchey SP, Zand MS, Thakar J. Executable Network Models of Integrated Multiomics Data. J Proteome Res 2023; 22:1546-1556. [PMID: 37000949 PMCID: PMC10167691 DOI: 10.1021/acs.jproteome.2c00730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
Abstract
Multiomics profiling provides a holistic picture of a condition being examined and captures the complexity of signaling events, beginning from the original cause (environmental or genetic), to downstream functional changes at multiple molecular layers. Pathway enrichment analysis has been used with multiomics data sets to characterize signaling mechanisms. However, technical and biological variability between these layered data limit an integrative computational analyses. We present a Boolean network-based method, multiomics Boolean Omics Network Invariant-Time Analysis (mBONITA), to integrate omics data sets that quantify multiple molecular layers. mBONITA utilizes prior knowledge networks to perform topology-based pathway analysis. In addition, mBONITA identifies genes that are consistently modulated across molecular measurements by combining observed fold-changes and variance, with a measure of node (i.e., gene or protein) influence over signaling, and a measure of the strength of evidence for that gene across data sets. We used mBONITA to integrate multiomics data sets from RAMOS B cells treated with the immunosuppressant drug cyclosporine A under varying O2 tensions to identify pathways involved in hypoxia-mediated chemotaxis. We compare mBONITA's performance with 6 other pathway analysis methods designed for multiomics data and show that mBONITA identifies a set of pathways with evidence of modulation across all omics layers. mBONITA is freely available at https://github.com/Thakar-Lab/mBONITA.
Collapse
|
7
|
Yousef M, Ozdemir F, Jaber A, Allmer J, Bakir-Gungor B. PriPath: identifying dysregulated pathways from differential gene expression via grouping, scoring, and modeling with an embedded feature selection approach. BMC Bioinformatics 2023; 24:60. [PMID: 36823571 PMCID: PMC9947447 DOI: 10.1186/s12859-023-05187-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 02/14/2023] [Indexed: 02/25/2023] Open
Abstract
BACKGROUND Cell homeostasis relies on the concerted actions of genes, and dysregulated genes can lead to diseases. In living organisms, genes or their products do not act alone but within networks. Subsets of these networks can be viewed as modules that provide specific functionality to an organism. The Kyoto encyclopedia of genes and genomes (KEGG) systematically analyzes gene functions, proteins, and molecules and combines them into pathways. Measurements of gene expression (e.g., RNA-seq data) can be mapped to KEGG pathways to determine which modules are affected or dysregulated in the disease. However, genes acting in multiple pathways and other inherent issues complicate such analyses. Many current approaches may only employ gene expression data and need to pay more attention to some of the existing knowledge stored in KEGG pathways for detecting dysregulated pathways. New methods that consider more precompiled information are required for a more holistic association between gene expression and diseases. RESULTS PriPath is a novel approach that transfers the generic process of grouping and scoring, followed by modeling to analyze gene expression with KEGG pathways. In PriPath, KEGG pathways are utilized as the grouping function as part of a machine learning algorithm for selecting the most significant KEGG pathways. A machine learning model is trained to differentiate between diseases and controls using those groups. We have tested PriPath on 13 gene expression datasets of various cancers and other diseases. Our proposed approach successfully assigned biologically and clinically relevant KEGG terms to the samples based on the differentially expressed genes. We have comparatively evaluated the performance of PriPath against other tools, which are similar in their merit. For each dataset, we manually confirmed the top results of PriPath in the literature and found that most predictions can be supported by previous experimental research. CONCLUSIONS PriPath can thus aid in determining dysregulated pathways, which applies to medical diagnostics. In the future, we aim to advance this approach so that it can perform patient stratification based on gene expression and identify druggable targets. Thereby, we cover two aspects of precision medicine.
Collapse
Affiliation(s)
- Malik Yousef
- Department of Information Systems, Zefat Academic College, 13206, Zefat, Israel. .,Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel.
| | - Fatma Ozdemir
- grid.440414.10000 0004 0558 2628Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, Turkey ,grid.5570.70000 0004 0490 981XUniversity Institute of Digital Communication Systems, Ruhr-University, Bochum, Germany
| | - Amhar Jaber
- grid.440414.10000 0004 0558 2628Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, Turkey
| | - Jens Allmer
- grid.454318.f0000 0004 0431 5034Medical Informatics and Bioinformatics, Institute for Measurement Engineering and Sensor Technology, Hochschule Ruhr West, University of Applied Sciences, Mülheim an der Ruhr, Germany
| | - Burcu Bakir-Gungor
- grid.440414.10000 0004 0558 2628Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, Turkey
| |
Collapse
|
8
|
Metabolic Pathway Analysis: Advantages and Pitfalls for the Functional Interpretation of Metabolomics and Lipidomics Data. Biomolecules 2023; 13:biom13020244. [PMID: 36830612 PMCID: PMC9953275 DOI: 10.3390/biom13020244] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/14/2023] [Accepted: 01/24/2023] [Indexed: 01/31/2023] Open
Abstract
Over the past decades, pathway analysis has become one of the most commonly used approaches for the functional interpretation of metabolomics data. Although the approach is widely used, it is not well standardized and the impact of different methodologies on the functional outcome is not well understood. Using four publicly available datasets, we investigated two main aspects of topological pathway analysis, namely the consideration of non-human native enzymatic reactions (e.g., from microbiota) and the interconnectivity of individual pathways. The exclusion of non-human native reactions led to detached and poorly represented reaction networks and to loss of information. The consideration of connectivity between pathways led to better emphasis of certain central metabolites in the network; however, it occasionally overemphasized the hub compounds. We proposed and examined a penalization scheme to diminish the effect of such compounds in the pathway evaluation. In order to compare and assess the results between different methodologies, we also performed over-representation analysis of the same datasets. We believe that our findings will raise awareness on both the capabilities and shortcomings of the currently used pathway analysis practices in metabolomics. Additionally, it will provide insights on various methodologies and strategies that should be considered for the analysis and interpretation of metabolomics data.
Collapse
|
9
|
Lu Y, Pang Z, Xia J. Comprehensive investigation of pathway enrichment methods for functional interpretation of LC-MS global metabolomics data. Brief Bioinform 2023; 24:bbac553. [PMID: 36572652 PMCID: PMC9851290 DOI: 10.1093/bib/bbac553] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/31/2022] [Accepted: 11/15/2022] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Global or untargeted metabolomics is widely used to comprehensively investigate metabolic profiles under various pathophysiological conditions such as inflammations, infections, responses to exposures or interactions with microbial communities. However, biological interpretation of global metabolomics data remains a daunting task. Recent years have seen growing applications of pathway enrichment analysis based on putative annotations of liquid chromatography coupled with mass spectrometry (LC-MS) peaks for functional interpretation of LC-MS-based global metabolomics data. However, due to intricate peak-metabolite and metabolite-pathway relationships, considerable variations are observed among results obtained using different approaches. There is an urgent need to benchmark these approaches to inform the best practices. RESULTS We have conducted a benchmark study of common peak annotation approaches and pathway enrichment methods in current metabolomics studies. Representative approaches, including three peak annotation methods and four enrichment methods, were selected and benchmarked under different scenarios. Based on the results, we have provided a set of recommendations regarding peak annotation, ranking metrics and feature selection. The overall better performance was obtained for the mummichog approach. We have observed that a ~30% annotation rate is sufficient to achieve high recall (~90% based on mummichog), and using semi-annotated data improves functional interpretation. Based on the current platforms and enrichment methods, we further propose an identifiability index to indicate the possibility of a pathway being reliably identified. Finally, we evaluated all methods using 11 COVID-19 and 8 inflammatory bowel diseases (IBD) global metabolomics datasets.
Collapse
Affiliation(s)
- Yao Lu
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Quebec, Canada
| | - Jianguo Xia
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
- Institute of Parasitology, McGill University, Quebec, Canada
| |
Collapse
|
10
|
Wieder C, Lai RPJ, Ebbels TMD. Single sample pathway analysis in metabolomics: performance evaluation and application. BMC Bioinformatics 2022; 23:481. [PMID: 36376837 PMCID: PMC9664704 DOI: 10.1186/s12859-022-05005-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 10/25/2022] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Single sample pathway analysis (ssPA) transforms molecular level omics data to the pathway level, enabling the discovery of patient-specific pathway signatures. Compared to conventional pathway analysis, ssPA overcomes the limitations by enabling multi-group comparisons, alongside facilitating numerous downstream analyses such as pathway-based machine learning. While in transcriptomics ssPA is a widely used technique, there is little literature evaluating its suitability for metabolomics. Here we provide a benchmark of established ssPA methods (ssGSEA, GSVA, SVD (PLAGE), and z-score) alongside the evaluation of two novel methods we propose: ssClustPA and kPCA, using semi-synthetic metabolomics data. We then demonstrate how ssPA can facilitate pathway-based interpretation of metabolomics data by performing a case-study on inflammatory bowel disease mass spectrometry data, using clustering to determine subtype-specific pathway signatures. RESULTS While GSEA-based and z-score methods outperformed the others in terms of recall, clustering/dimensionality reduction-based methods provided higher precision at moderate-to-high effect sizes. A case study applying ssPA to inflammatory bowel disease data demonstrates how these methods yield a much richer depth of interpretation than conventional approaches, for example by clustering pathway scores to visualise a pathway-based patient subtype-specific correlation network. We also developed the sspa python package (freely available at https://pypi.org/project/sspa/ ), providing implementations of all the methods benchmarked in this study. CONCLUSION This work underscores the value ssPA methods can add to metabolomic studies and provides a useful reference for those wishing to apply ssPA methods to metabolomics data.
Collapse
Affiliation(s)
- Cecilia Wieder
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, UK
| | - Rachel P J Lai
- Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, UK
| | - Timothy M D Ebbels
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|
11
|
Jiménez‐Santos MJ, García‐Martín S, Fustero‐Torre C, Di Domenico T, Gómez‐López G, Al‐Shahrour F. Bioinformatics roadmap for therapy selection in cancer genomics. Mol Oncol 2022; 16:3881-3908. [PMID: 35811332 PMCID: PMC9627786 DOI: 10.1002/1878-0261.13286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/22/2022] [Accepted: 07/08/2022] [Indexed: 12/24/2022] Open
Abstract
Tumour heterogeneity is one of the main characteristics of cancer and can be categorised into inter- or intratumour heterogeneity. This heterogeneity has been revealed as one of the key causes of treatment failure and relapse. Precision oncology is an emerging field that seeks to design tailored treatments for each cancer patient according to epidemiological, clinical and omics data. This discipline relies on bioinformatics tools designed to compute scores to prioritise available drugs, with the aim of helping clinicians in treatment selection. In this review, we describe the current approaches for therapy selection depending on which type of tumour heterogeneity is being targeted and the available next-generation sequencing data. We cover intertumour heterogeneity studies and individual treatment selection using genomics variants, expression data or multi-omics strategies. We also describe intratumour dissection through clonal inference and single-cell transcriptomics, in each case providing bioinformatics tools for tailored treatment selection. Finally, we discuss how these therapy selection workflows could be integrated into the clinical practice.
Collapse
Affiliation(s)
| | | | - Coral Fustero‐Torre
- Bioinformatics UnitSpanish National Cancer Research Centre (CNIO)MadridSpain
| | - Tomás Di Domenico
- Bioinformatics UnitSpanish National Cancer Research Centre (CNIO)MadridSpain
| | - Gonzalo Gómez‐López
- Bioinformatics UnitSpanish National Cancer Research Centre (CNIO)MadridSpain
| | - Fátima Al‐Shahrour
- Bioinformatics UnitSpanish National Cancer Research Centre (CNIO)MadridSpain
| |
Collapse
|
12
|
Panossian A, Abdelfatah S, Efferth T. Network Pharmacology of Ginseng (Part III): Antitumor Potential of a Fixed Combination of Red Ginseng and Red Sage as Determined by Transcriptomics. Pharmaceuticals (Basel) 2022; 15:ph15111345. [PMID: 36355517 PMCID: PMC9696821 DOI: 10.3390/ph15111345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/25/2022] [Accepted: 10/28/2022] [Indexed: 11/30/2022] Open
Abstract
Background: This study aimed to assess the effect of a fixed combination of Red Ginseng and Red Sage (RG–RS) on the gene expression of neuronal cells to evaluate the potential impacts on cellular functions and predict its relevance in the treatment of stress and aging-related diseases and disorders. Methods: Gene expression profiling was conducted by transcriptome-wide mRNA microarray analyses of murine HT22 hippocampal cell culture after treatment with RG–RS preparation. Ingenuity pathway analysis (IPA) was performed with datasets of significantly upregulated or downregulated genes and the expected effects on the physiological and cellular function and the diseases were identified. Results: RG–RS deregulates 1028 genes associated with cancer and 139 with metastasis, suggesting a predicted decrease in tumorigenesis, the proliferation of tumor cells, tumor growth, metastasis, and an increase in apoptosis and autophagy by their effects on the various signaling and metabolic pathways, including the inhibition of Warburg’s aerobic glycolysis, estrogen-mediated S-phase entry signaling, osteoarthritis signaling, and the super-pathway of cholesterol biosynthesis. Conclusion: The results of this study provide evidence of the potential efficacy of the fixed combination of Red Ginseng (Panax ginseng C.A. Mey.) and Red Sage/Danshen (Salvia miltiorrhiza Bunge) in cancer. Further clinical and experimental studies are required to assess the efficacy and safety of RG–RS in preventing the progression of cancer, osteoarthritis, and other aging-related diseases.
Collapse
Affiliation(s)
- Alexander Panossian
- EuroPharma USA Inc., Green Bay, WI 54311, USA
- Phytomed AB, 58344 Vastervick, Sweden
- Correspondence: (A.P.); (T.E.)
| | - Sara Abdelfatah
- Department of Pharmaceutical Biology, Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg University, 55131 Mainz, Germany
| | - Thomas Efferth
- Department of Pharmaceutical Biology, Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg University, 55131 Mainz, Germany
- Correspondence: (A.P.); (T.E.)
| |
Collapse
|
13
|
Metabolomic Profiling of End-Stage Heart Failure Secondary to Chronic Chagas Cardiomyopathy. Int J Mol Sci 2022; 23:ijms231810456. [PMID: 36142367 PMCID: PMC9499603 DOI: 10.3390/ijms231810456] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/25/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Chronic Chagas cardiomyopathy (CCC) is the most frequent and severe clinical form of chronic Chagas disease, representing one of the leading causes of morbidity and mortality in Latin America, and a growing global public health problem. There is currently no approved treatment for CCC; however, omics technologies have enabled significant progress to be made in the search for new therapeutic targets. The metabolic alterations associated with pathogenic mechanisms of CCC and their relationship to cellular and immunopathogenic processes in cardiac tissue remain largely unknown. This exploratory study aimed to evaluate the potential underlying pathogenic mechanisms in the failing myocardium of patients with end-stage heart failure (ESHF) secondary to CCC by applying an untargeted metabolomic profiling approach. Cardiac tissue samples from the left ventricle of patients with ESHF of CCC etiology (n = 7) and healthy donors (n = 7) were analyzed using liquid chromatography-mass spectrometry. Metabolite profiles showed altered branched-chain amino acid and acylcarnitine levels, decreased fatty acid uptake and oxidation, increased activity of the pentose phosphate pathway, dysregulation of the TCA cycle, and alterations in critical cellular antioxidant systems. These findings suggest processes of energy deficit, alterations in substrate availability, and enhanced production of reactive oxygen species in the affected myocardium. This profile potentially contributes to the development and maintenance of a chronic inflammatory state that leads to progression and severity of CCC. Further studies involving larger sample sizes and comparisons with heart failure patients without CCC are needed to validate these results, opening an avenue to investigate new therapeutic approaches for the treatment and prevention of progression of this unique and severe cardiomyopathy.
Collapse
|
14
|
Garrido-Rodriguez M, Zirngibl K, Ivanova O, Lobentanzer S, Saez-Rodriguez J. Integrating knowledge and omics to decipher mechanisms via large-scale models of signaling networks. Mol Syst Biol 2022; 18:e11036. [PMID: 35880747 PMCID: PMC9316933 DOI: 10.15252/msb.202211036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/12/2022] [Accepted: 05/31/2022] [Indexed: 11/10/2022] Open
Abstract
Signal transduction governs cellular behavior, and its dysregulation often leads to human disease. To understand this process, we can use network models based on prior knowledge, where nodes represent biomolecules, usually proteins, and edges indicate interactions between them. Several computational methods combine untargeted omics data with prior knowledge to estimate the state of signaling networks in specific biological scenarios. Here, we review, compare, and classify recent network approaches according to their characteristics in terms of input omics data, prior knowledge and underlying methodologies. We highlight existing challenges in the field, such as the general lack of ground truth and the limitations of prior knowledge. We also point out new omics developments that may have a profound impact, such as single‐cell proteomics or large‐scale profiling of protein conformational changes. We provide both an introduction for interested users seeking strategies to study cell signaling on a large scale and an update for seasoned modelers.
Collapse
Affiliation(s)
- Martin Garrido-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Katharina Zirngibl
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Olga Ivanova
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Sebastian Lobentanzer
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| |
Collapse
|
15
|
Comparing Bayesian-Based Reconstruction Strategies in Topology-Based Pathway Enrichment Analysis. Biomolecules 2022; 12:biom12070906. [PMID: 35883462 PMCID: PMC9313337 DOI: 10.3390/biom12070906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 06/21/2022] [Accepted: 06/24/2022] [Indexed: 02/01/2023] Open
Abstract
The development of high-throughput omics technologies has enabled the quantification of vast amounts of genes and gene products in the whole genome. Pathway enrichment analysis (PEA) provides an intuitive solution for extracting biological insights from massive amounts of data. Topology-based pathway analysis (TPA) represents the latest generation of PEA methods, which exploit pathway topology in addition to lists of differentially expressed genes and their expression profiles. A subset of these TPA methods, such as BPA, BNrich, and PROPS, reconstruct pathway structures by training Bayesian networks (BNs) from canonical biological pathways, providing superior representations that explain causal relationships between genes. However, these methods have never been compared for their differences in the PEA and their different topology reconstruction strategies. In this study, we aim to compare the BN reconstruction strategies of the BPA, BNrich, PROPS, Clipper, and Ensemble methods and their PEA and performance on tumor and non-tumor classification based on gene expression data. Our results indicate that they performed equally well in distinguishing tumor and non-tumor samples (AUC > 0.95) yet with a varying ranking of pathways, which can be attributed to the different BN structures resulting from the different cyclic structure removal strategies. This can be clearly seen from the reconstructed JAK-STAT networks by different strategies. In a nutshell, BNrich, which relies on expert intervention to remove loops and cyclic structures, produces BNs that best fit the biological facts. The plausibility of the Clipper strategy can also be partially explained by intuitive biological rules and theorems. Our results may offer an informed reference for the proper method for a given data analysis task.
Collapse
|
16
|
Mubeen S, Tom Kodamullil A, Hofmann-Apitius M, Domingo-Fernández D. On the influence of several factors on pathway enrichment analysis. Brief Bioinform 2022; 23:bbac143. [PMID: 35453140 PMCID: PMC9116215 DOI: 10.1093/bib/bbac143] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/21/2022] [Accepted: 03/30/2022] [Indexed: 02/01/2023] Open
Abstract
Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for. Researchers may fail to give influential aspects their due, resorting instead to popular methods and gene set collections, or default settings. Despite ongoing efforts to establish set guidelines, meaningful results are still hampered by a lack of consensus or gold standards around how enrichment analysis should be conducted. Nonetheless, such concerns have prompted a series of benchmark studies specifically focused on evaluating the influence of various factors on pathway enrichment results. In this review, we organize and summarize the findings of these benchmarks to provide a comprehensive overview on the influence of these factors. Our work covers a broad spectrum of factors, spanning from methodological assumptions to those related to prior biological knowledge, such as pathway definitions and database choice. In doing so, we aim to shed light on how these aspects can lead to insignificant, uninteresting or even contradictory results. Finally, we conclude the review by proposing future benchmarks as well as solutions to overcome some of the challenges, which originate from the outlined factors.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| |
Collapse
|
17
|
Winkler S, Winkler I, Figaschewski M, Tiede T, Nordheim A, Kohlbacher O. De novo identification of maximally deregulated subnetworks based on multi-omics data with DeRegNet. BMC Bioinformatics 2022; 23:139. [PMID: 35439941 PMCID: PMC9020058 DOI: 10.1186/s12859-022-04670-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Accepted: 03/29/2022] [Indexed: 12/14/2022] Open
Abstract
Background With a growing amount of (multi-)omics data being available, the extraction of knowledge from these datasets is still a difficult problem. Classical enrichment-style analyses require predefined pathways or gene sets that are tested for significant deregulation to assess whether the pathway is functionally involved in the biological process under study. De novo identification of these pathways can reduce the bias inherent in predefined pathways or gene sets. At the same time, the definition and efficient identification of these pathways de novo from large biological networks is a challenging problem. Results We present a novel algorithm, DeRegNet, for the identification of maximally deregulated subnetworks on directed graphs based on deregulation scores derived from (multi-)omics data. DeRegNet can be interpreted as maximum likelihood estimation given a certain probabilistic model for de-novo subgraph identification. We use fractional integer programming to solve the resulting combinatorial optimization problem. We can show that the approach outperforms related algorithms on simulated data with known ground truths. On a publicly available liver cancer dataset we can show that DeRegNet can identify biologically meaningful subgraphs suitable for patient stratification. DeRegNet can also be used to find explicitly multi-omics subgraphs which we demonstrate by presenting subgraphs with consistent methylation-transcription patterns. DeRegNet is freely available as open-source software. Conclusion The proposed algorithmic framework and its available implementation can serve as a valuable heuristic hypothesis generation tool contextualizing omics data within biomolecular networks.
Collapse
Affiliation(s)
- Sebastian Winkler
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany. .,International Max Planck Research School (IMPRS) "From Molecules to Organism", Tübingen, Germany.
| | - Ivana Winkler
- International Max Planck Research School (IMPRS) "From Molecules to Organism", Tübingen, Germany.,Interfaculty Institute for Cell Biology (IFIZ), University of Tuebingen, Tübingen, Germany.,German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Mirjam Figaschewski
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
| | - Thorsten Tiede
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
| | - Alfred Nordheim
- Interfaculty Institute for Cell Biology (IFIZ), University of Tuebingen, Tübingen, Germany.,Leibniz Institute on Aging (FLI), Jena, Germany
| | - Oliver Kohlbacher
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany.,Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany.,Translational Bioinformatics, University Hospital Tuebingen, Tübingen, Germany
| |
Collapse
|
18
|
Baruzzo G, Cesaro G, Di Camillo B. Identify, quantify and characterize cellular communication from single-cell RNA sequencing data with scSeqComm. Bioinformatics 2022; 38:1920-1929. [PMID: 35043939 DOI: 10.1093/bioinformatics/btac036] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Revised: 01/11/2022] [Accepted: 01/14/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Recently, single-cell RNA-seq (scRNA-seq) data have been used to study cellular communication. Most bioinformatics methods infer only the intercellular signaling between groups of cells, mainly exploiting ligand-receptor expression levels. Only few methods consider the entire intercellular + intracellular signaling, mainly inferring lists/networks of signaling involved genes. RESULTS Here, we present scSeqComm, a computational method to identify and quantify the evidence of ongoing intercellular and intracellular signaling from scRNA-seq data, and at the same time providing a functional characterization of the inferred cellular communication. The possibility to quantify the evidence of ongoing communication assists the prioritization of the results, while the combined evidence of both intercellular and intracellular signaling increase the reliability of inferred communication. The application to a scRNA-seq dataset of tumor microenvironment, the agreement with independent bioinformatics analysis, the validation using spatial transcriptomics data and the comparison with state-of-the-art intercellular scoring schemes confirmed the robustness and reliability of the proposed method. AVAILABILITY AND IMPLEMENTATION scSeqComm R package is freely available at https://gitlab.com/sysbiobig/scseqcomm and https://sysbiobig.dei.unipd.it/software/#scSeqComm. Submitted software version and test data are available in Zenodo, at https://dx.doi.org/10.5281/zenodo.5833298. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Giacomo Baruzzo
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Giulia Cesaro
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Barbara Di Camillo
- Department of Information Engineering, University of Padova, Padova, Italy.,Department of Comparative Biomedicine and Food Science, University of Padova, Padova, Italy.,CRIBI Innovative Biotechnology Center, University of Padova, Padova, Italy
| |
Collapse
|
19
|
NMR in Metabolomics: From Conventional Statistics to Machine Learning and Neural Network Approaches. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062824] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
NMR measurements combined with chemometrics allow achieving a great amount of information for the identification of potential biomarkers responsible for a precise metabolic pathway. These kinds of data are useful in different fields, ranging from food to biomedical fields, including health science. The investigation of the whole set of metabolites in a sample, representing its fingerprint in the considered condition, is known as metabolomics and may take advantage of different statistical tools. The new frontier is to adopt self-learning techniques to enhance clustering or classification actions that can improve the predictive power over large amounts of data. Although machine learning is already employed in metabolomics, deep learning and artificial neural networks approaches were only recently successfully applied. In this work, we give an overview of the statistical approaches underlying the wide range of opportunities that machine learning and neural networks allow to perform with accurate metabolites assignment and quantification.Various actual challenges are discussed, such as proper metabolomics, deep learning architectures and model accuracy.
Collapse
|
20
|
Laganà A. The Architecture of a Precision Oncology Platform. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:1-22. [DOI: 10.1007/978-3-030-91836-1_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
Toward modeling metabolic state from single-cell transcriptomics. Mol Metab 2021; 57:101396. [PMID: 34785394 PMCID: PMC8829761 DOI: 10.1016/j.molmet.2021.101396] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 10/21/2021] [Accepted: 11/09/2021] [Indexed: 12/31/2022] Open
Abstract
Background Single-cell metabolic studies bring new insights into cellular function, which can often not be captured on other omics layers. Metabolic information has wide applicability, such as for the study of cellular heterogeneity or for the understanding of drug mechanisms and biomarker development. However, metabolic measurements on single-cell level are limited by insufficient scalability and sensitivity, as well as resource intensiveness, and are currently not possible in parallel with measuring transcript state, commonly used to identify cell types. Nevertheless, because omics layers are strongly intertwined, it is possible to make metabolic predictions based on measured data of more easily measurable omics layers together with prior metabolic network knowledge. Scope of Review We summarize the current state of single-cell metabolic measurement and modeling approaches, motivating the use of computational techniques. We review three main classes of computational methods used for prediction of single-cell metabolism: pathway-level analysis, constraint-based modeling, and kinetic modeling. We describe the unique challenges arising when transitioning from bulk to single-cell modeling. Finally, we propose potential model extensions and computational methods that could be leveraged to achieve these goals. Major Conclusions Single-cell metabolic modeling is a rising field that provides a new perspective for understanding cellular functions. The presented modeling approaches vary in terms of input requirements and assumptions, scalability, modeled metabolic layers, and newly gained insights. We believe that the use of prior metabolic knowledge will lead to more robust predictions and will pave the way for mechanistic and interpretable machine-learning models. Single-cell RNA sequencing and prior metabolic knowledge enable metabolic predictions. When compared to bulk, single-cell modeling is linked to unique insights and challenges. Computational modelling approaches differ in applicability and newly provided insights. The use of prior metabolic knowledge paves the way for mechanistic machine-learning.
Collapse
|
22
|
Panossian A, Abdelfatah S, Efferth T. Network Pharmacology of Ginseng (Part II): The Differential Effects of Red Ginseng and Ginsenoside Rg5 in Cancer and Heart Diseases as Determined by Transcriptomics. Pharmaceuticals (Basel) 2021; 14:ph14101010. [PMID: 34681234 PMCID: PMC8540751 DOI: 10.3390/ph14101010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 09/27/2021] [Accepted: 09/29/2021] [Indexed: 01/08/2023] Open
Abstract
Panax ginseng C.A.Mey. is an adaptogenic plant traditionally used to enhance mental and physical capacities in cases of weakness, exhaustion, tiredness, or loss of concentration, and during recovery. According to ancient records, red ginseng root preparations enhance longevity with long-term intake. Recent pharmacokinetic studies of ginsenosides in humans and our in vitro study in neuronal cells suggest that ginsenosides are effective when their levels in blood is low—at concentrations from 10−6 to 10−18 M. In the present study, we compared the effects of red ginseng root preparation HRG80TM(HRG) at concentrations from 0.01 to 10,000 ng/mL with effects of white ginseng (WG) and purified ginsenosides Rb1, Rg3, Rg5 and Rk1 on gene expression in isolated hippocampal neurons. The aim of this study was to predict the effects of differently expressed genes on cellular and physiological functions in organismal disorders and diseases. Gene expression profiling was performed by transcriptome-wide mRNA microarray analyses in murine HT22 cells after treatment with ginseng preparations. Ingenuity pathway downstream/upstream analysis (IPA) was performed with datasets of significantly up- or downregulated genes, and expected effects on cellular function and disease were identified by IPA software. Ginsenosides Rb1, Rg3, Rg5, and Rk1 have substantially varied effects on gene expression profiles (signatures) and are different from signatures of HRG and WG. Furthermore, the signature of HRG is changed significantly with dilution from 10,000 to 0.01 ng/mL. Network pharmacological analyses of gene expression profiles showed that HRG exhibits predictable positive effects in neuroinflammation, senescence, apoptosis, and immune response, suggesting beneficial soft-acting effects in cancer, gastrointestinal, and endocrine systems diseases and disorders in a wide range of low concentrations in blood.
Collapse
Affiliation(s)
| | - Sara Abdelfatah
- Department of Pharmaceutical Biology, Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg University, 55099 Mainz, Germany;
| | - Thomas Efferth
- Department of Pharmaceutical Biology, Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg University, 55099 Mainz, Germany;
- Correspondence: (A.P.); (T.E.)
| |
Collapse
|
23
|
Panossian A, Abdelfatah S, Efferth T. Network Pharmacology of Red Ginseng (Part I): Effects of Ginsenoside Rg5 at Physiological and Sub-Physiological Concentrations. Pharmaceuticals (Basel) 2021; 14:ph14100999. [PMID: 34681222 PMCID: PMC8537973 DOI: 10.3390/ph14100999] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 09/27/2021] [Indexed: 01/01/2023] Open
Abstract
Numerous in vitro studies on isolated cells have been conducted to uncover the molecular mechanisms of action of Panax ginseng Meyer root extracts and purified ginsenosides. However, the concentrations of ginsenosides and the extracts used in these studies were much higher than those detected in pharmacokinetic studies in humans and animals orally administered with ginseng preparations at therapeutic doses. Our study aimed to assess: (a) the effects of ginsenoside Rg5, the major “rare” ginsenoside of Red Ginseng, on gene expression in the murine neuronal cell line HT22 in a wide range of concentrations, from 10−4 to 10−18 M, and (b) the effects of differentially expressed genes on cellular and physiological functions in organismal disorders and diseases. Gene expression profiling was performed by transcriptome-wide mRNA microarray analyses in HT22 cells after treatment with ginsenoside Rg5. Ginsenoside Rg5 exhibits soft-acting effects on gene expression of neuronal cells in a wide range of physiological concentrations and strong reversal impact at high (toxic) concentration: significant up- or downregulation of expression of about 300 genes at concentrations from 10−6 M to 10−18 M, and dramatically increased both the number of differentially expressed target genes (up to 1670) and the extent of their expression (fold changes compared to unexposed cells) at a toxic concentration of 10−4 M. Network pharmacology analyses of genes’ expression profiles using ingenuity pathway analysis (IPA) software showed that at low physiological concentrations, ginsenoside Rg5 has the potential to activate the biosynthesis of cholesterol and to exhibit predictable effects in senescence, neuroinflammation, apoptosis, and immune response, suggesting soft-acting, beneficial effects on organismal death, movement disorders, and cancer.
Collapse
Affiliation(s)
| | - Sara Abdelfatah
- Department of Pharmaceutical Biology, Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg University, 55131 Mainz, Germany;
| | - Thomas Efferth
- Department of Pharmaceutical Biology, Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg University, 55131 Mainz, Germany;
- Correspondence: (A.P.); (T.E.)
| |
Collapse
|
24
|
Tan YJ, Lee YT, Mancera RL, Oon CE. BZD9L1 sirtuin inhibitor: Identification of key molecular targets and their biological functions in HCT 116 colorectal cancer cells. Life Sci 2021; 284:119747. [PMID: 34171380 DOI: 10.1016/j.lfs.2021.119747] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 05/22/2021] [Accepted: 06/11/2021] [Indexed: 02/07/2023]
Abstract
BZD9L1 was previously described as a SIRT1/2 inhibitor with anti-cancer activities in colorectal cancer (CRC), either as a standalone chemotherapy or in combination with 5-fluorouracil. BZD9L1 was reported to induce apoptosis in CRC cells; however, the network of intracellular pathways and crosstalk between molecular players mediated by BZD9L1 is not fully understood. This study aimed to uncover the mechanisms involved in BZD9L1-mediated cytotoxicity based on previous and new findings for the prediction and identification of related pathways and key molecular players. BZD9L1-regulated candidate targets (RCTs) were identified using a range of molecular, cell-based and biochemical techniques on the HCT 116 cell line. BZD9L1 regulated major cancer pathways including Notch, p53, cell cycle, NFκB, Myc/MAX, and MAPK/ERK signalling pathways. BZD9L1 also induced reactive oxygen species (ROS), regulated apoptosis-related proteins, and altered cell polarity and adhesion profiles. In silico analyses revealed that most RCTs were interconnected, and were involved in the modulation of catalytic activity, metabolism and transcription regulation, response to cytokines, and apoptosis signalling pathways. These RCTs were implicated in p53-dependent apoptosis pathway. This study provides the first assessment of possible associations of molecular players underlying the cytotoxic activity of BZD9L1, and establishes the links between RCTs and apoptosis through the p53 pathway.
Collapse
Affiliation(s)
- Yi Jer Tan
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Penang 11800, Malaysia; Curtin Medical School, Curtin Health Innovation Research Institute (CHIRI) and Curtin Institute for Computation, Curtin University, GPO Box U1987, Perth, WA 6845, Australia
| | - Yeuan Ting Lee
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Penang 11800, Malaysia
| | - Ricardo L Mancera
- Curtin Medical School, Curtin Health Innovation Research Institute (CHIRI) and Curtin Institute for Computation, Curtin University, GPO Box U1987, Perth, WA 6845, Australia.
| | - Chern Ein Oon
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Penang 11800, Malaysia.
| |
Collapse
|
25
|
Koh AEH, Alsaeedi HA, Rashid MBA, Lam C, Harun MHN, Ng MH, Mohd Isa H, Then KY, Bastion MLC, Farhana A, Khursheed Alam M, Subbiah SK, Mok PL. Transplanted Erythropoietin-Expressing Mesenchymal Stem Cells Promote Pro-survival Gene Expression and Protect Photoreceptors From Sodium Iodate-Induced Cytotoxicity in a Retinal Degeneration Model. Front Cell Dev Biol 2021; 9:652017. [PMID: 33987180 PMCID: PMC8111290 DOI: 10.3389/fcell.2021.652017] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 03/29/2021] [Indexed: 12/18/2022] Open
Abstract
Mesenchymal stem cells (MSC) are highly regarded as a potential treatment for retinal degenerative disorders like retinitis pigmentosa and age-related macular degeneration. However, donor cell heterogeneity and inconsistent protocols for transplantation have led to varied outcomes in clinical trials. We previously showed that genetically-modifying MSCs to express erythropoietin (MSCEPO) improved its regenerative capabilities in vitro. Hence, in this study, we sought to prove its potential in vivo by transplanting MSCsEPO in a rat retinal degeneration model and analyzing its retinal transcriptome using RNA-Seq. Firstly, MSCsEPO were cultured and expanded before being intravitreally transplanted into the sodium iodate-induced model. After the procedure, electroretinography (ERG) was performed bi-weekly for 30 days. Histological analyses were performed after the ERG assessment. The retina was then harvested for RNA extraction. After mRNA-enrichment and library preparation, paired-end RNA-Seq was performed. Salmon and DESeq2 were used to process the output files. The generated dataset was then analyzed using over-representation (ORA), functional enrichment (GSEA), and pathway topology analysis tools (SPIA) to identify enrichment of key pathways in the experimental groups. The results showed that the MSCEPO-treated group had detectable ERG waves (P <0.05), which were indicative of successful phototransduction. The stem cells were also successfully detected by immunohistochemistry 30 days after intravitreal transplantation. An initial over-representation analysis revealed a snapshot of immune-related pathways in all the groups but was mainly overexpressed in the MSC group. A subsequent GSEA and SPIA analysis later revealed enrichment in a large number of biological processes including phototransduction, regeneration, and cell death (Padj <0.05). Based on these pathways, a set of pro-survival gene expressions were extracted and tabulated. This study provided an in-depth transcriptomic analysis on the MSCEPO-treated retinal degeneration model as well as a profile of pro-survival genes that can be used as candidates for further genetic enhancement studies on stem cells.
Collapse
Affiliation(s)
- Avin Ee-Hwan Koh
- Department of Biomedical Science, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Malaysia
| | - Hiba Amer Alsaeedi
- Department of Biomedical Science, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Malaysia
| | - Munirah Binti Abd Rashid
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, Malaysia
| | - Chenshen Lam
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, Malaysia
| | - Mohd Hairul Nizam Harun
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, Malaysia
| | - Min Hwei Ng
- Tissue Engineering Centre, Universiti Kebangsaan Malaysia Medical Center, Kuala Lumpur, Malaysia
| | - Hazlita Mohd Isa
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, Malaysia
| | - Kong Yong Then
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, Malaysia
| | - Mae-Lynn Catherine Bastion
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, Malaysia
| | - Aisha Farhana
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka, Saudi Arabia
| | | | - Suresh Kumar Subbiah
- Department of Medical Microbiology and Parasitology, Universiti Putra Malaysia, Serdang, Malaysia.,Genetics and Regenerative Medicine Research Group, Universiti Putra Malaysia, Serdang, Malaysia.,Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, India
| | - Pooi Ling Mok
- Department of Biomedical Science, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Malaysia.,Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka, Saudi Arabia.,Genetics and Regenerative Medicine Research Group, Universiti Putra Malaysia, Serdang, Malaysia.,Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, India
| |
Collapse
|
26
|
Somekh J. Model-based pathway enrichment analysis applied to the TGF-beta regulation of autophagy in autism. J Biomed Inform 2021; 118:103781. [PMID: 33839306 DOI: 10.1016/j.jbi.2021.103781] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 03/23/2021] [Accepted: 04/05/2021] [Indexed: 10/21/2022]
Abstract
To differentiate between conditions of health and disease, current pathway enrichment analysis methods detect the differential expression of distinct biological pathways. System-level model-driven approaches, however, are lacking. Here we present a new methodology that uses a dynamic model to suggest a unified subsystem to better differentiate between diseased and healthy conditions. Our methodology includes the following steps: 1) detecting connections between relevant differentially expressed pathways; 2) construction of a unified in silico model, a stochastic Petri net model that links these distinct pathways; 3) model execution to predict subsystem activation; and 4) enrichment analysis of the predicted subsystem. We apply our approach to the TGF-beta regulation of the autophagy system implicated in autism. Our model was constructed manually, based on the literature, to predict, using model simulation, the TGF-beta-to-autophagy active subsystem and downstream gene expression changes associated with TGF-beta, which go beyond the individual findings derived from literature. We evaluated the in silico predicted subsystem and found it to be co-expressed in the normative whole blood human gene expression data. Finally, we show our subsystem's gene set to be significantly differentially expressed in two independent datasets of blood samples of ASD (autistic spectrum disorders) individuals as opposed to controls. Our study demonstrates that dynamic pathway unification can define a new refined subsystem that can significantly differentiate between disease conditions.
Collapse
Affiliation(s)
- Judith Somekh
- Department of Information Systems, University of Haifa, Haifa 3498838, Israel.
| |
Collapse
|
27
|
Krokidis MG, Exarchos TP, Vlamos P. Data-driven biomarker analysis using computational omics approaches to assess neurodegenerative disease progression. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:1813-1832. [PMID: 33757212 DOI: 10.3934/mbe.2021094] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The complexity of biological systems suggests that current definitions of molecular dysfunctions are essential distinctions of a complex phenotype. This is well seen in neurodegenerative diseases (ND), such as Alzheimer's disease (AD) and Parkinson's disease (PD), multi-factorial pathologies characterized by high heterogeneity. These challenges make it necessary to understand the effectiveness of candidate biomarkers for early diagnosis, as well as to obtain a comprehensive mapping of how selective treatment alters the progression of the disorder. A large number of computational methods have been developed to explain network-based approaches by integrating individual components for modeling a complex system. In this review, high-throughput omics methodologies are presented for the identification of potent biomarkers associated with AD and PD pathogenesis as well as for monitoring the response of dysfunctional molecular pathways incorporating multilevel clinical information. In addition, principles for efficient data analysis pipelines are being discussed that can help address current limitations during the experimental process by increasing the reproducibility of benchmarking studies.
Collapse
Affiliation(s)
- Marios G Krokidis
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Greece
| | - Themis P Exarchos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Greece
| | - Panagiotis Vlamos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Greece
| |
Collapse
|
28
|
Thistlethwaite LR, Petrosyan V, Li X, Miller MJ, Elsea SH, Milosavljevic A. CTD: An information-theoretic algorithm to interpret sets of metabolomic and transcriptomic perturbations in the context of graphical models. PLoS Comput Biol 2021; 17:e1008550. [PMID: 33513132 PMCID: PMC7875364 DOI: 10.1371/journal.pcbi.1008550] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 02/10/2021] [Accepted: 11/16/2020] [Indexed: 01/17/2023] Open
Abstract
We consider the following general family of algorithmic problems that arises in transcriptomics, metabolomics and other fields: given a weighted graph G and a subset of its nodes S, find subsets of S that show significant connectedness within G. A specific solution to this problem may be defined by devising a scoring function, the Maximum Clique problem being a classic example, where S includes all nodes in G and where the score is defined by the size of the largest subset of S fully connected within G. Major practical obstacles for the plethora of algorithms addressing this type of problem include computational efficiency and, particularly for more complex scores which take edge weights into account, the computational cost of permutation testing, a statistical procedure required to obtain a bound on the p-value for a connectedness score. To address these problems, we developed CTD, "Connect the Dots", a fast algorithm based on data compression that detects highly connected subsets within S. CTD provides information-theoretic upper bounds on p-values when S contains a small fraction of nodes in G without requiring computationally costly permutation testing. We apply the CTD algorithm to interpret multi-metabolite perturbations due to inborn errors of metabolism and multi-transcript perturbations associated with breast cancer in the context of disease-specific Gaussian Markov Random Field networks learned directly from respective molecular profiling data.
Collapse
Affiliation(s)
- Lillian R. Thistlethwaite
- Quantitative and Computational Biosciences Program, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Varduhi Petrosyan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Xiqi Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Marcus J. Miller
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - Sarah H. Elsea
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Aleksandar Milosavljevic
- Quantitative and Computational Biosciences Program, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| |
Collapse
|
29
|
Agapito G, Pastrello C, Jurisica I. Comprehensive pathway enrichment analysis workflows: COVID-19 case study. Brief Bioinform 2020. [PMCID: PMC7799312 DOI: 10.1093/bib/bbaa377] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The coronavirus disease 2019 (COVID-19) outbreak due to the novel coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been classified as a pandemic disease by the World Health Organization on the 12th March 2020. This world-wide crisis created an urgent need to identify effective countermeasures against SARS-CoV-2. In silico methods, artificial intelligence and bioinformatics analysis pipelines provide effective and useful infrastructure for comprehensive interrogation and interpretation of available data, helping to find biomarkers, explainable models and eventually cures. One class of such tools, pathway enrichment analysis (PEA) methods, helps researchers to find possible key targets present in biological pathways of host cells that are targeted by SARS-CoV-2. Since many software tools are available, it is not easy for non-computational users to choose the best one for their needs. In this paper, we highlight how to choose the most suitable PEA method based on the type of COVID-19 data to analyze. We aim to provide a comprehensive overview of PEA techniques and the tools that implement them.
Collapse
Affiliation(s)
| | - Chiara Pastrello
- Krembil Research Institute, University Health Network, Toronto, Canada
| | - Igor Jurisica
- Departments of Medical Biophysics and Computer Science, University of Toronto, Canada
| |
Collapse
|
30
|
Giannuzzi D, Biolatti B, Longato E, Divari S, Starvaggi Cucuzza L, Pregel P, Scaglione FE, Rinaldi A, Chiesa LM, Cannizzo FT. Application of RNA-sequencing to identify biomarkers in broiler chickens prophylactic administered with antimicrobial agents. Animal 2020; 15:100113. [PMID: 33573988 DOI: 10.1016/j.animal.2020.100113] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 10/08/2020] [Accepted: 10/15/2020] [Indexed: 12/19/2022] Open
Abstract
Antimicrobial (AM) resistance is largely acknowledged as one of the biggest global health and food safety challenges and the overuse of AMs is known to generate resistance in bacteria that may affect both animals and humans. Poultry meat is the second most-produced meat in the European Union and in recent years consumers are becoming more concerned about food safety, traceability, and animal welfare in poultry rearing system, increasingly requiring meats from broilers reared without AMs. In the present study, we performed RNA sequencing to analyze 64 liver and 54 muscle transcriptomic profiles in broilers reared without treatment or treated with different classes of AMs. Moreover, we validated the most differentially expressed genes among the treated groups to detect putative novel biomarkers able to discriminate meats of broilers reared without AMs. The PDK4, IGFBP1, and RHOB genes were identified as putative novel hepatic biomarkers, discriminating broilers treated with AMs compared to broilers reared without treatments. The whole transcriptome changes revealed the liver as a valuable target organ for AM administration screening. In addition, our results suggest a leading effect of the coccidiostat when associated with AMs, influencing several biological processes. Our study showed that RNA sequencing is a powerful and valuable method to detect aberrant regulated genes and to identify biomarker candidates for AM misuse detection in farm animals. Further validation on larger sample size and a wider spectrum of AMs are needed to confirm the viability of the aforementioned biomarkers in poultry population.
Collapse
Affiliation(s)
- D Giannuzzi
- Department of Agronomy, Food, Natural Resources, Animals and Environment, University of Padua, Legnaro, I-35020 Padua, Italy.
| | - B Biolatti
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| | - E Longato
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| | - S Divari
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| | - L Starvaggi Cucuzza
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| | - P Pregel
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| | - F E Scaglione
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| | - A Rinaldi
- Faculty of Biomedical Sciences, Università della Svizzera italiana (USI), Institute of Oncology Research (IOR), CH-6500 Bellinzona, Switzerland
| | - L M Chiesa
- Department of Veterinary Science and Public Health, University of Milan, I-20133 Milan, Italy
| | - F T Cannizzo
- Department of Veterinary Science, University of Turin, Grugliasco, I-10095 Turin, Italy
| |
Collapse
|
31
|
Vrahatis AG, Kotsireas IS, Vlamos P. Detecting Common Pathways and Key Molecules of Neurodegenerative Diseases from the Topology of Molecular Networks. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2020; 1194:409-421. [PMID: 32468556 DOI: 10.1007/978-3-030-32622-7_38] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
MotivationNeurodegenerative diseases (NDs), including amyotrophic lateral sclerosis, Parkinson's disease, Alzheimer's disease, and Huntington's disease, occur as a result of neurodegenerative processes. Thus, it has been increasingly appreciated that many neurodegenerative conditions overlap at multiple levels. However, traditional clinicopathological correlation approaches to better classify a disease have met with limited success. Discovering this overlap offers hope for therapeutic advances that could ameliorate many ND simultaneously. In parallel, in the last decade, systems biology approaches have become a reliable choice in complex disease analysis for gaining more delicate biological insights and have enabled the comprehension of the higher order functions of the biological systems.ResultsToward this orientation, we developed a systems biology approach for the identification of common links and pathways of ND, based on well-established and novel topological and functional measures. For this purpose, a molecular pathway network was constructed, using molecular interactions and relations of four main neurodegenerative diseases (Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and Huntington's disease). Our analysis captured the overlapped subregions forming molecular subpathways fully enriched in these four NDs. Also, it exported molecules that act as bridges, hubs, and key players for neurodegeneration concerning either their topology or their functional role.ConclusionUnderstanding these common links and central topologies under the perspective of systems biology and network theory and greater insights are provided to uncover the complex neurodegeneration processes.
Collapse
Affiliation(s)
| | - Ilias S Kotsireas
- Department of Physics and Computer Science, Wilfrid Laurier University, Waterloo, Canada
| | | |
Collapse
|
32
|
Saberian N, Shafi A, Peyvandipour A, Draghici S. MAGPEL: an autoMated pipeline for inferring vAriant-driven Gene PanEls from the full-length biomedical literature. Sci Rep 2020; 10:12365. [PMID: 32703994 PMCID: PMC7378213 DOI: 10.1038/s41598-020-68649-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 06/17/2020] [Indexed: 11/09/2022] Open
Abstract
In spite of the efforts in developing and maintaining accurate variant databases, a large number of disease-associated variants are still hidden in the biomedical literature. Curation of the biomedical literature in an effort to extract this information is a challenging task due to: (i) the complexity of natural language processing, (ii) inconsistent use of standard recommendations for variant description, and (iii) the lack of clarity and consistency in describing the variant-genotype-phenotype associations in the biomedical literature. In this article, we employ text mining and word cloud analysis techniques to address these challenges. The proposed framework extracts the variant-gene-disease associations from the full-length biomedical literature and designs an evidence-based variant-driven gene panel for a given condition. We validate the identified genes by showing their diagnostic abilities to predict the patients' clinical outcome on several independent validation cohorts. As representative examples, we present our results for acute myeloid leukemia (AML), breast cancer and prostate cancer. We compare these panels with other variant-driven gene panels obtained from Clinvar, Mastermind and others from literature, as well as with a panel identified with a classical differentially expressed genes (DEGs) approach. The results show that the panels obtained by the proposed framework yield better results than the other gene panels currently available in the literature.
Collapse
Affiliation(s)
- Nafiseh Saberian
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Adib Shafi
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Azam Peyvandipour
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA.
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA.
| |
Collapse
|
33
|
Rotroff DM. A Bioinformatics Crash Course for Interpreting Genomics Data. Chest 2020; 158:S113-S123. [PMID: 32658646 PMCID: PMC8176646 DOI: 10.1016/j.chest.2020.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 11/11/2019] [Accepted: 03/09/2020] [Indexed: 10/23/2022] Open
Abstract
Reductions in genotyping costs and improvements in computational power have made conducting genome-wide association studies (GWAS) standard practice for many complex diseases. GWAS is the assessment of genetic variants across the genome of many individuals to determine which, if any, genetic variants are associated with a specific trait. As with any analysis, there are evolving best practices that should be followed to ensure scientific rigor and reliability in the conclusions. This article presents a brief summary for many of the key bioinformatics considerations when either planning or evaluating GWAS. This review is meant to serve as a guide to those without deep expertise in bioinformatics and GWAS and give them tools to critically evaluate this popular approach to investigating complex diseases. In addition, a checklist is provided that can be used by investigators to evaluate whether a GWAS has appropriately accounted for the many potential sources of bias and generally followed current best practices.
Collapse
Affiliation(s)
- Daniel M Rotroff
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH.
| |
Collapse
|
34
|
Maleki F, Ovens K, Hogan DJ, Kusalik AJ. Gene Set Analysis: Challenges, Opportunities, and Future Research. Front Genet 2020; 11:654. [PMID: 32695141 PMCID: PMC7339292 DOI: 10.3389/fgene.2020.00654] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 05/29/2020] [Indexed: 12/14/2022] Open
Abstract
Gene set analysis methods are widely used to provide insight into high-throughput gene expression data. There are many gene set analysis methods available. These methods rely on various assumptions and have different requirements, strengths and weaknesses. In this paper, we classify gene set analysis methods based on their components, describe the underlying requirements and assumptions for each class, and provide directions for future research in developing and evaluating gene set analysis methods.
Collapse
|
35
|
Jo K, Santos-Buitrago B, Kim M, Rhee S, Talcott C, Kim S. Logic-based analysis of gene expression data predicts association between TNF, TGFB1 and EGF pathways in basal-like breast cancer. Methods 2020; 179:89-100. [PMID: 32445696 DOI: 10.1016/j.ymeth.2020.05.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 04/30/2020] [Accepted: 05/13/2020] [Indexed: 12/16/2022] Open
Abstract
For breast cancer, clinically important subtypes are well characterized at the molecular level in terms of gene expression profiles. In addition, signaling pathways in breast cancer have been extensively studied as therapeutic targets due to their roles in tumor growth and metastasis. However, it is challenging to put signaling pathways and gene expression profiles together to characterize biological mechanisms of breast cancer subtypes since many signaling events result from post-translational modifications, rather than gene expression differences. We designed a logic-based computational framework to explain the differences in gene expression profiles among breast cancer subtypes using Pathway Logic and transcriptional network information. Pathway Logic is a rewriting-logic-based formal system for modeling biological pathways including post-translational modifications. Our method demonstrated its utility by constructing subtype-specific path from key receptors (TNFR, TGFBR1 and EGFR) to key transcription factor (TF) regulators (RELA, ATF2, SMAD3 and ELK1) and identifying potential association between pathways via TFs in basal-specific paths, which could provide a novel insight on aggressive breast cancer subtypes. Codes and results are available at http://epigenomics.snu.ac.kr/PL/.
Collapse
Affiliation(s)
- Kyuri Jo
- Department of Computer Engineering, Chungbuk National University, Cheongju, Republic of Korea
| | - Beatriz Santos-Buitrago
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
| | - Minsu Kim
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Sungmin Rhee
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
| | | | - Sun Kim
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea; Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea; Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
36
|
Eicher T, Kinnebrew G, Patt A, Spencer K, Ying K, Ma Q, Machiraju R, Mathé EA. Metabolomics and Multi-Omics Integration: A Survey of Computational Methods and Resources. Metabolites 2020; 10:E202. [PMID: 32429287 PMCID: PMC7281435 DOI: 10.3390/metabo10050202] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 05/07/2020] [Accepted: 05/13/2020] [Indexed: 02/06/2023] Open
Abstract
As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose effective techniques for multi-omics analyses relevant to their field of study.
Collapse
Affiliation(s)
- Tara Eicher
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Computer Science and Engineering Department, The Ohio State University College of Engineering, Columbus, OH 43210, USA
| | - Garrett Kinnebrew
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Comprehensive Cancer Center, The Ohio State University and James Cancer Hospital, Columbus, OH 43210, USA;
- Bioinformatics Shared Resource Group, The Ohio State University, Columbus, OH 43210, USA
| | - Andrew Patt
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, NIH, 9800 Medical Center Dr., Rockville, MD, 20892, USA;
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH 43210, USA
| | - Kyle Spencer
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH 43210, USA
- Nationwide Children’s Research Hospital, Columbus, OH 43210, USA
| | - Kevin Ying
- Comprehensive Cancer Center, The Ohio State University and James Cancer Hospital, Columbus, OH 43210, USA;
- Molecular, Cellular and Developmental Biology Program, The Ohio State University, Columbus, OH 43210, USA
| | - Qin Ma
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
| | - Raghu Machiraju
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Computer Science and Engineering Department, The Ohio State University College of Engineering, Columbus, OH 43210, USA
- Department of Pathology, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| | - Ewy A. Mathé
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, NIH, 9800 Medical Center Dr., Rockville, MD, 20892, USA;
| |
Collapse
|
37
|
Giannuzzi D, Giudice L, Marconato L, Ferraresso S, Giugno R, Bertoni F, Aresu L. Integrated analysis of transcriptome, methylome and copy number aberrations data of marginal zone lymphoma and follicular lymphoma in dog. Vet Comp Oncol 2020; 18:645-655. [PMID: 32154977 DOI: 10.1111/vco.12588] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 02/10/2020] [Accepted: 03/05/2020] [Indexed: 12/17/2022]
Abstract
Marginal zone lymphoma (MZL) and follicular lymphoma (FL) are classified as indolent B-cell lymphomas in dogs. Aside from the clinical and histopathological similarities with the human counterpart, the molecular pathogenesis remains unclear. We integrated transcriptome, genome-wide DNA methylation and copy number aberration analysis to provide insights on the pathogenesis of canine MZL (n = 5) and FL (n = 7), also comparing them with diffuse large B-cell lymphoma (DLBCL). Transcriptome profiling highlighted the presence of similar biological processes affecting both histotypes, including BCR and TLR signalling pathways. However, FLs showed an enrichment of E2F targets, whereas MZLs were characterized by MYC-driven transcriptional activation signatures. FLs showed a distinctive loss on chr1 containing CEACAM23 and 24, conversely MZLs presented multiple recurrent gains on chr13, where MYC is located. The distribution of methylation peaks was similar between the two histotypes. Integrating data from the three omics, FLs resulted clearly separated from MZLs and DLBCL dataset. MZLs showed the enrichment of FoxM1 network and TLR associated TICAM1-dependent IRFs activation pathway. However, no specific signatures differentiated MZLs from DLBCLs. In conclusion, our study presents the first comprehensive analysis of molecular and epigenetic pathogenesis of canine FL and MZL.
Collapse
Affiliation(s)
- Diana Giannuzzi
- Department of Comparative Biomedicine and Food Science, University of Padua, Padua, Italy
| | - Luca Giudice
- Department of Computer Science, University of Verona, Verona, Italy
| | - Laura Marconato
- Department of Veterinary Medical Sciences, University of Bologna, Bologna, Italy
| | - Serena Ferraresso
- Department of Comparative Biomedicine and Food Science, University of Padua, Padua, Italy
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Verona, Italy
| | - Francesco Bertoni
- Università della Svizzera italiana (USI), Institute of Oncology Research (IOR), Bellinzona, Switzerland
| | - Luca Aresu
- Department of Veterinary Science, University of Turin, Turin, Italy
| |
Collapse
|
38
|
Zyla J, Marczyk M, Domaszewska T, Kaufmann SHE, Polanska J, Weiner J. Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms. Bioinformatics 2019; 35:5146-5154. [PMID: 31165139 PMCID: PMC6954644 DOI: 10.1093/bioinformatics/btz447] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 05/08/2019] [Accepted: 06/10/2019] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Analysis of gene set (GS) enrichment is an essential part of functional omics studies. Here, we complement the established evaluation metrics of GS enrichment algorithms with a novel approach to assess the practical reproducibility of scientific results obtained from GS enrichment tests when applied to related data from different studies. RESULTS We evaluated eight established and one novel algorithm for reproducibility, sensitivity, prioritization, false positive rate and computational time. In addition to eight established algorithms, we also included Coincident Extreme Ranks in Numerical Observations (CERNO), a flexible and fast algorithm based on modified Fisher P-value integration. Using real-world datasets, we demonstrate that CERNO is robust to ranking metrics, as well as sample and GS size. CERNO had the highest reproducibility while remaining sensitive, specific and fast. In the overall ranking Pathway Analysis with Down-weighting of Overlapping Genes, CERNO and over-representation analysis performed best, while CERNO and GeneSetTest scored high in terms of reproducibility. AVAILABILITY AND IMPLEMENTATION tmod package implementing the CERNO algorithm is available from CRAN (cran.r-project.org/web/packages/tmod/index.html) and an online implementation can be found at http://tmod.online/. The datasets analyzed in this study are widely available in the KEGGdzPathwaysGEO, KEGGandMetacoreDzPathwaysGEO R package and GEO repository. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joanna Zyla
- Data Mining Group, Faculty of Automatic Control, Electronic and Computer Science, Institute of Automatic Control, Silesian University of Technology, Gliwice, Poland
- Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
| | - Michal Marczyk
- Data Mining Group, Faculty of Automatic Control, Electronic and Computer Science, Institute of Automatic Control, Silesian University of Technology, Gliwice, Poland
- Yale School of Medicine, Yale Cancer Center, New Haven, CT 06510, USA
| | - Teresa Domaszewska
- Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
| | - Stefan H E Kaufmann
- Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
| | - Joanna Polanska
- Data Mining Group, Faculty of Automatic Control, Electronic and Computer Science, Institute of Automatic Control, Silesian University of Technology, Gliwice, Poland
| | - January Weiner
- Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
| |
Collapse
|
39
|
Mubeen S, Hoyt CT, Gemünd A, Hofmann-Apitius M, Fröhlich H, Domingo-Fernández D. The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling. Front Genet 2019; 10:1203. [PMID: 31824580 PMCID: PMC6883970 DOI: 10.3389/fgene.2019.01203] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 10/30/2019] [Indexed: 02/04/2023] Open
Abstract
Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - André Gemünd
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|
40
|
Ma J, Shojaie A, Michailidis G. A comparative study of topology-based pathway enrichment analysis methods. BMC Bioinformatics 2019; 20:546. [PMID: 31684881 PMCID: PMC6829999 DOI: 10.1186/s12859-019-3146-1] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 10/02/2019] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Pathway enrichment extensively used in the analysis of Omics data for gaining biological insights into the functional roles of pre-defined subsets of genes, proteins and metabolites. A large number of methods have been proposed in the literature for this task. The vast majority of these methods use as input expression levels of the biomolecules under study together with their membership in pathways of interest. The latest generation of pathway enrichment methods also leverages information on the topology of the underlying pathways, which as evidence from their evaluation reveals, lead to improved sensitivity and specificity. Nevertheless, a systematic empirical comparison of such methods is still lacking, making selection of the most suitable method for a specific experimental setting challenging. This comparative study of nine network-based methods for pathway enrichment analysis aims to provide a systematic evaluation of their performance based on three real data sets with different number of features (genes/metabolites) and number of samples. RESULTS The findings highlight both methodological and empirical differences across the nine methods. In particular, certain methods assess pathway enrichment due to differences both across expression levels and in the strength of the interconnectedness of the members of the pathway, while others only leverage differential expression levels. In the more challenging setting involving a metabolomics data set, the results show that methods that utilize both pieces of information (with NetGSA being a prototypical one) exhibit superior statistical power in detecting pathway enrichment. CONCLUSION The analysis reveals that a number of methods perform equally well when testing large size pathways, which is the case with genomic data. On the other hand, NetGSA that takes into consideration both differential expression of the biomolecules in the pathway, as well as changes in the topology exhibits a superior performance when testing small size pathways, which is usually the case for metabolomics data.
Collapse
Affiliation(s)
- Jing Ma
- Texas A&M University, Department of Statistics, College Station, 77840 USA
- Fred Hutchinson Cancer Research Center, Public Health Sciences Division, Seattle, 98107 USA
| | - Ali Shojaie
- University of Washington, Department of Biostatistics, Seattle, 98105 USA
| | | |
Collapse
|
41
|
Executable pathway analysis using ensemble discrete-state modeling for large-scale data. PLoS Comput Biol 2019; 15:e1007317. [PMID: 31479446 PMCID: PMC6743792 DOI: 10.1371/journal.pcbi.1007317] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 09/13/2019] [Accepted: 08/01/2019] [Indexed: 12/15/2022] Open
Abstract
Pathway analysis is widely used to gain mechanistic insights from high-throughput omics data. However, most existing methods do not consider signal integration represented by pathway topology, resulting in enrichment of convergent pathways when downstream genes are modulated. Incorporation of signal flow and integration in pathway analysis could rank the pathways based on modulation in key regulatory genes. This implementation can be facilitated for large-scale data by discrete state network modeling due to simplicity in parameterization. Here, we model cellular heterogeneity using discrete state dynamics and measure pathway activities in cross-sectional data. We introduce a new algorithm, Boolean Omics Network Invariant-Time Analysis (BONITA), for signal propagation, signal integration, and pathway analysis. Our signal propagation approach models heterogeneity in transcriptomic data as arising from intercellular heterogeneity rather than intracellular stochasticity, and propagates binary signals repeatedly across networks. Logic rules defining signal integration are inferred by genetic algorithm and are refined by local search. The rules determine the impact of each node in a pathway, which is used to score the probability of the pathway's modulation by chance. We have comprehensively tested BONITA for application to transcriptomics data from translational studies. Comparison with state-of-the-art pathway analysis methods shows that BONITA has higher sensitivity at lower levels of source node modulation and similar sensitivity at higher levels of source node modulation. Application of BONITA pathway analysis to previously validated RNA-sequencing studies identifies additional relevant pathways in in-vitro human cell line experiments and in-vivo infant studies. Additionally, BONITA successfully detected modulation of disease specific pathways when comparing relevant RNA-sequencing data with healthy controls. Most interestingly, the two highest impact score nodes identified by BONITA included known drug targets. Thus, BONITA is a powerful approach to prioritize not only pathways but also specific mechanistic role of genes compared to existing methods. BONITA is available at: https://github.com/thakar-lab/BONITA.
Collapse
|
42
|
Mubeen S, Hoyt CT, Gemünd A, Hofmann-Apitius M, Fröhlich H, Domingo-Fernández D. The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling. Front Genet 2019. [PMID: 31824580 DOI: 10.3389/fgene.2019.01203/bibtex] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023] Open
Abstract
Pathway-centric approaches are widely used to interpret and contextualize -omics data. However, databases contain different representations of the same biological pathway, which may lead to different results of statistical enrichment analysis and predictive models in the context of precision medicine. We have performed an in-depth benchmarking of the impact of pathway database choice on statistical enrichment analysis and predictive modeling. We analyzed five cancer datasets using three major pathway databases and developed an approach to merge several databases into a single integrative one: MPath. Our results show that equivalent pathways from different databases yield disparate results in statistical enrichment analysis. Moreover, we observed a significant dataset-dependent impact on the performance of machine learning models on different prediction tasks. In some cases, MPath significantly improved prediction performance and also reduced the variance of prediction performances. Furthermore, MPath yielded more consistent and biologically plausible results in statistical enrichment analyses. In summary, this benchmarking study demonstrates that pathway database choice can influence the results of statistical enrichment analysis and predictive modeling. Therefore, we recommend the use of multiple pathway databases or integrative ones.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - André Gemünd
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Holger Fröhlich
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|
43
|
Domingo-Fernández D, Hoyt CT, Bobis-Álvarez C, Marín-Llaó J, Hofmann-Apitius M. ComPath: an ecosystem for exploring, analyzing, and curating mappings across pathway databases. NPJ Syst Biol Appl 2018; 5:3. [PMID: 30564458 PMCID: PMC6292919 DOI: 10.1038/s41540-018-0078-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/31/2018] [Accepted: 11/02/2018] [Indexed: 11/09/2022] Open
Abstract
Although pathways are widely used for the analysis and representation of biological systems, their lack of clear boundaries, their dispersion across numerous databases, and the lack of interoperability impedes the evaluation of the coverage, agreements, and discrepancies between them. Here, we present ComPath, an ecosystem that supports curation of pathway mappings between databases and fosters the exploration of pathway knowledge through several novel visualizations. We have curated mappings between three of the major pathway databases and present a case study focusing on Parkinson’s disease that illustrates how ComPath can generate new biological insights by identifying pathway modules, clusters, and cross-talks with these mappings. The ComPath source code and resources are available at https://github.com/ComPath and the web application can be accessed at https://compath.scai.fraunhofer.de/.
Collapse
Affiliation(s)
- Daniel Domingo-Fernández
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,2Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| | - Charles Tapley Hoyt
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,2Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| | - Carlos Bobis-Álvarez
- 3Faculty of Medicine and Health Sciences, University of Oviedo, 33006 Oviedo, Spain
| | - Josep Marín-Llaó
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,4Rovira i Virgili University, 43003 Tarragona, Spain
| | - Martin Hofmann-Apitius
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,2Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| |
Collapse
|
44
|
Seo EJ, Efferth T, Panossian A. Curcumin downregulates expression of opioid-related nociceptin receptor gene (OPRL1) in isolated neuroglia cells. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2018; 50:285-299. [PMID: 30466988 DOI: 10.1016/j.phymed.2018.09.202] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 08/22/2018] [Accepted: 09/17/2018] [Indexed: 06/09/2023]
Abstract
BACKGROUND Curcumin (CC) exerts polyvalent pharmacological actions and multi-target effects, including pain relief and anti-nociceptive activity. In combination with Boswellia serrata extract (BS), curcumin shows greater efficacy in knee osteoarthritis management, presumably due to synergistic interaction of the ingredients. AIM To elucidate the molecular mechanisms underlying the analgesic activity of curcumin and its synergistic interaction with BS. METHODS We performed gene expression profiling by transcriptome-wide mRNA sequencing in human T98G neuroglia cells treated with CC (Curamed), BS, and the combination of CC and BS (CC-BS; Curamin), followed by interactive pathways analysis of the regulated genes. RESULTS Treatment with CC and with CC-BS selectively downregulated opioid-related nociceptin receptor 1 gene (OPRL1) expression by 5.9-fold and 7.2-fold, respectively. No changes were detected in the other canonical opioid receptor genes: OPRK1, OPRD1, and OPRM1. Nociceptin reportedly increases the sensation of pain in supra-spinal pain transduction pathways. Thus, CC and CC-BS may downregulate OPRL1, consequently inhibiting production of the nociception receptor NOP, leading to pain relief. In neuroglia cells, CC and CC-BS inhibited signaling pathways related to opioids, neuropathic pain, neuroinflammation, osteoarthritis, and rheumatoid diseases. CC and CC-BS also downregulated ADAM metallopeptidase gene ADAMTS5 expression by 11.2-fold and 13.5-fold, respectively. ADAMTS5 encodes a peptidase that plays a crucial role in osteoarthritis development via inhibition of a corresponding signaling pathway. CONCLUSION Here, we report for the first time that CC and CC-BS act as nociceptin receptor antagonists, selectively downregulating opioid-related nociceptin receptor 1 gene (OPRL1) expression, which is associated with pain relief. BS alone did not affect OPRL1 expression, but rather appears to potentiate the effects of CC via multiple mechanisms, including synergistic interactions of molecular networks.
Collapse
Affiliation(s)
- Ean-Jeong Seo
- Department of Pharmaceutical Biology, Institute of Pharmacy and Biochemistry, Johannes Gutenberg University, Staudinger Weg 5, 55128 Mainz, Germany
| | - Thomas Efferth
- Department of Pharmaceutical Biology, Institute of Pharmacy and Biochemistry, Johannes Gutenberg University, Staudinger Weg 5, 55128 Mainz, Germany.
| | - Alexander Panossian
- EuroPharma USA Inc., 955 Challenger Dr., Green Bay, WI 54311, USA; Phytomed AB,Bofinkvagen 1, 31275 Vaxtorp, Halland, Sweden.
| |
Collapse
|