1
|
Dione MN, Shang S, Zhang Q, Zhao S, Lu X. Non-Thermal Effects of Terahertz Radiation on Gene Expression: Systematic Review and Meta-Analysis. Genes (Basel) 2024; 15:1045. [PMID: 39202405 PMCID: PMC11354197 DOI: 10.3390/genes15081045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 08/05/2024] [Accepted: 08/07/2024] [Indexed: 09/03/2024] Open
Abstract
With the advancement of terahertz technology, unveiling the mysteries of terahertz has had a profound impact on the field of biomedicine. However, the lack of systematic comparisons for gene expression signatures may diminish the effectiveness and efficiency of identifying common mechanisms underlying terahertz effects across diverse research findings. We performed a comprehensive review and meta-analysis to compile patterns of gene expression profiles associated with THz radiation. Thorough bibliographic reviews were conducted, utilizing the PubMed, Embase, Web of Science, and ProQuest databases to extract references from published articles. Raw CEL files were obtained from Gene Expression Omnibus and preprocessed using Bioconductor packages. This systematic review (Registration No. CDR42024502937) resulted in a detailed analysis of 13 studies (14 papers). There are several possible mechanisms and pathways through which THz radiation could cause biological changes. While the established gene expression results are largely associated with immune response and inflammatory markers, other genes demonstrated transcriptional outcomes that may unravel unknown functions. The enrichment of genes primarily found networks associated with broader stress responses. Altogether, the findings showed that THz can induce a distinct transcriptomic profile that is not associated with a microthermal cellular response. However, it is impossible to pinpoint a single gene or family of genes that would accurately and reliably justify the patterns of gene expression response under THz exposure.
Collapse
Affiliation(s)
- Mactar Ndiaga Dione
- School of Life Science and Technology, Xi’an Jiaotong University (XJTU), Xi’an 710049, China
| | - Sen Shang
- School of Life Science and Technology, Xi’an Jiaotong University (XJTU), Xi’an 710049, China
| | - Qi Zhang
- School of Life Science and Technology, Xi’an Jiaotong University (XJTU), Xi’an 710049, China
| | - Sicheng Zhao
- School of Life Science and Technology, Xi’an Jiaotong University (XJTU), Xi’an 710049, China
| | - Xiaoyun Lu
- School of Life Science and Technology, Xi’an Jiaotong University (XJTU), Xi’an 710049, China
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
| |
Collapse
|
2
|
He S, Gubin MM, Rafei H, Basar R, Dede M, Jiang X, Liang Q, Tan Y, Kim K, Gillison ML, Rezvani K, Peng W, Haymaker C, Hernandez S, Solis LM, Mohanty V, Chen K. Elucidating immune-related gene transcriptional programs via factorization of large-scale RNA-profiles. iScience 2024; 27:110096. [PMID: 38957791 PMCID: PMC11217617 DOI: 10.1016/j.isci.2024.110096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/03/2024] [Accepted: 05/21/2024] [Indexed: 07/04/2024] Open
Abstract
Recent developments in immunotherapy, including immune checkpoint blockade (ICB) and adoptive cell therapy (ACT), have encountered challenges such as immune-related adverse events and resistance, especially in solid tumors. To advance the field, a deeper understanding of the molecular mechanisms behind treatment responses and resistance is essential. However, the lack of functionally characterized immune-related gene sets has limited data-driven immunological research. To address this gap, we adopted non-negative matrix factorization on 83 human bulk RNA sequencing (RNA-seq) datasets and constructed 28 immune-specific gene sets. After rigorous immunologist-led manual annotations and orthogonal validations across immunological contexts and functional omics data, we demonstrated that these gene sets can be applied to refine pan-cancer immune subtypes, improve ICB response prediction and functionally annotate spatial transcriptomic data. These functional gene sets, informing diverse immune states, will advance our understanding of immunology and cancer research.
Collapse
Affiliation(s)
- Shan He
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Matthew M. Gubin
- Department of Immunology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Hind Rafei
- Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rafet Basar
- Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Merve Dede
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xianli Jiang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Qingnan Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yukun Tan
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Kunhee Kim
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Maura L. Gillison
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Katayoun Rezvani
- Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Weiyi Peng
- Department of Biology and Biochemistry, The University of Houston, Houston, TX, USA
| | - Cara Haymaker
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Sharia Hernandez
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Luisa M. Solis
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Vakul Mohanty
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
3
|
Cao Z, Zhan G, Qin J, Cupertino RB, Ottino-Gonzalez J, Murphy A, Pancholi D, Hahn S, Yuan D, Callas P, Mackey S, Garavan H. Unraveling the molecular relevance of brain phenotypes: A comparative analysis of null models and test statistics. Neuroimage 2024; 293:120622. [PMID: 38648869 PMCID: PMC11132826 DOI: 10.1016/j.neuroimage.2024.120622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 04/17/2024] [Accepted: 04/19/2024] [Indexed: 04/25/2024] Open
Abstract
Correlating transcriptional profiles with imaging-derived phenotypes has the potential to reveal possible molecular architectures associated with cognitive functions, brain development and disorders. Competitive null models built by resampling genes and self-contained null models built by spinning brain regions, along with varying test statistics, have been used to determine the significance of transcriptional associations. However, there has been no systematic evaluation of their performance in imaging transcriptomics analyses. Here, we evaluated the performance of eight different test statistics (mean, mean absolute value, mean squared value, max mean, median, Kolmogorov-Smirnov (KS), Weighted KS and the number of significant correlations) in both competitive null models and self-contained null models. Simulated brain maps (n = 1,000) and gene sets (n = 500) were used to calculate the probability of significance (Psig) for each statistical test. Our results suggested that competitive null models may result in false positive results driven by co-expression within gene sets. Furthermore, we demonstrated that the self-contained null models may fail to account for distribution characteristics (e.g., bimodality) of correlations between all available genes and brain phenotypes, leading to false positives. These two confounding factors interacted differently with test statistics, resulting in varying outcomes. Specifically, the sign-sensitive test statistics (i.e., mean, median, KS, Weighted KS) were influenced by co-expression bias in the competitive null models, while median and sign-insensitive test statistics were sensitive to the bimodality bias in the self-contained null models. Additionally, KS-based statistics produced conservative results in the self-contained null models, which increased the risk of false negatives. Comprehensive supplementary analyses with various configurations, including realistic scenarios, supported the results. These findings suggest utilizing sign-insensitive test statistics such as mean absolute value, max mean in the competitive null models and the mean as the test statistic for the self-contained null models. Additionally, adopting the confounder-matched (e.g., coexpression-matched) null models as an alternative to standard null models can be a viable strategy. Overall, the present study offers insights into the selection of statistical tests for imaging transcriptomics studies, highlighting areas for further investigation and refinement in the evaluation of novel and commonly used tests.
Collapse
Affiliation(s)
- Zhipeng Cao
- Shanghai Xuhui Mental Health Center, Shanghai 200232, China; Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA.
| | - Guilai Zhan
- Shanghai Xuhui Mental Health Center, Shanghai 200232, China
| | - Jinmei Qin
- Shanghai Xuhui Mental Health Center, Shanghai 200232, China
| | - Renata B Cupertino
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Jonatan Ottino-Gonzalez
- Division of Endocrinology, The Saban Research Institute, Children's Hospital Los Angeles, Los Angeles, CA, USA
| | - Alistair Murphy
- Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA
| | - Devarshi Pancholi
- Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA
| | - Sage Hahn
- Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA
| | - Dekang Yuan
- Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA
| | - Peter Callas
- Department of Mathematics and Statistics, University of Vermont College of Engineering and Mathematical Sciences, Burlington VT, 05401, USA
| | - Scott Mackey
- Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA
| | - Hugh Garavan
- Department of Psychiatry, University of Vermont College of Medicine, Burlington VT, 05401, USA
| |
Collapse
|
4
|
Candia J, Ferrucci L. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks. PLoS One 2024; 19:e0302696. [PMID: 38753612 PMCID: PMC11098418 DOI: 10.1371/journal.pone.0302696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/09/2024] [Indexed: 05/18/2024] Open
Abstract
Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools.
Collapse
Affiliation(s)
- Julián Candia
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, United States of America
| | - Luigi Ferrucci
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, United States of America
| |
Collapse
|
5
|
He S, Gubin MM, Rafei H, Basar R, Dede M, Jiang X, Liang Q, Tan Y, Kim K, Gillison ML, Rezvani K, Peng W, Haymaker C, Hernandez S, Solis LM, Mohanty V, Chen K. Elucidating immune-related gene transcriptional programs via factorization of large-scale RNA-profiles. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593433. [PMID: 38798470 PMCID: PMC11118452 DOI: 10.1101/2024.05.10.593433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Recent developments in immunotherapy, including immune checkpoint blockade (ICB) and adoptive cell therapy, have encountered challenges such as immune-related adverse events and resistance, especially in solid tumors. To advance the field, a deeper understanding of the molecular mechanisms behind treatment responses and resistance is essential. However, the lack of functionally characterized immune-related gene sets has limited data-driven immunological research. To address this gap, we adopted non-negative matrix factorization on 83 human bulk RNA-seq datasets and constructed 28 immune-specific gene sets. After rigorous immunologist-led manual annotations and orthogonal validations across immunological contexts and functional omics data, we demonstrated that these gene sets can be applied to refine pan-cancer immune subtypes, improve ICB response prediction and functionally annotate spatial transcriptomic data. These functional gene sets, informing diverse immune states, will advance our understanding of immunology and cancer research.
Collapse
|
6
|
Kim Y, Choi SR, Cho KH. Reducing State Conflicts between Network Motifs Synergistically Enhances Cancer Drug Effects and Overcomes Adaptive Resistance. Cancers (Basel) 2024; 16:1337. [PMID: 38611015 PMCID: PMC11010870 DOI: 10.3390/cancers16071337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/20/2024] [Accepted: 03/26/2024] [Indexed: 04/14/2024] Open
Abstract
Inducing apoptosis in cancer cells is a primary goal in anti-cancer therapy, but curing cancer with a single drug is unattainable due to drug resistance. The complex molecular network in cancer cells causes heterogeneous responses to single-target drugs, thereby inducing an adaptive drug response. Here, we showed that targeted drug perturbations can trigger state conflicts between multi-stable motifs within a molecular regulatory network, resulting in heterogeneous drug responses. However, we revealed that properly regulating an interconnecting molecule between these motifs can synergistically minimize the heterogeneous responses and overcome drug resistance. We extracted the essential cellular response dynamics of the Boolean network driven by the target node perturbation and developed an algorithm to identify a synergistic combinatorial target that can reduce heterogeneous drug responses. We validated the proposed approach using exemplary network models and a gastric cancer model from a previous study by showing that the targets identified with our algorithm can better drive the networks to desired states than those with other control theories. Of note, our approach suggests a new synergistic pair of control targets that can increase cancer drug efficacy to overcome adaptive drug resistance.
Collapse
Affiliation(s)
| | | | - Kwang-Hyun Cho
- Laboratory for Systems Biology and Bio-Inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; (Y.K.); (S.R.C.)
| |
Collapse
|
7
|
Liu W, Li HM, Bai G. Integrated bioinformatics analysis of ferroptosis-related gene signature in inflammation and immunity in intervertebral disc degeneration. NUCLEOSIDES, NUCLEOTIDES & NUCLEIC ACIDS 2024:1-21. [PMID: 38531048 DOI: 10.1080/15257770.2024.2332403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/12/2024] [Indexed: 03/28/2024]
Abstract
Ferroptosis has recently been shown to play a significant role in the progression of intervertebral disk degeneration (IDD), although the underlying mechanism is still unknown. The objective of this work was to use stringent bioinformatic techniques to clarify the crucial roles played by genes associated with ferroptosis in the emergence of IDD. For additional study, the microarray data pertinent to the IDD were acquired from the Gene Expression Omnibus database. The ferroptosis-related and IDD-related genes (FIDDRGs) were identified using a variety of bioinformatic techniques, which were also used to carry out function enrichment analysis, protein-protein correlation analysis, build the correlation regulatory network, and examine the potential connections between ferroptosis and immune abnormalities and inflammatory responses in IDD. A total of 16 FIDDRGs were eliminated for the further function enrichment analysis, and 10 hub FIDDRGs were chosen to build the correlation regulatory network. Hub FIDDRGs were shown to be highly associated with M2 macrophages and hub inflammatory response-related genes in IDD. When seen as a whole, our findings can give fresh perspectives on the mechanistic studies of ferroptosis in the emergence of IDD and new prospective targets for the therapeutic approaches.
Collapse
Affiliation(s)
- Wei Liu
- Department of Orthopedics, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, Zhejiang, PR China
| | - Hui-Min Li
- Department of Orthopedics, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, Zhejiang, PR China
| | - Guangchao Bai
- Department of Orthopedics, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, Zhejiang, PR China
| |
Collapse
|
8
|
Vaswani CM, Simone J, Pavelick JL, Wu X, Tan GW, Ektesabi AM, Gupta S, Tsoporis JN, Dos Santos CC. Tiny Guides, Big Impact: Focus on the Opportunities and Challenges of miR-Based Treatments for ARDS. Int J Mol Sci 2024; 25:2812. [PMID: 38474059 DOI: 10.3390/ijms25052812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 02/24/2024] [Accepted: 02/25/2024] [Indexed: 03/14/2024] Open
Abstract
Acute Respiratory Distress Syndrome (ARDS) is characterized by lung inflammation and increased membrane permeability, which represents the leading cause of mortality in ICUs. Mechanical ventilation strategies are at the forefront of supportive approaches for ARDS. Recently, an increasing understanding of RNA biology, function, and regulation, as well as the success of RNA vaccines, has spurred enthusiasm for the emergence of novel RNA-based therapeutics. The most common types of RNA seen in development are silencing (si)RNAs, antisense oligonucleotide therapy (ASO), and messenger (m)RNAs that collectively account for 80% of the RNA therapeutics pipeline. These three RNA platforms are the most mature, with approved products and demonstrated commercial success. Most recently, miRNAs have emerged as pivotal regulators of gene expression. Their dysregulation in various clinical conditions offers insights into ARDS pathogenesis and offers the innovative possibility of using microRNAs as targeted therapy. This review synthesizes the current state of the literature to contextualize the therapeutic potential of miRNA modulation. It considers the potential for miR-based therapeutics as a nuanced approach that incorporates the complexity of ARDS pathophysiology and the multifaceted nature of miRNA interactions.
Collapse
Affiliation(s)
- Chirag M Vaswani
- Department of Physiology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
- Keenan Research Centre for Biomedical Science, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
| | - Julia Simone
- Department of Medicine, McMaster University, Hamilton, ON L8V 5C2, Canada
| | - Jacqueline L Pavelick
- Institute of Medical Sciences, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Xiao Wu
- Keenan Research Centre for Biomedical Science, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
| | - Greaton W Tan
- Department of Physiology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
- Keenan Research Centre for Biomedical Science, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
| | - Amin M Ektesabi
- Keenan Research Centre for Biomedical Science, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
- Institute of Medical Sciences, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Sahil Gupta
- Faculty of Medicine, School of Medicine, The University of Queensland, Herston, QLD 4006, Australia
| | - James N Tsoporis
- Keenan Research Centre for Biomedical Science, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
| | - Claudia C Dos Santos
- Department of Physiology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
- Keenan Research Centre for Biomedical Science, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
- Institute of Medical Sciences, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
- Laboratory Medicine and Pathobiology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A8, Canada
- Interdepartmental Division of Critical Care, St. Michael's Hospital, University of Toronto, Toronto, ON M5B 1W8, Canada
| |
Collapse
|
9
|
Hui TX, Kasim S, Aziz IA, Fudzee MFM, Haron NS, Sutikno T, Hassan R, Mahdin H, Sen SC. Robustness evaluations of pathway activity inference methods on gene expression data. BMC Bioinformatics 2024; 25:23. [PMID: 38216898 PMCID: PMC10785356 DOI: 10.1186/s12859-024-05632-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 01/02/2024] [Indexed: 01/14/2024] Open
Abstract
BACKGROUND With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches. RESULTS Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets. CONCLUSION However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.
Collapse
Affiliation(s)
- Tay Xin Hui
- Soft Computing and Data Mining Center, Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia (UTHM), 83000, Batu Pahat, Malaysia
| | - Shahreen Kasim
- Soft Computing and Data Mining Center, Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia (UTHM), 83000, Batu Pahat, Malaysia.
| | - Izzatdin Abdul Aziz
- Computer and Information Sciences Department (CISD), Universiti Teknologi PETRONAS (UTP), 32610, Seri Iskandar, Malaysia
| | - Mohd Farhan Md Fudzee
- Soft Computing and Data Mining Center, Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia (UTHM), 83000, Batu Pahat, Malaysia
| | - Nazleeni Samiha Haron
- Computer and Information Sciences Department (CISD), Universiti Teknologi PETRONAS (UTP), 32610, Seri Iskandar, Malaysia
| | - Tole Sutikno
- Department of Electrical Engineering, Universitas Ahmad Dahlan (UAD), 55166, Yogyakarta, Indonesia
| | - Rohayanti Hassan
- Faculty of Electrical Engineering, Universiti Teknologi Malaysia (UTM), 81310, Johor Bahru, Malaysia
| | - Hairulnizam Mahdin
- Soft Computing and Data Mining Center, Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia (UTHM), 83000, Batu Pahat, Malaysia
| | - Seah Choon Sen
- Faculty of Computing, Universiti Teknologi Malaysia (UTM), 81310, Johor Bahru, Malaysia
| |
Collapse
|
10
|
Song B, Wang K, Peng Y, Zhu Y, Cui Z, Chen L, Yu Z, Song B. Combined signature of G protein-coupled receptors and tumor microenvironment provides a prognostic and therapeutic biomarker for skin cutaneous melanoma. J Cancer Res Clin Oncol 2023; 149:18135-18160. [PMID: 38006451 DOI: 10.1007/s00432-023-05486-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/19/2023] [Indexed: 11/27/2023]
Abstract
BACKGROUND G protein-coupled receptors (GPCRs) have been shown to have an important role in tumor development and metastasis, and abnormal expression of GPCRs is significantly associated with poor prognosis of tumor patients. In this study, we analyzed the GPCRs-related gene (GPRGs) and tumor microenvironment (TME) in skin cutaneous melanoma (SKCM) to construct a prognostic model to help SKCM patients obtain accurate clinical treatment strategies. METHODS SKCM expression data and clinical information were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Differential expression analysis, LASSO algorithm, and univariate and multivariate cox regression analysis were used to screen prognosis-related genes (GPR19, GPR146, S1PR2, PTH1R, ADGRE5, CXCR3, GPR143, and OR2I1P) and multiple prognosis-good immune cells; the data set was analyzed according to above results and build up a GPR-TME classifier. The model was further subjected to immune infiltration, functional enrichment, tumor mutational load, immunotherapy prediction, and scRNA-seq data analysis. Finally, cellular experiments were conducted to validate the functionality of the key gene GPR19 in the model. RESULTS The findings indicate that high expression of GPRGs is associated with a poor prognosis in patients with SKCM, highlighting the significant role of GPRGs and the tumor microenvironment (TME) in SKCM development. Notably, the group characterized by low GPR expression and a high TME exhibited the most favorable prognosis and immunotherapeutic efficacy. Furthermore, cellular assays demonstrated that knockdown of GPR19 significantly reduced the proliferation, migration, and invasive capabilities of melanoma cells in A375 and A2058 cell lines. CONCLUSION This study provides novel insights for the prognosis evaluation and treatment of melanoma, along with the identification of a new biomarker, GPR19.
Collapse
Affiliation(s)
- Binyu Song
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China
| | - Kai Wang
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China
| | - Yixuan Peng
- School of Basic Medicine, The Fourth Military Medical University, 169 Changle West Road, Xi'an, 710032, China
| | - Yuhan Zhu
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China
| | - Zhiwei Cui
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China
| | - Lin Chen
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China.
| | - Zhou Yu
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China.
| | - Baoqiang Song
- Department of Plastic Surgery, Xijing Hospital, Fourth Military Medical University, 127 Changle West Road, Xi'an, 710032, Shaanxi Province, China.
| |
Collapse
|
11
|
Tran MN, Baek SJ, Jun HJ, Lee S. Identifying target organ location of Radix Achyranthis Bidentatae: a bioinformatics approach on active compounds and genes. Front Pharmacol 2023; 14:1187896. [PMID: 37637410 PMCID: PMC10448535 DOI: 10.3389/fphar.2023.1187896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 08/01/2023] [Indexed: 08/29/2023] Open
Abstract
Background: Herbal medicines traditionally target organs for treatment based on medicinal properties, and this theory is widely used for prescriptions. However, the scientific evidence explaining how herbs act on specific organs by biological methods has been still limited. This study used bioinformatic tools to identify the target organ locations of Radix Achyranthis Bidentatae (RAB), a blood-activating herb that nourishes the liver and kidney, strengthens bones, and directs prescription to the lower body. Methods: RAB's active compounds and targets were collected and predicted using databases such as TCMSP, HIT2.0, and BATMAN-TCM. Next, the RAB's target list was analyzed based on two approaches to obtain target organ locations. DAVID and Gene ORGANizer enrichment-based approaches were used to enrich an entire gene list, and the BioGPS and HPA gene expression-based approaches were used to analyze the expression of core genes. Results: RAB's targets were found to be involved in whole blood, blood components, and lymphatic organs across all four tools. Each tool indicated a particular aspect of RAB's target organ locations: DAVID-enriched genes showed a predominance in blood, liver, and kidneys; Gene ORGANizer showed the effect on low body parts as well as bones and joints; BioGPS and HPA showed high gene expression in bone marrow, lymphoid tissue, and smooth muscle. Conclusion: Our bioinformatics-based target organ location prediction can serve as a modern interpretation tool for the target organ location theory of traditional medicine. Future studies should predict therapeutic target organ locations in complex prescriptions rather than single herbs and conduct experiments to verify predictions.
Collapse
Affiliation(s)
- Minh Nhat Tran
- Korean Medicine Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea
- Korean Convergence Medical Science, University of Science and Technology, Daejeon, Republic of Korea
- Faculty of Traditional Medicine, Hue University of Medicine and Pharmacy, Hue University, Hue, Vietnam
| | - Su-Jin Baek
- Korean Medicine Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea
| | - Hyeong Joon Jun
- Korean Medicine Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea
| | - Sanghun Lee
- Korean Medicine Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea
- Korean Convergence Medical Science, University of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
12
|
Tong Z, Li H, Jin Y, Sheng L, Ying M, Liu Q, Wang C, Teng C. Mechanisms of ferroptosis with immune infiltration and inflammatory response in rotator cuff injury. Genomics 2023; 115:110645. [PMID: 37230182 DOI: 10.1016/j.ygeno.2023.110645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 04/03/2023] [Accepted: 05/19/2023] [Indexed: 05/27/2023]
Abstract
The processes driving ferroptosis and rotator cuff (RC) inflammation are yet unknown. The mechanism of ferroptosis and inflammation involved in the development of RC tears was investigated. The Gene Expression Omnibus database was used to obtain the microarray data relevant to the RC tears for further investigation. In this study, we created an RC tears rat model for in vivo experimental validation. For the additional function enrichment analysis, 10 hub ferroptosis-related genes were chosen to construct the correlation regulation network. In RC tears, it was discovered that genes related to hub ferroptosis and hub inflammatory response were strongly correlated. The outcomes of in vivo tests showed that RC tears were related to Cd68-Cxcl13, Acsl4-Sat1, Acsl3-Eno3, Acsl3-Ccr7, and Ccr7-Eno3 pairings in regulating ferroptosis and inflammatory response. Thus, our results show an association between ferroptosis and inflammation, providing a new avenue to explore the clinical treatment of RC tears.
Collapse
Affiliation(s)
- Zhicheng Tong
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Huimin Li
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Yanglei Jin
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Lingchao Sheng
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Mingshuai Ying
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Qixue Liu
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Chenhuan Wang
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China
| | - Chong Teng
- Department of Orthopaedic Surgery, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 32200, China..
| |
Collapse
|
13
|
Zhao K, Rhee SY. Interpreting omics data with pathway enrichment analysis. Trends Genet 2023; 39:308-319. [PMID: 36750393 DOI: 10.1016/j.tig.2023.01.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 11/24/2022] [Accepted: 01/13/2023] [Indexed: 02/09/2023]
Abstract
Pathway enrichment analysis is indispensable for interpreting omics datasets and generating hypotheses. However, the foundations of enrichment analysis remain elusive to many biologists. Here, we discuss best practices in interpreting different types of omics data using pathway enrichment analysis and highlight the importance of considering intrinsic features of various types of omics data. We further explain major components that influence the outcomes of a pathway enrichment analysis, including defining background sets and choosing reference annotation databases. To improve reproducibility, we describe how to standardize reporting methodological details in publications. This article aims to serve as a primer for biologists to leverage the wealth of omics resources and motivate bioinformatics tool developers to enhance the power of pathway enrichment analysis.
Collapse
Affiliation(s)
- Kangmei Zhao
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94025, USA.
| | - Seung Yon Rhee
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94025, USA.
| |
Collapse
|
14
|
Lu Y, Pang Z, Xia J. Comprehensive investigation of pathway enrichment methods for functional interpretation of LC-MS global metabolomics data. Brief Bioinform 2023; 24:bbac553. [PMID: 36572652 PMCID: PMC9851290 DOI: 10.1093/bib/bbac553] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/31/2022] [Accepted: 11/15/2022] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Global or untargeted metabolomics is widely used to comprehensively investigate metabolic profiles under various pathophysiological conditions such as inflammations, infections, responses to exposures or interactions with microbial communities. However, biological interpretation of global metabolomics data remains a daunting task. Recent years have seen growing applications of pathway enrichment analysis based on putative annotations of liquid chromatography coupled with mass spectrometry (LC-MS) peaks for functional interpretation of LC-MS-based global metabolomics data. However, due to intricate peak-metabolite and metabolite-pathway relationships, considerable variations are observed among results obtained using different approaches. There is an urgent need to benchmark these approaches to inform the best practices. RESULTS We have conducted a benchmark study of common peak annotation approaches and pathway enrichment methods in current metabolomics studies. Representative approaches, including three peak annotation methods and four enrichment methods, were selected and benchmarked under different scenarios. Based on the results, we have provided a set of recommendations regarding peak annotation, ranking metrics and feature selection. The overall better performance was obtained for the mummichog approach. We have observed that a ~30% annotation rate is sufficient to achieve high recall (~90% based on mummichog), and using semi-annotated data improves functional interpretation. Based on the current platforms and enrichment methods, we further propose an identifiability index to indicate the possibility of a pathway being reliably identified. Finally, we evaluated all methods using 11 COVID-19 and 8 inflammatory bowel diseases (IBD) global metabolomics datasets.
Collapse
Affiliation(s)
- Yao Lu
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Quebec, Canada
| | - Jianguo Xia
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
- Institute of Parasitology, McGill University, Quebec, Canada
| |
Collapse
|
15
|
Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT. The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 2023; 24:bbac529. [PMID: 36617463 PMCID: PMC9851315 DOI: 10.1093/bib/bbac529] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 01/10/2023] Open
Abstract
DNA and RNA sequencing technologies have revolutionized biology and biomedical sciences, sequencing full genomes and transcriptomes at very high speeds and reasonably low costs. RNA sequencing (RNA-Seq) enables transcript identification and quantification, but once sequencing has concluded researchers can be easily overwhelmed with questions such as how to go from raw data to differential expression (DE), pathway analysis and interpretation. Several pipelines and procedures have been developed to this effect. Even though there is no unique way to perform RNA-Seq analysis, it usually follows these steps: 1) raw reads quality check, 2) alignment of reads to a reference genome, 3) aligned reads' summarization according to an annotation file, 4) DE analysis and 5) gene set analysis and/or functional enrichment analysis. Each step requires researchers to make decisions, and the wide variety of options and resulting large volumes of data often lead to interpretation challenges. There also seems to be insufficient guidance on how best to obtain relevant information and derive actionable knowledge from transcription experiments. In this paper, we explain RNA-Seq steps in detail and outline differences and similarities of different popular options, as well as advantages and disadvantages. We also discuss non-coding RNA analysis, multi-omics, meta-transcriptomics and the use of artificial intelligence methods complementing the arsenal of tools available to researchers. Lastly, we perform a complete analysis from raw reads to DE and functional enrichment analysis, visually illustrating how results are not absolute truths and how algorithmic decisions can greatly impact results and interpretation.
Collapse
Affiliation(s)
- Jiung-Wen Chen
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lisa Shrestha
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - George Green
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Microbiology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| |
Collapse
|
16
|
Balestra C, Maj C, Müller E, Mayr A. Redundancy-aware unsupervised ranking based on game theory: Ranking pathways in collections of gene sets. PLoS One 2023; 18:e0282699. [PMID: 36893181 PMCID: PMC9997904 DOI: 10.1371/journal.pone.0282699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 02/13/2023] [Indexed: 03/10/2023] Open
Abstract
In Genetics, gene sets are grouped in collections concerning their biological function. This often leads to high-dimensional, overlapping, and redundant families of sets, thus precluding a straightforward interpretation of their biological meaning. In Data Mining, it is often argued that techniques to reduce the dimensionality of data could increase the maneuverability and consequently the interpretability of large data. In the past years, moreover, we witnessed an increasing consciousness of the importance of understanding data and interpretable models in the machine learning and bioinformatics communities. On the one hand, there exist techniques aiming to aggregate overlapping gene sets to create larger pathways. While these methods could partly solve the large size of the collections' problem, modifying biological pathways is hardly justifiable in this biological context. On the other hand, the representation methods to increase interpretability of collections of gene sets that have been proposed so far have proved to be insufficient. Inspired by this Bioinformatics context, we propose a method to rank sets within a family of sets based on the distribution of the singletons and their size. We obtain sets' importance scores by computing Shapley values; Making use of microarray games, we do not incur the typical exponential computational complexity. Moreover, we address the challenge of constructing redundancy-aware rankings where, in our case, redundancy is a quantity proportional to the size of intersections among the sets in the collections. We use the obtained rankings to reduce the dimension of the families, therefore showing lower redundancy among sets while still preserving a high coverage of their elements. We finally evaluate our approach for collections of gene sets and apply Gene Sets Enrichment Analysis techniques to the now smaller collections: As expected, the unsupervised nature of the proposed rankings allows for unremarkable differences in the number of significant gene sets for specific phenotypic traits. In contrast, the number of performed statistical tests can be drastically reduced. The proposed rankings show a practical utility in bioinformatics to increase interpretability of the collections of gene sets and a step forward to include redundancy-awareness into Shapley values computations.
Collapse
Affiliation(s)
- Chiara Balestra
- Department of Computer Science, TU Dortmund, Dortmund, Germany
- Department of Medical Biometry, Informatics and Epidemiology (IMBIE), University Hospital Bonn, Bonn, Germany
- * E-mail:
| | - Carlo Maj
- Institute for Genomic Statistics and Bioinformatics IGSB, University Hospital Bonn, Bonn, Germany
- Centre for Human Genetics, University of Marburg, Marburg, Germany
| | - Emmanuel Müller
- Department of Computer Science, TU Dortmund, Dortmund, Germany
| | - Andreas Mayr
- Department of Medical Biometry, Informatics and Epidemiology (IMBIE), University Hospital Bonn, Bonn, Germany
| |
Collapse
|
17
|
Arowolo O, Salemme V, Suvorov A. Towards Whole Health Toxicology: In-Silico Prediction of Diseases Sensitive to Multi-Chemical Exposures. TOXICS 2022; 10:764. [PMID: 36548597 PMCID: PMC9784704 DOI: 10.3390/toxics10120764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/15/2022] [Accepted: 12/07/2022] [Indexed: 06/17/2023]
Abstract
Chemical exposures from diverse sources merge on a limited number of molecular pathways described as toxicity pathways. Changes in the same set of molecular pathways in different cell and tissue types may generate seemingly unrelated health conditions. Today, no approaches are available to predict in an unbiased way sensitivities of different disease states and their combinations to multi-chemical exposures across the exposome. We propose an inductive in-silico workflow where sensitivities of genes to chemical exposures are identified based on the overlap of existing genomic datasets, and data on sensitivities of individual genes is further used to sequentially derive predictions on sensitivities of molecular pathways, disease states, and groups of disease states (syndromes). Our analysis predicts that conditions representing the most significant public health problems are among the most sensitive to cumulative chemical exposures. These conditions include six leading types of cancer in the world (prostatic, breast, stomach, lung, colorectal neoplasms, and hepatocellular carcinoma), obesity, type 2 diabetes, non-alcoholic fatty liver disease, autistic disorder, Alzheimer's disease, hypertension, heart failure, brain and myocardial ischemia, and myocardial infarction. Overall, our predictions suggest that environmental risk factors may be underestimated for the most significant public health problems.
Collapse
Affiliation(s)
- Olatunbosun Arowolo
- Department of Environmental Health Sciences, School of Public Health and Health Sciences, University of Massachusetts, 686 North Pleasant Street, Amherst, MA 01003, USA
| | - Victoria Salemme
- Department of Pharmacology, University of California, 1275 Med Science, Davis, CA 95616, USA
| | - Alexander Suvorov
- Department of Environmental Health Sciences, School of Public Health and Health Sciences, University of Massachusetts, 686 North Pleasant Street, Amherst, MA 01003, USA
| |
Collapse
|
18
|
Srinivasu PN, Shafi J, Krishna TB, Sujatha CN, Praveen SP, Ijaz MF. Using Recurrent Neural Networks for Predicting Type-2 Diabetes from Genomic and Tabular Data. Diagnostics (Basel) 2022; 12:3067. [PMID: 36553074 PMCID: PMC9776641 DOI: 10.3390/diagnostics12123067] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/01/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022] Open
Abstract
The development of genomic technology for smart diagnosis and therapies for various diseases has lately been the most demanding area for computer-aided diagnostic and treatment research. Exponential breakthroughs in artificial intelligence and machine intelligence technologies could pave the way for identifying challenges afflicting the healthcare industry. Genomics is paving the way for predicting future illnesses, including cancer, Alzheimer's disease, and diabetes. Machine learning advancements have expedited the pace of biomedical informatics research and inspired new branches of computational biology. Furthermore, knowing gene relationships has resulted in developing more accurate models that can effectively detect patterns in vast volumes of data, making classification models important in various domains. Recurrent Neural Network models have a memory that allows them to quickly remember knowledge from previous cycles and process genetic data. The present work focuses on type 2 diabetes prediction using gene sequences derived from genomic DNA fragments through automated feature selection and feature extraction procedures for matching gene patterns with training data. The suggested model was tested using tabular data to predict type 2 diabetes based on several parameters. The performance of neural networks incorporating Recurrent Neural Network (RNN) components, Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU) was tested in this research. The model's efficiency is assessed using the evaluation metrics such as Sensitivity, Specificity, Accuracy, F1-Score, and Mathews Correlation Coefficient (MCC). The suggested technique predicted future illnesses with fair Accuracy. Furthermore, our research showed that the suggested model could be used in real-world scenarios and that input risk variables from an end-user Android application could be kept and evaluated on a secure remote server.
Collapse
Affiliation(s)
- Parvathaneni Naga Srinivasu
- Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada 520007, Andhra Pradesh, India
| | - Jana Shafi
- Department of Computer Science, College of Arts and Science, Prince Sattam bin Abdul Aziz University, Wadi Ad-Dawasir 11991, Saudi Arabia
| | - T Balamurali Krishna
- Department of Computer Science and Engineering, Dhanekula Institute of Engineering and Technology, Vijayawada 521139, Andhra Pradesh, India
| | - Canavoy Narahari Sujatha
- Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad 501301, Telangana, India
| | - S Phani Praveen
- Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada 520007, Andhra Pradesh, India
| | - Muhammad Fazal Ijaz
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea
| |
Collapse
|
19
|
Koskeridis F, Evangelou E, Said S, Boyle JJ, Elliott P, Dehghan A, Tzoulaki I. Pleiotropic genetic architecture and novel loci for C-reactive protein levels. Nat Commun 2022; 13:6939. [PMID: 36376304 PMCID: PMC9663411 DOI: 10.1038/s41467-022-34688-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 11/02/2022] [Indexed: 11/16/2022] Open
Abstract
C-reactive protein is involved in a plethora of pathophysiological conditions. Many genetic loci associated with C-reactive protein are annotated to lipid and glucose metabolism genes supporting common biological pathways between inflammation and metabolic traits. To identify novel pleiotropic loci, we perform multi-trait analysis of genome-wide association studies on C-reactive protein levels along with cardiometabolic traits, followed by a series of in silico analyses including colocalization, phenome-wide association studies and Mendelian randomization. We find 41 novel loci and 19 gene sets associated with C-reactive protein with various pleiotropic effects. Additionally, 41 variants colocalize between C-reactive protein and cardiometabolic risk factors and 12 of them display unexpected discordant effects between the shared traits which are translated into discordant associations with clinical outcomes in subsequent phenome-wide association studies. Our findings provide insights into shared mechanisms underlying inflammation and lipid metabolism, representing potential preventive and therapeutic targets.
Collapse
Affiliation(s)
- Fotios Koskeridis
- Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece.
| | - Evangelos Evangelou
- Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece
- Institute of Biosciences, University Research Center of Ioannina, University of Ioannina, 45110, Ioannina, Greece
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
| | - Saredo Said
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Joseph J Boyle
- National Heart and Lung Institute, Imperial College London, London, UK
| | - Paul Elliott
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
- UK Dementia Research Institute, Imperial College London, London, UK
- BHF Centre of Excellence, Imperial College London, London, UK
| | - Abbas Dehghan
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
- UK Dementia Research Institute, Imperial College London, London, UK
- BHF Centre of Excellence, Imperial College London, London, UK
| | - Ioanna Tzoulaki
- Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece
- Institute of Biosciences, University Research Center of Ioannina, University of Ioannina, 45110, Ioannina, Greece
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
- UK Dementia Research Institute, Imperial College London, London, UK
- BHF Centre of Excellence, Imperial College London, London, UK
| |
Collapse
|
20
|
Abstract
Pathway enrichment analysis (PEA) is a computational biology method that identifies biological functions that are overrepresented in a group of genes more than would be expected by chance and ranks these functions by relevance. The relative abundance of genes pertinent to specific pathways is measured through statistical methods, and associated functional pathways are retrieved from online bioinformatics databases. In the last decade, along with the spread of the internet, higher availability of computational resources made PEA software tools easy to access and to use for bioinformatics practitioners worldwide. Although it became easier to use these tools, it also became easier to make mistakes that could generate inflated or misleading results, especially for beginners and inexperienced computational biologists. With this article, we propose nine quick tips to avoid common mistakes and to out a complete, sound, thorough PEA, which can produce relevant and robust results. We describe our nine guidelines in a simple way, so that they can be understood and used by anyone, including students and beginners. Some tips explain what to do before starting a PEA, others are suggestions of how to correctly generate meaningful results, and some final guidelines indicate some useful steps to properly interpret PEA results. Our nine tips can help users perform better pathway enrichment analyses and eventually contribute to a better understanding of current biology.
Collapse
|
21
|
Zhang Q, Taniguchi S, So K, Tsuda M, Higuchi Y, Hashida M, Yamashita F. CREB is a potential marker associated with drug-induced liver injury: Identification and validation through transcriptome database analysis. J Toxicol Sci 2022; 47:337-348. [PMID: 35922923 DOI: 10.2131/jts.47.337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Drug-induced liver injury (DILI) is the main cause of failure in drug development and postapproval withdrawal. Although toxicogenomic techniques provide an unprecedented opportunity for mechanistic assessment and biomarker discovery, they are not suitable for the screening of large numbers of exploratory compounds in early drug discovery. Using a comprehensive analysis of toxicogenomics (TGx) data, we aimed to find DILI-relevant transcription factors (TFs) that could be incorporated into a reporter gene assay system. Gene set enrichment analysis (GSEA) of the Open TG-GATEs dataset highlighted 4 DILI-relevant TFs, including CREB, NRF2, ELK-1, and E2F. Using ten drugs with already assigned idiosyncratic toxicity (IDT) risks, reporter gene assays were conducted in HepG2 cells in the presence of the S9 mix. There were weak correlations between NRF2 activity and IDT risk, whereas strong correlations were observed between CREB activity and IDT risk. In addition, CREB activation associated with 3 Withdrawn/Black box Warning drugs was reversed by pretreatment with a PKA inhibitor. Collectively, we suggest that CREB might be a sensitive biomarker for DILI prediction, and its response to stress induced by high-risk drugs might be primarily regulated by the PKA/CREB signaling pathway.
Collapse
Affiliation(s)
- Qiyue Zhang
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University
| | - Shiori Taniguchi
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University
| | - Kanako So
- Department of Applied Pharmaceutics and Pharmacokinetics, Graduate School of Pharmaceutical Sciences, Kyoto University
| | - Masahiro Tsuda
- Department of Applied Pharmaceutics and Pharmacokinetics, Graduate School of Pharmaceutical Sciences, Kyoto University
| | - Yuriko Higuchi
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University
| | - Mitsuru Hashida
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University
| | - Fumiyoshi Yamashita
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University.,Department of Applied Pharmaceutics and Pharmacokinetics, Graduate School of Pharmaceutical Sciences, Kyoto University
| |
Collapse
|
22
|
Javidi H, Mariam A, Khademi G, Zabor EC, Zhao R, Radivoyevitch T, Rotroff DM. Identification of robust deep neural network models of longitudinal clinical measurements. NPJ Digit Med 2022; 5:106. [PMID: 35896817 PMCID: PMC9329311 DOI: 10.1038/s41746-022-00651-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 07/06/2022] [Indexed: 11/09/2022] Open
Abstract
Deep learning (DL) from electronic health records holds promise for disease prediction, but systematic methods for learning from simulated longitudinal clinical measurements have yet to be reported. We compared nine DL frameworks using simulated body mass index (BMI), glucose, and systolic blood pressure trajectories, independently isolated shape and magnitude changes, and evaluated model performance across various parameters (e.g., irregularity, missingness). Overall, discrimination based on variation in shape was more challenging than magnitude. Time-series forest-convolutional neural networks (TSF-CNN) and Gramian angular field(GAF)-CNN outperformed other approaches (P < 0.05) with overall area-under-the-curve (AUCs) of 0.93 for both models, and 0.92 and 0.89 for variation in magnitude and shape with up to 50% missing data. Furthermore, in a real-world assessment, the TSF-CNN model predicted T2D with AUCs reaching 0.72 using only BMI trajectories. In conclusion, we performed an extensive evaluation of DL approaches and identified robust modeling frameworks for disease prediction based on longitudinal clinical measurements.
Collapse
Affiliation(s)
- Hamed Javidi
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Electrical Engineering and Computer Science, Cleveland State University, Cleveland, OH, USA
| | - Arshiya Mariam
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Gholamreza Khademi
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Emily C Zabor
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Ran Zhao
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Tomas Radivoyevitch
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Daniel M Rotroff
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.
- Department of Electrical Engineering and Computer Science, Cleveland State University, Cleveland, OH, USA.
- Endocrinology and Metabolism Institute, Cleveland Clinic, Cleveland, OH, USA.
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| |
Collapse
|
23
|
Lee K, Yu D, Hyung D, Cho SY, Park C. ASpediaFI: Functional Interaction Analysis of Alternative Splicing Events. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:466-482. [PMID: 35085775 PMCID: PMC9801047 DOI: 10.1016/j.gpb.2021.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 10/15/2021] [Accepted: 11/01/2021] [Indexed: 01/26/2023]
Abstract
Alternative splicing (AS) regulates biological processes governing phenotypes and diseases. Differential AS (DAS) gene test methods have been developed to investigate important exonic expression from high-throughput datasets. However, the DAS events extracted using statistical tests are insufficient to delineate relevant biological processes. In this study, we developed a novel application, Alternative Splicing Encyclopedia: Functional Interaction (ASpediaFI), to systemically identify DAS events and co-regulated genes and pathways. ASpediaFI establishes a heterogeneous interaction network of genes and their feature nodes (i.e., AS events and pathways) connected by co-expression or pathway gene set knowledge. Next, ASpediaFI explores the interaction network using the random walk with restart algorithm and interrogates the proximity from a query gene set. Finally, ASpediaFI extracts significant AS events, genes, and pathways. To evaluate the performance of our method, we simulated RNA sequencing (RNA-seq) datasets to consider various conditions of sequencing depth and sample size. The performance was compared with that of other methods. Additionally, we analyzed three public datasets of cancer patients or cell lines to evaluate how well ASpediaFI detects biologically relevant candidates. ASpediaFI exhibits strong performance in both simulated and public datasets. Our integrative approach reveals that DAS events that recognize a global co-expression network and relevant pathways determine the functional importance of spliced genes in the subnetwork. ASpediaFI is publicly available at https://bioconductor.org/packages/ASpediaFI.
Collapse
|
24
|
Mubeen S, Tom Kodamullil A, Hofmann-Apitius M, Domingo-Fernández D. On the influence of several factors on pathway enrichment analysis. Brief Bioinform 2022; 23:bbac143. [PMID: 35453140 PMCID: PMC9116215 DOI: 10.1093/bib/bbac143] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/21/2022] [Accepted: 03/30/2022] [Indexed: 02/01/2023] Open
Abstract
Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for. Researchers may fail to give influential aspects their due, resorting instead to popular methods and gene set collections, or default settings. Despite ongoing efforts to establish set guidelines, meaningful results are still hampered by a lack of consensus or gold standards around how enrichment analysis should be conducted. Nonetheless, such concerns have prompted a series of benchmark studies specifically focused on evaluating the influence of various factors on pathway enrichment results. In this review, we organize and summarize the findings of these benchmarks to provide a comprehensive overview on the influence of these factors. Our work covers a broad spectrum of factors, spanning from methodological assumptions to those related to prior biological knowledge, such as pathway definitions and database choice. In doing so, we aim to shed light on how these aspects can lead to insignificant, uninteresting or even contradictory results. Finally, we conclude the review by proposing future benchmarks as well as solutions to overcome some of the challenges, which originate from the outlined factors.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| |
Collapse
|
25
|
Xu N, Solari A, Goeman JJ. Closed testing with Globaltest, with application in metabolomics. Biometrics 2022. [PMID: 35567306 DOI: 10.1111/biom.13693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 05/02/2022] [Indexed: 11/30/2022]
Abstract
The Globaltest is a powerful test for the global null hypothesis that there is no association between a group of features and a response of interest, which is popular in pathway testing in metabolomics. Evaluating multiple feature sets, however, requires multiple testing correction. In this paper, we propose a multiple testing method, based on closed testing, specifically designed for the Globaltest. The proposed method controls the family-wise error rate simultaneously over all possible feature sets, and therefore allows post hoc inference, i.e. the researcher may choose feature sets of interest after seeing the data without jeopardizing error control. To circumvent the exponential computation time of closed testing, we derive a novel shortcut that allows exact closed testing to be performed on the scale of metabolomics data. An R package ctgt is available on CRAN for the implementation of the shortcut procedure, with applications on several real metabolomics data examples. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Ningning Xu
- Department of Biomedical Data Sciences, Leiden University Medical Center, The Netherlands
| | - Aldo Solari
- Department of Economics, Management and Statistics, University of Milano-Bicocca, Italy
| | - Jelle J Goeman
- Department of Biomedical Data Sciences, Leiden University Medical Center, The Netherlands
| |
Collapse
|
26
|
Transcript and blood-microbiome analysis towards a blood diagnostic tool for goats affected by Haemonchus contortus. Sci Rep 2022; 12:5362. [PMID: 35354850 PMCID: PMC8967894 DOI: 10.1038/s41598-022-08939-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Accepted: 03/10/2022] [Indexed: 11/19/2022] Open
Abstract
The Alpine goat (Capra aegagrus hircus) is parasitized by the barber pole worm (Haemonchus contortus). Hematological parameters from transcript and metagenome analysis in the host are reflective of infestation. We explored comparisons between blood samples of control, infected, infected zoledronic acid-treated, and infected antibody (anti-γδ T cells) treated wethers under controlled conditions. Seven days post-inoculation (dpi), we identified 7,627 transcripts associated with the different treatment types. Microbiome measurements at 7 dpi revealed fewer raw read counts across all treatments and a less diverse microbial flora than at 21 dpi. This study identifies treatment specific transcripts and an increase in microflora abundance and diversity as wethers age. Further, F/B ratio reflect health, based on depression or elevation above thresholds defined by the baseline of non-infected controls. Forty Alpine wethers were studied where blood samples were collected from five goats in four treatment groups on 7 dpi and 21 dpi. Transcript and microbiome profiles were obtained using the Partek Flow (St. Louis, Missouri, USA) software suites pipelines. Inflammation comparisons were based on the Firmicutes/Bacteriodetes ratios that are calculated as well as the reduction of microbial diversity.
Collapse
|
27
|
Functional Enrichment Analysis of Regulatory Elements. Biomedicines 2022; 10:biomedicines10030590. [PMID: 35327392 PMCID: PMC8945021 DOI: 10.3390/biomedicines10030590] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 02/22/2022] [Accepted: 02/25/2022] [Indexed: 01/27/2023] Open
Abstract
Statistical methods for enrichment analysis are important tools to extract biological information from omics experiments. Although these methods have been widely used for the analysis of gene and protein lists, the development of high-throughput technologies for regulatory elements demands dedicated statistical and bioinformatics tools. Here, we present a set of enrichment analysis methods for regulatory elements, including CpG sites, miRNAs, and transcription factors. Statistical significance is determined via a power weighting function for target genes and tested by the Wallenius noncentral hypergeometric distribution model to avoid selection bias. These new methodologies have been applied to the analysis of a set of miRNAs associated with arrhythmia, showing the potential of this tool to extract biological information from a list of regulatory elements. These new methods are available in GeneCodis 4, a web tool able to perform singular and modular enrichment analysis that allows the integration of heterogeneous information.
Collapse
|
28
|
Trapotsi MA, Hosseini-Gerami L, Bender A. Computational analyses of mechanism of action (MoA): data, methods and integration. RSC Chem Biol 2022; 3:170-200. [PMID: 35360890 PMCID: PMC8827085 DOI: 10.1039/d1cb00069a] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 12/09/2021] [Indexed: 12/15/2022] Open
Abstract
The elucidation of a compound's Mechanism of Action (MoA) is a challenging task in the drug discovery process, but it is important in order to rationalise phenotypic findings and to anticipate potential side-effects. Bioinformatic approaches, advances in machine learning techniques and the increasing deposition of high-throughput data in public databases have significantly contributed to recent advances in the field, but it is not straightforward to decide which data and methods are most suitable to use in a given case. In this review, we focus on these methods and data and their applications in generating MoA hypotheses for subsequent experimental validation. We discuss compound-specific data such as -omics, cell morphology and bioactivity data, as well as commonly used supplementary prior knowledge such as network and pathway data, and provide information on databases where this data can be accessed. In terms of methodologies, we discuss both well-established methods (connectivity mapping, pathway enrichment) as well as more developing methods (neural networks and multi-omics integration). Finally, we review case studies where the MoA of a compound was successfully suggested from computational analysis by incorporating multiple data modalities and/or methodologies. Our aim for this review is to provide researchers with insights into the benefits and drawbacks of both the data and methods in terms of level of understanding, biases and interpretation - and to highlight future avenues of investigation which we foresee will improve the field of MoA elucidation, including greater public access to -omics data and methodologies which are capable of data integration.
Collapse
Affiliation(s)
- Maria-Anna Trapotsi
- Centre for Molecular Informatics, Yusuf Hamied Department of Chemistry, University of Cambridge UK
| | - Layla Hosseini-Gerami
- Centre for Molecular Informatics, Yusuf Hamied Department of Chemistry, University of Cambridge UK
| | - Andreas Bender
- Centre for Molecular Informatics, Yusuf Hamied Department of Chemistry, University of Cambridge UK
| |
Collapse
|
29
|
Zareifi DS, Chaliotis O, Chala N, Meimetis N, Sofotasiou M, Zeakis K, Pantiora E, Vezakis A, Matsopoulos GK, Fragulidis G, Alexopoulos LG. A network-based computational and experimental framework for repurposing compounds towards the treatment of Non-Alcoholic Fatty Liver Disease. iScience 2022; 25:103890. [PMID: 35252807 PMCID: PMC8889147 DOI: 10.1016/j.isci.2022.103890] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 01/11/2022] [Accepted: 02/04/2022] [Indexed: 11/29/2022] Open
Abstract
Non-alcoholic fatty liver disease (NAFLD) is among the most common liver pathologies, however, none approved condition-specific therapy yet exists. The present study introduces a drug repositioning (DR) approach that combines in vitro steatosis models with a network-based computational platform, constructed upon genomic data from diseased liver biopsies and compound-treated cell lines, to propose effectively repositioned therapeutic compounds. The introduced in silico approach screened 20′000 compounds, while complementary in vitro and proteomic assays were developed to test the efficacy of the 46 in silico predictions. This approach successfully identified six compounds, including the known anti-steatogenic drugs resveratrol and sirolimus. In short, gallamine triethiotide, diflorasone, fenoterol, and pralidoxime ameliorate steatosis similarly to resveratrol/sirolimus. The implementation holds great potential in reducing screening time in the early drug discovery stages and in delivering promising compounds for in vivo testing. A computational and experimental drug-screening platform for NAFLD was created This framework evaluates in silico and validates in vitro a great number of compounds 20′000 compounds were screened in silico and 21 were selected for validation Six compounds reversed fully or partially the steatotic phenotype
Collapse
Affiliation(s)
- Danae Stella Zareifi
- School of Mechanical Engineering, National Technical University of Athens, Iroon Polytechneiou 9, Zografou, 15780 Athens, Greece
| | - Odysseas Chaliotis
- School of Mechanical Engineering, National Technical University of Athens, Iroon Polytechneiou 9, Zografou, 15780 Athens, Greece
| | - Nafsika Chala
- School of Mechanical Engineering, National Technical University of Athens, Iroon Polytechneiou 9, Zografou, 15780 Athens, Greece
| | - Nikos Meimetis
- School of Mechanical Engineering, National Technical University of Athens, Iroon Polytechneiou 9, Zografou, 15780 Athens, Greece
| | - Maria Sofotasiou
- School of Mechanical Engineering, National Technical University of Athens, Iroon Polytechneiou 9, Zografou, 15780 Athens, Greece
| | - Konstantinos Zeakis
- School of Electrical Engineering, National Technical University of Athens, 15780 Athens, Greece
| | - Eirini Pantiora
- 2nd Department of Surgery, Aretaieio Hospital, University of Athens, School of Medicine, 11528, Athens, Greece
| | - Antonis Vezakis
- 2nd Department of Surgery, Aretaieio Hospital, University of Athens, School of Medicine, 11528, Athens, Greece
| | - George K. Matsopoulos
- School of Electrical Engineering, National Technical University of Athens, 15780 Athens, Greece
| | - Georgios Fragulidis
- 2nd Department of Surgery, Aretaieio Hospital, University of Athens, School of Medicine, 11528, Athens, Greece
| | - Leonidas G. Alexopoulos
- School of Mechanical Engineering, National Technical University of Athens, Iroon Polytechneiou 9, Zografou, 15780 Athens, Greece
- ProtATonce Ltd, Patriarchou Grigoriou & Neapoleos Demokritos Science Park, Building#27, Agia Paraskevi GR15343, Greece
- Corresponding author
| |
Collapse
|
30
|
Qu J, Cui Y. Gene set analysis with graph-embedded kernel association test. Bioinformatics 2021; 38:1560-1567. [PMID: 34935928 PMCID: PMC8896609 DOI: 10.1093/bioinformatics/btab851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/20/2021] [Accepted: 12/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Kernel-based association test (KAT) has been a popular approach to evaluate the association of expressions of a gene set (e.g. pathway) with a phenotypic trait. KATs rely on kernel functions which capture the sample similarity across multiple features, to capture potential linear or non-linear relationship among features in a gene set. When calculating the kernel functions, no network graphical information about the features is considered. While genes in a functional group (e.g. a pathway) are not independent in general due to regulatory interactions, incorporating regulatory network (or graph) information can potentially increase the power of KAT. In this work, we propose a graph-embedded kernel association test, termed gKAT. gKAT incorporates prior pathway knowledge when constructing a kernel function into hypothesis testing. RESULTS We apply a diffusion kernel to capture any graph structures in a gene set, then incorporate such information to build a kernel function for further association test. We illustrate the geometric meaning of the approach. Through extensive simulation studies, we show that the proposed gKAT algorithm can improve testing power compared to the one without considering graph structures. Application to a real dataset further demonstrate the utility of the method. AVAILABILITY AND IMPLEMENTATION The R code used for the analysis can be accessed at https://github.com/JialinQu/gKAT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jialin Qu
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- To whom correspondence should be addressed.
| |
Collapse
|
31
|
Manzini S, Busnelli M, Colombo A, Franchi E, Grossano P, Chiesa G. reString: an open-source Python software to perform automatic functional enrichment retrieval, results aggregation and data visualization. Sci Rep 2021; 11:23458. [PMID: 34873191 PMCID: PMC8648753 DOI: 10.1038/s41598-021-02528-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 10/28/2021] [Indexed: 11/30/2022] Open
Abstract
Functional enrichment analysis is an analytical method to extract biological insights from gene expression data, popularized by the ever-growing application of high-throughput techniques. Typically, expression profiles are generated for hundreds to thousands of genes/proteins from samples belonging to two experimental groups, and after ad-hoc statistical tests, researchers are left with lists of statistically significant entities, possibly lacking any unifying biological theme. Functional enrichment tackles the problem of putting overall gene expression changes into a broader biological context, based on pre-existing knowledge bases of reference: database collections of known expression regulation, relationships and molecular interactions. STRING is among the most popular tools, providing both protein-protein interaction networks and functional enrichment analysis for any given set of identifiers. For complex experimental designs, manually retrieving, interpreting, analyzing and abridging functional enrichment results is a daunting task, usually performed by hand by the average wet-biology researcher. We have developed reString, a cross-platform software that seamlessly retrieves from STRING functional enrichments from multiple user-supplied gene sets, with just a few clicks, without any need for specific bioinformatics skills. Further, it aggregates all findings into human-readable table summaries, with built-in features to easily produce user-customizable publication-grade clustermaps and bubble plots. Herein, we outline a complete reString protocol, showcasing its features on a real use-case.
Collapse
Affiliation(s)
- Stefano Manzini
- Department of Pharmacological and Biomolecular Sciences, Università Degli Studi Di Milano, Milan, Italy.
| | - Marco Busnelli
- Department of Pharmacological and Biomolecular Sciences, Università Degli Studi Di Milano, Milan, Italy
| | - Alice Colombo
- Department of Pharmacological and Biomolecular Sciences, Università Degli Studi Di Milano, Milan, Italy
| | - Elsa Franchi
- Department of Pharmacological and Biomolecular Sciences, Università Degli Studi Di Milano, Milan, Italy
| | - Pasquale Grossano
- Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, Italy
| | - Giulia Chiesa
- Department of Pharmacological and Biomolecular Sciences, Università Degli Studi Di Milano, Milan, Italy.
| |
Collapse
|
32
|
Baltoumas FA, Zafeiropoulou S, Karatzas E, Paragkamian S, Thanati F, Iliopoulos I, Eliopoulos AG, Schneider R, Jensen LJ, Pafilis E, Pavlopoulos GA. OnTheFly 2.0: a text-mining web application for automated biomedical entity recognition, document annotation, network and functional enrichment analysis. NAR Genom Bioinform 2021; 3:lqab090. [PMID: 34632381 PMCID: PMC8494211 DOI: 10.1093/nargab/lqab090] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 09/09/2021] [Accepted: 09/20/2021] [Indexed: 02/06/2023] Open
Abstract
Extracting and processing information from documents is of great importance as lots of experimental results and findings are stored in local files. Therefore, extracting and analyzing biomedical terms from such files in an automated way is absolutely necessary. In this article, we present OnTheFly2.0, a web application for extracting biomedical entities from individual files such as plain texts, office documents, PDF files or images. OnTheFly2.0 can generate informative summaries in popup windows containing knowledge related to the identified terms along with links to various databases. It uses the EXTRACT tagging service to perform named entity recognition (NER) for genes/proteins, chemical compounds, organisms, tissues, environments, diseases, phenotypes and gene ontology terms. Multiple files can be analyzed, whereas identified terms such as proteins or genes can be explored through functional enrichment analysis or be associated with diseases and PubMed entries. Finally, protein-protein and protein-chemical networks can be generated with the use of STRING and STITCH services. To demonstrate its capacity for knowledge discovery, we interrogated published meta-analyses of clinical biomarkers of severe COVID-19 and uncovered inflammatory and senescence pathways that impact disease pathogenesis. OnTheFly2.0 currently supports 197 species and is available at http://bib.fleming.gr:3838/OnTheFly/ and http://onthefly.pavlopouloslab.info.
Collapse
Affiliation(s)
- Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center "Alexander Fleming", Vari 16672, Greece
| | - Sofia Zafeiropoulou
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center "Alexander Fleming", Vari 16672, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center "Alexander Fleming", Vari 16672, Greece
| | - Savvas Paragkamian
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes P.O. Box 2214, 71003 Heraklion, Crete, Greece
| | - Foteini Thanati
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center "Alexander Fleming", Vari 16672, Greece
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, Heraklion 71003, Crete, Greece
| | - Aristides G Eliopoulos
- Department of Biology, School of Medicine, National and Kapodistrian University of Athens, Athens, 70013, Greece
| | - Reinhard Schneider
- University of Luxembourg, Luxembourg Centre for Systems Biomedicine, Bioinformatics Core, Esch-sur-Alzette, L-4365, Luxembourg
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, 2200, Denmark
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes P.O. Box 2214, 71003 Heraklion, Crete, Greece
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center "Alexander Fleming", Vari 16672, Greece
| |
Collapse
|
33
|
Li X, Zhang B, Yu K, Bao Z, Zhang W, Bai Y. Identifying cancer specific signaling pathways based on the dysregulation between genes. Comput Biol Chem 2021; 95:107586. [PMID: 34619555 DOI: 10.1016/j.compbiolchem.2021.107586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/10/2021] [Accepted: 09/26/2021] [Indexed: 11/26/2022]
Abstract
A large collection of studies has shown that the occurrence of cancer is related to the functional dysfunction of the pathways. Identification of cancer-related pathways could help researchers understand the mechanisms of complex diseases well. Whereas, most current signaling pathway analysis methods take no account of the gene interaction variations within pathways. Furthermore, considering that some pathways have connection with two or more cancer types, while some are likely to be cancer-type specific pathways. Identifying cancer-type specific pathways contributes to interpreting the different mechanisms of different cancer types. In this study, we first proposed a pathway analysis method named Pathway Analysis of Intergenic Regulation (PAIGR) to identify pathways with dysregulation between genes and compared the performance of this method with four existing methods on four colorectal cancer (CRC) datasets. The results showed that PAIGR could find cancer-related pathways more accurately. Moreover, in order to explore the relationship between the identified pathways and the cancer type, we constructed a pathway interaction network, in which nodes and edges represented pathways and interactions between pathways respectively. Highly connected pathways were considered to play a central role in an extensive range of biological processes, while sparsely connected pathways are considered to have certain specificity. Our results showed that pathways identified by PAIGR had a low nodal degree (i.e., a few numbers of interactions), which suggested that most of these pathways were cancer-type specific.
Collapse
Affiliation(s)
- Xiaohan Li
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Bing Zhang
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Kequan Yu
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Zhenshen Bao
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| | - Weizhong Zhang
- Department of Ophthalmology, First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| | - Yunfei Bai
- State Key Lab of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.
| |
Collapse
|
34
|
Maleki F, Ovens K, McQuillan I, Kusalik AJ. Silver: Forging almost Gold Standard Datasets. Genes (Basel) 2021; 12:genes12101523. [PMID: 34680918 PMCID: PMC8535810 DOI: 10.3390/genes12101523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 09/19/2021] [Accepted: 09/22/2021] [Indexed: 11/16/2022] Open
Abstract
Gene set analysis has been widely used to gain insight from high-throughput expression studies. Although various tools and methods have been developed for gene set analysis, there is no consensus among researchers regarding best practice(s). Most often, evaluation studies have reported contradictory recommendations of which methods are superior. Therefore, an unbiased quantitative framework for evaluations of gene set analysis methods will be valuable. Such a framework requires gene expression datasets where enrichment status of gene sets is known a priori. In the absence of such gold standard datasets, artificial datasets are commonly used for evaluations of gene set analysis methods; however, they often rely on oversimplifying assumptions that make them biased in favor of or against a given method. In this paper, we propose a quantitative framework for evaluation of gene set analysis methods by synthesizing expression datasets using real data, without relying on oversimplifying or unrealistic assumptions, while preserving complex gene-gene correlations and retaining the distribution of expression values. The utility of the quantitative approach is shown by evaluating ten widely used gene set analysis methods. An implementation of the proposed method is publicly available. We suggest using Silver to evaluate existing and new gene set analysis methods. Evaluation using Silver provides a better understanding of current methods and can aid in the development of gene set analysis methods to achieve higher specificity without sacrificing sensitivity.
Collapse
Affiliation(s)
- Farhad Maleki
- Augmented Intelligence & Precision Health Laboratory, Institute of the McGill University Health Centre, McGill University, Montreal, QC H4A 3S5, Canada;
- Correspondence:
| | - Katie Ovens
- Augmented Intelligence & Precision Health Laboratory, Institute of the McGill University Health Centre, McGill University, Montreal, QC H4A 3S5, Canada;
| | - Ian McQuillan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK S7N 5C9, Canada; (I.M.); (A.J.K.)
| | - Anthony J. Kusalik
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK S7N 5C9, Canada; (I.M.); (A.J.K.)
| |
Collapse
|
35
|
Fang S, Xu X, Zhong L, Wang AQ, Gao WL, Lu M, Yin ZS. Bioinformatics-based study to identify immune infiltration and inflammatory-related hub genes as biomarkers for the treatment of rheumatoid arthritis. Immunogenetics 2021; 73:435-448. [PMID: 34477936 DOI: 10.1007/s00251-021-01224-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 08/05/2021] [Indexed: 10/20/2022]
Abstract
Rheumatoid arthritis (RA) is a systemic autoimmune disease whose principal pathological change is aggressive chronic synovial inflammation; however, the specific etiology and pathogenesis have not been fully elucidated. We downloaded the synovial tissue gene expression profiles of four human knees from the Gene Expression Omnibus database, analyzed the differentially expressed genes in the normal and RA groups, and assessed their enrichment in functions and pathways using bioinformatics methods and the STRING online database to establish protein-protein interaction networks. Cytoscape software was used to obtain 10 hub genes; receiver operating characteristic (ROC) curves were calculated for each hub gene and differential expression analysis of the two groups of hub genes. The CIBERSORT algorithm was used to impute immune infiltration. We identified the signaling pathways that play important roles in RA and 10 hub genes: Ccr1, Ccr2, Ccr5, Ccr7, Cxcl5, Cxcl6, Cxcl13, Ccl13, Adcy2, and Pnoc. The diagnostic value of these 10 hub genes for RA was confirmed using ROC curves and expression analysis. Adcy2, Cxcl13, and Ccr5 are strongly associated with RA development. The study also revealed that the differential infiltration profile of different inflammatory immune cells in the synovial tissue of RA is an extremely critical factor in RA progression. This study may contribute to the understanding of signaling pathways and biological processes associated with RA and the role of inflammatory immune infiltration in the pathogenesis of RA. In addition, this study shows that Adcy2, Cxcl13, and Ccr5 have the potential to be biomarkers for RA treatment.
Collapse
Affiliation(s)
- Sheng Fang
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China
| | - Xin Xu
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China
| | - Lin Zhong
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China.,Department of Orthopedics, The Third Affiliated Hospital of Anhui Medical University, 390 Huaihe Road, Hefei, Anhui Province, 230061, People's Republic of China
| | - An-Quan Wang
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China
| | - Wei-Lu Gao
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China
| | - Ming Lu
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China
| | - Zong-Sheng Yin
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, Anhui Province, 230022, People's Republic of China.
| |
Collapse
|
36
|
Mubeen S, Bharadhwaj VS, Gadiya Y, Hofmann-Apitius M, Kodamullil AT, Domingo-Fernández D. DecoPath: a web application for decoding pathway enrichment analysis. NAR Genom Bioinform 2021; 3:lqab087. [PMID: 34568823 PMCID: PMC8459727 DOI: 10.1093/nargab/lqab087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 08/31/2021] [Accepted: 09/14/2021] [Indexed: 12/16/2022] Open
Abstract
The past decades have brought a steady growth of pathway databases and enrichment methods. However, the advent of pathway data has not been accompanied by an improvement in interoperability across databases, hampering the use of pathway knowledge from multiple databases for enrichment analysis. While integrative databases have attempted to address this issue, they often do not account for redundant information across resources. Furthermore, the majority of studies that employ pathway enrichment analysis still rely upon a single database or enrichment method, though the use of another could yield differing results. These shortcomings call for approaches that investigate the differences and agreements across databases and methods as their selection in the design of a pathway analysis can be a crucial step in ensuring the results of such an analysis are meaningful. Here we present DecoPath, a web application to assist in the interpretation of the results of pathway enrichment analysis. DecoPath provides an ecosystem to run enrichment analysis or directly upload results and facilitate the interpretation of results with custom visualizations that highlight the consensus and/or discrepancies at the pathway- and gene-levels. DecoPath is available at https://decopath.scai.fraunhofer.de, and its source code and documentation can be found on GitHub at https://github.com/DecoPath/DecoPath.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Vinay S Bharadhwaj
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
| | - Yojana Gadiya
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
| | - Alpha T Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO 80301, USA
| |
Collapse
|
37
|
Cingiz MÖ, Biricik G, Diri B. The Performance Comparison of Gene Co-expression Networks of Breast and Prostate Cancer using Different Selection Criteria. Interdiscip Sci 2021; 13:500-510. [PMID: 34003445 DOI: 10.1007/s12539-021-00440-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 04/21/2021] [Accepted: 05/11/2021] [Indexed: 06/12/2023]
Abstract
Gene co-expression networks (GCN) present undirected relations between genes to understand molecular structures behind the diseases, including cancer. The utilization of various biological datasets and gene network inference (GNI) algorithms can reveal meaningful gene-gene interactions of GCNs. This study applies three GNI algorithms on mRNA gene expression, RNA-Seq, and miRNA-target genes datasets to infer GCNs of breast and prostate cancers. To evaluate the performance of the GCNs, we utilize overlap analysis via literature data, topological assessment, and Gene Ontology-based biological assessment. The results emphasize how the selection of biological datasets and GNI algorithms affect the performance results on different evaluation criteria. GCNs on microarray gene expression data slightly outperform in overlap analysis. Also, GCNs on RNA-Seq and gene expression datasets follow scale-free topology. The biological assessment results are close to each other on all biological datasets. C3NET algorithm-based GCNs did not contain any biological assessment modules; therefore, it is not optimal for biological assessment. GNI algorithms' selection did not change the overlap analysis and topological assessment results. Our primary objective is to compare the performance results of biological datasets and GNI algorithms based on different evaluation criteria. For this purpose, we developed the GNIAP R package that enables users to select different GNI algorithms to infer GCNs. The GNIAP R package also provides literature-based overlap analysis, and topological and biological analyses on GCNs. Users can access the GNIAP R package via https://github.com/ozgurcingiz/GNIAP .
Collapse
Affiliation(s)
- Mustafa Özgür Cingiz
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Bursa Technical University, 16310, Yildirim, Bursa, Turkey.
| | - Göksel Biricik
- Computer Engineering Department, Yildiz Technical University, Istanbul, Turkey
| | - Banu Diri
- Computer Engineering Department, Yildiz Technical University, Istanbul, Turkey
| |
Collapse
|
38
|
Trasierras AM, Luna JM, Ventura S. Improving the understanding of cancer in a descriptive way: An emerging pattern mining‐based approach. INT J INTELL SYST 2021. [DOI: 10.1002/int.22503] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
| | - José María Luna
- Department of Computer Science and Numerical Analysis, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI) University of Cordoba Córdoba Spain
| | - Sebastián Ventura
- Department of Computer Science and Numerical Analysis, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI) University of Cordoba Córdoba Spain
| |
Collapse
|
39
|
Fang S, Zhong L, Wang AQ, Zhang H, Yin ZS. Identification of Regeneration and Hub Genes and Pathways at Different Time Points after Spinal Cord Injury. Mol Neurobiol 2021; 58:2643-2662. [PMID: 33484404 DOI: 10.1007/s12035-021-02289-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 01/11/2021] [Indexed: 12/19/2022]
Abstract
Spinal cord injury (SCI) is a neurological injury that can cause neuronal loss around the lesion site and leads to locomotive and sensory deficits. However, the underlying molecular mechanisms remain unclear. This study aimed to verify differential gene time-course expression in SCI and provide new insights for gene-level studies. We downloaded two rat expression profiles (GSE464 and GSE45006) from the Gene Expression Omnibus database, including 1 day, 3 days, 7 days, and 14 days post-SCI, along with thoracic spinal cord data for analysis. At each time point, gene integration was performed using "batch normalization." The raw data were standardized, and differentially expressed genes at the different time points versus the control were analyzed by Gene Ontology enrichment analysis, the Kyoto Encyclopedia of Genes and Genomes pathway analysis, and gene set enrichment analysis. A protein-protein interaction network was then built and visualized. In addition, ten hub genes were identified at each time point. Among them, Gnb5, Gng8, Agt, Gnai1, and Psap lack correlation studies in SCI and deserve further investigation. Finally, we screened and analyzed genes for tissue repair, reconstruction, and regeneration and found that Anxa1, Snap25, and Spp1 were closely related to repair and regeneration after SCI. In conclusion, hub genes, signaling pathways, and regeneration genes involved in secondary SCI were identified in our study. These results may be useful for understanding SCI-related biological processes and the development of targeted intervention strategies.
Collapse
Affiliation(s)
- Sheng Fang
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, #218 Jixi Road, Hefei, 230022, Anhui Province, China
| | - Lin Zhong
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, #218 Jixi Road, Hefei, 230022, Anhui Province, China
- Department of Orthopedics, The Third Affiliated Hospital of Anhui Medical University, Hefei, Anhui Province, China
| | - An-Quan Wang
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, #218 Jixi Road, Hefei, 230022, Anhui Province, China
| | - Hui Zhang
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, #218 Jixi Road, Hefei, 230022, Anhui Province, China
| | - Zong-Sheng Yin
- Department of Orthopedics, The First Affiliated Hospital of Anhui Medical University, #218 Jixi Road, Hefei, 230022, Anhui Province, China.
| |
Collapse
|
40
|
Abstract
PURPOSE OF REVIEW Drug development has evolved over the years from being one-at-a-time to be massive screens in an industrial manner. Bringing a new therapeutic agent from concept to bedside can take a decade and cost billions of dollars-with most concepts failing along the way. Of the few compounds that make it to clinical testing, less than one out of eight make it to approval. This traditional drug development pipeline is challenging for prevalent diseases and makes the development of new therapeutics for rare diseases financially intractable. RECENT FINDINGS Repurposing of drugs is an alternative to identify new applications for the thousands of compounds that have already been approved for clinical use. There is now a range of strategies for such efforts that leverage clinical data, pharmacologic data, and/or genomic or transcriptomic data. These strategies, together with examples, are detailed in this review. Drug repurposing bypasses the pre-clinical work and thereby opens up the opportunity to provide targeted treatment at a fraction of the cost that is accompanied with the development from ideation to full approval. Such an approach makes drug discovery for any disease process more efficient but holds particular promise for rare diseases for which there is little to no other viable drug development channel.
Collapse
Affiliation(s)
- Eric Kort
- DeVos Cardiovascular Research Program, Van Andel Institute/Spectrum Health, Grand Rapids, MI, USA.,Dept of Pediatrics & Human Development, Michigan State University, Grand Rapids, MI, USA.,Helen DeVos Children's Hospital, Grand Rapids, MI, USA
| | - Stefan Jovinge
- DeVos Cardiovascular Research Program, Van Andel Institute/Spectrum Health, Grand Rapids, MI, USA. .,Frederik Meijer Heart and Vascular Institute, Spectrum Health, Grand Rapids, MI, USA. .,Cardiovascular Institute, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
41
|
Katz S, Song J, Webb KP, Lounsbury NW, Bryant CE, Fraser IDC. SIGNAL: A web-based iterative analysis platform integrating pathway and network approaches optimizes hit selection from genome-scale assays. Cell Syst 2021; 12:338-352.e5. [PMID: 33894945 DOI: 10.1016/j.cels.2021.03.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 11/25/2020] [Accepted: 03/03/2021] [Indexed: 01/13/2023]
Abstract
Hit selection from high-throughput assays remains a critical bottleneck in realizing the potential of omic-scale studies in biology. Widely used methods such as setting of cutoffs, prioritizing pathway enrichments, or incorporating predicted network interactions offer divergent solutions yet are associated with critical analytical trade-offs. The specific limitations of these individual approaches and the lack of a systematic way by which to integrate their rankings have contributed to limited overlap in the reported results from comparable genome-wide studies and costly inefficiencies in secondary validation efforts. Using comparative analysis of parallel independent studies as a benchmark, we characterize the specific complementary contributions of each approach and demonstrate an optimal framework to integrate these methods. We describe selection by iterative pathway group and network analysis looping (SIGNAL), an integrated, iterative approach that uses both pathway and network methods to optimize gene prioritization. SIGNAL is accessible as a rapid user-friendly web-based application (https://signal.niaid.nih.gov). A record of this paper's transparent peer review is included in the Supplemental information.
Collapse
Affiliation(s)
- Samuel Katz
- NIAID, National Institutes of Health, Laboratory of Immune System Biology, Bethesda, MD 20892, USA; University of Cambridge, Department of Veterinary Medicine, Cambridge, UK
| | - Jian Song
- NIAID, National Institutes of Health, Laboratory of Immune System Biology, Bethesda, MD 20892, USA
| | - Kyle P Webb
- NIAID, National Institutes of Health, Laboratory of Immune System Biology, Bethesda, MD 20892, USA
| | - Nicolas W Lounsbury
- NIAID, National Institutes of Health, Laboratory of Immune System Biology, Bethesda, MD 20892, USA
| | - Clare E Bryant
- University of Cambridge, Department of Veterinary Medicine, Cambridge, UK
| | - Iain D C Fraser
- NIAID, National Institutes of Health, Laboratory of Immune System Biology, Bethesda, MD 20892, USA.
| |
Collapse
|
42
|
Seifert S, Gundlach S, Junge O, Szymczak S. Integrating biological knowledge and gene expression data using pathway-guided random forests: a benchmarking study. Bioinformatics 2021; 36:4301-4308. [PMID: 32399562 PMCID: PMC7520048 DOI: 10.1093/bioinformatics/btaa483] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 03/13/2020] [Accepted: 05/05/2020] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION High-throughput technologies allow comprehensive characterization of individuals on many molecular levels. However, training computational models to predict disease status based on omics data is challenging. A promising solution is the integration of external knowledge about structural and functional relationships into the modeling process. We compared four published random forest-based approaches using two simulation studies and nine experimental datasets. RESULTS The self-sufficient prediction error approach should be applied when large numbers of relevant pathways are expected. The competing methods hunting and learner of functional enrichment should be used when low numbers of relevant pathways are expected or the most strongly associated pathways are of interest. The hybrid approach synthetic features is not recommended because of its high false discovery rate. AVAILABILITY AND IMPLEMENTATION An R package providing functions for data analysis and simulation is available at GitHub (https://github.com/szymczak-lab/PathwayGuidedRF). An accompanying R data package (https://github.com/szymczak-lab/DataPathwayGuidedRF) stores the processed and quality controlled experimental datasets downloaded from Gene Expression Omnibus (GEO). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Stephan Seifert
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| | - Sven Gundlach
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| | - Olaf Junge
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| | - Silke Szymczak
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| |
Collapse
|
43
|
Gilhooley MJ, Owen N, Moosajee M, Yu Wai Man P. From Transcriptomics to Treatment in Inherited Optic Neuropathies. Genes (Basel) 2021; 12:147. [PMID: 33499292 PMCID: PMC7912133 DOI: 10.3390/genes12020147] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 01/13/2021] [Accepted: 01/20/2021] [Indexed: 02/06/2023] Open
Abstract
Inherited optic neuropathies, including Leber Hereditary Optic Neuropathy (LHON) and Dominant Optic Atrophy (DOA), are monogenetic diseases with a final common pathway of mitochondrial dysfunction leading to retinal ganglion cell (RGC) death and ultimately loss of vision. They are, therefore, excellent models with which to investigate this ubiquitous disease process-implicated in both common polygenetic ocular diseases (e.g., Glaucoma) and late-onset central nervous system neurodegenerative diseases (e.g., Parkinson disease). In recent years, cellular and animal models of LHON and DOA have matured in parallel with techniques (such as RNA-seq) to determine and analyze the transcriptomes of affected cells. This confluence leaves us at a particularly exciting time with the potential for the identification of novel pathogenic players and therapeutic targets. Here, we present a discussion of the importance of inherited optic neuropathies and how transcriptomic techniques can be exploited in the development of novel mutation-independent, neuroprotective therapies.
Collapse
Affiliation(s)
- Michael James Gilhooley
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
- Moorfields Eye Hospital NHS Foundation Trust, 162 City Road, London EC1V 2PD, UK
| | - Nicholas Owen
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
| | - Mariya Moosajee
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
- Moorfields Eye Hospital NHS Foundation Trust, 162 City Road, London EC1V 2PD, UK
- The Francis Crick Institute, 1 Midland Road, Somers Town, London NW1 1AT, UK
- Great Ormond Street Hospital for Children NHS Foundation Trust, London WC1N 3JH, UK
| | - Patrick Yu Wai Man
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
- Moorfields Eye Hospital NHS Foundation Trust, 162 City Road, London EC1V 2PD, UK
- Department of Clinical Neurosciences, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
- MRC Mitochondrial Biology Unit, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
- Cambridge Eye Unit, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK
| |
Collapse
|
44
|
Pan X, Ma X. A Novel Six-Gene Signature for Prognosis Prediction in Ovarian Cancer. Front Genet 2020; 11:1006. [PMID: 33193589 PMCID: PMC7593580 DOI: 10.3389/fgene.2020.01006] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Accepted: 08/06/2020] [Indexed: 12/18/2022] Open
Abstract
Ovarian cancer (OC) is the most malignant tumor in the female reproductive tract. Although abundant molecular biomarkers have been identified, a robust and accurate gene expression signature is still essential to assist oncologists in evaluating the prognosis of OC patients. In this study, samples from 367 patients in The Cancer Genome Atlas (TCGA) database were subjected to mRNA expression profiling. Then, we used a gene set enrichment analysis (GSEA) to screen genes correlated with epithelial–mesenchymal transition (EMT) and assess their prognostic power with a Cox proportional regression model. Six genes (TGFBI, SFRP1, COL16A1, THY1, PPIB, BGN) associated with overall survival (OS) were used to construct a risk assessment model, after which the patients were divided into high-risk and low-risk groups. The six-gene signature was an independent prognostic biomarker of OS for OC patients based on the multivariate Cox regression analysis. In addition, the six-gene model was validated with samples from the Gene Expression Omnibus (GEO) database. In summary, we established a six-gene signature relevant to the prognosis of OC, which might become a therapeutic tool with clinical applications in the future.
Collapse
Affiliation(s)
- Xin Pan
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Xiaoxin Ma
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
45
|
Ghulam A, Lei X, Guo M, Bian C. A Review of Pathway Databases and Related Methods Analysis. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191018162505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Pathway analysis integrates most of the computational tools for the investigation of
high-level and complex human diseases. In the field of bioinformatics research, biological pathways
analysis is an important part of systems biology. The molecular complexities of biological
pathways are difficult to understand in human diseases, which can be explored through pathway
analysis. In this review, we describe essential information related to pathway databases and their
mechanisms, algorithms and methods. In the pathway database analysis, we present a brief introduction
on how to gain knowledge from fundamental pathway data in regard to specific human
pathways and how to use pathway databases and pathway analysis to predict diseases during an
experiment. We also provide detailed information related to computational tools that are used in
complex pathway data analysis, the roles of these tools in the bioinformatics field and how to store
the pathway data. We illustrate various methodological difficulties that are faced during pathway
analysis. The main ideas and techniques for the pathway-based examination approaches are presented.
We provide the list of pathway databases and analytical tools. This review will serve as a
helpful manual for pathway analysis databases.
Collapse
Affiliation(s)
- Ali Ghulam
- School of Computer Science, Shaanxi Normal University, Xian, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xian, China
| | - Min Guo
- School of Computer Science, Shaanxi Normal University, Xian, China
| | - Chen Bian
- School of Computer Science, Shaanxi Normal University, Xian, China
| |
Collapse
|
46
|
Rotroff DM. A Bioinformatics Crash Course for Interpreting Genomics Data. Chest 2020; 158:S113-S123. [PMID: 32658646 PMCID: PMC8176646 DOI: 10.1016/j.chest.2020.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 11/11/2019] [Accepted: 03/09/2020] [Indexed: 10/23/2022] Open
Abstract
Reductions in genotyping costs and improvements in computational power have made conducting genome-wide association studies (GWAS) standard practice for many complex diseases. GWAS is the assessment of genetic variants across the genome of many individuals to determine which, if any, genetic variants are associated with a specific trait. As with any analysis, there are evolving best practices that should be followed to ensure scientific rigor and reliability in the conclusions. This article presents a brief summary for many of the key bioinformatics considerations when either planning or evaluating GWAS. This review is meant to serve as a guide to those without deep expertise in bioinformatics and GWAS and give them tools to critically evaluate this popular approach to investigating complex diseases. In addition, a checklist is provided that can be used by investigators to evaluate whether a GWAS has appropriately accounted for the many potential sources of bias and generally followed current best practices.
Collapse
Affiliation(s)
- Daniel M Rotroff
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH.
| |
Collapse
|
47
|
Maleki F, Ovens K, Hogan DJ, Kusalik AJ. Gene Set Analysis: Challenges, Opportunities, and Future Research. Front Genet 2020; 11:654. [PMID: 32695141 PMCID: PMC7339292 DOI: 10.3389/fgene.2020.00654] [Citation(s) in RCA: 106] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 05/29/2020] [Indexed: 12/14/2022] Open
Abstract
Gene set analysis methods are widely used to provide insight into high-throughput gene expression data. There are many gene set analysis methods available. These methods rely on various assumptions and have different requirements, strengths and weaknesses. In this paper, we classify gene set analysis methods based on their components, describe the underlying requirements and assumptions for each class, and provide directions for future research in developing and evaluating gene set analysis methods.
Collapse
|
48
|
Guo Y, Huang P, Ning W, Zhang H, Yu C. Identification of Core Genes and Pathways in Medulloblastoma by Integrated Bioinformatics Analysis. J Mol Neurosci 2020; 70:1702-1712. [PMID: 32535713 DOI: 10.1007/s12031-020-01556-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Accepted: 04/13/2020] [Indexed: 12/20/2022]
Abstract
Medulloblastoma (MB) is one of the most common intracranial malignancies in children. The present study applied integrated bioinformatics to identify potential core genes associated with the pathogenesis of MB and reveal potential molecular mechanisms. Through the integrated analysis of multiple data sets from the Gene Expression Omnibus (GEO), 414 differentially expressed genes (DEGs) were identified. Combining the protein-protein interaction (PPI) network analysis with gene set enrichment analysis (GSEA), eight core genes, including CCNA2, CCNB1, CCNB2, AURKA, CDK1, MAD2L1, BUB1B, and RRM2, as well as four core pathways, including "cell cycle", "oocyte meiosis", "p53 pathway" and "DNA replication" were selected. In independent data sets, the core genes showed superior diagnostic values and significant prognostic correlations. Moreover, in the pan-caner data of the cancer genome atlas (TCGA), the core genes were also widely abnormally expressed. In conclusion, this study identified core genes and pathways of MB through integrated analysis to deepen the understanding of the molecular mechanisms underlying the MB and provide potential targets and pathways for diagnosis and treatment of MB.
Collapse
Affiliation(s)
- Yuduo Guo
- Department of Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Peng Huang
- Department of Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Weihai Ning
- Department of Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Hongwei Zhang
- Department of Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China.
| | - Chunjiang Yu
- Department of Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China.
| |
Collapse
|
49
|
Pradines JR, Farutin V, Cilfone NA, Ghavami A, Kurtagic E, Guess J, Manning AM, Capila I. Enhancing reproducibility of gene expression analysis with known protein functional relationships: The concept of well-associated protein. PLoS Comput Biol 2020; 16:e1007684. [PMID: 32058996 PMCID: PMC7046299 DOI: 10.1371/journal.pcbi.1007684] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Revised: 02/27/2020] [Accepted: 01/27/2020] [Indexed: 12/27/2022] Open
Abstract
Identification of differentially expressed genes (DEGs) is well recognized to be variable across independent replications of genome-wide transcriptional studies. These are often employed to characterize disease state early in the process of discovery and prioritize novel targets aimed at addressing unmet medical need. Increasing reproducibility of biological findings from these studies could potentially positively impact the success rate of new clinical interventions. This work demonstrates that statistically sound combination of gene expression data with prior knowledge about biology in the form of large protein interaction networks can yield quantitatively more reproducible observations from studies characterizing human disease. The novel concept of Well-Associated Proteins (WAPs) introduced herein-gene products significantly associated on protein interaction networks with the differences in transcript levels between control and disease-does not require choosing a differential expression threshold and can be computed efficiently enough to enable false discovery rate estimation via permutation. Reproducibility of WAPs is shown to be on average superior to that of DEGs under easily-quantifiable conditions suggesting that they can yield a significantly more robust description of disease. Enhanced reproducibility of WAPs versus DEGs is first demonstrated with four independent data sets focused on systemic sclerosis. This finding is then validated over thousands of pairs of data sets obtained by random partitions of large studies in several other diseases. Conditions that individual data sets must satisfy to yield robust WAP scores are examined. Reproducible identification of WAPs can potentially benefit drug target selection and precision medicine studies.
Collapse
Affiliation(s)
- Joël R. Pradines
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
| | - Victor Farutin
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
- * E-mail: (VF); (IC)
| | - Nicholas A. Cilfone
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
| | - Abouzar Ghavami
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
| | - Elma Kurtagic
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
| | - Jamey Guess
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
| | - Anthony M. Manning
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
| | - Ishan Capila
- Momenta Pharmaceuticals, 301 Binney Street, Cambridge, Massachusetts, United States of America
- * E-mail: (VF); (IC)
| |
Collapse
|
50
|
Sevim Bayrak C, Zhang P, Tristani-Firouzi M, Gelb BD, Itan Y. De novo variants in exomes of congenital heart disease patients identify risk genes and pathways. Genome Med 2020; 12:9. [PMID: 31941532 PMCID: PMC6961332 DOI: 10.1186/s13073-019-0709-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 12/26/2019] [Indexed: 12/14/2022] Open
Abstract
Background Congenital heart disease (CHD) affects ~ 1% of live births and is the most common birth defect. Although the genetic contribution to the CHD has been long suspected, it has only been well established recently. De novo variants are estimated to contribute to approximately 8% of sporadic CHD. Methods CHD is genetically heterogeneous, making pathway enrichment analysis an effective approach to explore and statistically validate CHD-associated genes. In this study, we performed novel gene and pathway enrichment analyses of high-impact de novo variants in the recently published whole-exome sequencing (WES) data generated from a cohort of CHD 2645 parent-offspring trios to identify new CHD-causing candidate genes and mutations. We performed rigorous variant- and gene-level filtrations to identify potentially damaging variants, followed by enrichment analyses and gene prioritization. Results Our analyses revealed 23 novel genes that are likely to cause CHD, including HSP90AA1, ROCK2, IQGAP1, and CHD4, and sharing biological functions, pathways, molecular interactions, and properties with known CHD-causing genes. Conclusions Ultimately, these findings suggest novel genes that are likely to be contributing to CHD pathogenesis.
Collapse
Affiliation(s)
- Cigdem Sevim Bayrak
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Peng Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, The Rockefeller University, New York, NY, USA
| | - Martin Tristani-Firouzi
- Nora Eccles Harrison Cardiovascular Research and Training Institute, University of Utah, Salt Lake City, UT, USA
| | - Bruce D Gelb
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yuval Itan
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|